510.8^ IL6r no. 1139-11/15 1983-8/1 INC. cop. 3 Report No. UIUCDCS-R-83-1145 UILU-ENG 83 1723 The Delay /Re -Read Protocol for Concurrency Control In Databases by M. Dennis Mickunas Pankaj Jalote March 1983 ChCH DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS Digitized by the Internet Archive in 2013 http://archive.org/details/delayrereadproto1145mick The Delay /Re-Read Protocol for Concurrency Control in Databases M. Dennis Mickunas Pankaj Jalote Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois March 28, 1983 ^Research supported in part by Highly Available Database Project, IBM Research Labs. ABSTRACT We present a new protocol, called the Delay /Re-Read Protocol, for controlling con- current access to a database. The protocol uses a combination of preventative and corrective measures for maintaining consistency. The Delay /Re-Read Protocol Is deadlock-free, requires no backup data, and does not abort either transactions or data- base writes. 1. INTRODUCTION The problem of concurrency in databases has received a good deal of attention in recent years. Errors resulting from unrestricted concurrency were first observed by Eswaran et.al. (ESW|. They showed that unrestricted concurrency can result in an inconsistent database. The solution proposed was that each transaction lock the entity it is going to access. Moreover, such a locking protocol typically adopts so-called "two phase locking", which consists of a "growing phase", and a "shrinking phase". In the growing phase a transaction collects the locks that it requires, and in shrinking phase it releases them. A transaction cannot request any further locks once it has released any lock. A disadvantage of two-phase locking is that it is not deadlock- free. Since then many variations on locking have been proposed [BAYb, ELL], and it has been demonstrated that locking achieves somewhat better results when the database is structured as a hierarchy [KED, SIL]. However, the use of locking to maintain con- sistency is an entirely preventive measure i.e. it tries to prevent any view of the database from becoming inconsistent. For this reason locking assumes the worst case and is often overly restrictive. In an effort to relieve the tight restrictions of locking protocols, Kung [KUN] proposed a corrective measure for concurrency control in which a transaction is permitted to see an inconsistent state, but its view will be later 'corrected' and made consistent. In the present paper we present a new protocol, which utilizes both preventative and corrective techniques. The protocol, which we call the Delay/Re-Read Protocol, acts, on the one hand, in a corrective fashion by sometimes forcing a transaction to re- read some data; it does so upon recognizing that a transaction has read an inconsistent set of data. The protocol acts, on the other hand, in a preventative fashion by some- times imposing a delay before permitting a transaction to write to the database^ it does so upon recognizing that such a write might, at the present time, jeopardize the integrity of the database. This paper is organized as follows. In Section 2 we present our model of a database system. In Section 3 we define our notion of consistency and present some results relat- ing consistency to the ordering of basic actions of transactions. In Section 4 we present the Delay /Re-Read Protocol and prove that it is both consistent and deadlock-free. In Section 5 we discuss some aspects of the Delay /Re-Read Protocol, including its applica- tion to distributed databases. 2. SYSTEM MODEL We consider the database to be a collection of distinct named objects, called enti- ties, together with assertions about the values of the entities. These assertions, called integrity constraints are often not explicitly states; indeed, the space required for full specification of all integrity constraints might exceed that required for the database itself. Nonetheless, whether explicitly stated or not, consistency constraints are present and they govern the interactions of operations upon entities. A database which satisfies all of the integrity constraints is said to be in a consistent state. In order to formalize our moded, we present some definitions. We denote the set of entities in the database by "E". Such an entity may be read or written indivisibty. Definition 2.1. A transaction with unique transaction number k, denoted T , is a set of actions T k ={tft!u together with a linear ordering 1 < r t, on T k . As a notational convenience, we use the subscripts on the actions to reflect this linear ordering, viz. ii implies (t, Vr,z)< 5 (.;*,i?,ar) which, by anti-symmetry of < 5 , disallows this case. (2) (j, W 1 x)< s (i, W,x). Since T 3 is well-formed, we have (;,/?,x)< 5 (;,^,x) whence as in case 1 (i,W 7 x)< s (j,R,x) and (i,W,x)< s (j,W,x) which, by anti-symmetry of < s , disallows this case. (3) (j,W,x)< s {i,R,x). Since T* is well- formed, (i,R,x)< s (i,W ef a) By definition of critical read, U,W e ,b)< s (j,W,x) whence U,W„*)<8(iW,*) However, by hypothesis (since ii, (j,R,z)€S implies either z^x or (i,W,x)< s (j,R,z). Proof. The proof is by contradiction. Suppose that G s has a cycle involving nodes r", . . . , T ik (k>l). Let i=min{i lt . . . ,i k ). Now, for every je{i h . . . ,i k ) {j^i) we have j>i, so Lemma 3.1 applies, and there can be no arc in G s from T } to T x . Therefore, the presumed cycle involving T* is contradicted. p Corollary 3.3. Let T l , . . . , T n and S be as in Theorem 3.2. Then S is consistent. Proof. The proof is immediate since G s is acyclic. p Informally, the theorem lays down the condition to be satisfied by the schedule such that every transaction sees a consistent state, i.e. the set of values returned by the Reads of the transaction is such that it is the same as the set of values of these entities in some consistent database state. This does not imply that all the Reads must be per- formed on the same consistent state. A Read can be performed on any database state, possibly transitory and inconsistent, but the set of values read by all Reads must be such that all the values can co-exist in some consistent database state. Theorem 3.2 specifies the condition when this is satisfied. This theorem is the basis of Delay /Re- Read Protocol. In the following sections W,(z) and R{{x) mean same as (i,W,x) and (i,R,x) respec- tively. 4. DELAY/RE-READ PROTOCOL Not all schedules satisfy the condition of Theorem 3.2. The purpose of the Delay /Re-Read Protocol is to take any schedule and "correct" it such that the schedule satisfies the condition of the theorem. Each transaction is submitted to a Transaction Manager which assigns a Transaction Process (TP) to each transaction. Each TP sub- mits its Read and Write requests to a process called the Concurrency Control Process (CCP); the TP awaits permission from the CCP to proceed with the requested Read or Write. There is also a History File which only the CCP reads or writes. This is different from a "log file" which might record all activity as well as "backup" data. The History File records only a finite window of activity. The CCP, on receiving a Read request, merely records it in the History File (since we exert no control over the Reads). However, a Write can be delayed by the CCP according to the following scheme. The Delay /Re-Read Protocol is used by the CCP to ensure that any schedule remains consistent. This is accomplished by a combination of preventative and correc- tive measures. On one hand, the Delay /Re-Read Protocol sometimes causes the CCP to delay the Critical Write of a TP, thereby delaying all of the Writes of the associated transaction (a preventative action); on the other hand, the Delay /Re-Read Protocol sometimes causes the CCP to instruct a TP to re-read some entities prior to proceeding with a Write, thereby assuring that the Use Set for the Write is consistent (a corrective action). Since we do not allow any transactions to be "backed up," the Delay /Re-Read Pro- tocol must ensure that whenever any transaction performs a Write, that the transaction will be able to complete. For this reason, the initial Write of any transaction is critical. We assume that when the transaction is submitted, its Read and Write set are known (actually the read set can be computed while processing the transactions, because the method does not use this information until all the Reads are done, but we need to know the Write set). This assumption has been made in SDDl [BER], and is implicit in many locking protocols in order to determine whether to request a shared or exclusive lock. This does not place any restrictions on what transactions can read or write, but when the transactions is submitted, its read and write sets should be known, maybe by doing a prepass over the transaction before starting to process it. At the initial Write, the write set of the transaction is also recorded in the History File. If the write set of T l is {x,y,z} and the first write by T x is on x, then it will be recorded in History File as w i (x)w i (y)w i (z)W i (x). All other Reads or Writes are simply recorded as R {(entity -name) or W^entity-name). A transaction which has performed at least one Write but has not yet terminated will be called Active and Writing (AW), and the set of all active and writing transac- tions will be referred to as the Active and Writing set (AWS). Let us now present the Delay /Re-Read Protocol formally. Let x,y,zeE The History File, H is maintained as a string over the alphabet 2 Actually, the "transaction number" f c[l,n] is not assigned by the CCP until the transaction's crit- Let an ellipses (...) denote an arbitrary string over J] (possibly of length zero). Let AWS(j) be the AWS just before T 3 performs its critical write, TP(j) the tran- saction process of T 1 . The Delay /Re-Read Protocol is as follows: Given a request for Wj(x{U)), 1. for every i£AWS(j) do 2. { for every yeU do 3. {iftf= • • w { {y) • •• 4. then { if H^ • • • W { {y) ■ ■ • 5. then await H,(y) 6. if//^-- Wi(y) ••• Hy(f) •■ • 7. then { instruct TP(j) to: a) re-read y b) re-compute x(U) c) re- request Wy(a:((/)) halt } } } } 9. authorize Wj{x(U)) Informally, the Delay /Re-Read Protocol ensures that there is no arc in G s from T 1 to T x (where {i ,W e ,a)< s (j ,W e ,b)). This is done by noticing the situation that a) T J tries to use some ycE which b) T* expects to have written, and by ensuring that c) T % has indeed performed that Write, and that d) T J, s Read (or Re-Read) of y occurs after P's Write. Condition b) is detected in line 3 of the Delay /Re-Read Protocol; the need for the preventative delay of step c) is detected in line 4 of the Delay /Re-Read Protocol, and; condition d) is either verified in line 6 of the Delay /Re-Read Protocol or is ensured via a corrective re-read by line 8 of the Delay /Re-Read Protocol. Claim 5.1. The Delay /Re-Read Protocol is consistent. Sketch of Proof. The above discussion illustrates that any RAy) occurs after all Wj(y) that may occur (for i