UNIVERSITY OF 
 
 ILLINOIS LIBRARY 
 
 AT URBANA-CHAMPAIGN 
 
below v ef ° re *> taN* ffi borr °™* 
 
 ffff? 
 
 / 
 
 OK 720/0 
 
 L162 
 

 UIUCDCS-R-79-980 
 
 //L^'f '' 
 
 UILU-ENG 79 1732 
 
 July 1979 
 
 Subset Dependencies as an Alternative to 
 Embedded Multivalued Dependencies 
 
 by 
 
 Yehoshua Sagiv 
 
 Scott Walecka 
 
 nJEL 'BRARY F 
 
 T>*E 
 
 Mar 
 
 l 2 1SS0 
 
 U 0&P n OFjLUN0lS 
 
UIUCDCS-R-79-980 
 
 Subset Dependencies as an Alternative to 
 Embedded Multivalued Dependencies 
 
 Yehoshua Sagiv 
 Scott Walecka 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 July 1979 
 
 (+) The work of this author was supported in part by the National Sci- 
 ence Foundation under grant MCS-77-22830 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/subsetdependenci980sagi 
 
ABSTRACT 
 
 We show that the inference rules for multivalued dependencies can- 
 not be extended to a complete set of inference rules for embedded mul- 
 tivalued dependencies. A new type of dependencies, called subset depen- 
 dencies, is introduced. Subset dependencies are a generalization of 
 embedded multivalued dependencies. We give a set of inference rules for 
 subset dependencies and investigate their properties. 
 
 CR categories: 4.33 
 
 Key words and phrases: multivalued dependency, embedded multivalued 
 dependency, subset dependency, inference rule, relational database. 
 
- 2 - 
 
 _1. Introduction 
 
 The relational database model [Cod] uses dependencies as a semantic 
 tool for expressing properties of the data. Functional [Arm, Cod] and 
 multivalued dependencies [BFH,Fa 1, Zan] are the most common types of 
 dependencies, and they have been investigated thoroughly (e.g., 
 [Bel,BB,Bil,Bi2,Fa2,HIT,Mak,Men,Nic,Sag,SaF]). 
 
 A complete utilization of multivalued dependencies requires that we 
 deal also with embedded multivalued dependencies, i.e., those mul- 
 tivalued dependencies that hold in a projection of a relation but not 
 necessarily in the relation itself. In contrast to functional and mul- 
 tivalued dependencies, the properties of embedded multivalued dependen- 
 cies are substantially unknown. Attempts have been made to extend the 
 inference rules of [BFH] for multivalued dependencies to a complete set 
 of inference rules for embedded multivalued dependencies [TK1,TK2]. 
 However, in this paper we show that no such extension exists. The proof 
 is carried out by showing that for every positive integer n, there is a 
 set of n embedded multivalued dependencies £ that implies another embed- 
 ded multivalued dependency a, but the only embedded multivalued depen- 
 dencies implied by any subset of Z are those obtained by augmentation 
 and projection. 
 
 We also introduce a new type of dependencies, called subset 
 dependencies , that is a generalization of embedded multivalued dependen- 
 cies. A set of inference rules for subset dependencies is presented. 
 This set of rules is not known to be complete. However, it is superior 
 
- 3 - 
 
 to the rules of [BFH] in the following sense. We show the existence of 
 
 a subset of embedded multivalued dependencies for which one cannot 
 
 obtain a complete set of inference rules by extending the rules of 
 
 [BFH], and yet our rules are complete for this subset. Our rules also 
 
 imply the rules of [BFH] for multivalued dependencies, and the known 
 rules for embedded multivalued dependencies [ABU,Fal,TKl] . 
 
 2* Basic Definitions and Results 
 
 2^_1 . The Relational Model for Databases 
 
 The relational model for databases assumes that the data is stored 
 in tables called relations . The columns of a table correspond to 
 attributes , and the rows to records or tuples . Each attribute has an 
 associated domain of values. It is convenient to regard a tuple as a 
 mapping from the attributes to thier domains, since no canonical order- 
 ing of the attributes is needed in this way. A relation scheme is a set 
 of attributes labeling the columns of a table. We often use the rela- 
 tion scheme itself as the name of the table. A relation can be viewed 
 as the "current value" of a relation scheme. 
 
 Suppose that r is a relation defined on a relation scheme X. Let \x 
 be a tuple of r and A an attribute in X. The tuple p maps the attribute 
 A to M(A), and u(A) is called the A -value of u. If Y is a subset of X, 
 then y(Y) is a tuple defined only on the attributes of Y; the tuple y(Y) 
 maps each attribute A of Y to y(A). We call M(Y) a Y -value in r and 
 usually ^enote it by y. If tuples y and v agree on all the attributes 
 
- 4 - 
 
 of the set X, then we write u(X) - v(X). The projection of the relation 
 r onto Y is obtained by removing coloumns of r not corresponding to 
 attributes of Y and identifying common tuples, i.e., 
 
 r(X) - <u(Y) | pis a tuple of r} 
 
 We use the letters A,B,C,... to denote attributes, and the letters 
 ...,X,Y,Z to denote sets of attributes. A string of attributes (e.g., 
 ABCD) denotes the set containing these attributes, and the union of two 
 sets X and Y is written as XY. 
 
 _2._2. Dependencies 
 
 In many cases the data must satisfy certain constraints. Func- 
 tional [Arm, Cod] and multivalued [BFH,Fal,Zan] dependencies are the most 
 common types of dependencies. In this paper we consider only mul- 
 tivalued dependencies. A multivalued dependency (abbr. MVD) is a state- 
 ment of the form X *♦ Y, where both X and Y are sets of attributes. 
 Suppose that r is a relation on a relation scheme U. Let Z be the set 
 of all the attributes in U that are neither in X nor in Y. The relation 
 r satisfies the MVD X ++ Y (or X ♦♦ Y holds in r) if and only if for all 
 tuples v. and M ? in r, if P.(X) ■ \i (X) , then there are tuples M^ and M, 
 in r such that 
 
 (i) u 3 (X) = y x (X), u 3 (Y) = ^(Y), and lyz) - u 2 (Z) 
 (ii) u 4 (X) = M 2 (X), u 4 (Y) = M 2 (Y), and U^Z) = u^Z). 
 In other words, X ++ Y means that the set of Y-values associated with a 
 particular X-value must be independent of the values of the rest of the 
 
- 5 - 
 
 attributes. 
 
 If we consider relations over the relation scheme U, then the MVD 
 X ■*■»- Y is written as X ♦♦ Y|Z, where Z = U - X - Y. Suppose that STV is 
 a proper subset of U. The statement S "►-*■ T|V is called an embedded 
 multivalued dependency (abbr. EMVD). The EMVD S ♦♦ T|V holds in a rela- 
 tion r over U if the MVD S -►-»■ T|V holds in the relation r(STV), i.e., in 
 the projection of r onto STV. The EMVD S •*■-► T|V is said to be defined 
 on STV. Note that an MVD is also an EMVD, but the converse is not 
 necessarily true. 
 
 A dependency a is a consequence of a set of dependencies E (or E 
 implies a) if for all relations r, o holds in r if all the dependencies 
 of E hold in r. If E is a set of MVD's (and/or functional dependencies) 
 and o is an MVD (or a functional dependency), then there are inference 
 rules that can be used to infer a from E if and only if o is a conse- 
 quence of E (i.e., the rules are complete) [Arm,BFH]. In this case 
 there are also efficient algorithms for deciding whether E implies a 
 [Bel, BB, HIT, Sag]. The properties of EMVD, however, are substantially 
 unknown. (For example, the problem of deciding whether an EMVD a is 
 implied by a set of EMVD's E is not even known to be decidable.) 
 
 A dependency o is trivial if for all relations r, a holds in r. A 
 trivial dependency is implied by any other set of dependencies. An EMVD 
 X +♦ Y|Z is trivial if either XY or XZ is equal to XYZ. 
 
 The following is a complete set of inference rules for MVD's (U is 
 assumed to be the set of all the attributes) . 
 
- 6 - 
 
 MVDO (Complementation): 
 
 Let X, Y, and Z be sets of attributes such that their union 
 is U and Y ft Z C X. Then X ♦+ T if and only if X ♦+ Z. 
 MVD1 (Reflexivity): 
 
 If Y C X, then X ■►+ Y. 
 MVD2 (Augmentation): 
 
 If V C W and X ++ Y, then XW ++ YV. 
 MVD3 (Transitivity): 
 
 If X » Y and Y ■»•■»■ Z, then X +♦ Z - Y. 
 These rules can be used to infer EMVD's from other EMVD's as long as all 
 the EMVD's involved are defined on the same set of attributes. (For 
 example, an EMVD X •*-*■ Y|Z can be augmented with a set W only if W is 
 contained in XYZ.) The following is a rule for inferring EMVD's 
 [ABU, Fa 1]. 
 
 EMVD1 (Projection): 
 
 If X -M- Y|Z, Y' C Y and Z' C_ Z, then X ♦+ Y'|Z' 
 
 3^. Subset and Equivalence Dependencies 
 
 Let Z and X be sets of attributes and let x be an X-value in a 
 relation r. Z (x) is the set of all Z-values associated with the X- 
 value x, i.e., 
 Z (x) = {z | there is a tuple u in r such that u(Z) = z and p(X) - x} 
 
 Proposition 1 : Z (xy) is a subset of Z (x) for all X-values x and 
 Y-values y in r. 
 
- 7 - 
 
 Proof : If xy is not an XY-value in r, then Z (xy) is empty and, 
 
 hence, Z (xy) jC Z (x) . Suppose that xy is an XY-value in r. For all 
 
 tuples y of r, if u(XY) = xy then y(X) = x and, by definition, Z (xy) 
 must be a subset of Z (x) . [] 
 
 Lemma 2 : If Z (x) £ Z (y) for all XY-values xy in a relation r, and 
 
 Z (y) C Z (w) for all YW-values yw in r, then Z (x) C Z (w) for all XW- 
 r — r r — r 
 
 values xw in r . 
 
 Proof : Suppose that P is a tuple in r such that P(XW) = xw. Let 
 p(Y) = y. Since xy is an XY-value in r, Z (x) C Z (y) . Similarly, 
 Z (y) C Z (w). Hence, Z (x) £ Z (w) for all XW-values xw in r. [] 
 
 Lemma 3 : The EMVD X ■*-■*■ Y|Z holds in a relation r if and only if 
 Z (x) = Z (xy) for all XY-values xy in r. 
 
 Proof : [Fal] Only if . By Proposition 1, Z (xy) C Z (x) and, there- 
 fore, it suffices to prove that Z (x) £ Z (xy). Let z e Z (x) , i.e., 
 there is a tuple y. in r such that y.(X) = x and y. (Z) = z. Let 
 u.(Y) = y' . Since xy is an XY-value in r, there is a tuple U- in r such 
 that y 2 (X) = x and M 2 (Y) = y. Let U 2 (Z) = z'. By definition of EMVD's, 
 there is a tuple y_ in r such that VoW = x, M 3 (Y) = y, and y.(Z) = z. 
 Hence, z e Z (xy) . 
 
 If . Let y. and u be tuples of r such that 
 
 (1) y 1 (X) = x, y x (Y) = y, and ^(Z) = z 
 
 (2) u 2 (X) = x, u 2 (Y) = y', and M^Z) = z' 
 
 Obviously, both z and z' are in Z (x) . Since Z (x) jC Z (xy), z and z' 
 
- 8 - 
 
 are also in Z (xy) . Therefore, there must be a tuple u in r such that 
 
 U 3 (X) - x, u 3 (Y) - y, and u 3 (Z) - z' 
 
 Similarly, since Z (x) £ Z (xy'), there is a tuple u, in r such that 
 
 P 4 (X) « x, u 4 (Y) - y', and y^Z) - z 
 Thus, X ++ Y | Z holds in r. [] 
 
 A subset dependency (abbr. SD) is a statement of the form 
 Z(X) C Z(Y), where X, Y, and Z are sets of attributes and both X and Y 
 are disjoint from Z. The ZSD Z(X) C Z(Y) holds in a relation r if and 
 only if Z (x) £ Z (y) , for all XY-values xy in r. 
 
 An equivalence dependency (abbr. ED) is a statement of the form 
 Z(X) ■ Z(Y), where X, Y, and Z are sets of attributes and both X and Y 
 are disjoint from Z. The dependency Z(X) ■ Z(Y) holds in a relation r 
 if and only if Z (x) = Z (y) for all XY-values xy in r. 
 
 • 
 
 Example 1 : Consider the relation of Figure 1. The ZED Z(X) = Z(Y) 
 holds in the above relation. Note that the EMVD X +♦ Y|Z does not hold 
 in this relation. [] 
 
 • 
 
 Proposition 4 : Z(X) = Z(Y) holds in a relation r, if and only if 
 Z(X) C Z(Y) and Z(Y) C Z(X) hold in r. 
 
 Proof : Immediate from the definitions. [] 
 
 Lemma 5 : If W C V, then Z(V) C Z(W) is a trivial SD. 
 
 Proof : Let wv be a WV-value in a relation r. By proposition 1, 
 
 Z (v) C Z (w). Thus, Z(V) C Z(W) holds in r. [] 
 r r 
 
- 9 - 
 
 1 1 1 
 
 x y z 
 
 x y' z 
 
 x' y z 
 
 x' y' z 
 
 Figure 1 
 
 Lemma 6 : If Z(X) C Z(Y) and Z(Y) C Z(W) hold in a relation r, then 
 Z(X) C Z(W) also holds in r. 
 
 Proof : Follows directly from the definitions and Lemma 2. [] 
 
 Lemma 7 : The EMVD X ♦+ Y|Z holds in a relation r if and only if 
 Z(X) = Z(XY) holds in the relation r. 
 
 Proof : Follows from the definitions and Lemma 3. [] 
 
 Corollary 8 : The EMVD X +■*■ Y|Z holds in a relation r if and only if 
 Z(X) C Z(XY) holds in the realtion r. 
 
 Proof : Follows from Lemma 7, Lemma 5, and Proposition 4. [] 
 
 Corollary 8 implies that a set of EMVD's is equivalent to a set of 
 SD's. Consequently, from now on EMVD's are treated as SD's. 
 
- 10 - 
 
 ^. Z -Graph 8 
 
 An SD of the form Z(X) C Z(Y), where Z is a fixed set of attri- 
 butes, is called a Z subset dependency (abbr. ZSD). In this section we 
 consider a set I of ZSD's (i.e., a set of SD's with a fixed Z) . In par- 
 ticular, note that a set of EMVD's of the form X » Y|Z, for a fixed Z, 
 is a set of ZSD's. The kernel of I, written KER(I), is the set: 
 
 {X | there is a Y such that either Z(X) C Z(Y) or Z(Y) C Z(X) is in £} 
 A Z -graph for S is a directed graph defined as follows. The nodes of a 
 Z-graph correspond to sets of attributes that are disjoint from Z. A 
 node corresponding to a set of attributes X is denoted by [X] . A Z- 
 graph has a node for each set in KER(E) and possibly additional nodes 
 that correspond to other sets. The following rules imply all the 
 (directed) edges of a Z-graph G„. 
 
 Rule 1 : If [X] and [Y] are nodes of G and X C Y, then there is an 
 
 edge from [X] to [Y] . 
 
 Rule 2 ; For each ZSD Z(X) C Z(Y) in E, there is an edge in G from 
 
 [Y] to [X]. 
 The minimal Z-graph for E is the Z-graph containing only nodes that 
 correspond to sets of KER(E). By reflexivity (Lemma 5) and transitivity 
 (Lemma 6) of SD's, we obtain the following lemma. 
 
 Lemma 9 : If there is a directed path in a Z-graph G„ from {Y] to 
 [X], then Z(X) C Z(Y) is a consequence of E. 
 
 We now prove that the converse of Lemma 9 is also true. 
 
- 11 - 
 
 Lemma 10 : If there is no path in a Z-graph G„ from a node [Y] to a 
 node [X] , then there is a relation r in which all the dependencies of Z 
 hold, but Z(X) C Z(Y) fails. 
 
 Proof ; We construct a relation r over the domain {0,1} as follows. 
 Consider the set 
 
 CON(X) = <W | there is a path from [W] to [X] in G^,} 
 For all sets W in CON(X), the relation r has a pair of tuples as fol- 
 lows. Both tuples map all the attributes of W to 0, and all the attri- 
 butes that are neither in W nor Z to 1. One tuple in the pair maps all 
 the attributes of Z to 0, and the other tuple maps all the attributes of 
 Z to 1. The relation r has one more tuple, denoted by u, that maps all 
 the attributes to 0. 
 
 Conventionally, the X-value of a set of attributes X consisting 
 only of 0's is denoted by x_; the X-value consisting only of l's is 
 denoted by x. . 
 
 Claim 1 ; Let V be disjoint from Z. If v is a V-value occuring in a 
 tuple of r other than u, then Z (v) » {z.,z_}. If v occurs in u (i.e., 
 v=v n ), then Z (v) contains z_. 
 
 Claim 1 follows immediately from the construction of r. 
 
 Let V be disjoint from Z. Suppose that Z (v Q ) « {z.,z >. (Note 
 
 that v_ occurs in u.) Thus, there is a tuple v in r corresponding to a 
 
 set W of CON(X) with a V-value v Q and a Z-value z.. It follows that V 
 
 must be a subset of W, because W contains all the attributes (except 
 
- 12 - 
 
 those in Z) that are mapped to by v. But in this case there is an 
 
 edge from [V] to [W] and, hence, there is a path from [V] to [X] 
 
 (because there is a path from [W] to [X]). Thus, we have proved the 
 following claim. 
 
 Claim 2 : Let V be disjoint from Z. If Z (v Q ) - {z ,z.>, then V is 
 in CON(X) (i.e., there is a path from [V] to [X]). 
 
 We now show that all the ZSD's of I hold in r but Z(X) C Z(Y) fails 
 
 in r. In proof, Z (x~) = {z.,z.}, because X is in CON(X). Since there 
 
 is no path from [Y] to [X], Z (y Q ) - {z Q } by Claim 2. Thus Z(X) C Z(Y) 
 fails in r. 
 
 Let Z(W) C Z(V) be any ZSD in £. By Claim 1, in order to prove 
 that Z(W) C Z(V), it is sufficient to show that Z (w ) £ Z (v.) for the 
 WV-value w.v- occuring in u. By Claim 1, if Z (w_) = {z.} we are done. 
 So suppose that Z (w_) = {z_,z.}. By Claim 2 , there is a path from [W] 
 to [X] and, hence, there is a path from [V] to [X] (because Z(W) C Z(V) 
 implies an edge from [V] to [W] ) . Hence, Z (v_) ■ {z_,z.}. This com- 
 pletes the proof. [] 
 
 Lemma 9 and Lemma 10 provide a method for deciding whether a set of 
 ZSD's E implies another ZSD Z(X) C Z(Y). In order to do so, construct a 
 Z-graph G with nodes corresponding to X and Y, and check whether there 
 is a path from [Y] to [X] . 
 
 Theorem 11 : Testing whether a ZSD Z(X) C Z(Y) is a consequence of a 
 
 2 
 3et of ZSD's Z can be done in 0(n ) time, where n is the size of the 
 
- 13 - 
 
 input 
 
 Proof ; Assuming that the attributes in the input are represented by 
 
 the numbers l,...,k, a Z-graph containing nodes for X and Y can be con- 
 
 2 
 structed in 0(n ) time. Testing whether there is a path from [Y] to [X] 
 
 requires only linear time (in the size of the graph). [] 
 
 A Z embedded multivalued dependency (abbr. Z-EMVD) is an EMVD of 
 the form X +•*■ Y|Z, where Z is a fixed set of attributes. 
 
 Corollary 12 : Testing whether a Z-EMVD X >->■ Y|Z is a consequence of 
 
 2 
 a set of Z-EMVD's E can be done in 0(n ) time, where n is the size of 
 
 the input. 
 
 5. ZSD's and EMVD's 
 
 In this section we investigate the EMVD's implied by a set of ZSD's 
 
 E. Let MG be the minimal Z-graph for E. The Z -EMVD cover of E, writ- 
 
 r 
 ten Z-EMVD (E), is the set 
 
 {X+vyJZ | there is a path from [XY] to [X] in MG^} 
 We will show that an EMVD T is implied by E only if there is a Z-EMVD a 
 in Z-EMVD (E) such that t is obtained from a by augmentation and projec- 
 tion. 
 
 Lemma 13 : If a Z-EMVD X ■*-*■ Y|Z is a consequence of a set of ZSD's 
 
 Q 
 
 E, then X ++ Y|Z can be obtained from a Z-EMVD in Z-EMVD (E) by augmen- 
 tation an' projection. 
 
- 14 - 
 
 Proof ; If both X and XY are in KER(Z), then X ♦♦ Y|Z is in Z- 
 EMVD (Z) and we are done. Assume that neither X nor XY is in KER(Z) 
 (the other two cases in which either X or XY is in KER(Z) are proved 
 similarly). Let G„ be a Z -graph in which all the nodes correspond to 
 members of the set KER(Z) U {[X],[XY]}. Since X -►->- Y|Z is a consequence 
 of Z, there is a path in G from [XY] to [X] . Let the first edge in 
 this path be from [XY] to [S], and the last edge be from [T] to [X]. An 
 edge from [XY] to [S] can exist only if XY C S. Similarly, T C X and, 
 hence, T C S. Let S be written as TS', where S' is disjoint from T. 
 Thus, Z-EMVD (Z) contains the Z-EMVD T ♦+ S'|Z. It is easy to show that 
 X ++ Y|Z follows from T ++ S'|Z by augmentation and projection. [] 
 
 Lemma 14 : If W ++ V|Y is a nontrivial EMVD implied by a set Z of 
 ZSD's, then either VCZor YCZ. (It is assumed that W, V, and Y are 
 pairwise disjoint.) 
 
 Proof : Construct a relation r over {0,1} with two tuples that agree 
 exactly on the atributes of Z and W. Let z be the Z-value of the two 
 tuples in r. Obviously, for every X-value x, Z (x) = {z}. Thus all the 
 ZSD's of Z hold in r and, hence, W ■*■■*■ V|W holds in r. By Lemma 3 in 
 [SaF], either VCZorYCZ. [] 
 
 Suppose that a is a nontrivial EMVD implied by a set of ZSD's Z. 
 
 By Lemma 13, a can be written as W +■* V'|Z', where Z' C Z. (It is 
 
 assumed that W, V, and Z' are pairwise disjoint.) We now prove the 
 following lemma. 
 
- 15 - 
 
 Lemma 15 : There exists a Z-EMVD W ■*■* V|Z implied by E such that 
 W' ■*-* V'|Z' can be obtained from W +■*■ V|Z by augmentation and projec- 
 tion. 
 
 Proof : We use the same method as in the proof of Lemma 10. In that 
 proof we built a relation r having two Z-values, z. and z., that 
 disagreed on all the columns of Z. It is sufficient, however, that z n 
 and z. would not be the same. Thus z. is replaced with z, where z has 
 0's exactly in the columns of W ft Z and l's in all the other columns of 
 Z. Since W ++ V'|Z' is a nontrivial EMVD, Z' - W is nonempty (i.e., 
 some columns of z are indeed 1 and z is different from z_). 
 
 Let W = W - Z, and let G„ be a Z-graph for £ containing the nodes 
 [W] and [WV'J. Construct a relation r as in the proof of Lemma 10 using 
 the Z-values z_ and z (instead of z.), and the set CON(W). Recall that 
 y is the tuple of r that maps all the attributes to 0. Since W is in 
 CON(W), the relation r has a tuple u such that vj maps all the attributes 
 of W to 0, all the attributes of Z to z, and all the other attributes to 
 1. Note that the tuples vi and u agree exactly on the columns of W. 
 
 All the ZSD's of I hold in r and, hence, W ** V'|Z' also holds in 
 r. Therefore, there is a tuple t in r such that t(W') = u(W'), 
 t(V') = u(V'), and t(Z') = o(Z'). The Z-value of t must be z, because u 
 maps some attributes of Z' to 1. Therefore, t and u agree on all 
 columns of Z. Since they should disagree on all the columns of V, it 
 follows that V is disjoint from Z. 
 
- 16 - 
 
 By the construction of r, WV' must be in CON(W), because WV' con- 
 tains all the columns (except those in Z) in which t has O's. Hence, 
 there is a path in G from [WV'] to [W] . Therefore, W ^ V'|Z is a 
 consequence of E. Obviously, W ♦+ V'|Z' can be obtained from W +■♦ V |Z 
 by augmentation and projection. [] 
 
 6^. The Nonextendibility of the MVP Inference Rules to EMVD 's 
 
 In this section we show that for any positive integer n, one can 
 find a set Z of n EMVD's that implies another EMVD a, but any n-1 EMVD's 
 of £ imply only those EMVD's that can be obtained by projection and aug- 
 mentation. This result indicates that the inference rules of [BFH] for 
 MVD's cannot be extended in any meaningful way to a set of inference 
 rules for EMVD's. 
 
 Given a positive integer n, let X A ,X-, ...,X ,,Z be pairwise dis- 
 
 U z n— l 
 
 joint sets of attributes. Let Z consists of the following Z-EMVD's. 
 
 x o ** x i |z 
 Xj — x 2 iz 
 
 X n -2 " X n-I |Z 
 
 X „-l " x o |z 
 That is, E contains the Z-EMVD X ->•> X 1+ il z for a11 0<i<n-2, and the Z- 
 
 EMVD X ++ X,|Z. It is convenient to assume that addition and subtrac- 
 n 1 
 
 tion of indices is done modulo n. For example, X is X ni and X . is 
 
 r n -1 
 
- 17 - 
 
 X .. 
 n-1 
 
 Lemma 16 ; X. ■*■■*■ X . |Z is a consequence of Z. 
 
 Proof ; To prove Lemma 16, construct the minimal Z-graph MG for X. 
 The graph MG has the following nodes; 
 
 (1) [X ± ] for 0<i<n-l, and 
 
 (2) [ x ± x 1+1 ] for 0<i<n-l. 
 
 The edges of MG_ can be classified in the following groups; 
 
 (1) for 0<i<n-l, an edge from [X ] to [XX], and 
 
 (2) for 0<i<n-l, an edge from [X ] to [XX]. 
 
 The edges in the above groups are implied by Rule 1. The following 
 group of edges is implied by Rule 2. 
 
 (3) for 0<i<n-l, an edge from t^^.J to [X ] . 
 
 Figure 2 describes the graph G„ for n=4. The edges in group (3) 
 
 are denoted by broken lines. It is easy to see that there is a path 
 
 from [X .X,J to [X_] . Hence, X rt -»•->■ X ,|Z is a consequence of E. 
 n-1 n-1 
 
 Lemma 17 ; Let E' be a set of n-1 dependencies from E. If a' is 
 implied by £', then there is a Z-EMVD a in V such that o" is obtained 
 from a by augmentation and projection. 
 
 Proof ; Consider the graph MG . Obviously, a path in MG that 
 corresponds to a Z-EMVD implied by £ must start in a node [XX] (for 
 some 0<i<n-l) and terminate In either [X ] or [^.il* For all i 
 (0<i<n-l), there is an edge from [ x ± x 1+ i] to [X ] , because X -»•> X f Z 
 is in Z. It is easy to see that there is a path from [XX] to IX,,] 
 
Figure 2 
 
- 18 - 
 
 for all i. That is, X *-*■ X |Z is implied by Z (0<i<n-l). 
 
 Claim 3 : For all 0<i<n-l, every path from [X.X .] to [X ] uses 
 all the edges in group (3) (i.e., all the edges implied by the EMVD's in 
 Z). 
 
 Proof : Suppose not. Among all the paths from [XX.] to [X ,. ] 
 that do not use all the edges in group (3), consider a path p that has a 
 minimum number of edges. For all nodes [X ] (0<i<n-l), there is only 
 one edge directed to [X ] and this edge is in group (3). Since p does 
 not use all the edges in group (3), there is a j (0<j<n-l) such that p 
 visits [X . ] but does not visit [X ] (because [X . ] is in the path, 
 and some [X, ] is not in the path) . 
 
 Case 1: Suppose that j=i. The only edge out of [X^^.J is 
 directed toward [X ] . If [X ] is not visited, then p cannot be a path 
 from [X ± X i+1 ] to [X ±+1 ]. 
 
 Case 2} j*i* Note that p has a minimum number of edges and, hence, 
 
 no node is visited more than once. Since j*i, [X.,,] cannot be the last 
 
 node in p. There are two edges out of [X . ] ; one edge is directed to 
 
 [X.X ] and the other edge is directed to [X...X. ,.]. The only edge 
 J J+ 1 j+1 j+2 
 
 out of [XX] is to [X ] . The path p does not use this edge, and so p 
 
 cannot visit [XX]. Thus, after entering t X j +1 ]» P visits 
 
 [XX]. But the only way to move into [X JM ] is from [X...X. ]. 
 j+i j+2 j+1 j+1 j+2 
 
 Consequently, [X X _] is visited twice. This contradiction completes 
 the proof of the claim. 
 
- 19 - 
 
 Now suppose that X. ♦♦ X, |Z (for some 0<k<n-l) Is the Z-EMVD in E 
 but not in E' . A Z-graph for E' is obtained from MG by removing the 
 edge from [X,X. .] to [X, ]. Now no edge is directed toward [X, ] and, 
 therefore, X, +♦ X, . |Z cannot be a consequence of E' . By Claim 3, none 
 
 of the Z-EMVD's X . ♦+ X ± |Z (0<i<n-l) is in Z-EMVD C (E'). Therefore, 
 
 r 
 
 the only Z-EMVD's in Z-EMVD (E') are those in E' . Thus, the lemma fol- 
 lows from Lemma 13 and Lemma 15. [] 
 
 _7 . Inference Rules for Subset Dependencies 
 
 Although the inference rules of [BFH] cannot be extended to a set 
 of inference rules for EMVD's or even for Z-EMVD's, it might be possible 
 to find a complete set of inference rules for SD's. In fact it should 
 be clear from the results obtained so far that the reflexivity rule 
 (Lemma 5) and the transitivity rule (Lemma 6) for SD's along with aug- 
 mentation and projection are complete for ZSD's and, hence, for Z- 
 EMVD's. In this section we give a set of inference rules for SD's. We 
 have no proof of completeness for these rules. However, these rules 
 imply the inference rules for MVD's, and the rule of Projection for 
 EMVD's. Furthermore, these rules are complete for ZSD's and Z-EMVD's. 
 
 Following is a set of inference rules for SD's. (Note that Z is no 
 longer assumed to be a fixed set of attributes.) 
 SD1 (Reflexivity): 
 
 Z(X) C Z(Y) for all Y C X. 
 SD2 (Augmentation): 
 
- 20 - 
 
 • • 
 
 If ZW(X) C ZW(Y), then Z(WX) C Z(WY). 
 SD3 (Transitivity): 
 
 • • • 
 
 If Z(X) C Z(Y) and Z(Y) C Z(W), then Z(X) C Z(W). 
 SD4 (Complementation): 
 
 Let X be the intersection of VX and XY. 
 
 If Z(VX) C Z(XY), then Y(VX) C Y(XZ). 
 SD5 (Projection): 
 
 If Z(X) C Z(Y), then Z'(X) C Z'(Y) for all Z' C Z. 
 
 Lemma 18 : The above SD rules are sound . 
 
 Proof : The rule of Reflexivity is sound by Lemma 5, and the rule of 
 Transitivity is sound by Lemma 6. We now prove that the other rules are 
 also sound. 
 
 • 
 
 Case 1 : (SD2 - Augmentation) Suppose that ZW(X) C ZW(Y) holds in a 
 relation r. We have to show that Z(WX) C Z(WY) also holds in r. Let 
 wxy be a WXY-value in r. Suppose that z is in Z (wx) . That is, zwx is 
 a ZWX-value in r. Therefore, zw is in ZW (x) and, hence, in ZW (y) . 
 Thus, zwy is a ZWY -value in r, and z is in Z (wy) . 
 
 • 
 
 Case 2 : (SD4 - Complementation) Suppose that Z(VX) C Z(XY) holds in 
 
 a relation r. Let vxz be a VXZ -value in r, and suppose that y e Y (vx) . 
 
 Therefore, yvx is a YVX-value in r. But z is in Z (vx) and, hence, it 
 
 is in Z (xy) . It follows that yxz is a YXZ -value in r and y is in 
 Y (xz). Thus, Y(VX) C Y(XZ) also holds in r. 
 
- 21 - 
 
 Case 3 ; (SD - Projection) It is easy to obtain a direct proof for 
 this case. However, note that the rule of Projection follows from the 
 rules of Reflexivity, Transitivity, and Complementation. [] 
 
 Lemma 1 9 : The rules of Reflexivity, Augmentation, Transitivity, and 
 Complementation are complete for MVD's. 
 
 Proof ; We will show that the SD rules imply the MVD rules of Sec- 
 tion 2. 
 
 Case 1 : (MVDO - Complementation) Let V be the attributes which are 
 not in X or Y (i.e., V = U - X - Y) , and let W be the attributes which 
 are not in X or Z. We have to show that V(X) C V(XY) implies 
 
 • • • 
 
 W(X) C W(XZ). By SD4, V(X) C V(XY) implies Y'(X) C Y' (XV) , where 
 Y' = Y - X. But XV = XZ and Y' » W, and so we are done. 
 
 Case 2 ; (MVD1 - Reflexivity) Let Z be the attributes that are not 
 in X. By SD1 (Reflexivity), Z(X) C Z(X). By Corollary 8, Z(X) C Z(X) 
 is equivalent to X ■*-*■ Y, where YC X. 
 
 Case 3 : (MVD2 - Augmentation) Let Z be the attributes that are not 
 in X or Y. Let W be the intersection of Z and W, and Z' = Z - W. By 
 SD2 (Augmentation), Z'(XW) C Z'(XYW). By reflexivity, Z'(XW) C Z'(XW) 
 and so by transitivity, Z'(XW) C Z'(XYW). Since all the attributes are 
 contained in XYZ, XYW must equal XYW. Thus Z'(XW) C Z'(XYW). Since 
 XYWZ' are all the attributes, by Corollary 8, Z'(XW) C Z'(XYW) is 
 equivalent to XW ■*-*■ YV, where V £ W. 
 
- 22 - 
 
 Case 4 ; (MVD3 - Transitivity) Let V - U - X - Y and W - U - Y - Z. 
 
 We write Z as ZZ Z , where Z-Z-X-Y, Z -ZfiX, and Z = Z fi Y. 
 
 x y x y 
 
 Let T ■= U - X - Z (note that T is the complement of X and Z - Y) . The 
 
 MVD X ■*-*■ Y is equivalent to V(X) C V(XY) . Since "z is contained in V, by 
 
 projection, 
 
 (1) Z(X) C Z(XY) 
 
 The MVD Y ++ Z is equivalent to W(Y) C W(YZ) and, by complementation, it 
 
 implies ZZ (Y) C ZZ (YW) . By augmentation, 
 
 (2) "z(Z Y) C "z(YWZ ) 
 x x 
 
 But Z Y is contained in XY and, by reflexivity and transitivity, (2) 
 
 implies 
 
 (3) Z(XY) C Z(YWZ ) 
 
 x 
 
 By applying transitivity to (1) and (3) 
 
 (4) Z(X) C ~Z(YWZ ) 
 But YWZ Z contains all the attributes, and by complementation (4) 
 implies 
 
 (5) T(X) C T(XZ) 
 Since (5) is equivalent to the MVD X +-»• Z - Y, we are done. [] 
 
 References 
 
 [Arm] Armstrong, W. W., "Dependency Structures of Database Relation- 
 ships," Proc. IFIP 74, North Holland, 1974, pp. 580-583. 
 
 [ABU] Aho, A. V., C. Beeri, and J. D. Ullman, "The Theory of Joins in 
 Relational Databases," to appear in ACM Trans , on Database 
 Systems . , 
 
- 23 - 
 
 [Bel] Beeri, C, "On the Membership Problem for Multivalued Dependen- 
 cies in Relational Databases," to appear in ACM Trans , on 
 Database Systems * 
 
 [BB] Beeri, C, and P. A. Bernstein, "Computational Problems Related 
 to the Design of Normal Form Relational Schemas," ACM Trans , on 
 Database Systems , Vol. 4, No. 1 (March, 1979), pp. 30-59. 
 
 [BFH] Beeri, C, R. Fagin, and J. H. Howard, "A Complete Axiomatization 
 for Functional and Multivalued Dependencies in Database Rela- 
 tions," Proc . ACM SIGMOD Int . Conf. on Management of Data , 
 Toronto, Aug., 1977, pp. 47-61. 
 
 [Bil] Biskup, J., "On the Complementation Rule for Multivalued Depen- 
 dencies in Database Relations," Acta Informatica , Vol. 10, No. 3 
 (1978), pp. 297-305. 
 
 [Bi2] Biskup, J., "A New Completeness Result for Multivalued Dependen- 
 cies," to appear Theoretical Computer Science . 
 
 [Cod] Codd, E. F., "A Relational Model for Large Shared Data Banks," 
 Comm . ACM , Vol. 13, No. 6 (June, 1970), pp. 377-387. 
 
 [Fal] Fagin, R., "Multivalued Dependencies and a New Normal Form for 
 Relational Databases," ACM Trans , on Database Systems , Vol. 2, 
 No. 3 (Sept., 1977), pp. 262-278. 
 
 [Fa2] Fagin, R., "Functional Dependencies in a Relational Database and 
 Propositional Logic," IBM J. of Res , and Dev ., Vol. 21, No. 6 
 (Nov., 1977), pp. 534-544. 
 
 [HIT] Hagihara, K. , M. Ito, K. Taniguchi, and T. Kasami, "Decision 
 Problems for Multivalued Dependencies in Relational Databases," 
 SLAM J. Computing , Vol. 8, No. 2 (May 1979), pp. 247-264. 
 
- 24 - 
 
 [Mak] Makinouchi, A. , "A Consideration on Normal Form of Not- 
 Necessarily-Normalized Relation in the Relational Database 
 Model," Proc. Third Inter . Conf . on Very Large Data Bases , Tokyo, 
 Japan, Oct. 1977, pp. 447-453. 
 
 [Men] Mendelzon, A. 0., "On Axiomatizing Multivalued Dependencies in 
 Relational Databases," J. ACM , Vol. 26, No. 1 (Jan. 1979), pp. 
 37-44. 
 
 [Nic] Nicolas, J. M. , "First Order Logic Formalization for Functional, 
 Multivalued and Mutual Dependencies," Proc . ACM-SIGMOD Inter . 
 Conf . on Management of Data , Austin, Texas, June 1978, pp. 40-46. 
 
 [Sag] Sagiv, Y. , "An Algorithm for Inferring Multivalued Dependencies 
 that Works Also for a Subclass of Propositional Logic," UIUCDCS- 
 R-79-954, Dept. of Comp. Sci., University of Illinois at Urbana- 
 Champaign, Urbana, Illinois, Jan., 1979. 
 
 [SaF] Sagiv, Y. , and R. Fagin, "An Equivalence Between Relational Data- 
 base Dependencies and a Subclass of Propositional Logic," IBM 
 Research Report RJ2500, March, 1979. 
 
 [TK1] Tanaka, K. , Y. Kambayashi, and S. Yajima, "Properties of Embedded 
 Multivalued Dependencies in Relational Databases," Research 
 Report ER78-03, Dept. of Information Science, Kyoto University, 
 Kyoto, Japan, Dec, 1978. 
 
 [TK2] Tanaka, K. , Y. Kambayashi, and S. Yajima, "On the Representabil- 
 ity of Decompositional Scheme Design with Multivalued Dependen- 
 cies," Research Report ER79-01, Dept. of Information Science, 
 Kyoto University, Kyoto, Japan, Jan., 1979. 
 
 [Zan] Zaniolo, C, "Analysis and Design of Relational Schemata for 
 
- 25 - 
 
 Database Systems," Tech. Rep. UCLA-ENG-7769, Dept. of Comp. Scl., 
 UCLA, July, 1976. 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R-79-980 
 
 3. Recipient's Accession No. 
 
 4. Tide and Subtitle 
 
 Subset Dependencies as an Alternative to 
 Embedded Multivalued Dependencies 
 
 5. Report Date 
 
 July 1979 
 
 6. 
 
 7. Author(s) 
 
 Yehoshua Sagiv*, Scott Walecka 
 
 8. Performing Organization Rept. 
 No. 
 
 9. Performing Organization Name and Address 
 
 Department of Computer Science 
 University of Illinois 
 
 at Urbana-Champaign 
 Urbana. Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 12. Sponsoring Organization Name and Address 
 
 National Science Foundation 
 Washington, D.C. 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 15. Supplementary Notes 
 
 16. Abstracts 
 
 We show that the inference rules for multivalued dependencies 
 cannot be extended to a complete set of inference rules for embedded 
 multivalued dependencies. A new type of dependencies, called subset 
 dependencies, is introduced. Subset dependencies are a generalization 
 of embedded multivalued dependencies. We give a set of inference 
 rules for subset dependencies and investigate their properties. 
 
 17. Key Words and Document Analysis. 17o. Descriptors 
 
 multivalued dependency, embedded multivalued dependency, subset 
 dependency, inference rule, relational database. 
 
 17b. Identifiers/Open-Ended Terms 
 
 17c. COSATI Field/Group 
 
 18. Availability Statement 
 
 FORM NTIS-35 (tO-70) 
 
 19. Security Class (This 
 Report) 
 
 . UNCLASSIF I ED 
 
 20. Security Class (Thi 
 Page 
 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Price 
 
 USCOMM-DC 40329-P7I 
 
JMfcUttM 
 
FEB 
 
 2 M