UNIVERSITY OF AT noL N0IS UB ***Y Digitized by the Internet Archive in 2013 http://archive.org/details/studiesingraphal975zaks P IOJ1 UIUCDCS-R-79-975 UILU-ENG 79 1720 DEC*'*' 9 .in, 01 " ,1,,0lS STUDIES IN GRAPH ALGORITHMS: GENERATION AND LABELING PROBLEMS by Shmuel Zaks August 1979 STUV1ES IN GRAPH ALGORITHMS: GENERATION ANV LABELING PROBLEMS BY SHMUEL ZAKS B.Sc, Technion- Israel Institute of Technology, 1971 . g M.Sc, Technion- Israel Institute of Technology, 1972 THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 1979 Urbana, Illinois To WeeA and Sky, my two little devils, a hard proof that Daddy could also study. STUV1ES W GRAPH ALGORITHMS: GENERATION AMP LABELING PROBLEMS Shmuel Zaks, Ph.D. Department of Computer Science University of Illinois at Urbana-Champaign, 1979 It is the purpose of this thesis to study graph-theoretic problems that arise in certain practical situations. These problems deal with classes of ordered trees, with classes of undirected graphs appearing in scheduling problems, and with algorithms that label free trees. First we study problems concerning ordered trees. We show how to generate trees in certain classes in order, how to determine the position of a given tree and how to find a tree given its position. We also show several enumeration results for the class of ordered trees with n edges. Next we investigate a class of undirected graphs that arise in certain scheduling problems. This class is fully characterized, and several of its extensions are studied. Last we study algorithms that label edges of a given tree with given labels, optimizing certain objective functions. For some of these functions we show polynomial -time algorithms, and for others the problem is shown to be NP-complete. in ACKNOWLEDGMENT My stay in the Department of Computer Science has been both pleasant and educational. For this I am especially grateful to my thesis advisor, Professor C. L. Liu, whose guidance and advice have been of major help. I am also indebted to Professor Klaus Ecker, Professor Yehoshua Perl, Dana Richards and Professor Nachum Dershowitz for their contribution to the development of this thesis, and to Professor David Muller for interesting discussions. I would like to thank the Department of Computer Science in the University of Illinois and the National Science Foundation under grants NSF MCS 73-03408 and 77-22830 for supporting this research. I am also thankful for having Kin-Man Chung, Mei Chung, Don Friesen, Brian Hansche, Art Liestman and Dana Richards as my officemates. Finally, many thanks to my parents, and special thanks to Irith, my wife. IV TABLE OF CONTENTS CHAPTER PAGE 1 INTRODUCTION 1 1.1 AN OVERVIEW 1 1.2 SUMMARY OF RESULTS 3 2 ORVEREV TREES 6 2.1 INTRODUCTION 6 2.2 PRELIMINARIES 8 2.3 CORRESPONDENCES 13 2.4 LINEAR ORDERING OF TREES 22 2.5 LATTICE PATHS AND THE CYCLE LEMMA 24 2 . 6 ORDERED TREES WITH N EDGES 28 2.7 BIBLIOGRAPHICAL NOTES 32 3 GENERATING TREES: PART 1 34 3.1 INTRODUCTION 34 3.2 RELATIONS TO EXISTING ALGORITHMS 35 3.3 THE GENERATING ALGORITHM 37 3.4 THE RANKING ALGORITHM 39 3.5 THE UNRANKING ALGORITHM 45 3.6 GENERATING K-ARY TREES 47 3.7 ANALYSIS OF THE GENERATING ALGORITHM 50 3.8 AN EXAMPLE 55 CHAPTER PAGE 4 GENERATING TREES: PART II 59 4.1 INTRODUCTION 59 4.2 K-ARY TREES IN 6-ORDER: GENERATION 59 4.3 K-ARY TREES IN e-ORDER: RANKING AND UNRANKING. . . . 61 4.4 GENERAL CLASSES OF TREES: GENERATION 66 4.5 GENERAL CLASSES OF TREES: RANKING AND UNRANKING.. 72 5 A GRAPH LABELING PROBLEM 77 5.1 INTRODUCTION 77 5.2 PRELIMINARIES 80 5.3 1 -LABELED GRAPHS 83 5.4 S-LABELED GRAPHS 94 6 EVGE LA8ELINGS FOR TREES 105 6.1 INTRODUCTION 105 6.2 PRELIMINARIES 106 6.3 DIAMETER AND RADIUS: NP-COMPLETE RESULTS 108 6.4 RADIUS: POLYNOMIAL RESULTS 113 6.5 AVERAGE MEASUREMENTS: POLYNOMIAL RESULTS 117 6.6 OPEN PROBLEMS 124 LIST OF REFERENCES 125 l/ITA 129 CHAPTER 1 INTRODUCTION 1.1 AN OVERVIEW It is the purpose of this thesis to study graph-theoretic problems which arise in certain practical situations. These problems deal with classes of ordered trees, with classes of undirected graphs appearing in scheduling problems, and with algorithms that label free trees. First we study problems concerning ordered trees. For example, suppose we have an algorithm that operates on a certain class S of ordered trees, and we want to measure its performance empirically. For this we might need to run the algorithm on all the trees in S, or (in case S contains too many trees) to run it on a random sampling of trees from S. In both cases we first have to define a linear ordering in S, and then either generate all the trees, one by one, according to this ordering, or to generate the r-th tree in S for some random number r. As another example, suppose we deal with a function f, defined on trees in S, and suppose f depends on some property P of the tree (say, the number of leaves). To compute the average of f over S we need to study the average of P over S (the expected number of leaves in the above example). These problems are studied in chapters 2, 3 and 4. The first problem is studied when S is the set of regular binary trees with n internal nodes, the set of k-ary trees with n internal nodes, and the set of ordered trees with n^ internal nodes of degree k . , i=l,2,...t for some t; all of this is treated in chapters 3 and 4. The second problem is studied in chapter 2, where S is the set of ordered trees with n edges. The discussion of this is based on [Zaks, 1977a], [Zaks, 1977b], [Zaks and Richards, 1977] and [Dershowitz and Zaks, 1979]. Next we study classes of undirected graphs which arise in certain scheduling problems. Suppose we have n tasks t-. , t« t to be executed on two identical processors that share the same memory, and that task t. requires a fraction c. of the memory for its execution. Two distinct tasks t. and t. can be executed simultaneously if their total memory requirement does not exceed the available amount of memory, namely if Cj + Cj < T. This situation can be described by a graph in which the vertex n. is associated with task t. , and there is an edge between n. and n. if c. + c. <. 1, where c is the label of n . . The class of graphs that can be obtained in this way is investigated. Extending the scheduling problem to the case when the two processors share more than one resource gives rise to new classes of graphs. All of this is the subject of chapter 5, based on [Ecker and Zaks, 19771. Last we study algorithms that label edges of free trees. Suppose we have n+1 users and a tree-like communication network (with n connections) connecting them, and we are given n communication lines of different 'weights' W-, , w 2 ... , w . Assign the weights to the edges, and define the communication cost c(i,j) between users i and j as the sum of the weights on the (unique) path connecting them. We want to determine an assignment that minimizes (or maximizes) the largest of the c(i,j)'s, or one that minimizes (or maximizes) the average of the c(i,j)'s. These problems, and other related ones, are discussed in chapter 6, based on [Zaks and Perl, 1978]. 1.2 SUMMARY OF RESULTS 1. Ordered trees (chapters 2-4) We show one-to-one correspondences among the regular binary trees with n internal nodes, the ordered trees with n edges, the 0,1 sequences of n l's and n O's in which the number of l's in each prefix is not smaller than the number of O's, the lattice paths from (0,0) to (n,n) which do not go below the diagonal y=x, and other classes of combinatorial objects. Linear ordering of trees is discussed, and relations to lexicographic ordering of the corresponding sequences are studied. Using these correspondences we find several enumeration results concerning the class of ordered trees with n edges. For example, we show r /2n-r-l\ that -•! , J of these trees have root degree r, that the expected root degree for this class is ;rp5-> and that the number of nodes of deqree d in this class is f ""," J. All of this is the subject of chapter 2. In chapter 3 we develop an algorithm that generates lexicographically sequences corresponding to regular binary trees with n internal nodes, we show how to determine the position of a sequence and how to determine the sequence given its position in this ordering. The generating algorithm is then generalized to k-ary trees, and is compared to existing related algorithms. The complexity of the generating algorithm is measured by the number of comparisons it makes. We show that the algorithm, that generates sequences corresponding to k-ary trees with n internal nodes, uses, on the average, less than two comparisons per sequence (independent of n and k). In chapter 4 these results are extended as follows: First the generating, ranking and unranking algorithms are studied for k-ary trees, using a different ordering of these sequences, and then all the discussion of chapter 3 (generating, ranking and unranking algorithms for binary trees) is extended to the classes of ordered trees with n. internal nodes of degree k., i=l ,2,. . . ,t for some t. 2. A graph labeling problem (chapter 5) k The classes GR , k>0, of undirected graphs are defined as follows: k GR contains all the graphs G=(V,E) for which there exists a labeling of the i/ vertices in V with vectors in [0,1] such that there is an edge connecting two vertices if and only if the sum of their labels does not exceed (1, !,...,!). Such a labeling is called a k-labeling. The class GR is fully characterized, and is shown to coincide with n 1 the class of threshold graphs. It is shown that there are 2 " non- isomorphic graphs in GR with n vertices, and the proof provides a simple and elegant way to generate all those graphs and, given a graph G, to decide whether it is in GR . k The classes GR , k>l, are then studied and several results and examples are explored. It is shown that GR 1 is properly included in GR for each i. It is known that, given a graph G and an integer k, the problem k of determining whether G is in GR is NP-complete. Therefore it seems that k a simple characterization of the classes GR , k>l , is not likely to exist . For example, although the class GR can be characterized by a finite number of 2 forbidden induced subgraphs, this is not the case for GR . 3. Optimal label ings for trees (chapter 6) Given a free tree T with n edges and a set W of n weights, we study algorithms for labeling the edges of T with weights from W that optimize certain objective functions. Some of these problems are shown to be NP-complete, and for others we present polynomial -time algorithms. For example, we prove that determining whether a given tree can be labeled with a given set of weights such that its diameter will not be larger than a given number is an NP-complete problem; the same problem remains NP-complete when 'diameter' is replaced by 'radius'. Other related NP-complete results are also derived. The problem of minimizing or maximizing the average distance between the vertices of T, given T and W as above, is solvable in polynomial time, as well as other related problems. CHAPTER 2 OWEREV TREES 2.1 INTRODUCTION Ordered trees play a key role in theoretical computer science. For example, derivation trees are used quite often. In the next, three chapters we discuss several problems concerning these trees. In certain practical situations we are concerned with trees from a certain class S, Suppose we have an algorithm that operates on trees in S, and we want to measure its performance. For this we might like to run the algorithm on all trees in S, or (if S is too large) to run it on some randomly chosen trees. In both cases we first might want to define a linear ordering on S, and then either generate all trees or generate the r-th tree, for a random number r, according to this ordering. As another related situation, suppose we are looking for trees with given properties in S. If we don't have a better way, we might proceed as follows: define a linear ordering on S, and then generate the trees , one by one, according to this ordering, until we either find all the trees we are looking for or reach the last tree. As yet another situation , we might like to study certain properties of a tree chosen randomly from S; for example, we might want to know what is the expected number of leaves in such a tree, if we are dealing with a cost function that depends on this quantity. In this chapter we present ordered trees and discuss several one- to-one correspondences among certain classes of these trees, integer sequences, lattice paths and other combinatorial objects. Most of this chapter reviews results of a combinatorial flavor. We bring it here in order to keep the thesis self-contained. In the following two chapters we will discuss the problem of generating trees. We discuss one-to-one correspondences among the regular binary trees with n internal nodes, the ordered trees with n edges, the 0,1 sequences with n l's and n 0's in which in any prefix the number of l's is not smaller than the number of 0's, the set of lattice paths from (0,0) to (n,n) which do not go below the diagonal y=x, and other combinatorial objects. Linear orderings of trees are discussed, and the connection between them and the lexicographic ordering of the corresponding integer sequences are investigated. We then study the Cycle Lemma, and use it to enumerate certain lattice paths. At the end of the chapter we give enumeration results concerning the class of trees with n edges. For example, we show the following: the number of such trees r /2n-r-l\ which have a root of degree r is -I , J, the expected root degree in this class is — ^5- , the total number of nodes of degree d in these trees is ( ~,~ J. We also present two new proofs for the number — f !? Vf . "-. J of trees with n edges and k leaves. Basic notions are given in section 2.2. A detailed discussion of the one-to-one correspondences is the subject of section 2.3. Section 2.4 deals with the ordering of trees, and section 2.5 discusses the Cycle Lemma and the enumeration of lattice paths. The enumeration results about ordered trees are presented in section 2.6. As mentioned before, most of this chapter is a review; several references on these subjects are cited in section 2.7. 2.2 PRELIMINARIES We deal with the combinatorial structures known as ordered trees. We state here definitions, notations and properties that we will use in the sequel. Our terminologies mainly follow [Knuth, 1968] and [Liu, 1977]. An ordered tree has a distinguished node r, called the root of the tree, and m subtrees t, , t«, ... , t , m>0, of the root r, where each t. is also an ordered tree. It should be emphasized that the relative order of the subtrees is important. The roots of the subtrees are sons of the root r, and r is their father . The degree of a node in a tree is the number of its sons, and it resides on level £, where i is its distance (number of edges seperating it) from the root. The height of a tree is the maximal level of any of its nodes. A node of degree is called a leaf , otherwise it is an internal node. A k-ary tree is an ordered tree in which each^node has at most k sons; in a binary tree (k=2) each node has 0, 1 or 2 sons. A k-ary tree is called regular if each internal node has exactly k sons; in a . regular binary tree each internal node has exactly two sons, a left son and a right son . Figure 2.1 presents an ordered tree with 4 internal nodes and 5 leaves; it is a non-regular ternary (k=3) tree of height 3. The root a has degree 3, and the rest of the internal nodes b,e and f have degrees 2, 2 and 1, respectively. The leaves are c,d,g,h and i, and the levels of the nodes are shown in that figure. T(k,n) denotes the set of regular k-ary trees with n internal nodes, and t(k,n) is the number of these trees. B n =T(2,n) and b n =t(2,n) stand for the corresponding entities in the binary case. T n denotes the set of ordered trees with n edges. The regular binary trees with n<3 internal nodes, and the ordered trees with n<3 edges are shown in figures 2.2 and 2.3, respectively; these trees for n=4 are shown in section 3.8. level level 1 level 2 level 3 An ordered tree with eight edges Figure 2.1 42,... The sizes of the sets of trees T , or B , form a series 1,1,2,5,14, These are the well-known Catalan numbers C : n C = B = T n I nl I nl 1 /2n n+1 \ n There are many sets of combinatorial objects whose cardinality is C , and some of them are mentioned in section 2.3. It is also known that the number of regular k-ary trees with n internal nodes, t(k,n), is given by '<"•"> ■ tfWC") • of which the Catalan numbers are a special case (see [Knuth, 1968, p. 584]) 10 A Regular binary trees with n<3 internal nodes Figure 2.2 11 I AAA 4 • Ordered trees with n<3 edges Figure 2.3 12 The following sequence a(T), associated with a tree T, will be often used: label each node with its degree, and read these labels in preorder (root-left-right)i om1t|*a§ptfeeilaSfioQfc iAn example is shown in figure 2.4, a(T) = 40203000 1 An ordered tree T and the corresponding sequence a(T) Figure 2.4 13 2.3 CORRESPONDENCES We present now several one-to-one correspondences between certain sets of ordered trees, lattice paths, integer sequences and parenthetic expressions.. A list of these sets follows: B B' n the set of regular binary trees with n internal nodes, the set of binary trees with n nodes, the set of ordered trees with n edges. the set of 0,1 sequences with n l's and n 0's, such that in each prefix the number of l's is not smaller than the number of 0's; this property of a 0,1 sequence is called the dominating property , the set of integer sequences z={z.}? , such that < z, < z~" % < z. and z. <_ 21 — 1 for each i Q n and the set of integer sequences a={a.} 1 such that £ a. i j=l J z a. > i for each i. Note that A is a subset of the set of al partitions (p 1 , p 2 , ... , p ) of an integer n (where each p. ^ and z p. = n). i the set of lattice paths from the point (0,0) to (n,n), which do not pass below the diagonal y=x (in a lattice path all steps are either up or to the right); a lattice path is called legal if it does not go below the diagonal y=x. the set of parenthetic expressions consisting of n open and n close parentheses, in which each open parenthesis has a matching close parenthesis after it; such an expression is called legal . We now illustrate these correspondences in a series of figures and short explanations; more references can be found in section 2.7. 14 We start with a tree T in B g (figure 2.5) T : A regular binary tree in Bp Figure 2.5 If we remove all the leaves and the edges incident with them, we get a tree T' in B' (figure 2.6). This correspondence between B and B' is one-to-one (see [Knuth, 1968, p. 559]). If we label each internal node of T with 1 and each leaf with 0, and read these labels in preorder, except forthe last leaf, we get a sequence of 8 l's and 8 0's, x-(T) , in X g . (See figure 2.7.) The correspondence between B and X n is discussed in various references (see, for example, [DeBruijn and Morsel t, 1967] or [Knuth, 1973, p. 63]). 15 T' A binary tree in Bp Figure 2.6 x(T) = 1101001100111000 A 0,1 sequence in X Figure 2.7 If we replace 1 and in x(T) by open and close parentheses, we obtain a legal parenthetic expression p(T) in P« (figure 2.8). The one- to-one correspondence between X and P is immediate. n n P(T) = [[][]][[]][[[]]] A parenthetic expression in P„ Figure 2.R 16 From the sequence x(T) we build the sequence z(T), where z. is the position of the i-th 1 in x(T); this sequence is in Z g (see figure 2.9). The one-to-one correspondence between X and Z is discussed in [Zaks, 1977a] and follows from the discussion in [Knuth, 1973, p. 63]. When no confusion occurs, we use x and z for x(T) and z(T), respectively. z(T) = 1 2 4 7 8 11 12 13 An integer sequence in Z» Figure 2.9 This 0,1 sequence x(T) can also be obtained by traversing the ordered tree t(T) from T g , shown in figure 2.10, when 1 means 'going down' and means 'going up'. It is clear that such a sequence has the dominating property, since in each step the total number of 'going down's must be not smaller than the total number of 'going up's; it is also clear that the number of steps down is equal to the number of steps up. That this correspondence is one-to- one follows from the above observation (see, for example, [DeBruijn and Morselt, 1967]). t(T) : An ordered tree i n Figure 2.10 6 17 Another way of interpreting this 0,1 sequence x(T) is via lattice paths: x(T) corresponds to a lattice path l(J) from (0,0) to (8,8), when 1 means 'one step up' and means 'one step to the right' (see figure 2.11) This path has the property that it does not pass below the diagonal y=x; in other words, a (J) is in Lq. The one-to-one correspondence between X and L follows immediately. *(T) 1 /'■" (8,8) ► (0,0) A lattice path in Lg Figure 2.11 18 The sequence a(T) was previously defined for any tree T. Applying it to the tree t(T), we obtain an integer sequence a(t(T)) in A fi . That this correspondence is one-to-one is the subject of theorem 4.4 in chapter 4; see also [Chorneyko and Mohanty, 1975]. See figure 2.12. a(t(T)) = 3 2 10 11 A sequence in A p Figure 2.12 This sequence a(t(T))=32001011 corresponds also to another lattice path £'(T): the one in which we go up 3 steps on the first column, then 2 steps on the second column, then steps on the third column, etc. i'(T) : (0-0) Ik / yx V 18,8) 7 i ^ Another lattice path in L p Figure 2.13 19 Since the sum of the first i elements in a(t(T)) is not more than i, it follows that £'(T) does not go below the diagonal y=x, hence it is in L fi . This correspondence is clearly one-to-one. (See figure 2.13 .) Notes: 1. The correspondence between T and B' is, in fact, the one r n n described in [Knuth, 1968, p. 333]. 2. The correspondence between 13 and L , as discussed above, is r n n later extended to a correspondence between T(k,n) (regular k-ary trees *~m th - n internal nodes) and L(k,n) ('legal' lattice-paths from (0,0) to ((k-l)n.n) ). The following properties follow directly from these correspondences: Lemma 2.1: Let T e B . Then n (1) The number of leaves in the tree t(T) = the number of 10 patterns in x(T) = the number of corners (i.e., a path segment of the form ) in z(J) = the number of left leaves in T. (2) The number of internal nodes in the tree t(T) = one more than the number of 11 (or equivalently 00) patterns in x(T) = one more than the number of I (or equivalently ■ — •— •) path segments in A(T) = the number of right leaves in T. (3) The number of nodes of degree d in the tree t(T) = the number of occurrences of d in a(t(T)) = the number of vertical path segments of length exactly d in a'(T). (4) The number of nodes of degree d in all the trees in T n = the number of occurrences of runs of exactly d l's (or equivalently d 0's) in the sequences in X . 20 The following encoding of a tree will be used in discussing ordering of trees in section 2.4: label each node with the size (number of nodes) of the tree rooted at this node, and then read these labels in preorder, For T (figure 2.5) we get the sequence s(T), shown in figure 2.14 s(T) = 17 5 1 3 1 1 11 3 1 1 7 5 3 1 1 1 1 A sequence of subtree sizes Figure 2 ._}< 4 The following two additional encodings are also used in the literature: [Ruskey and Hu, 1977] represent a tree in B by the sequence of the levels of its leaves (see figure 2,13). In [Knuth, 1968, 2.2.1], [Trojanowski , 1977a,b] and [Knott, 1977] the numbers l,2,...,n are used to label the internal nodes of T in some order (say, preorder), and then these labels are read in a different order (say, inorder: left-root-right). We get in this way a permutation of the numbers l,2,...,n (known as a stack permutation ; 21 see also [Rotem and Varol , 1977]) . See example in figure 2.16, the sequence: 233335543 Ruskey and Hu's level numbers Figure 2.15 the permutation: 2 3 15 4 8 7 6 A stack permutation (preorder labeling, inorder reading ) Figure 2.16 22 2.4 LINEAR ORDERING OF TREES Our goal in the next two chapters is to generate trees in order. In order to define a linear ordering for a certain class of trees, we can first establish an encoding of these trees into certain sequences, and then use the lexicographic ordering of sequences to define the ordering for the trees. The lexicographic ordering for integer sequences is defined as follows: let u={u.}q and v={v. }q ; we say that u < v if u Q < v Q or if there exists j, 1 <_ j <^n, such that u. = v. for i=0,l j-1 and u. < v.. Regarding u and v as English words, u < v means that u precedes v in an English dictionary. If instead of comparing u and v from left to right (starting with u Q and v fi ) we compare them from right to left (starting with u and v ), we get a different lexicographic ordering. For obvious reasons we call this the Hebrew lexicographic ordering (if one prefers, he might replace 'Hebrew' with (Arabic^ ...). For example, 1256 < 1346 according to the English ordering, but 1256 > 1346 according to the Hebrew ordering. We now define recursively a-order and a '-order for trees; r T denotes the degree of the root of T, and the T. 's are the subtrees of the root Definition (a-order ): Given ordered trees T and T' , we say that T < T' if (1) r j < r T' ' or (2) r T = r-p, and for some i, 1 <_ i < r T , we have (a) T, = Ti for j = l ,2,. . . ,i-l , and (b) Tl < a Tj . Definition (a '-order) : Given ordered trees T and T', we say that T < 1 J' if (1) IT| < | T ■ | ( |T| is the number of nodes in T), or (2) |T | = IT' i , and for some-i, 1 ■<, i ± r T , , we have (a) T. = Tj for j = l ,2, . . . ,i-l , and (b) T. < T! 23 Both these orderings are special cases of the ordering in [Knuth, 1968, p. 331]. The relation between the orderings of T and T' and their corresponding sequences a(T), a(T'), s(T) and s(T') is the following: Ordering Lemma : For ordered trees T and T' (1) T ,T' a (note: s(T) > s(T') ) T < T' a (note: a(T) < a(T') ) Ordering of trees Figure 2.17 24 A third ordering will be used in the first half of chapter 4: Definition (e-order) : Given ordered trees T and T', we say that T < T' if (1) r-j- = r-p , and for some i, 1 < 1 < rv, we have (a) T. = Tj for j=r T> r.p-1 ,.. . ,r T -1+2, and (b) T r r i + 1 I T r T -i + l ' or (2) r T < r r . It follows from the above discussion that t(T) < t(T') if and only if x(T) < x(T'), or equivalently z(T) < z(T'), by the Hebrew lexicographic ordering, for T, T' in B 2.5 LATTICE PATHS AND THE CYCLE LEMMA The following lemma turns out to be very useful; it goes back to [Motzkin, 1948], and has since been rediscovered a number of times and generalized (see [Raney, 1960], [ Takacs , 1967], [Bergman, 1978] and [Singmaster, 1978]). Cycle Lemma: For r any 0,1 sequence a i a 2 , * ,a m+n °^ n ^' s and m 0s ' n > m ' there exist exactly n-m cyclic permutations a j a j+r--Vn a T" a j-l that have the strong dominating property. (A 0,1 sequence is said to have the strong dominating property if in each prefix the number of l's is greater than the number of 0's.) Proof : Arrange the given sequence a-jap.-.a around a circle. It is clear that removing a 10 pattern does not affect the number of cyclic permutations that have the strong dominating property. We can always remove m such pairs. The remaining n-m l's are the positions at which a cyclic permutation that have this property can start. O 25 This proof is similar to the one given in [Singmaster, 1978]; for more proofs see [Zaks and Dershowitz, 1979]. For example, given the sequence 0010110111, we arrange it around a circle (see figure 2.18 ). After removing the 10 pairs (shown in double arrows) we have two l's left (pointed to by an arrow), and the two cyclic permutations, that start at these two l's and have the strong dominating property, are 1101110010 and 1110010110. Cyclic permutations Figure 2.18 This lemma is now used to count the number a(i,j) of lattice-paths from (0,0) to (i,j), j >_ i , that do not pass below the diagonal y=x. (See, for example, [Yaglom and Yaglom, 1964, problem 83] . ) As was shown before, these 26 paths correspond to sequences of j l's and i O's that have the dominating property. If we attach a 1 in front of each such sequence we get a sequence that has the strong dominating property. There are I 1 •? ] possible linear arrangements of the j+1 l's and i O's. Out of the j+i+1 cyclic permutations of each sequence exactly j+l-i have the strong dominating property (by the Cycle Lemma), hence the number of legal sequences, or equivalently the number of legal lattice-paths, is given by (*) ■ For i=j=n we get the number of legal paths from (0,0) to (n,n), namely the Catalan numbers (see p. 9). Figure 2.19 demonstrates these numbers. 1 6 20 48 90 132 132 5 14 28 42 42 4 9 14 14 3 5 5 2 2 1 i ► The lattice numbers a(i,j) Figure 2.19 27 This counting can also be done using the reflection principle (see, for example, [Yaglom and Yaglom, 1964, problem 83] or [Gnedenko, 1962, p. 36]), from which it follows that •-C; J K-0 8 the first term stands for the total number of paths from (0,0) to (i,j), and the second term stands for those paths that do pass below the diagonal, and one can show that these forbidden paths are in a one-to-one correspondence with the paths from (1,-1) to (i,j) (using a reflection with respect to y=x-l). It is clear that the sequence a(i,j) can also be defined recursively as follows: ( 1 for i = 0, j >_ a(i,j) = < for j < i (**) ^a(i,j-l) + a(i-l,j) otherwise. Figure 2.19 demonstrates this fact. It is also clear that the number of legal paths from (i,j) to (n,n) is equal to the number of those from (0,0) to (n-j,n-j), and is given by a f n _i n _i) _ j-i + 1 ftn-i-j+A j-i+1 /2n-i-j\ (ieieie) a(nj,ni) 2 n-i-j+l V "-j / n-i+1 \ n-j / " l ' These numbers will play a key role in the ranking algorithm in the next chapter. We summarize the above discussion in the following lemma: Lattice Lemma : (1) a(i,j), the number of legal paths from (0,0) to (i,j), satisfies (**) (2) A closed-form expression for a(i,j) is given in (*). (3) The number of legal paths from (i,j) to (n,n) is given in (***). 28 2.6 ORDERED TREES WITH N EDGES We present now several statistical results about the class T of n ordered trees with n edges; the discussion follows [Dershowitz and Zaks, 1979] The orem 2. 1 : The number R (r) of trees in T with root of degree r is n n J ".w-S-ft-T 1 )- Proof : If a tree T has a root of degree r, then the corresponding i x lattice- path (see p. 18) starts with (0,0) -> (0,r) -* (l,r) . But the number of legal path from (l,r) to (n,n) is r-1+1 /2n-r-A _ r /2n-r-l\ n-1+1 ' \n-r ) " n \ n-1 / by the Lattice Lemma. O Corollary : The expected root degree of trees in T is —pr . (Note that this number is less than 3 for any n . ) Proof : This expected number is given by n n 2 /2n-r-l\ E r«R (r) 1 • E r »l n-1 ) r=0 n = n r=0 V 7 IT I C 1 n 1 n But z r 2 - ( 2 nl^" 1 )= -nT2*(n+l) ; this follows from the identity ) need the following lemma: Lemma 2.2 : The numbers g. (x,y,z) and g r (x,y,z) of ways to arrange x l's and y O's on a line and on a circle, respectively, with exactly z occurrences of the pattern 10, are given by 9 L (x,y,z) - (*}(*) and ^(x.y.z) - \^{t}) ■ Proof : Put the 10's first; the rest of the O's and l's must be put between these 10 's in only one way: an arbitrary number of O's followed by an arbitrary number of l's. The rest follows immediately using ordinary techniques. O First proof of theorem 2.3 : From the Cycle lemma and lemma 2.1 it follows that L (k) is equal to the number of way to arrange n+1 l's and n O's on a circle, with exactly k occurrences of the pattern 10, hence L (k) = g r (n+l,n,k), and 30 the result follows. /~\ Second proof of theorem 2.3 : We have to count the number of sequences with n l's, n 0's and exactly k 10's, which also have the dominating property (p. 13), Without this property we have g. (n,n,k) such sequences by lemma 2.2. Using lattice-path techniques it can be shown (see [Zaks and Dershowitz, 1979]) that the number of illegal paths is equal to the number of paths from (0,0) to (n+l,n-l) with k corners (see p. 19), and is therefore given by g. (n+1 ,n-l ,k); hence L (k) = g,(n,n,k) - g. (n+1 ,n-l ,k), and the result follows. (3 Corollary 1 : The number of trees in T with k leaves is equal to the number of those with n+l-k leaves. This simply means that L (k) = L (n+l-k). A direct proof for this r n n corollary follows: (We use notations from section 2.3.) Let T e B . If the tree t(T) in T has k leaves, then the corresponding 0,1 sequence x(T) has exactly k 10's and in T exactly k leaves are left leaves (lemma 2.1(1)). But if we traverse T in a right preorder (root-right-left), we get a sequence in X with n+l-k 10's (since now the right leaves will contribute the 10 patterns, and there are exactly n+l-k such leaves), which corresponds to a tree in T with n+l-k leaves. This correspondence between trees in T which have k leaves and those with n n+l-k leaves is clearly one-to-one. /~\ n+1 Corollary 2 : The average number of leaves in trees from T is —*r- . This follows immediately from corollary 1. A direct proof for this corollary follows : Following lejurar £.1(1 ) , we count the number of corners in the lattice- paths in L (which is equal to the number of leaves in the trees in T ). n n 31 Let h(i,j) denote the number of paths in L in which the lattice point (i,j) is a corner. Clearly, h(i,j) is the number of legal paths from (0,0) to (i,j-l) times the number of legal paths from (i+l,j) to (n,n). By the Lattice Lemma we have, therefore, H(iJ)-[(T)-( 1 ft , ) H R=l=H-(^-i-' , ) ] for j >_ i . Therefore the total number of corners, N, is given by n-1 n z z i=l j=0 (the first term stands for the number of corners on the first column x=0; each l [ ^( 2 n) + "j, j ^n path in L has one such corner ). By partitioning h(i,j) into four parts (by opening the brackets), and for each of these parts using the identity /s+k\/r-k\ = /r+s+l\ k f U/Ur Ui|- n - s -°- m ' r - 0, (see [Knuth, 1968, p. 58]), it follows that N = jf"). Dividing this total number of corners by the total number of paths in L ( —pr[ n ) ) we get the average number ^— of corners O We conclude with an interesting counting result, the discussion of which is the main part of [Dershowitz and Zaks, 1979]: Theorem 2.4 : The number N U,d) of nodes of degree d which reside on level i in trees in T is n V*" a; 2n-d \ n+£/ . 32 2.7 BIBLIOGRAPHICAL NOTES Many articles have been written about these trees, sequences, lattice paths and other related combinatorial structures, going back to Leonhard Euler; we deeply apologize for not mentioning all of them. An exhaustive list of 454 references has been compiled in [Gould, 1977]. An extensive discussion about ordered trees is found in [Knuth, 1968]. General references for the combinatorial topics are [Hall, 1967] and [Liu, 1968] The book of integer sequences [Sloane, 1973] was used quite a lot in our study [Dershowitz and Zaks, 1979], part of which is discussed in section 2.6. Two general techniques are used many times in studies of lattice-paths and integer sequences; they are the reflection principle and the Cycle Lemma. For the reflection principle we gave two references in p. 27 . The Cycle Lemma goes back to [Motzkin, 1948], which discusses also generalized ballot problems. Discussions about this lemma and various generalizations are found in [Raney, 1960] and [ Taka'cs, 1967, chapter 1]. [Bergman, 1978], [Sands, <*978] and [Singmaster, 1978] are three recent interesting short articles with new proofs to the lemma. In manipulating combinatorial summations we found [Knuth, 1968] yery helpful (see p. 28 and p. 31 , and also p. 63). We mention a few more references that discuss trees, sequences and/or lattice-paths: [Etherinqton, 1938], [Lyness, 1941], [Erdos and Kaplansky, 1946], [DeBruijn and Morselt, 1967], [Silberger, 1969], [Klarner, 1969 and 1970], [Read, 1972] and [Chorneyko and Mohanty, 1975]. The 'fun with lattice-paths' in [Grossman, 1946 and 1950] and the description in [Gardner, 1976] are yery entertaining. 33 [Carl i tz, 1969a] first gave the closed-form expression for t(k,n), that was mentioned earlier (recursively, without a closed-form) in [Riordan, 1957]. A few references for the ballot problem are [Whitworth, 1878], [Dvoretzky and Motzkin, 1947], [Mohanty and Narayana, 1961] and [Carlitz, 1969b] 34 CHAPTER 3 GENERATING TREES: PART I 3.1 INTRODUCTION We explained previously why one would like to study generating, ranking and unranking algorithms for certain classes of trees (see sections 1.1 and 2.1). We discuss these algorithms in the next two chapters. In this chapter we use the correspondences between the trees in B , the integer sequences in X„ and Z , and the lattice-paths in L , all of which were n n n r n presented in section 2.3. Using the sequences in Z , we develop an algorithm that generates the trees in B in order, we show how to find the position of a given tree, and we show how to find a tree given its position. The linear ordering of trees used throughout this chapter is the a-order. We would like to note that, using these generating, ranking and unranking algorithms for sequences, one can convert them into algorithms that will directly operate on the trees. This would not add but several programming details to our discussion, and we prefered not to include this here. We also show (theorem 3.1) that our order of generating these trees (the a-order) is the same order in which they are generated by two existing algorithms. The generating algorithm is generalized to k-ary trees. The average number of comparisons in generating the sequences corresponding to the k-ary trees with n internal nodes is shown to be less than two (independent of n, or even of k). Fixing k and increasing n,this number approaches (1--* 1 ) , k K which is 4/3 f° r binary trees, and is smaller than 1.1 for k>4. 35 Section 3.2 discusses the connection between our results and other known results on this subject. In section 3.3 the generating algorithm is discussed. Ranking and unranking algorithms are the subjects of sections 3.4 and 3.5, respectively. Extension of the generating algorithm from binary to k-ary trees can be found in section 3.6, and its analysis is the subject of section 3.7. Section 3.8 presents, as an example, the 14 trees in B,, Bi and T-, together with their various corresponding sequences. 3.2 RELATIONS TO EXISTING ALGORITHMS Algorithms for generating binary trees and k-ary trees have been extensively studied in the past few years. One approach uses permutations (see pp. 20-21). In [Trojanowski , 1977a] the generating, ranking and unranking steps are discussed, using the a-order (see theorem 3.1). [Knott, 1977] discusses the ranking and unranking algorithms using the a'-order. This order is also used in [Trojanowski, 1977a] for ranking and unranking in the binary case. See also [Rotem and Varol , 1977] for a related work. A second approach is to use level numbers (see pp. 20-21). Generating, ranking and unranking in the binary case are discussed in [Ruskey and Hu, 1977] and are extended to k-ary trees in {Ruskey, 1978], both using the a-order for trees (see theorems 3.1 and 3.5). [Col bourn, 1977] contains many references and a brief summary on the subject of graph generation. Our approach for generating trees makes use of the integer sequences introduced before (section 2.3). More precisely, we use the x and z sequences for generating, ranking and unranking binary trees and, more generally, k-ary trees. Treating these classes of trees using these integer sequences seems to be the simplest and most intuitive approach to deal with problems of the kind discussed here. 36 The following theorem establishes the relations mentioned above: Theorem 3.1 : Let T, T 1 e B . The following are equivalent: (1) T < T' . a (2) x(T) < x(T') (see p. 14). (3) z(T) > z(T') (see p. 16). (4) T < T' according to Ruskey-Hu's level sequence (see pp. 20-21). (5) T < T' according to Trojanowski 's permutation (see pp. 20-21). Proof : (1) -*-+■ (2) follows from the Ordering Lemma. (2) ■*-»■ (3) follows immediately from the definitions of the z sequences. (2) ~ (4) x < x' iff for some l, l<£<2n, x i = x'. for i = 1 ,2, . . . , £-1 , and x £ = 0<1 = x . Let us look at those parts of T and T' corresponding to the first I - 1 nodes traversed in preorder. Those parts must be isomorphic, otherwise we could not have (x-j , . . . , x |)=(x' , . . . >x^_-i), by the equivalence (1)**(2). Hence, x_ 1 for j=0, 11 b(i,j+l) + b(i-l.j-l) otherwise See figure 3.2, and compare it with the corresponding numbers a(i,j) in p. 26 . 41 i \ 1 2 1 1 3 4 5 6 7 1 1 1 2 2 1 3 5 3 1 4 14 9 4 1 5 42 28 14 5 1 6 132 90 48 20 6 1 7 429 297 165 75 27 7 1 8 1430 1001 572 275 no 35 8 1 The sequence b(i ,j) Figure 3.2 Let p=lNIT{z) denote the largest i for which z. = i (note that we always have z,=l, hence INIT{z) >_1); this means that the corresponding n 1 lattice-path s.' (see p. 18) starts with p steps up. Let z = {z\ } ^ be the sequence obtained from z by deleting z and setting z. «• z. + -| - 2 for j > p (and z, + z. for j < p) ; this corresponds to 'cutting' the first corner in the corresponding lattice-path and concatenating the two pieces. The following is the key to the ranking algorithm: Theorem 3.3 : lNDEX{z) can be computed recursively by f 1 if p = n INDEX{z). = b(n,p) + lNDEX{z) if p f n Proof : By step 1 of the algorithm GENERATE-BINARY it follows that index{z) = 1 when p = n. For any other sequence z we have lNDEX[z) = [ the number of sequences z with init{z) >_ p+1 ] + [ the position of z among all the sequences z with init(z) =fl 42 by the nature of the lexicographic ordering. We show that the first term in this sum is equal to b(n,p) and that the second one is INDEX (z) . That the first term is equal to b(n,p) follows from the fact that ch this number is equal to the number of legal paths that start with more than p steps up, which is precisely b(n,p). As for the second term, all the sequences z for which init(z) = p correspond to 0,1 sequences x = l^Oy. The sequence z as described above is exactly the one corresponding to x = l p ~ y; hence the position of z among all the sequences z with JNiT{z)=p is exactly INDEX(z). (For the last part see also theorem 4.1 .) o Example : For the tree in figure 3.3 we have T : x(T) = 1111000101100100 z(T) = 1 2 3 4 8 10 11 14 A tree in B 6 Figure 3.3 43 IM?£J(1,2,3,4,8,10,11,14) = b(8,4) + INDEX{1 ,2,3,6,8,9,12) INDEX{1 ,2,3,6,8,9,12) = b(7,3) + INDEXtf ,2,4,6,7,10) 17^*0,2,4,6,7,10) rM?£*(l,2,4,5,8) IM?£*0 .2,3,6) IM?£*0»2,4) INDEXO ,2) = b(6,2) + IM?£*0 .2,4,5,8) = b(5,2) + INDEX[1, 2 ,3,6) = b(4,3) + INDEXC\,2,4) = b(3,2) + INDEXC\,2) = 1 SO INDEX(1,2, 3, 4, 8, 10, 11, 14) = b(8,4) + b(7,3) + b(6,2) + b(5,2) + b(4,3) + b(3,2) + 1 = 110 + 75 + 48 + 14 + 1 + 1 + 1 = 250 . Therefore INDEX{T) = C g - 250 +1 = 1181 . In figure 3.4 the lattice-path i{J) is shown. Each lattice point is labeled with the number of legal paths from it to (n,n) ((8,8) in this case). Note: The numbers used in computing index(z) are exactly those directly above the corresponding path, plus an additional 1 (to make the counting start at 1 rather than 0). In this example they are the numbers 110, 75, 48, 14, 1 and 1. This observation is due to D. Richards and has been further extended in [Zaks and Richards, 1977] (see discussion in the next chapter, sections 4.4 and 4.5). 44 1 1 i 1 1 1 1 1 1 1 1 1 / 8 7 6 5 4 3 2 1 35 27 20 I 14 9 5 2 1 110 I 75 i 48 28 14 5 275 165 90 42 14 572 297 132 42 1001 429 132 1430 429 1430 ■i *- An example for the ranking function Figure 3.4 45 3.5 THE UNRANKING ALGORITHM Given a number i, 1 <. i <_ C , we show in this section how to find the i-th tree in B . As explained, it suffices to do this for the sequences in Z R (corollary, p. 37). The algorithm that we present follows immediately from the way we calculated index{z) using theorem 3.3, so we will skip its proof here. We first illustrate the algorithm by the tree from the previous example: Suppose we want to find the 1181-st tree T in B ft . By the previous discussion, we have to find the sequence z in Z fi such that INDEX(z) = Cg - INDEX(J) + 1 = 250 (in other words, we have to find the 250-th sequence in Zq). By theorem 3.3 we know that 250 = b(8,p) + INDEX{z). We choose p such that b(8,p) < 250 <_ b(8,p-l); here p=4, and we get 250 = 110 + lNDEX{z), or INDEX{z) = 90, where z is a sequence in 1-,. Next we get 90 = b(7,3) + INDEX(z), or INDEX{1) = 15, and so on. At the end we have 250 = b(8,4) + b(7,3) + b(6,2) + b(5,2) + b(4,3) + b(3,2) + b(2,l) . From this decomposition of 250 we reconstruct the sequence z: The last 1 tells us that at the \/ery end (of computing INDEX(z)) we had {1,21. Then b(3,2) tells us that a step before we had the sequence {l,2,y}, and that after omitting the 2 and setting y ■*- y-2 we got {1,2}. This means that we have {1,2,4}. Continuing in this manner, we get: b(4,3) b(5,2) b(6,2) b(7,3) b(8,4) 1,2,3,6} 1,2,4,5,8} 1,2,4,6,7,10} 1,2,3,6,8,9,12} 1,2,3,4,8,10,11,14} = z. 46 Using the geometric interpretation shown above (pp. 43-44), it should be clear intuitively how the unranking algorithm works; tracing the corresponding lattice points, following this unranking algorithm, will make this point clearer. We now turn to a formal description of this algorithm. The following algorithm converts a number i to the i-th sequence in Z , given i and n: Algorithm UNRANK.BINARY 1. A «- 1; J «- n; 2. find m. for which b(j,m.) < A<_b(j,m.-1) ; 3. A -«- i - b(j,m.) ; J + j-l ; vf A > then goto 2 ; n (we now have the decomposition i = z b(j,m.) ) J 4. z <- {1,2, ... , j Q } ; s-m q ; 5. (changing z) (z. unchanged for i < s) z i+1 *■ z^ + 2 for i = q,q-l,...,s ; z s - s ; 6. q •*- q+1 ; rf_ q <_ n then s «- m and goto 5 . J=J 47 3.6 GENERATING K-ARY TREES We generalize now the generating algorithm to the class T(k,n) of regular k-ary trees with n internal nodes. The proofs in this section are omitted, being similar to the corresponding previous ones. Given a tree T e T(k,n), we define the sequences x(T) and z(T) in a manner similar to the binary case; figure 3.5 illustrates these sequences. A sequence related to x(T) is associated with a planted planar tree in [Klarner, 1970] . > x(T) = 11000001000100100000 z(T) = 1 2 8 12 15 Labeling a k-ary tree Figure 3.5 48 kn Let a = {a-}, be a sequence of k l's and (k-l)n O's; it has the k-dominating property if in each prefix {a-}? the number of l's is not smaller than £• . In other words, if this prefix contains s l's, then it contains at most (k-l)s O's. Note that for k=2 we get the dominating property used before. The following thsorem makes the desired connection between k-ary trees and the corresponding sequences and lattice-paths: Theorem 3.4 : The following sets are in one-to-one correspondence with one another: 1. T(k,n), the regular k-ary trees with n internal nodes, 2. X(k,n), the 0,1 sequences with n l's and (k-l)n O's that have the k-dominating property, 3. Z(k,n), the integer sequences {z-} 1 ? such that < z, < z~< ... < z and z. <_ ki-k+1 for each i, and 4. L(k,n), the lattice-paths from (0,0) to ((k-l)n,n) that do not pass below the diagonal (k-l)y=x. The following generalization of theorem 3.1 is the basis to our generating algorithm. The proof follows immediately from that of theorem 3.1 . Note that in [Trojanowski , 1977a and 1977b] the permutation associated with a tree is not immediately extended from the binary case. Theorem 3.5 : Let T, T' e T(k,n). The following are equivalent: (1) T < T' a (2) x(T) < x(T') (see p. 47) (3) z(T) > z(T') (see p. 47) (4) T < T' according to Ruskey-Hu's level sequence (see pp. 20-21) 49 Corollary : In order to generate the trees in T(k,n) in order, it suffices to generate the sequences in X(k,n) or Z(k,n) in order. Notes : 1. It should be clear how to convert a sequence to the corresponding tree. 2. Theorem 3.2 applies also here, with 100 and 10 replaced with 10 k and 10 k-1 , respectively. Following the above discussion, we get: Algorithm GENERATE.K-ARY (generating Z(k,n) lexicographically, given k and n) Exactly like algorithm GENERATE.BINARY (p. 38) with one slight modification: in step 2 replace the upper bound ' 2 j - 1 ' with 'kj-k+1' . For example, the 12 sequences in Z(3,3), corresponding to the regular ternary trees with 3 internal nodes, are generated by this algorithm as follows (we underlined the rightmost z. discussed before; z n =0 is omitted): Position Sequence 1 123 2 124 3 125 4 126 5 127 6 134 7 135 8 136 9 10 137 145 11 146 12 147 50 3.7 ANALYSIS OF THE GENERATING ALGORITHM We now study the complexity of algorithm GENERATE_K-ARY. It is clear that the work done by the algorithm is proportional to the number of comparisons made in its step 2 (namely, checking whether z. < kj-k+1 for the current j); therefore we study now the number of these comparisons. After we have generated the last sequence (which is z = 1, 1+k, l+2k, ... , l+(n-l)k ) in step 3, we come back to step 2, in which we have to scan this sequence from right to its very left; and then, having found no j >0 for which z. < kj-k+1, we stop; therefore, in its worst case, the algorithm makes n comparisons for one sequence. As for the average case, we show that the average number of comparisons per sequence is less than two, independent of n or even of k. Let p. denote the number of sequences in which j comparisons are made by the algorithm. The expected number of comparisons is then given by £ J*P. (*) j>l J t(k,n) Lemma 3.1 : P-j = t(k,n) - t(k,n-l). Note : For k = 2 this is simplified to -f^^ 2 "'?) b ^ t ( 2 ' n ) = b n = "nTf O • Proof : p., counts those sequences z = {z.}, in which z < kn - (k-1). This means that the corresponding x sequence ends with , or that the corresponding lattice path passes through C((k-1 )n-k,n) . Denote the number of paths from A(0,0) to a point U, not going beyond the line AB, by P(U). Then we have (see figure 3.6): P(C) = P(D((k-l)(n-l),n)) - P(E((k-l)(n-l),n-l)) = P(B) - P(E) = t(k,n) - t(k,n-l). 51 1 I B((k-l)n, 0^ n) o 1 c D if T .'1 1 1 »- A(0,0) The path corresponding to the tree in figure 3.5 Figure 3.6 Denote k = (k-1) k-1 Lemma 3.2: '1 t(k,n) -> 1 - k as n -> °°, for any k >_ 2, Note : This limit is 3/4 for k = 2. ! fk(n-l) Proof: t(k,n-1 )_ (k-l)(n-1)+l ' V n-1 t(k,n) 1 (k-l)n+l ( k n") (k-l)n + 1 (k-l)n (k-l)n - 1 (k-l)n + 2 " kn ' kn - 1 -> 1 k-1 k-1 k as n -> (k-1)n - (k-2) 1 kn - (k-2) ' kn - (k-1) Hence, by lemma 3.1 fTkrT) "*" ^ " k as n O 52 This means that in most cases only one comparison is made; for example, this happens in about 75% of the cases for k = 2, 85% of the cases for k = 3 and 92% of the cases for k = 5. Lemma 3.3 : p. = t(k,n-j+l) - t(k,n-j) for l<_j<_n-l, and p = 1, for k > 2. Note : For k = 2 p. is simplified to ^4^- („"jf^). Proof : Similar to that of lemma 3.1. (3 Lemma 3.4 : t (j^ n ) + (1-k) k j_1 as n - ». 3 Note : This limit is ». for k = 2. Proof : By lemma 3.2 this holds for j = 1. Assume it holds for j < I we have by the previous lemma: p £ = t(k,n-£+l) - t(k,n-0 p £ _ 1 = t(k,n-£+2) - t(k,n-£+l) Hence p £ t(k,n-£+l) - t(k,n-£ ) t(k,(n-l )-£+2) - t(k,(n-l )-£+! ) . t(k,n) t(k,n) t(k,n-l) t(k,n-l) t(k,n) * But the first term approached (1 - k)k ~ by the induction hypothesis, and the second term - k (see proof of lemma 3.2); hence P * (1 - ~k)k* , • t(k,n) and the lemma is thus proved. /~\ We now return to (*): Theorem 3.6 : Algorithm GENERATE_K-ARY requires comparisons per 1 - k sequence, for n -*■ °°. 53 Note : This limit is 4/3 for k = 2. Proof ' Following the previous discussion, we get n n-1 + tT¥7nT E jp, E jp. 3=1 J JfJ i + _n t(k,n) t(k,n) n-1 n-1 n-1 E jp, E p, E p. j = l J = J = l J + J=2 J + ... - t(k,n) t(k,n) t(k,n) n-1 n-1 p 00 00 E E (l-k)k J_1 a=l j=a , "** . oo (1-k) E a=l 1-k E k 3 " 1 = a=l 1 1-k Note that i k i- N) M -N) ; i k k k-1 (k-l)e ' hence we can upperboundO-k)" by (1-1 /(k-1 )e) o 54 As an illustration of our analysis, consider the case where k = 2 and n = 4. The algorithm generates the sequences 1234, 1235, 1236, 1237, 1245, 1246, 1247, 1256, 1257, 1345, 1346, 1347, 1356, 1357. In nine of these sequences 1 comparison is made when forming the next sequence, three of them require 2 comparisons, one requires 3 (namely, 1257), and the last one (1357) requires 4 comparisons. Hence, the algorithm uses on the average 1.571 comparisons per sequence. More numerical results are shown in the following table (the column for "n -»• «" is the limiting value — -): 1-k 3 4 5 6 7 n ■* oo 2 1.600 1.571 1.524 1.485 1.457 1.333 3 1.333 1.291 1.260 1.241 1.229 1.174 4 1.227 1.193 1.172 1.160 1.153 1.118 5 1.171 1.144 1.129 1.120 1.115 1.089 Average number of comparisons made by algorithm GENE RAT E_K-ARY Figure 3.7 As was pointed out to us by [Paterson, 1978], the average case analysis can also be argued as follows: it is clear that z is changed n t(k,r) times, hence the total number of changes is z t(k,r). Using r=l Stirling's formula for n! we get t(k,n) -n |/2Trn(l-l/k)' n as n ■* oo , from which we get { z t(k,r)>/t(k,n) + 1+k+k + r=l 1-k 55 3.8 AN EXAMPLE We conclude this chapter by showing the 14 regular binary trees in B. with the corresponding trees and sequences. For each tree T in B. we show the following: 1. Its position with respect to the a-order (p. 22). 2. Its position with respect to the a'-order (p. 22). 3. Its position with respect to the g-order (p. 24). 4. The corresponding tree T 1 in Bl (p. 15). 5. The corresponding tree t(T) in T. (p. 16). 6. The corresponding 0,1 sequence x(T) in X. (p. 15). 7. The corresponding sequence z(T) in 1. (p. 16). 8. The corresponding sequence s(T) (p. 20). 9. The corresponding level sequence of Ruskey and Hu (p. 21). 10. The corresponding permutation (p. 21). The example is shown in the following three pages. 56 CM s o ^~ o in r— CO CvJ J*C s* X O r— m IT) *f <5<< "^ <% 10111 1 3 4 91753 CO «3- CM >• s* M o ■— s^ S^ y^ O VO i— ^f CM aC ^*C jg^ i— co ^t *d- r~> <>0 . <> 10110 1 3 4 91751 co co CO s* A O i— ^ A f i— r^. ro CO *i- ^*^~— s* / O i— co co CM */ ^ J*£ / % • O ^f i— CO CM r— <<< y^ X 1011 1 3 9172 CO CO M CO jS s* O i— O VD i— *a- CO /■ * ^\ /^ >A. / I— CO CM C\J >< ^«c -^ ^^ / . r— ID IC> O r— r- tn s •3- «3" ./^ ^» >^ \ " CM CM •c ^ » \ O i— ^ \ i— r— cr> -• «3- s^ A • O i— *c y/^ / r— 1^. CO «* «3" s/"^ ^< / O i— «* <^ ^ N 10101 1 3 5 91715 CO CM CO CM ai «3- ^- «a- o c *t -«d" h- X rM e o 1_ s_ s_ CO 03 I- - 1— 1 — 1 — 1 — (/) 3 E 1 1 (— "Y.r* *»— <» %■_*»» N^^ s "s QQ +-> X N to CU > cu t- 57 O ' — CO <<<< ^s 11011000 12 4 5 971531111 2 4 4 3 1 2 4 3 1 O «3 <^< ^ -^ 11010100 12 4 6 971513111 2 3 4 4 1 2 3 4 1 ^ 00 oo <— << O << 11010010 12 4 7 951311311 2 3 3 2 2 2 3 14 r^ r>. CO CT Q. CD _l 58 y* O I— *^ + O ^3" i— CM t— *\ ^ ^w O >— «3- «3" *~ ^■ >^ O co i— r— CO CO CM ^^^ « ^S^^ i— CM to «d" CO ^v ^ ^^^ i— r^ ^< i— i— en «3" «d" ^* O i— ^ M o tr> i — CM r— *^ ^ ^* ^ v<* O i— co CO CM ^*T ^T ^s. . . «r i— CO CO «* CM ^<5 *X 111C 1 2 9751 *3" CO CO / ^ /^ ^^ o •— V y*C ^^ O (£3 r— CO r^ x X^\- • A ^r i— CO CM CM LO x / ^ V. /* » ^r o co i— CO ^J- V ^< 1110 1 2 9731 CO CO CM CO y CM X. O i— / \- i— r^ co CM ««t / ^ >* .* O i— t— cn o < «/ s^ O CO i— CM r— r ~ r ~" \s*- •\ <. . . O r— i— CM CO f— LO CO CM X ^ i— i— CXi CO CO CD •3- *tf- ** O c s- i~ s_ *a- -«3- >- X M c o CD CD CD co 00 CD •r— X> "O -o (j (J CO 3 -!-> s- i- s- 0) u O" ro o O o -^^* ^ H *■ « #- N cu 4-> 1 1 1- - i- 1 | 1— X N t/> CD > CD _l E s- CD Q- 59 CHAPTER 4 GENERATING TREES: PART 11 4.1 INTRODUCTION In this chapter we extend the results from chapter 3. In the first half of the chapter we study generating, ranking and unranking algorithms for k-ary trees in s-order using the x and z sequences. In the second half we generalize the discussion of section 3 about binary trees to the classes of trees that have n. nodes of degree k . , i = l,2,...,t for some t. The first half of the chapter is treated in sections 4.2 (generating) and 4.3 (ranking and unranking), and the second half can be found in sections 4.4 (generating) and 4.5 (ranking and unranking). 4.2 K-ARY TREES IN g-ORDER: GENERATION We use the sequences x(T) and z(T) corresponding to a regular k-ary tree T with n internal nodes (see p. 47); recall that, according to our notations, T e T(k,n), x(T) e X(k,n) and z(T) e Z(k,n). We show how to generate the sequences in Z(k,n) in order according to the Hebrew lexicographic ordering; this corresponds to generating the sequences in X(k,n) according to the same ordering, and to generating the trees in T(k,n) in 3-order (see p. 24). It is clear how to generate the sequences in Z(k,n) in the Hebrew lexicographic ordering; the appropriate modification of the algorithm GENERATE K-ARY from the previous chapter is not discussed here . 60 See [Zaks, 1977b] for more details of this algorithm. Note that the reverse R R R R z of the sequence z in Z(k,n) have the property that z, > z 2 > ... > z > p and z j + i <_ ki-k+1 for each i, and that instead of generating the sequences in Z(k,n) Hebrew-lexicographically one can now generate these reverses of the sequences in Z(k,n) in the English lexicographic ordering. For example, the 12 sequences in Z(3,3), corresponding to the ternary trees with 3 internal nodes, T(3,3), are generated by the algorithm as follows (we underlined the leftmost z. that is changed ): Position Sequence 1 123 2 124 3 134 4 125 5 135 6 145 7 126 8 136 9 146 10 127 11 137 12 147 Note that, according to this generating algorithm, in order to find the sequence next to a given one we scan it from left to right until we find the first (leftmost) entry that can be incremented, and after incrementing it we append to its left the first possible subsequence; Compare it with the discussion in section 3.3 (p. 38), and also compare the above example with the one in section 3.6. 61 4.3 K-ARY TREES IN g-ORDER: RANKING AND UNRANKING We now find the position index{x) of a sequence x e X(k,n) in the Hebrew-lexicographic ordering of X(k,n). The modification to Z(k,n) is left to the reader. It was shown before that in a sequence x e X(k,n) erasing k-1 10 patterns, as long as possible, results in the empty sequence. It 1s this property of X(k,n) that enables us to compute the ranking function in a way similar to what we did for binary trees in chapter 3 . An extension from a different point of view is used later, in sections 4.4 and 4.5, for a more general case. First we observe the following: Theorem 4.1 : There is a one-to-one correspondence between all the sequences in X(k,n) that end with 10 p+k-1 and those in X(k.n-l) that end with P , for each possible p. Proof : Suppose there are u sequences in X(k,n) that end with 10 p " ; call them A.=Y.10 P ~ , i=l,2,...,u. By omitting the last occurrence of k-1 d 10 from each A. we get the sequences B.=Y.0 K , i=l,2,...,u, where each of the B. 's is in X(k,n-1) and ends with P (see the discussion preceding the theorem above). This correspondence is clearly one-to-one, which completes the proof. /-> Note that the argumentation in the last proof generalizes the discussion in the end of the proof of theorem 3.3 . Now, it is evident That the correspondence mentioned above preserves the lexicographic ordering; namely, A. < A. if and only if B. < B.. This fact will be used in the next theorem, by which we can recursively compute the ranking function: 62 Theorem 4.2 : Let x=ylO p e X(k,n). Then 1 INDEX{x) = < if x=l n (k - 1)n a((k-l)n-p-l ,n,k) + index (yO^'^ ) otherwise where a((k-l )n-p-l ,n,k) is the number of sequences in X(k,n) that end with p+1 . Proof : Immediate, by the definition of the Hebrew-lexicographic ordering and theorem 4.1 . f~\ We now evaluate the numbers a(i,j,k) as defined in theorem 4.2 . The lattice-paths of L(k,n) were introduced in theorem 3.4; using these correspondences between trees, sequences and paths in T(k,n), X(k,n) and L(k,n), respectively, it follows that a(i,j,k) is equal to the number of lattice-paths from (0,0) to (i,j) that do not pass below x=(k-l)y, and is therefore given by the recursive definition as follows: ' 1 i = i > (k-l)j (*) a(i,j-l,k) + a(i-l,j,k) otherwise An example is shown in fiqure 4.1, where the lattice-point (i,j) is labeled with the number a(i,j,k), and we show these numbers for k=3 and j<4. 1 i -*fi 9 12 a(i,j,k) = * 12 a(i ,j,k) for k=3 and j<4 Figure 4.1 63 The solution to this recurrence relation (*) follows: Theorem 4.3 : The solution to the recurrence relation (*) for the a(i,j,k)'s, where i , j >_ and k >_2> is given by .(i.J.k) ■ ( 1 1 _1 \ 1 i-1 l^Jj /i+j-i_kt\ 1 /kt\ t i } \ j-t I (k-l)t+l V t / / b \ i 2 , where s < 1, is taken to be 01. v t=l ' Proof : We prove by induction on i and j. For i = and any j, we have a(0,j,k) We assume the formula holds for i-1, and prove it for i, as follows: Let i = (k-1 ;x + y, 1 <_ y <_ k - 1 . For j <_ x, we get i = (k-1 )x + y >_ (k- 1 )j + y > (k-1 ) j and in this case the formula must give us the boundary condition 0, and it really does: as i = (k-l)x + y, and 1 <_ y <_ k - 1, thus So we get 11-11 I'll - x k-lj x k-1 J /i+j-l-kt\ 1 t=l YX\ _ x v t=l j-t ) (k-i)t+i \ t ; l+kt\/[i-(k-l)j-l] + k(j-t)\ _±_ t )\ j-t I l+kt \ 3 J \ J / The last step is done using the identity 2 /r-tk\ /s-t(n-k)\ _r_ k>0\ k j\ n-k j r-tk r+s-tn n for integer n , r f tk , and < k <_ n (see [Knuth, 1968, p. 58]); setting k <- t, r «- 1, t +■ -k, s «• i - (k-l)j - 1 and n + j we get the first term ( 1 * J )j minus the term ( 1+ ^ _ J (corresponding to t=0). 64 So the formula gives ad.j.k).^" 1 ) + ( i+ j- 1 ) - ('JJ) =0, as desired. Assuming it holds for j - 1 , we continue as follows: if j > x , we have i = (k-l)x + y < (k-l).i + y, or i <_ (k-l)j, and we show that a(i,j,k) satisfies a(i,j,k) = a(i,j-l,k) + a(i-l,j,k). By the induction hypothesis we have i-1 "^■"■rn-^ra^TFiWc?) and a(l-l.J.k) = ( 1 ft 2 i-2 - k ; lj /i+J-2-kt\ 1 /kt N t : 1 [ j-t j (k-Dt+i u;. Therefore, i-1 k-1 a(i.j-l.k) + a(i-l,j,k) = f 1 ^" 1 ) - 2" v 1 ' t=l (...) + i-2 k-1 " 2 " t=l (...) It remains to show that i-1 i-2 k-1 L k ; lj /i+J-2-kt\ 1 (kt\ , ^y'J/1+j-2-kt\ 1 /kt W \ J-l-t j (k-l)t+l \ti t * I j-t j (k-l)t+l [t i-1 _ Lk ; lj /i+j-l-kt\ 1 fkf " t t } \ J-t J (k-l)t+l. \t I . (**) If y > 1 then r-f = tV = x » and a ^ tne summations are I ( (**) is correct. If y = 1, we have ) , hence i-1 x-1 = x, i-2 k-1 = x - 1, but then a term corresponding to t = x in the second summation on the left side of (**)is ( J 4*» ); but j > x, so this is 0. The proof is thus completed, O 65 As for the unranking algorithm, it can be found in a 'reverse' interpretation of theorem 4.2 (in a way similar to the one by which the unranking algorithm of section 3.5 was built from the ranking algorithm of section 3.4). More details can be found in [Zaks, 1977b]. As an example we take the 8-th sequence z=136 from the example in p. 60 (the 12 sequences in Z(3,3) ). This sequence corresponds to the 0,1 sequence x=101001000 and to the lattice path shown in figure 4.1 (p. 62). When applying the ranking algorithm to the sequence x, we get INDEX(x) = IWZ?ffX( 101 00 1000) = = a(2,3,3) + INDEX(]0]000) = a(2,3,3) + a(0,2,3) + IWZ?£T( 1 00 ) = a(2,3,3) + a(0,2,3) + 1 = 6+1 + 1 = 8 . Note that the numbers used in computing the ranking fuction are exactly to the left of the corresponding lattice-path (see figure 4.1), plus an additional 1; Compare this with the similar discussion in the note at the end of section 3.4 (p. 43). As for the unranking algorithm, we are looking for the 8-th sequence in X(3,3). In [Zaks, 1977b] this example is discussed based on the unranking algorithm studied there. Here we present a geometric approach to this algorithm. We start at the point (6,3) (see figure 4.1) and go to the left until we see the first label which is smaller than s=8; it is a(2,3,3) = 6, that labels the point (2,3), in this case. Hence we make a turn at the point (3,3) and go down one step, and set s <- s - a(2,3,3) = 2. This procedure is now repeated until we reach the origin (0,0): We go to the left until we meet the first label which is smaller than s=2; it is a(0,2,3) = 1, so we make a turn at (1,2), go down one step and set s +- s - a(0,2,3) = 1; the path is now completed through (0,1) to (0,0). 66 4.4 GENERAL CLASSES OF TREES: GENERATION We generalize now the discussion in the previous chapter for binary trees to the classes T(K,N) of trees that have n. nodes of degree k. , i = 1, 1 t n 2,...,t for some t, and n n +l leaves, where n n = E (k.-l) n. . We denote u u i=1 1 1 K=(k Q , k, , ... , k.) and N=(n , n-. n.) and assume throughout this discussion that k. > k._, > ... > k n =0 . For t=l we get the classes T(k,n) of regular k-ary trees with n internal nodes. The trees are generated according to the a-order (section 2.4). Define A(K,N) to be the set of integer sequences a = {a,-}? that have n. occurrences of the integer k. and n« O's (the k. 's and the n.'s are as defined above) and have the extended dominating property . A sequence a = {a.} 1 ? is said to have the extended dominating orooerty if in each prefix 1 t the number of O's is not larger than i (k.-l)-(the number of k.'s in the prefix) i=l n ^ The following theorem was proved in [Chorneyko and Mohanty, 1975] (using the reverses of our sequences): Theorem 4.4 : There is a one-to-one correspondence between the trees in T(K,N) and the sequences in A(K,N). Proof : To see this, associate with a tree T in T(K,N) the sequence a(T) of degree numbers (see p. 12). Since a(T) is built in a preorder traversal of the tree T, it follows that the father's degree appears in a(T) before his sons'; also, a tree with x. internal nodes of degree y. has E(y.-1)»x. + 1 i leaves. From these facts we deduce that a(T) has the extended dominating property. Based on these arguments it can also be shown that this correspondence is one-to-one. O Note that the sequences in A(K,N) are also in A (see p. 13) for n = zk.«n. ; in other words, for each sequence a in A(K,N) we have I a. >_ i for each 1 . 67 It was proved in the Ordering Lemma (p. 23) that T < T' if and only a if a(T) _0. To get the next sequence we replace a. by a , the smallest positive element in a, and append to its right the first possible subsequence; this subsequence must contain m' occurrences of k for s > 0, where m i _ m if s >. 0, s f j,p < m. + 1 if s = j ni - 1 if s = p, L P The first such subsequence is clearly , k-,-1 mi k 9 -l mi k.-l m! o d (k 1 o ' ) ] (k 2 o l ) 2 ••• (k t t ) t where d = mi - E m'*(k -1); U i s s S>J This is discussed in somewhat more detail in [Zaks and Richards, 1977]. It is clear that the first sequence in A(K,N) is k,-l n, k 9 -l n k.-l n. (^0 ] ) ' (k 2 2 ) 2 ... (k t * ) t and that the last one is n t n t-l n l Z i ( k r 1 ) ,n i kk kf) K t K t-1 K l U The following algorithm starts with the first sequence in A(K,N), and proceeds from a sequence to its successor until it reaches the last one 70 Algorithm GENERATE_T(K,N ) (generating A(K,N) lexicographically, given K and N) 1 . (first sequence) k,-l n, k 9 -l n 9 k.-l n. a ■ a 1 a 2 "- a i:k.n. *" (k l° ) (k 2° } {k t° ) • a~ «- (a~ is the sentinel) 2. (scanning the current sequence a= a. ) j +■ zk.n i ; P *■ k t +l ; for i -*- to t dp_ m. +■ ; while a. , > a . do begin J-l - J — — a — if p > a. > then p «- a m. «- m. +1 ; J J J ' J + j-l ; end if j = then stop (last sequence) (j now points at the rightmost i such that a. < a. + ,) (next sequence) (we have m. of the a.'s equal to k. to the right of a. l it J a is the smallest positive number to the right of a. a j" a p ; *, * m. ♦ 1 ; nL «- m„ - 1 ; P P a p+l a p+2*" a zk i n i 4. goto 2 . m n - z m„(k -1) k,-l m, k -l m k.-l m. ° «» s s (^0 ' ) '(k 2 2 ) 2 ...(k t l ) ) As an example, the 21 sequences in A(K,N), for K=(0,2,3) and N=(4,2,l), corresponding to the ordered trees with one internal node of degree 71 3 and two internal nodes of degree 2 (one of which has been shown in figure 4.2), are generated by this algorithm as follows: Position Sequence 1 2020300 2 2023000 3 2030020 4 2030200 5 2032000 6 2200300 7 2203000 8 2230000 9 2300020 10 2300200 11 2302000 12 1320000 13 3002020 14 3002200 15 3020020 16 3020200 17 3022000 18 3200020 19 3200200 20 3202000 21 3220000 (As in the previous examples, we underlined the rightmost a- discussed in the text, and we omitted the sentinel a Q =0.) 72 4.5 GENERAL CLASSES OF TREES: RANKING AND UNRANKING In this section we compute the function INDEX(L) that, given a path L £ L(K,N), will compute its corresponding position in the lexicographic ordering of L(K,N); also, given an integer w, we construct the path L e L(K,N) such that INDEX (I) = w. As discussed earlier, the paths L = L-. L, ... L inL(K,N) are those lattice paths from the point (n«, n, , ... , n.) to the origin which do not go below the hyperplane x n = e (k.-l)x-. i=1 l i We make use of the multinomial coefficients d \ J if any d. is < d r d 2 , ... , d £ / ^ d V. ... . d ^, otherwise i where d = 2.- d.. The multinomial coefficient has a familiar interpretation i=1 ] as the number of lattice paths from the point (d, , d«» ... , d ) to the origin. This interpretation gives a combinatorial proof of the following lemma, which is also easily proved directly from the above definition: Lemma 4.1 If d =2 d. and all d. are integers, then f "' d W ( d - 1 \ V d T d 2 A f^ \ d T d 2' ••• ' d i-T d i ' '" d ' +1 ' •" ' d V Let C(n Q ,n,,n2, ••• ,n.) denote the number of lattice paths from the point (n Q , n, , ... 5 n.) to the origin which do not go below the hyperplane x^ = 2 (k.-l)x-. i=l The following theorem defines these entries recursively, and solves the recurrence relation: 73 Theorem 4. 5: The solution to the recurrence relation C(n Qf n 1 ,n 2 ,. .. ,n t ) = / ^ Z C(n n ,n, ,... ,n.-l j=0 u ' J n.<0 for i = 1 ,2,.. . , or t n Q = n^ = ... = n t = n n = ? (k.-l)n. - 1 i=l ] ,n ) otherwise is given by C(n (J ,n 1 ,n 2 ,... > n t ) = [n^ ,. . . ,n t ) - 2 (k.-l )(n Q +l ,n ] ,. . . ,n.-l ,. . . ,n t where n = n~ + n, + ... + n. . Proof: We show that C(n Q ,n.|, ... , n t ) , as given by (*), satisfies the re- currence relation and the boundary conditions. When n. < for i = 1, 2, . or t (*) gives the value by definition. The case n Q = n, = ... = n, = is taken care of in the same way. When n n = £ (k.-l)n. - 1 and no u i=i i i n. is < 0, (*) can be rewritten as C( VV n 2 n t> - ( V l)!nJ!...n t ! ^V 1 " .* 'V""^ from which it is clear that C(n ,n, , . . . ,n ) is for this case. If n n , n,, ... , n, are none of the above, we prove the recurrence by induction on n. For n = 1, (*) is correct. We assume that it holds for any m < n, and take n = n^ + n, + ... + n,. By the recursive definition of C(n~,n,,. (*) ,nj we have C(n ,n r ...,n t ) = 2 C^.n, ... . ,n -l,...,n t ) . : For each of the terms on the right, we use (*), by the induction hypothesis, and get . . n-1 t / C(n Q ,n 1 ,.. . ,n t ) = I \n Q ,n 1 , . . . ,n .-1 , . . . ,n t •J ^ \ t t , n-l " js0 M 1" \n +l,n 1 ,...,n r l 1 ...,n J -l,...,n t J 74 t . n-1 = I j=0V n ,n r ...,n j -l,...,n t - z (k-i) If n_1 \ J which, by the previous lemma, gives C( V n l "t» = (n .n,, n ..,n t ) " J, 'V 1 ' ( n Q+1 .„, .. . .?„,-! ,. . . ,„ t ) as desired. /~\ This recurrence relation has been solved for the case t = 1 in previous works: For k, =2 it was solved in [Whitworth, 1878], and a solution for an arbitrary k, is found in [Yaglom and Yaglom, 1964]. A solution for the 1 t general problem for points on_ the hyperplane x n = z (k.-l)»x_. is given in u i=l n ] [Chorneyko and Mohanty, 1975] using generating functions. See also [Richards, 1978] for a note related to these countings. Given a path L e L(K,N), we show now how to compute its position INDEX [L) in the lexicographic ordering of L(K,N); because of the immediate relation between L(K,N) and A(K,N) it is clear that this will also give us a way to deal with the sequences in A(K,N). Theorem 4. 6 : Let a = a,ap...a e A(K,N) and the corresponding lattice-path L== L L r ..L n e L(K,N), where L i = (y. Qi y^, ... , y^). Then n-1 Vr 1 INDEX {L) = 1 + Z z C(y. n , y.,, ... , y..-l y..) i=0 j=o iu i i ij , ... , ix where the C( .,.,...,. )'s are given by theorem 4.5 , and the b. 's are defined as follows: if a. = k. then b. = j, for i=l,2 n . (Note that since it is only the relative values of the k.'s that is important during the generation, ranking or unranking, for the listing of the sequence in A(K,N), one can prefer working with the numbers b. rather than the a.'s; 75 more details can be found in [Zaks and Richards, 1977] .) Proof : By definition we know that all the sequences that begin with either of k A ,k, ... , or k, , will come before this sequence a, and the number of these ] V 1 b r i sequences is indicated by the inner summation 2 . This follows from the j=0 definitions of b., y. . and theorem 4.4 . i ij Next, we know that all the sequences that start with k, 0,k. k,,..., D l D l ' or k, k, , will come before the sequence a, and the number of these sequences b i V 1 b -l is indicated by the summation j . The rest follows by induction, following J-0 this line of argument. The constant 1 is added, as in the previous ranking functions, so that the indexing will begin with 1 rather than 0. /-\ It is clear from the proof how the ranking function in this general case extends the one for binary trees (see note in p. 43). To illustrate this procedure we refer to the tree and path in figures 4.2 and 4.3, respectively. In figure 4.3 each lattice point above the plane x = x, + 2x« has been labeled with C(x ,x, ,x«), the number of ways to reach from this point (x^x-pXp) to the origin (0,0,0) while not passing below this plane. We go along the path and, after each unit step that we make in a certain direction, we add all the labels in the preceding directions (x A preceds x, and x, preceds x«). We thus get for the path L in our example INDEX (I) = 1 + 12 + 5 + + 2 + + + = 20. As for the unranking algorithm, we will follow theorem 4.6 in a reverse interpertation. We are given a number i, and look for a path L such that INDEX {I) = i. We first show the unranking for the previous example: 76 Suppose we want to find the 20-th sequence in L(K,N), where K=(0,2,3) and N= (4,2,1). Starting at the point (4,2,1) we sum up the labels in directions 0, 1,... (see figure 4.3), in that order, as long as we do not exceed 20-1=19. Here we take C(4,l,l)=12, which corresponds to making the first move from (4,2,1) to (4,2,0), or a, =3 (the first move has been done in the third direction). Starting now at (4,2,0), we sum up the labels in directions 0,1,. as long as we do not exceed 19-12=7; here we take C(3,2,0)=5, which corresponds to making the second move from (4,2,0) to (4,1,0), and so on. The unranking algorithm is formally described as follows; the proof follows from the above discussion. Algorithm UNRANKJ(K,N ) (finding the w-th sequence in A(K,N), given w, K and N as in the discussion above) u *■ w - 1 ; (y ,y r ...>y t ) - (n ,n r ...,n t ); for i - 1 to n do begin (Find the largest j such that sum of entries in the first j directions does not exceed u) j «- 0; sum «- 0; S - C(y Q -l, y r ... , y t ); while sum + S <_ u do begin sum «• sum + S; j «- j + 1 ; S - C(y Q , ... tfj-1, ...,, y t ); end; a i * k j 5 y^ *■ y n - - 'I ; end. u ■«- u - sum ; 77 CHAPTER 5 A GRAPH LABELING PROBLEM 5.1 INTRODUCTION We are given n tasks t-, , t 2 t to be executed on two identical processors, that share a common memory. Task t. requires c. of the memory for its execution, for each i ( <_ c <_ 1 ). Two distinct tasks t. and t. can be executed simultaneously on the two processors if and only if their total memory requirement does not exceed the available amount of memory, namely if and only if c. + c. <_ 1. This situation gives rise to an undirected graph in which a vertex n. is associated with the task t. , and there is an edge connecting two distinct vertices n. and n. if and only if the corresponding tasks t. and t. can be executed simultaneously. We denote by GR the class of graphs that can be obtained in this way. In other words, from a graph theoretic point of view, we are interested in the class GR of undirected graphs for which there exists a labeling of the vertices with numbers in the interval [0,1] such that two distinct vertices are connected by an edge if and only if the sum of their labels does not exceed 1. When extending the problem to the case where there are s resources shared by the two processors, we get extensions of this class of graphs. In this case the task t. requires c.. of the j-th resource during its execution, for i=l,2,...,n and j=l,2,...,s. Tasks t. and t. can be executed simultaneously if and only if their total requirement of 78 the j-th resource does not exceed the available amount of this resource (for each j), namely iff c^ + c- k <_ 1 for k=l,2,...,s. This gives rise to the classes GR S of graphs for which there exists a labeling of the vertices with s- dimensional vectors over the interval [0,1] such that two vertices are connected by an edge if and only if the sum of their labels does not exceed (1, !,...,!). It turns out that a graph is in GR if and only if it is the intersection of s graphs from GR , and that GR 1 is properly included in GR 1+1 for i > 1 It is also possible to extend our scheduling problem to the case when we have m > 2 processors; In this case a set of (not more than) m tasks can be executed simultaneously under the same conditions (namely, not using more than the available amount of any resource), which means that in this case we have to deal with hypergraphs (rather than graphs), in which the vertices correspond, as before, to the tasks, and the edges are those sets of at most m vertices that correspond to sets of tasks that can be simultaneously executed. In all of these scheduling problems, it can be expected that some knowledge of the structure of the graphs will be helpful if one wants to find schedules of minimal total execution time or in case one wishes to analyze heuristic algorithms for these scheduling problems (see [Johnson, 1974], [Coffman, 1976] and [Ecker, 1977]). Another motivation to study these classes of graphs is given in [Chvatal and Hammer, 1977]. Given a system of linear inequalities n z a. . x. - 1 i = l,2,...,m, (*) j=l 1J J " where each a., is either or 1 , they want to find out whether there exists 79 a single inequality n I C. X. •: b (**) j=l J J that have the same set of zero-one solutions as (*). (The c's and b are integers.) Let A = (a..) be the matrix containing the a^'s. It is clear that the zero-one solutions of (*) are completely determined by those pairs of columns of A that have a positive dot product. Chvcital and Hammer build the graph G(A) in which there is a vertex corresponding to each column of A, and two distinct vertices are connected by an edge if and only if the corresponding two columns have a positive dot product. The zero-one solutions of (*) correspond to independent sets of vertices (a set of vertices is independent if no two of the vertices in it are connected by an edge). Therefore, in order to be able to 'simulate' (*) by (**), there must be a labelinq of the vertices of the graph G(A) with numbers c,,c« c such that the zero-one solutions of (**) will be exactly the characteristic vectors of independent sets of vertices in G(A). This class of graphs, known as threshold graphs, coincides with the class GR defined above, and has been studied in the literature also in connection with synchronization problems ([Golumbic, 1976], [Henderson and Zalcstein, 1977]). From Chvcital and Hammer's point of view a graph G is a threshold graph if and only if, when regarding the subsets of the n vertices as points in the n-dimensional unit cube, there exists an hyperplane Ec.x.=b seperating the independent sets of vertices from the other sets of vertices. From another point of view, a graph G is a threshold graph if and only if there exists an integer labeling of its vertices and a threshold b such that two vertices are connected by an edge if and only if the sum of their labels 80 exceeds the threshold b ([Golumbic, 1976]). Extending their problem, Chvatal and Hammer want to find s inequalities n I c.. x, ib. i=l,2,...,s (***) j=l 1J J n such that (*) and (***) will have the same set of zero-one solutions. For this it is necessary and sufficient that the graph G(A), defined above, will be the union of s threshold graphs. It follows that these graphs are the complements of the graphs in our class GR S In this chapter we first discuss the class GR , reviewing results from [Chvatal and Hammer, 1977], [Henderson and Zalcstein, 1977] and [Golumbic, 1976], and extending them, and following this discussion we study the classes GR . In section 5.2 we introduce notations and basic 1 s notions; the class GR is treated in section 5.3, and the classes GR , for general s, are studied in section 5.4. 5.2 PRELIMINARIES In this section we deal with undirected graphs without self -loops or multiple edges. Such a graph is described by a pair (V,E), where V is the set of vertices and E, the set of edges , is a set of unordered pais of vertices. Let G=(V,E) be a graph; two vertices v,w e V are connected (by an edge), or adjacent , if (v,w) e E. G is connected if for any two vertices v and w there is a sequence of vertices (v,, v 2 , ... , v.) such that v-|=v, v k =w, and (v. , v- + ,) e E for i = l ,2, . . . ,k-l . The neighborhood of v e V is N(v) = (w|(v,w) e E}; the cardinality DEG(v) = |N(v)| is called 81 the degree of v (for a set A, |A| denotes the cardinality of A). We will also use a modification of the notion of the neighborhood: N'(v)=N(v)u{ v }. V. denotes the set of all the vertices with degree i. For instance, V Q contains all the isolated vertices. DEG(Qt) denotes the maximal degree among the vertices of G. The complementary graph of a graph G=(V,E) is a graph G C =(V C ,E C ) where V C =V and E c ={ (v,w)| v,w e V, v + w, (v,w) ^ E }. Let V'eV; the subgraph induced by V is defined as G' = (V',E'), where E' contains these edges with both endpoints in V. Instead of 'subgraph induced by ... ' we will use the notion 'subgraph' throughout the paper. Graphs G.=(V. ,E. ), i=l,2, are called isomorphic if there exists a bijection :y. ■*■ V« such that V v,w e V, (v,w) e E 1 ** ( 1/3, v 2 ■> 1/2, v 3 - 0, v 4 ->■ 1 is a 1 -labeling for the graph G, in figure 5.1 . For the graph G^, the function R 2 : W] -> (3/4,1/2), w 2 -> (1/4,1/2), w 3 ■> (1/2,3/4), w 4 ■> (1/2,1/4) is a 2-labeling; one can easily verify that no 1 -label ing exists for G„ . As will turn out later, for a fixed s e N, not every graph has an s-labeling. Therefore, it is of interest to characterize those graphs that 82 can be s-labeled; we denote this class of graphs by GR 5 '1 ' '2 ' A graph G -j in GR 1 , and a graph G^ in_GRJ but not in GR Figure 5.1 Clearly, for a given s e N and a graph G, if R is an s-labeling for G, then R is not the only possible s-labeling for G; there are infinitely many such labelings. On the other hand, given a set V of vertices and a function R: V •*■ [0,1] , there is a uniquely determined graph G=(V,E) for which R is an s-labeling; its set of edges is precisely E - {(v,w) | R(v)+R(w) < (1,1 1) }. 83 5.3 1 -LABELED GRAPHS ,1 In this section a complete characterization of GR is given. When studying an integer programming problem (see section 5.1), [Chva'tal and Hammer, 1977] defined a class of graphs, called threshold graphs , which are identical to GR . The structure of these graphs is also investigated in [Henderson and Zalcstein, 1977], [Golumbic, 1976 and 1978]. These results are summarized in the sequel (theorem 5.1). We say that a graph has the subgraph property if it does not contain an induced subgraph isomorphic to 2HL, P* or C. (see figure 5.2). The forbidden graphs Figure 5.2 A graph G=(V,E) has the inclusion property if (i) or (ii) is true for all v,w e V: (i) If (v,w) i E, then N(v)s N(w) or N(w)s N(v). (ii) If (v,w) e E, then N'(v)s N'(w) or N'(w)s N'(v). 84 Theorem 5.1 : The following are equivalent: (1) G e GR 1 . (2) G has the subgraph property. (3) G is a threshold graph ([Chvatal and Hammer, 1977]). (4) G has the inclusion property. (5) G and G c are both trivially perfect ([Golumbic, 1978]). Proof : (1) «-* (3) is proved in [Golumbic, 1976], (2) *-*■ (3) +-* (4) in [Chvatal and Hammer, 1977], and (2) «-* (5) in [Golumbic, 1978]. >-v We now describe an algorithm that 1 -labels a given graph G = (V,E) in GR 1 ; if G j. GR 1 it results in a 1-labeling of a graph G' = (V,E') in GR 1 where E' a E. Algorithm 1-LABELING (we are given a graph G=(V,E), |V|=n, with vertex sets Vq, V,, ... , V DEG (n\ as discussed above (section 5.2), and define a labeling function R: V ■*■ [0,1]) 1. V«- V ; for i «- to DEG(G) do begin while V i f do 2. begin choose v e V. ; 3. R(v) -«- l-i/2n ; while N(v)n V f t do begi n 4 t choose w e N(v) n V J 85 5. R(w) «• 1/2n j 6. II w c V . then V . * V,-{w] > 7. V «■ V-{w> ; end; 8. V. * V.-{v} ; 9. V «- V'-{v} ; end; end An example that demonstrates how the algorithm works follows the proof. Proof of correctness : Let G=(V,E) 4 GR , (u,u') e E, and suppose the algorithm labels u first. If u is labeled in step 3 by l-j/2n then u' is labeled in step 5 by j/2n. If u is labeled in step 5 by j/2n then u' gets a label < l-j/2n. In both cases R(u)+R(u') <. 1, hence we result in a 1-labeling for a graph G'= (V,E') where E'= E. Let now G e GR^ . Following the previous discussion it is sufficient to show that (v,w) f. E implies R(v)+R(w) > 1 . We consider three cases: (1) R(v)=l-i/2n, R(w)=l-j/2n. Then R(v)+R(w) > 1. (ii) R(v)=l-i/2n, R(w)=j/2n. If j>i then R(v)+R(w) > 1; If j <_ i then ]u e V. such that w e N(u) and R(u)=l-j/2n, DEG{u)=j (due to the algorithm). From R(v)=l-i/2n follows DEG[v)=i. Therefore DEG(u) <_ deg{m) . Because of the inclusion property we obtain N(u) £N(v). Hence w e N(v), which is in contradiction to (v,w) t E. 86 (iii) R(v)=1/2n, R(w)=j/2n. There exists a vertex v' e N(v) with R(v')=l-i/2n and D£G(v')=i, and similarly a w' e N(w) with R(w')=l-j/2n and DEG{vi')=j. Suppose j >_ 1 ; then we obtain DEG(w) >_ j >_ i = DEG{v'), from which we get N(w) = N(v'). But as v e N(v') we have a contradiction. _ The following example demonstrates how the algorithm 1 -LABELING works. The sets of vertices of the same degree in the graph of figure 5.3 are V {v 7 } » V l =0 » V 2 ={ V' V 3 ={V 1' V 2'V' V ' V {v 3 } ' V {v 5' v 6 } ' For i=0, R(v 7 )=i according to the steps 2 and 3 of the algorithm. If i=2, R(v 4 )=l-2/2n = 14/16 (steps 2 and 3), and R(v 5 )=R(v 6 )=2/16 (steps 4 and 5). Similarly, for i=3 we obtain R(v 1 )=13/16, R(v 3 )«3/16, and R(v 2 )= R(v 8 )=13/16. v 1 (13/16) v 2 (13/16) (13/16) v 3 (3/16) v 4 (14/16) v 6 (2/16) v 5 (2/16) An example to algorithm 1 -LABELING Figure 5.3 87 From the way the algorithm is designed, the following is an immediate consequence: Corollary : For any graph G=(V,E) e GR there exists an 1 -label ing R: V -*■ [0,1] for which R(v) f for each v e V. We next study the structure of GR . First we state a lemma which is proved in [Henderson and Zalcstein, 1977]. Lemma 5.1 : Let G=(V,E) e GR 1 . Then (1) If G contains no isolated vertices then G is connected. (2) If G is connected then there exists a vertex with degree |V|-1. The following theorem describes the structure of GR : Theorem 5.2 i n i (1) The number of non-isomorphic graphs in GR with n vertices is 2 (2) Furthermore: 2 of them are connected, 2 of them have one connected component with n-1 vertices and 1 isolated vertex, n-4 2 of them have one connected component with n-2 vertices and 2 isolated vertices, n-k have one connected component with n-k+2 vertices and k-2 isolated vertices, 1 has one component with 2 vertices and n-2 isolated vertices, and 1 has n isolated vertices. 88 Proof: As can be easily seen, for the graphs in figure 5.4 the theorem holds n=l : • n=2 : • • • n=3 :. /. l Theorem 5.2 Figure 5.4 Assume it holds for n. According to the previous lemma we can get all graphs in GR having n+1 vertices by taking the following two disjoint sets of graphs (1) All the graphs in GR with n vertices and an additional isolated vertex to each, and (2) All the graphs in GR with n vertices and an additional vertex - connected to all the n vertices - to each. The construction from n=3 to n=4 leads to the graphs of figure 5.5 . (1) • • • • i: l* (2) 71 Vi Theorem 5.2 (continue) Figure 5.5 89 Our construction is valid following the discussion in the previous sections. Then it is obvious that _ i (a) the set of graphs obtained in (1) contains 2 " non-isomorphic graphs. The same is true for the set of graphs (2). As the sets obtained in (1) and (2) are disjoint, there are 2*2 n ~ =2 n classes of non-isomorphic graphs in GR, each having n+1 vertices. (b) From (2) we know that GR contains 2 n " =2^ n+ ' non-isomorphic graphs with n+1 vertices which are connected. Suppose G has n vertices, n-2-k and k of them are isolated, 00, there is a one-to-one correspondence between the graphs in GR with n vertices and the binary sequences of length n-1. 90 x=a,a 2 a 3 a.a 5 a 6 a 7 =1011001 V V l •^ • V 6 # • v, • v. 6# v 5 V 4 Step (a) a,=l ■* connect v, to v~ a 3 =1 v 5 v 4 Step (b) connect v 3 to v . , i<2 v 7 # a 4 =l connect v- to v. , i<3 Constructing the graph G(x) Figure 5.6 91 A vertex v. is connected to all the vertices v., j>i, if and only if a.=l, and this contribution to deg{v.) is clearly z a. . Also, v.. is J j>i J connected either to all or to none of the v.'s for ji J The adjacency matrices corresponding to graphs in GR have interesting properties which we may also use to determine the number of non-isomorphic graphs in GR with n vertices each. Let G=(V,E) e GR 1 , V=t Vl ,...,v n > , and R: V ^ [0,1] be a 1-labeling for G such that R(v-])< R(v 2 ) <... < R(v ). Define A Q = { v | v e V, R(v) < 1/2 } = { v r v 2 , ..., v k > , A ] ={v| Ve V, R(v) > 1/2 } = tv k+1 ,V k+2 ,...,Y n > . If R( v. + -j )+R( v. ) <_ 1 we could change R(v. + ,) to 1/2 and transfer v. +1 to A Q . It is then clear that R(v. +1 ) +R (v k+2 ^ > !• So we assume without loss of generality that in A Q and A, as defined we have r ( v i c+ -i) +r ( v i < ) > 1- As RCv^^ R(v i+1 ) we have deg(Vj ) > 2?^(v i+1 ) and N(v.)2N(v. +1 ) for i=l ,2,. . . ,n-l (this follows from definitions; see also the remark preceding lemma 5.2). We also have v,w E A Q ^ (v,w) e E, v,w e A-, -*■ (v,w) f E. Following this discussion, we conclude that the adjacency matrix of G must have the following structure: 92 / 'k+1 k+2 'n V Vl Vz an arbitrary lattice path connecting A and B V. all the entries are 1 all the entries are n - k 0/ \ / >^ Sk-l J The number of these matrices for given n and k is equal to the number of paths from A to B, i.e. rtr") ■ cd From the structure of the matrices it follows that G, ,G 2 e GR result in the same matrix iff they are isomorphic. Therefore, the number of non-isomorphic graphs in GR with n vertices each is i-l . I (- 1 ) ■ 2 "' 93 One may ask whether any graph G=(V,E) in GR can be labeled by a two-valued function R: V -> {0,1} . However, this is not true, as can be easily seen. Furthermore, we show that no finite set of labels is sufficient for labeling all graphs in GR . We do this by using the following property of 1 -label ing: Lemma 5.2 : Let R be a 1-labeling for G=(V,E) e GR 1 . Then R(v) < R(w) «-► DEG(v) > DEG(w) for all vertices v,w e V. Moreover, if R(v)=R(w) then DEG{v)=DEG{vi) . Proof : Follows immediately from the definition of 1-labeling. ,-^ Theorem 5.4 : No finite number of distinct labels suffices for labeling all the graphs in GR . Proof : For G e GR we know that R(v)=R(w) -*■ DEG{v)=DEG(w) (lemma 5.2). So, given t e N we construct a graph G. in GR that has t distinct degrees of vertices, and therefore it must contain t distinct labels. The grap+i G. has t+1 vertices v, , v ? , ... , v. + , ; the label on v. is i/(t+2). It follows that the degrees of the various vertices are given by ' t+l-i i £ t+2-i i > deg{v.) = ' t+1 2 t+1 For each t e N, G cannot be 1-labeled with less than t distinct labels, which proves the theorem. O 94 5.4 S-LABELED GRAPHS A detailed characterization of graphs in GR , for general s, similar to that for GR as discussed in the preceding section, is much more complicated. Unfortunately, we do not have such a characterization for s>l . However, we present some interesting facts about these classes. First note s s+1 that we clearly have GR c GR for every s, since an s-labeling of a graph can easily be extended to an (s+1 )-labeling by adding a new all -zero component. 1 2 Theorem 5.5 : The hierarchy GR £ GR £ ... is infinite (namely, all the inclusions are proper). Proof : We show that for every s>l there exists a graph G which is in GR but not in GR S_1 (G s e GR S - GR S_1 ). Let K denote the complete graph with n vertices, and let V and V" be two disjoint sets of s vertices each. We define G to be the graph K' u K" , where K' and K" are the complete graphs with vertex sets V and V", respectively (see figure 5.7). The graph G, Figure 5.7 95 (a) G s e GR b . The function R: V ■* [0,l] s with the components R,,...,R defined in the following table is an s-labeling for G (v e (0,1/4) ): v l v 2 • • • v s w l w 2 • • • w s R 1 1-1) V • • • V l-2v 2v • • • 2v h V 1-v . . . V 2v l-2v . . . 2v • 9 • • *s V V . . . 1-v 2v 2v . . . l-2v (b) G s / GR 5 " 1 . The proof is based on a decomposition of the vertex set of k-labeled graphs. We first demonstrate this decomposition. Let G=(V,E) e GR , and let R be a k-labeling for G. We decompose the vertex set V into 2 disjoint subsets U n ,...,U L : if a. ...a-, is the binary representation J 2 k -l of i e {0, ,2-1} , i.e. define k-1 i = s j=0 2 J a, j+l ' U. = {V e V | R,(v) < 1/2 ^ a.=0, j=l vj J k} The following properties of vertices in U. can be immediately derived (for the case k=2 the decomposition of V is shown in figure 5.8) (1) Vv e U Q : R(v) < (1/2 1/2); hence (v,w) e E Vv,w e U Q . (2) V 1 e {l,...,2 k -l} , \/v,w e U.: R(v)+R(w) I (l,...l); hence (v,w) t E. 96 (3) Let i,j e {0,. . . ,2-1} , v e U. , w e U . , and (v,w) e E. Then, if a. ...a, and b. ...b, are the binary representations ofi and j, respectively, we have a =0 or b =0 for all y e {!,... k> . Consequences of (1), (2) and (3) are: If G contains a complete subgraph K, then the vertices are distributed among the sets U n ,...,U , such u 2-1 that each of the sets U. with i>0 contains at most one vertex of K. If, however, G contains two disjoint complete subgraphs K' and K", then the vertices of one subgraph, say K' , must be members of U Q only, whereas the vertices of K" are distributed among U, ,...,U . . According to 1 2-1 properties (2) and (3), K" cannot contain more than k vertices. U u. ( > 1/2, > 1/2) U ( < 1/2, < 1/2) (< 1/2, > 1/2) -a ( > 1/2, < 1/2) J Decomposition of the vertex set Figure 5.8 97 s-1 This result is now applied to the graph G $ . Suppose G s is in GR . As G contains two disjoint complete subgraphs with s vertices each, the vertices of one subgraph cfre distributed among the sets U, ,...U , . 1 2 s -1 However, as this graph cannot contain more than s-1 vertices, we get a s-1 contradiction. Therefore G I GR , which completes the proof. The following lemma shows that the graphs in GR with a fixed set of vertices may be viewed as some kind of basis for all graphs with the same set of vertices: c Lemma 5.3 : A graph G=(V,E) e GR if and only if there exist s graphs G.=(V,E.) e GR 1 ,(i=l,2,...,s),such that G = n G. (where nG. = ( nV. , nE.)> i i i n i "" (V, nE.) ). i Proof : Let R=(R ] R $ ): V +[0,l] s be an s-labeling for G. If (v,w) e E then R^vJ+R^w) < 1 for all i e {l,...,s} . Define G. = (V,E.) where E i = ((v,w) | R i (v)+R i (w) < 1} . Then (v,w) e E ^ R(v) + R(w) < (1,...,1) ^ Vie ( !,..., s } R n .(v) + R^w) < 1 ^ V 1 £ { l,...,s } (v,w) eE. , Q 3 As an example, for the graph in figure 5.9 the function R:V ■*■ [0,1] defined by vertex R = (R^.Rg.R-) ^ ■*- (3/4,1/4,1/2) v 2 ^ (1/4,3/4,1/4) v 3 ■> (1/2,1/4,3/4) v 4 - (1/2,1/2,1/4) v 5 -> (1/4,1/2,1/2) is a 3- labeling. 98 Each of the components of R, namely R, , R„ and R 3 , defines a 1-labeling for a graph G 1 =(V,E 1 ) E GR 1 , i = l,2,3, and G 1 n G 2 n 6 3 = G. The graphs G ] , G 2 and G~ are shown in figure 5.10 . An example for the intersection property Figure 5.10 It is clear from the definition of GR that every complete graph is in GR . However, from theorem 5.5 we know that a graph that consists of two disjoint complete graphs of the same size s>l is contained in GR but not in GR s-1 . The following theorem generalizes this idea: 99 Theorem 5.6 : Let seN, G=(V,E)eGR s , and let G'=(V',E') be a complete graph with s vertices, where V'n V = 0. Then Gu G' e GR . Proof : For G one can always find an s-labeling R: V -+[0,1 ] s where Vv e V R(v) ^ (0,...,0): According to lemma 5.4 G can be represented as intersection of graphs G. = (V. ,E. ) e GR , where V.=V for i=l,...s. For each of the graphs G i there is a 1-labeling R.. : V + [0,1] such that R^v) t for all v e V (corollary , p. 87). Hence the combined function R=(R-, ,. . . ,R ) : V -* [0,l] s is an s-labeling for G which has the desired property. Let G' = (V' ,E') be a complete graph and V'^w, w } , V n V = 0. The function R': V -> [0,l] s , defined by R': w, -> (0. . .010. . .0) , j=l,...s, 3 3 defines an s-labeling for G'. Then, for G" = Gu G' = (Vu V , Eu E 1 ), the function R": Vu V -* [0,1 ] s , defined by l"(v) = | R(v) if v e V, R'(v) if V e V, is an s-labeling. In fact, as R(v) has non-zero components only, we have R M (v)+R"(w) I (1,...,1) for every v e V and w e V. Q For instance, as an immediate result, the graph in figure 5.11 is 2 a member of GR . • • » A graph which is in GR (by theorem 5.6) Figure 5.11 100 It is easily shown that for any graph 6 e GR also its complement c 1 G is in GR . It follows that the class of s-threshold graphs, defined in [Chvatal and Hammer, 1977] as unions of s graphs in GR (see also section 5.1), coincides with { G | G c e GR S } . Hence the class of s-threshold graphs differs from GR S for s>l. For example, the graph in figure 5.12 is in GR , whereas the comlementary graph is of the type of graphs G (see theorem 5.5), and 2 therefore is not in GR . 2 A graph in GR , whose complement is not Figure 5.12 In general, the problem of deciding whether a given graph is in GR , for given k, seems to be much more difficult than the case where k=l. It follows from [Chva'tal and Hammer, 1977] that this decision problem is an NP-complete problem (see [Cook, 1971], [Karp, 1972]). Lemma 5.4 : Let G=(V,E) e GR S , w i V, V c V, and connect w to all the s+1 vertices in V. Then the graph thus obtained is in GR Proof : Let R=(R,, R 2> ... , R s ) be an s-labeling for G. An (s+1 )-labeling R 1 is defined as follows: 101 R'Cv) = i , R s (v), ) for v e V ( R^v), R 2 (v), . ( R^v), R 2 (y), ... ( ,0 , ... , , 1 ) for v=w. R c (v), 1/2 ) for ve V-V o A recursive application of lemma 5.4 yields the following: Corollary : If a graph G has n vertices then G e GR n . For example, applying this procedure to the pentagon in figure 5.9 , we get the following labeling of its vertices: vertex R = (R-, ,R 2 ,R 3 ,R 4 ,R 5 ) v 1 ■* (1/2, ,1/2,1/2, ) v 2 -> ( n , 1 , ,1/2,1/2) v 3 ■* (0,0,1,0 ,1/2) v 4 -> (0,0,0,1,0) v 5 -* (0,0,0,0,1) 3 2 Lemma 5.5 : Every cycle with at least five vertices is in GR -GR . Proof : A path P , of length n-1 is easily shown to be in GR ; one possible 2-labeling is shown in figure 5.13. Using lemma 5.4 it follows n-1 n-1 1 -) ( n+r n+r Vr n+r Vr n+i n-2v #n-3 -) 'n-1 A 2-1abeling for P Figure 5.13 n-1 102 that the cycle C n with n vertices is therefore in 6R . One can show that C 5 i GR (in [Ecker and Zaks, 1977] a theorem is given that, in certain instances, might help in deciding that a certain graph cannot be s-labeled, 2 2 given s); We now show that C for n > 5 is not in GR . Suppose C e GR ; then by lemma 5.3 there exist graphs G', G" e GR such that C = G' n G". Since G' and G" are both connected, they must contain vertices v' and v", respectively, of degrees n-1 each (by lemma 5.1). If these two vertices are not adjacent on C then the edge connecting them is in both G' and G", hence it is in G' n G", a contradiction. If these two vertices are adjacent on C then we get the configuration shown in figure 5.14 . We now prove that the C cannot be 2-labeled — n Figure 5.14 edge (u',u") must be in both G 1 and G". In G' we have edges e-, and e 3 that form a forbidden subgraph. Hence, according to theorem 5.1, G' must contain the edge (u',u"). Similarly, the edges e~ and e, form a forbidden subgraph in in G", therefore (u',u") must be also in G". Therefore (u',u") must be in G' n G", a contradiction. In both cases we have a contradiction, and this completes the proof. r\ 103 In contrast to the subgraph property of GR (theorem 5.1), a similar subgraph property for GR S , based on a finite number of forbidden graphs, does 2 not exist: following the lemma just proved, for the class GR we have infinitely many pairwisely non-isomorphic forbidden graphs, namely all the cycles of length at least five (note that any induced subgraph of C is in GR 2 ). We conclude by further examples. By direct inspection it can be verified that 1. all graphs with 13 vertices are in GR , 2 2. all graphs with 1 4 vertices are in GR , 3 3. all graphs with <. 6 vertices are in GR , 4. the only graphs with four vertices that are not in GR are those in figure 5.2, and 2 5. the only graphs with ± 6 vrtices that are not in GR are those in figure 5.15 . More enumeration results are found in the following table: Total Number Number Number Number Number number of of of of graphs of of graphs graphs graphs in GR S vertices graphs in GR 1 in GR 2 in GR 3 s <_ 4 1 1 1 1 1 1 2 2 2 2 2 2 3 4 4 4 4 4 4 11 8 11 11 11 5 34 16 33 34 34 6 156 32 133 156 156 104 V All graphs with <6 vertices not in GR* Figure 5.15 105 CHAPTER 6 EVGE LABELWGS FOR TREES 6.1 INTRODUCTION Given n+1 cities that are directly connected by a tree-like communication network, and n communication lines of different values of a certain property, we want to assign these communication lines ('weights') to the direct connections ('edges') of the qiven network, optimizing certain objective functions. In case this property represents capacity or reliability, we would like to maximize our objective functions, and in case it represents loss or vulnerability we would like to minimize them. Our objective functions are of various types. For example, the function SUMDIS is measuring the average communication cost of the network (see T_Hu, 1974] and [Lenstra, Rinnooy Kan and Johnson, 1976]). The diameter is measuring the maximal communication cost in the network, and the radius is measuring the maximal communication cost from a directory optimally located in the network . Let c(e) and v(e) denote the capacity and vulnerability of a communication line assigned to the edge e, respectively. Define the capacity and vulnerability between the vertices i and j as c(i,j) = min {c(e)} and v(i,j) = max (v(e)}, where e is on the path connecting i and j. Then the functions SUMMIN and SUMMAX measure the average capacity and vulnerability of the network . In section 6.2 we present several NP-complete problems, concerning optimizing the radius and the diameter of a tree and of a binary weighted (0,1 weights) general graph. Polynomial -time algorithms for special cases 106 of minimizing the radius of a tree are shown in section 6.3. In section 6.4 we present polynomial-time algorithms for optimizing average measurement functions (e.g., SUMMAX). Open problems are found in section 6.5. 6.2 PRELIMINARIES Let T»(V,E) be a tree with vertices V={1 ,2,. . . ,n+l } and edges E={e-|, e^,..., e }. A tree is considered to be unrooted, unless otherwise stated A vertex of degree one is a leaf , and an edge incident with a leaf is a terminal edge . P denotes a path with n vertices. W={w, , w«,. . . , w } is a set of weights such that < w.„ *w, < w < ••• w M a w m , x/ . A labeling of the edges a — mm 1—2— — n max -*■ 3 of T with weights from W is a bijection f: E - W. In this chapter we study label ings that optimize certain objective functions. Let p(i,j) denote the (unique) path connecting vertices i and j in T. Given a labeling f of T, the distance d^(i,j) between i and j is d f (i,j) = Z f(e) . eep(i.j) The diameter D f (J) of the tree T, labeled with f, is D f (l) = max {d f (i ,j)} . i and m1n(1,j) = min {f(e)> , eep(i.j) eep(i ,j) respectively. I p ( i » j )| is the number of edges in p(i,j). is the average weight on p(i,j). The quantities which we optimize, for a given tree T, over all possible label ings f, are the following: 1. D , 2. R , 3. SUMDIS{J) = e d(i,j) , i0, to determine whether there exists a labeling f with diameter <_ k. PROBLEM 2 : Like PROBLEM 1, for radius < k. This result holds also for rooted trees. PROBLEMS 1 and 2 remain NP-complete when the weights are bounded PROBLEM 3 : Like PROBLEM 1, for radius > k. Maximizing the diameter of a tree is trivial. PROBLEMS 1 , 2 and 3 are NP-complete also for the corresponding vertex labeling problems. PROBLEM 4 : Given a connected graph G with m edges and a set of 0,1 weights W, |W|=m, to determine whether there exists a labeling with radius <_ 1. PROBLEM 5 : Given a connected graph G with n+1 vertices and a set of 0,1 weights W, |W|=n, to determine whether G contains a spanning tree which can be labeled such that its diameter < 2. The reductions are from the PARTITION problem and the MAXIMUM TERMINAL SPANNING TREE problem, both known to be NP-complete (see [Karp, 1972] and [Garey and Johnson, 1979]): 109 PARTITION : Given positive integers a^ , i=l,2,...,n, to determine whether there exists a set Ic{l,2,...,n> such that z a. = Z a. . 1el Ul MAXIMUM TERMINAL SPANNING TREE : Given a graph G and an integer k>0, to determine whether G has a spanning tree with at least k terminals (=1 eaves). Proof for PROBLEM 1 : We show that PARTITION is reduced to PROBLEM 1. Given a., i=l,2,...,n, (an instance of PARTITION), we define the following instance of PROBLEM 1: ?A ^^ n edges C W = {a,, a«, ..., a, 0, 0, ... , 0, z. a.} . L V ^ 4 n O's k = f i a, We prove that there exists a solution to the PARTITION problem iff there exists a labeling of T with diameter <_ k. Suppose there exists a solution to the PARTITION problem, namely z a. - z a. for some Ic{l ,2,. . . ,n} . The following labeling has a iel n Ml 1 diameter k: label AB with Z. a. , spread the a.'s for iel and i^I on BC and BD, respectively, and label the rest of the edges with O's. Suppose there exists a labeling with diameter <_ k. One can find such a labeling in which za. labels the edge AB (if in the given labeling AB is labeled with a., for some t, and Za. labels another edge e, then by no interchanging la. and a t the diameter is not increased). This means that the a.'s on BC and on BD must sum up to the given PARTITION instance. a.'s on BC and on BD must sum up to j 2a., each, hence we have a solution O Proof for PROBLEM 2 : We prove that PARTITION is reduced to PROBLEM 2. Given a. as above, we define the following instance of PROBLEM 2: T = P 2n+1 • W = {a-,, a , ... , a . 0, 0, ..., 0} , and i c n > „ * ,1 n 0's k = j sa- • The rest follows immediately. r\ The same reduction holds in the case when this path is rooted 1n its center, which proves that PROBLEM 2 is NP-complete also for rooted trees. As was pointed by [Johnson, 1978], PROBLEM 1 and PROBLEM 2 remain NP-complete when the weights are bounded, and this is shown by a similar reduction from 3-PARTITION (see [Garey and Johnson, 1979]). Proof for PROBLEM 3 : We prove that PARTITION is reduced to PROBLEM 3. Given a. (as above), we define the following instance of PROBLEM 3: T=P n + 2 • W = {a-. , a«. ...» a , x} where x>max {a.. } , and 1 1 k = x + p *sa. . If there exists a solution to PARTITION then the labeling shown in figure 6.1 has a radius R = k. Ill / a.'s for iel *-x a. 's for i^I Proof for PROBLEM 3 Figure 6. 1 If there exists a labeling with radius #>_ x + j Za ,- » then ^ et D De a center and DB a radius (see figure 6.2). Therefore d(D,B) >_ x + j za. , and hence d(A,D) <. ^ za. . A •- — • • • Proof for o y • •— PROBLEM C -♦ 3 B (continue) 1 r igure 6 .2 Also d(D,B) <_d(A,C), otherwise D is not a center. Therefore x + \ za -j 1 d(D,B) <_ d(A,C) <_ y + d(A,D) < y + j Ea i , hence x0 (an instance of MAXIMUM TERMINAL SPANNING TREE), we define the following instance of PROBLEM 4: G is the same given graph and W contains n-k O's and all the rest l's. We prove that G has a spanning tree with at least k leaves iff G has a labeling with radius R<\ . Suppose G has a spanning tree with at least k leaves. This yields a labeling of G with radius R <1 , by labeling the internal edges (of the tree) with O's , and thus eyery internal vertex (of the tree) is a center for the graph G with radius 2?<1 . Suppose there exists a labeling of G with radius #<_ 1 . Let A be a center of the labeled graph, and apply a shortest-path algorithm from A to get a 'shortest-path spanning tree' with radius R<_ 1. The tree is labeled with at least k l's. In this tree there exists at most one edge labeled 1 on each path from A to any leaf, since the radius is < 1. Therefore the number of leaves in the spanning tree is at least as the number of l's in the tree, i.e. > k. O Proof for PROBLEM 5 : The reduction is similar to the one of PROBLEM 4. ,-* Conjecture : The following problem is also NP-complete: Given a connected graph and a set of 0,1 weights, to determine whether there exists a labeling with diameter Z?<1 . 113 We have the following observations about this problem. Suppose the graph G is labeled with the given weights such that its diameter D<] . It is clear that z?=0 iff G contains a spanning tree with all edges labeled 0. Thus the interesting case is when z?=l. In this case we may assume that the edges labeled with generate a forest in G, since if they generate any cycle we can replace one 0-edge with a 1-edge without increasing the diameter (putting this 0-edge in another arbitrary place not closing a cycle of 0's). Let T.= (V^.E^), i=l,2,...,j be the trees in this forest. A graph G' = (V\E') is constructed from G by contructing (see [Harary, 1969]) all the vertices of each tree T., i = l,2,...,j, into one vertex u. . It can be shown that G is labeled with a diameter I>=1 if and only if G 1 is a complete graph. This observation lead us to this conjecture. 6.4 RADIUS: POLYNOMIAL RESULTS In the previous section we proved that minimizing the diameter or the radius in a tree are NP-complete problems. Now we present polynomial - time algorithms for special cases of these problems. PROBLEM 6 : Given a tree T and a set of 0,1 weights W, to find a labeling with minimal radius. The following algorithm solves this problem in 0(n) running-time: Algorithm MIN_RADIUS_I 1. while (number of leaves in T) <_ (number of 1's in W) do begin label all the terminal edges with 1 ; delete the terminal edges from T ; delete the used Vs from W ; end ; 114 2. Put all the remaining l's on terminal edges; label all other edges with O's. Proof : Follows immediately. /~\ Algorithm MIN_RADIUS_I is applicable also for rooted trees. PROBLEM 7 : Like PROBLEM 6, for a set of a,b weights, af?, where R is the radius of T' . Then the path from r to x contains an edge labeled with b in T and labeled with a in T' . Let (y,z) be the first such edge from the root r. On the other hand there exists an edge (u,v) labeled with b in T' and labeled with a in T. By the property of T' all the edges in the subtree V are labeled with b. Let w be a vertex in T' with maximal distance d'(r,w) . d'(r,w) r + (a-b) . Hence the algorithm should have labeled the edge (u,v) with b before labeling the edge (y,z), a contradiction. Therefore the labeling of T is optimal. /-\ PROBLEM 8: Like PROBLEM 7, for a maximal radius. 117 In a rooted tree this is done by finding a longest path from the root, labeling its edges with as many b's as possible, and labeling the rest of the edges arbitrarily. In an unrooted tree this is done by finding a longest path, labeling it with ■ fx/2l b's on one end and |x/2j b's on the other (where x is the number of b's), and labeling the rest of the edges arbitrarily. Both algorithms are of 0(n) running-time. PROBLEM 9 : Given a tree T and a set of 0,1 weights W, to find a labeling with minimal diameter. This is done in 0(n) running-time algorithm, identical to that of PROBLEM 6. 6.5 AVERAGE MEASUREMENTS: POLYNOMIAL RESULTS In this section we present polynomial-time algorithms for optimization problems concerning the functions SUMDIS , SUMAVR, SUMMIN and SUMMAX. PROBLEM 10 : Given T and W, to find a labeling that maximizes or minimizes SUMDIS(T) = z d(i,j) . i< N < n! . The optimal labeling is unique up to permuting edges with equal connections. Assuming that all weights are distinct, the number N of optimal labelings satisfies 2 The bounds are achieved for the graphs P ., and K, . The connections can be calculated in 0(n) time by regarding T as rooted and traversing it in postorder. Therefore, and because of the sorting step involved, the algorithm for optimal labeling (for either maximizing or minimizing SUMDIS{1) ) has an O(nlogn) running-time. PROBLEM 11 : Given T and W, to find a labeling that maximizes or minimizes SUMAVR(J) = E i /I'M i • i_ and Ex. = n-k. If the labeling f is optimal, then l x i " x i I 1 1 f° r every i and j. Proof : For a labeling function f as described in (4), Let N^(a) (N^(b)) denote the number of paths p(i,j) in which the largest label is a (b). It is clear that by minimizing N,.(a) over all possible labelings f we will also maximize SUMMAX(T) over all possible labelings f, since a x^l . (6) 123 We define a labeling f as follows: x n -l x,+l x 9 x t -| x. f : a ° ba ' ba ^b ••• ba z 'ba z (7) By (5) we have v-ra- r;i ra • <•• Using (5), (8) and (6) it can be shown that N f ,(a) < N f (a) , hence SUMMAX* . (T) > SUMMAX^ (T) , a contradiction. q Using this theorem it is clear how one should 'spread' the a's among the b's in order to get a labeling that maximizes the function SUMMAX. A path, k weights Here T=p n +i » and W contains m. copies of w, for i=l,2,...,k, where m, + nu + ••• + m. = n and w, > w« > ••• > w.> 0. The following dynamic programming approach, proposed by [Megiddo, 1978] , solves the 2k+l problem in 0(n ). Let T be a path of length zs. labeled with s. copies of w. for i=l,2,...,k, where the leftmost label is w a and p. is the 1 a i position of the first occurrence of w. (from the left). Denote by 4> ( s-j, s 2 , ... , s k , w a> p.j, p 2 , ... , p k ) the maximal value for SUMMAX(T) . Note that p =1. a If there exists b such that p. =2 then (s 1 ,...,s k ,w a ,p 1 ,...p k ) = max {<|>(sl ,. . . ,s a _ 1 ,s a >l ,s a+1 s k' w b' p l " * * ' p k^ + p a ♦ (a) } 124 where p\ = p.j-1 for i>a and i|;(a) is the contribution of the leftmost vertex to SUMMAX[T) , easily calculated using the p/s. If there exists no such b then w a labels the second edge too, and we compute by a ^(s^ ,s 2> ... ,s |< ,w a ,p 1 .... ,p k ) = is computed in 0(n ) points, where each calcu- 2k+l lation is at most 0(n), therefore the algorithm runs in time 0(n ). 6.6 OPEN PROBLEMS We mention three open problems which arose while working on these labeling problems (the first two of these open problems are mentioned in the text): 1. Given a graph and a set of zero-one weights, to determine whether there exists a labeling with diameter < 1. 2. Given a path P and a set of weights W, to find a labeling that maximizes SUMMAX{J) . 3. Given a tree T and a set W of a,b weights, to find a labeling that maximizes SUMMAx{J) (this is an interesting special case of the open problem of maximizing SUMMAX(l) for general T and W) 125 LIST Of REFERENCES Aho, A. V., J. E. Hopcroft and J. D. Ill 1 man [1974], The Design and Analysis oh Computer Algorithms , Add is on -Wesley, Reading, Mass. Bergman, G. M. [1978], Terms and cyclic permutations , Algebra Universalis 8, 129-130. Carlitz, L. [1969a], A note, on the enumeration oh tine chromatic trees, J. of Combinatorial Theory 6, 99-101. Carlitz, L. [1969b], Solution oh certain iecunAe.nc.eA, SIAM J. Applied Math, vol. 17, no. 2, 251-259. Chorneyko, I. Z. and S. G. Mohanty [1975], On the. enumeration oh ceAtaln sets oh planted plane trees, J. of Combinatorial Theory (B) 18, 209-221. Chva'tal , V. and P. L. Hammer [1977], Aggregation oh Inequalities In Integer programming, Annals of Discrete Mathematics 1, 145-162. Coffman, E. G., Jr,(ed) [1976], Compute.*, and Job-Shop Scheduling Theory, John Wiley and Sons, New York. Colbourn, C. J. [1977], Graph generation, Res. Report CS-77-37, Dept. of Computer Science, Univ. of Waterloo (November). Cook, S. A. [1971], The complexity oh theorem- proving procedures , Proc. third Annual ACM Symp. on Theory of Computing, New York. DeBruijn, N. G. and B. J. M. Morselt [1967], A note on plane trees , J. of Combinatorial Theory 2, 27-34. Dershowitz, N. and S. Zaks [1979], Enumerations oh ordered trees , in preparation. Dvoretzky, A. and Th. Motzkin [1947], A problem oh arrangements , Duke Math. J. 14, 305-313. Ecker, K. [1977], Organisation von Varallelen Prozessen. Theorle Veterminlstlscher Schedules , Zurich: Bibliographisches Institut. Ecker, K. and S. Zaks [1977], On a graph labeling problem, Tech. Report no. 99, GMD, Bonn, W. Germany (December). Erdbs, P. and I. Kaplansky [1946], Sequences oh plus and minus, Scripta Math 12, 73-75. 126 Etherington, I. M. H. [1938], On no n- associative combinations, Proc. of the Royal Society of Edinburg, 153-162. Gardner, M. [1976], Mathe.mcutLc.aI. games: Catalan numbers, Scientific American, June, 120-122. Garey, M. R. and D. S. Johnson [1979], Computers and Intractability : A Guide, to the. Theory oh UP -Completeness , W. H. Freeman and Co., San Francisco. Gnedenko, B. N. [1962], The. Theory oh ProbabUJXy , Chelsea Publishing Co., New York. Golumbic, M. C. [1976], Threshold graphs and synchronizing parallel processes , Colloquia Math. Societatis Janos Bolyai , (18) Combinatorics, Keszthely, Hungary. Golumbic, M. C [1978], A note.'- Trivially perhect graphs, Discrete Math. 24, 105-107. Gould, H. W. [1977], Belt and Catalan numbers: research bibliography oh two spe.QA.al numbeA 6e.que.nc.es, West Virginia Univ. Grossman, H. D. [1946], Vun mXh lattice, points, Scripta Math. 12, 223-225. Grossman, H. D. [1950], Fan with lattice, points, Scripta Math. 16, 120-124. Hall, M. , Jr. [1967], Combinatorial Theory, Blaisdell Co., Waltham, Mass. Harary, F. [1969], Graph Theory, Addison-Wesley, Reading, Mass. Henderson, P. B. and Y. Zalcstein [1977], A graph- theoretic characterization oh the PV-chunk. clash oh synchronizing primitives, SIAM J. Computing 6, 88-108. Hu, T. C. [1974], Optimal communication spanning trees, SIAM J. Computing 3, 188-195. Johnson, D. S. [1974], Approximation algorithms ior combinatorial problem*, J. Computer System Sc. 9, 256-278. Johnson, D. S. [1978], private communication. Karp, R. M. [1972], Reductibility among combinatorial problem*, in Complexity oh Computer Computations, R. E. Miller and J. W. Thatcher (eds.), Plenum Press, New York, 85-104. Klarner, D. A [1969], A correspondence between two sets oh trees, Indag. Math. 31, 292-296. Klarner, D. A. [1970], Correspondences between plane trees and binary sequences, J. of Combinatorial Theory 9, 401-411. 127 Knott, G. D. [1977], A numbering *y*tem {or binary tree*, Communications of the ACM, vol. 20, no. 2, 113-115. Knuth, D. E. [1968], The hit o{ Computer Programming, vol. 1: fundamental Algorilhm*, Addi son-Wesley, Reading, Mass. Knuth, D. E. [1973], The hit o{ Computer Programming, vol. 2: Sorting and Searching, Addi son-Wesley, Reading, Mass. Lenstra, J. K. , A. H. G. Rinnooy Kan and D. S. Johnson [1976], The complexity o{ the network design problem, Mathematisch Centrum, Amsterdam. Liu, C. L. [1968], Introduction to Combinatorial Mathematics , McGraw-Hill. Liu, C. L. [1977], El2.me.nti> o{ Discrete Mathematics , McGraw-Hill. Lyness, R. C. [1941], Al Capone and the death ray, The Math. Gazette 25, 283-287. Megiddo, N. [1978], private communication. Mohanty, S. G. and T. V. Narayana [1961], Some, properties o{ composition* and thetr application to probability and *tati*tic* I, Biometrische Zeitschrift 3, 252-258. Motzkin, Th. [1948], Relation* between hyper* ur{ace, cro** ratio*, and a combinatorial formula {or partition* o{ a polygon, {or permanent preponderance,, and {or non- a** dative, product*, Bull. AMS 54, 352-360. Narayana, T. V. [1959], A partial order and it* application to probability, Sankhya 21, 91-98. Paterson, M. S. [1978], private communication. Raney, G. N. [1960], Tunctional compo*ition pattern* and power *erle* reversion, Transactions of the AMS 94, 441-451. Read, R. C. [1972], The coding o{ varlou* kinds o{ unlabeled tree*, in Graph Theory and Computing, R. C. Read (ed.), Academic Press, 153-182. Richards, D. [1978], A note- On a theorem o{ Chorneyko and Mohanty, to appear in the J. of Combinatorial Theory. Riordan, J. [1957], The number o{ labeled, colored and chromatic tree*, Acta Math. 97, 211-225. Riordan, J. [1969], ballots and tree*, J. of Combinatorial Theory 6, 408-411. Rotem, D. and Y. Varol [1977], Generation o{ binary tree* {rom ballot *equence*, Res. Report CS-77-29, Dept. of Computer Science, Univ. of Waterloo (October), to appear in the J. of the ACM. Ruskey, F. and T. C. Hu [1977], Generating binary tree* lexicographically, SIAM J. Computing, vol. 6, no. 4, 745-758. 128 Ruskey, F. [1978], Generating t-ary trees lexicographically , SIAM J. Computing, vol. 7, no. 4, 424-439. Sands, A. D. [1978], Mote: On generalized Catalan numbers , Discrete Math. 21, 219-221. Silberger, D. M. [1969], Occurrences o{> the integer [ln-2] l/nl [n-1 ) / , Comment. Math. Prace Mat. 13, 92-96. Singmaster, D. [1978], Classroom Notes: An elementary evaluation oi the Catalan numbers, The American Math. Monthly, vol. 85, no. 5, 366-368. Sloane, N. J. A. [1973], A Handbook of, Integer Sequences , Academic Press, New York. Takacs, L. [1967], Combinatorial Methods in the. Theory ofi Stochastic Processes, John Wiley and sons, New York. Trojanowski , A. E. [1977a], On the ordering, enumeration and ranking oh k-ary trees, Tech. Report UIUCDCS-R-77-850, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign (February). Trojanowski, A. E. [1977b], Ranking and Luting algorithms far k-ary trees, SIAM J. Computing, to appear. Whitworth, W. A. [1878], Arrangements oi m things oh one sort and m things o^ another sort under certain conditions ol priority , Messenger of Math. 8, 105-114. Yaglom, A. M. and I. M. Yaglom [1964], Challenging Mathematical Problems voith Elementary Solutions, vol. 7: Combinatorial Analysis and Probability Theory, Hoi den-Day. Zaks, S. [1977a], Generating binary trees lexicographically, Tech. Report UIUCDCS-R-77-888, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign (August); to appear in the Theoretical Computer Science. Zaks, S. [1977b], Generating k-ary trees lexicographically, Tech. Report UIUCDCS-R-77-901, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign (November). Zaks, S. and N. Dershowitz [1979], The cycle lemma and some applications, in preparation. Zaks, S. and Y. Perl [1978], On the complexity ol edge labelings far trees, Tech. Report UIUCDCS-R-78-948, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign (October). Zaks, S. and D. Richards [1977], Generating trees and other combinatorial objects lexicographically , Tech. Report UIUCDCS-R-77-903, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign (November); SIAM J. Computing, vol. 8, no. 1, 73-81. 129 l/ITA Shmuel Zaks was born in Haifa, Israel, on August 21, 1949. He received his B.S. (cum laude) and M.S. in Mathematics from the Technion, Israel, in 1971 and 1972, respectively, and his Ph.D. in Computer Science from the University of Illinois in 1979. Since August 1976 he has been a research assistant in the Department of Computer Science at the University of Illinois. During his graduate study at the University of Illinois he visited IBM Thomas J. Watson Research Center, Yorktown Heights, in summer 1978. He is a member of the Association for Computing Machinery. MH2 4VSW BIBLIOGRAPHIC DATA SHEET I. Report No. UIUCDCS-R-79-975 3. Recipient's Accession No. 5- Report Date June 1979 4. Title and Subtitle Studies in Graph Algorithms: Generation and Labeling Problems 7. Author(s) Shmuel Zaks 8. Performing Organization Rept. No. 9. Performing Organization Name and Address Department of Computer Science University of Illinois Urbana, Illinois 61801 10. Project/Task/Work Unit No. 11. Contract /Grant No. NSF MCS 77-22830 12. Sponsoring Organization Name and Address National Science Foundation Washington, D.C. 13. Type of Report & Period Covered Ph.D. Thesis 14. 15. Supplementary Notes 16. Abstracts It is the purpose of this thesis to study graph-theoretic problems that arise in certain practical situations. These problems deal with classes of ordered trees, with classes of undirected graphs appearing in scheduling problems, and with algorithms that label free trees. First we study problems generate trees in certain classes of a given tree and how to find a several enumeration results for th Next we investigate a class of und scheduling problems. This class i extensions are studied. Last we s tree with g.iven labels, optimizing these functions we show polynomial is shown to be NP-complete. concerning ordered trees. We show how to in order, how to determine the position tree given its position. We also show e class of ordered trees with n edges, irected graphs that arise in certain s fully characterized, and several of its tudy algorithms that label edges of a given certain objective functions. For some of -time algorithms, and for others the problem Key Words: Graph algorithms, Trees, Generation of Combinatorial Objects 17b. Identifiers /Open-Ended Terms 17c. COSATI Field/Group 18. Availability Statement FORM NTIS-35 ( 10-70) 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price USCOMM-DC 40329-P7 1 f t B 2 o m