UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN Digitized by the Internet Archive in 2013 http://archive.org/details/generatingkarytr901zaks JL n UIUCDCS-R-77-901 />(dlL K UILU-ENG 77 1758 GENERATING K-ARY TREES LEXICOGRAPHICALLY by Shmuel Zaks November 1977 t GENERATING K-ARY TREES LEXICOGRAPHICALLY by Shmuel Zaks Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 November, 1977 This work was supported in part by the National Science Foundation under grant number NSF MCS-73-03408. TABLE OF CONTENTS 1. Introduction 2. Definitions and Notations 3. Existing Algorithms and Results; The Generating Algorithm 4. The Ranking Function and the Unranking Procedure 5. Summary I. INTRODUCTION We show a 1-1 correspondence between all regular k-ary trees with kn n internal nodes, all 0,1-sequences x = {x. K with n Ts and (k-l)n O's in which k-1 times the number of l's in any prefix Cx.}, (1 < i < kn) is at least as the number of O's in that prefix, and all integer sequences z = {z^}-. in which < z, < z 2 < ... < z n and z.. <_ ki - (k-1) for i = 1 , 2, . . . , n. Working with the reverses of these sequences, we deal with the following three steps: 1. Generating: the sequences are generated one-by-one in lexico- graphic order. 2. Ranking: we show how to compute the function that, given a sequence, determines its position in that lexicographic order. 3. Unranking: a procedure is discussed that, given a position in that lexicographic order, constructs the sequence which occupies this position. We also discuss relations to existing related algorithms. II. DEFINITIONS AND NOTATIONS 2.1 We state here definitions and notations that we shall follow through- out this paper. We follow [KN] and [LI] for our basic terms. A tree means an ordered tree . The terms root , son , internal node and leaf are used as defined in the literature. A k-ary tree is a tree all of which internal nodes have at most k sons each. A regular k-ary tree is a tree all of which internal nodes have exactly k sons each. |T| denotes the number of vertices in a tree T, and T. is the subtree rooted at the i son of the root of T. T(k,n) denotes the set of all the regular k-ary trees with n internal nodes. The number of elements in T(k,n) is known to be WTWT ("n") < see [ RN: P- 584 ]» 2.2 In order to generate trees "lexicographically", we first establish a 1-1 correspondence between them and certain integer sequences, and then we generate these sequences lexicographically. Given a tree T e T(k,n) we label each internal node with 1 and each leaf with 0. We then read these labels in preorder (root-left-right), and thus obtain a sequence of n l's and (k-l)n + 1 0's. The last visited node is a leaf, and we omit the corresponding for simplifying matters. Denote the resulting sequence by f(T) = x = Uj}^", and let F(T) = X be the reverse of f(T). Instead of working with the sequences x and X of length kn, one can prefer using the sequences g(T) = z = {z.j}, in which z^ = position of the i th 1 in f(T) = x,and its reverse G(T) = Z. Note that both z and Z are of length n. See figure 1 for an example. The 0,1-sequence x = f(T) is also discussed in [BM], [GA] and [KL]. * The sons are read from left to right, f(T) = x = 11000001000100100000 F(T) = X = 00000100100010000011 g(T) = z = 1, 2, 8, 12, 15 G(T) = Z = 15, 12, 8, 2, 1 Sequences Corresponding to a k-ary Tree Figure 1 We note that the reconstruction of a tree T given X = F(T) or x = f(T), and the transformations x +* z and X ■*-»• Z, are all quite easy, and the details are left to the reader. A sequence x is called feasible if there is a tree T e T(k,n) s.t. x = f(T). The same definition holds for X, z and Z. x(k,n), X(k,n), z(k,n) and Z(k,n) denote the classes of the feasible x, X, z and Z sequences, respectively, corresponding to the trees in T(k,n). kn Let a = {a..-}-] be a sequence consisting of n l's and (k-l)n 0's. We say that a has the k-dominating property if in each prefix {a,-}-i» 1 £ A < kn, the number of l's is at least ; in other words, the number of 0's is Ttl denotes the smallest integer not smaller than t. UJ denotes the largest integer not larger than t. at most (k-1) times the number of l's in any such prefix. This property is closely related to the g eneralized box-office problem* (see [DM], [ZA]). The case k = 2 is discussed in various references (see, for example, [YY: Problem 83], or [GN: p. 37]). 2.3 In this paper we work with the reverse sequences, for which we obtain the following: 1. Generating: we generate Z(k,n) lexicographically (corresponds to a lexicographic generation of X(k,n); see Theorem 2). 2. Ranking: we show how to compute the function index(X) that assigns to a sequence X e X(k,n) its position in the lexi- cographic ordering of X(k,n). 3. Unranking: given an integer t, we construct the sequence X e X(k,n) s.t. index (X) = t. In a previous work [ZA], we proved that generating x(k,n) lexico- graphically corresponds to generating the trees in T(k,n) according to the following ordering (see also Theorem 2): Definition 1 : Given T, T' e T(k,n), we say that T < T' if 1 . T is empty, or 2. T is not empty, and for some i, 1 <_ i <_ k, we have a) T. = T'. for j = 1, 2, ... , i-1, and b) T, 1^ > ... > Z n > and Z .,, < ki - (k-1) for i = 1, 2, ... , n. n-i+1 - v Following the results proved in [ZA] for the x and z sequences, and using the definitions of the X and Z sequences, we get: Theorem 2 : Let T, T' e T(k,n), and let x, Z, z, Z and x\ X', z', V be the corresponding sequences associated with them. Then 1. T < T' according to definition l«>xz>z'. 2. T < T 1 according to definition 2 <» X < V <=> 1 < V . kn 3.3 In [ZA], we proved that a 0,1-sequence x = {x-}-i is feasible iff k-1 erasing any 10 pattern*, as long as possible, results in the empty sequence, A graphical interpretation of this reduction is discussed there. Modify- ing this to the X sequences yields the following: kn k-1 Theorem 3 : A 0,1-sequence X = (X.)-, is feasible iff erasing any 1 pattern, as long as possible, results in the empty sequence. It is Theorem 3 that enables us to compute the ranking function for X(k,n) (see Theorem 5 in the next section), while the mentioned result enabled us to compute it for x(k,n) only for k = 2. 3.4 In [ZA], we generated the feasible z sequences lexicographically. A modification of this algorithm will generate the Z sequences, as characterized by Theorem 1 : Algorithm GENERATE (Generating the sequences Z e Z(k,n) lexicographically): 1. Begin with Z = {I. } = {n, n-1 , . . . , 1 } 2. Find the largest I s.t. Z £ < k(n-£+l ) - (k-1). If i = 1 then goto 3. If i > 1 then: if Z < Z , - 1 then goto 3 else go on looking for such an £. If no such i exists - goto 5. 3. The sequence V = {!'.} next to Z is built as follows: a means a string consisting of m consecutive a's. z \ - z i Zl <- n - i + 1 Let V be called Z. Goto 2. End. for i < i for i > £ Example : In figure 2 we list the sequences x, X, z and Z corresponding to the ternary trees with 3 internal nodes (T(3,3)). The table is arranged according to lexicographic ordering of the x sequences ( = anti lexicographic ordering of the z sequences) on the left, and lexicographic ordering of the X sequences (= lexicographic ordering of the Z sequences) on the right. z X index X z 1,4,7 10 10 10 1 111 3,2,1 1,4,6 10 10 10 2 10 11 4,2,1 1,4,5 10 1 10 3 110 1 4,3,1 1,3,7 10 10 10 4 10 11 5,2,1 1,3,6 10 10 10 5 10 10 1 5,3,1 1,3,5 10 10 10 6 110 1 5,4,1 1,3,4 10 110 7 10 11 6,2,1 1,2,7 110 10 8 10 10 1 6,3,1 1,2,6 110 10 9 10 10 1 6,4,1 1,2,5 110 10 10 10 11 7,2,1 1,2,4 1 10 10 11 10 10 1 7,3,1 1,2,3 1110 12 10 10 1 7,4,1 The Sequences Corresponding to T(3,3) Figure 2 IV. THE RANKING FUNCTION AND THE UNRANKING PROCEDURE 4.1 In this section, we find the position Index(T) of a given tree T e T(k,n). For this purpose, we take X = F(T), and find the position index(X) of X in the lexicographic ordering of X(k,n). The modification to Z(k,n) is left to the reader. First we observe the following: Theorem 4 : There is a 1-1 correspondence between all the sequences in X(k,n) which begin with £+k_1 l, and those in X(k,n-1) which begin with . £+k-l Proof: Suppose there are u sequences in X(k,n) beginning with 1. Call them A Q £+k-l 1Y 5 A 2 =0 £+k-1 lY 2 , ... ,A u =0 £+k " 1 lY u . By omitting the first occurence of k_1 l from each of the A^s we get B ] = £ Y ] , B 2 = £ Y 2> ... , B y = £ Y u , each of which is a sequence in X(k,n-1) by Theorem 3. It is clear that B- =f B. for i j j , and that all the sequences in X(k,n-1) beginning with are matched in this way (by Theorem 3); the theorem has thus been proved. D Note : It is clear that the correspondence mentioned above preserves lexi- cographic order, i.e. A. < A. iff B. < B.. This fact will be used in the next theorem - which is used recursively to compute the ranking function - as follows: Theorem 5 : Let X = £ 1Y belong to X(k,n)*. Then index(X)** = if X = a l for some a and b a((k-l)n - U+l),n,k) + index(0 £_k+1 Y) otherwise where a((k-l)n - (£+l),n,k) is the number of sequences in X(k,n) which begin with £+1 . Proof : Immediate, by the nature of lexicographic order and theorem 4. D * By Theorem 4, we have £ > k - 1 ** We use here the same name for the functions index(X) corresponding to a given n and k. index(X) gives the position of X in X(k,n), while index(0 £ ~ k+1 Y) gives the position of the feasible sequence o £-k+1 Y in X(k,n-1) 10 4.2 The following discussion concerns the evaluation of the numbers a(i,j,k) as defined in Theorem 5. We make use of the following geometric interpreta- kn tion (see [YY: Problem 83]): a sequence a = {a.}, , with n l's and (k-l)n 0's, corresponds to a path from B((k-l)n,n) to A(0,0) in the rectangular lattice defined by B and A (see figure 3), where 1 means "one step down" and means "one step to the left." This sequence is feasible iff this path never goes below the line AB(see Theorem 1), which is i = (k-l)j. J 5 15 34 65 108 163 228 283 283 283 B((k-l)n,n) -+— —9— —9— — # ♦ ♦ • - 3- ■lO-o — 19-<» — 3V-*^ -43hk 55-o — 55- A(0,0) -9- -3- -h*^~ -CH» 12- ->*^ — 0-^> Of G-*> (H» 0-<' (H» (H» -O-i » 0- 1 1- — 0-i r O-i i- a(i,j,k) for k = 3 and j < 5 Figure 3 The following labeling of the lattice points gives the number of ways to go from a point (i,j) to A, without going below the line AB: > i a(i,j,k) = < a(i,j-l ,k) + a(i-l ,j,k) see example in figure 3. i = 1 > (k-l)j otherwise 11 r The solution to this recurrence relation is as follows: Theorem 6: The solution to the recurrence relation i = i > (k-l)j .a(i,j-l,k) +a(i-l,j,k) otherwise a(i,j,k) = \ where i, j ^ 0, is given by i-1 *■'.« - or) - L S T^) i+j-l-kt\ 1 /kt j-t I (k-l)t+l I t 2 , where s < 1, is taken to be 0) v t=l > Proof : We prove by induction on i and j. For i = and any j, we have a(0,j,k)=l We assume the formula holds for i-1, and prove it for i, as follows: Let i = (k-l)x + y, 1 < y ^k - 1. For j <_ x, we get i = (k-1 )x + y >_ (k-1 )j + y > (k-l)j and in this case the formula must give us the boundary condition 0, and it really does: as i = (k-l)x + y, and 1 ^0\ k j \ n-k / r-tk for integer n and r =)= tk, <_ k <_ n. Setting k -<- t, r ■*■ 1 , t *- -k, s -*- i - (k-l)j - 1 and n «- j we get our summation ( . J j minus the term corresponding to t = 0. 12 So the formula gives a(i,j,k) = ( i+ J'- ] ) ♦ ( 1t j" 1 ) - ( 1 }J) - 0, as desired. Assuming it holds for j - 1 , we continue as follows: if j > x , we have i = (k-l)x + y < (k-l),i + y, or i <_ (k-l)j, and we show that a(i,j,k) satisfies a(i,j,k) = a(i,j-l,k) + a(i-l,j,k). By the induction hypothesis we have 1.-1 •^•«-ni-.^(w^TFiW(?) and a(i-l,j,k) = f 1 "^" 2 i-2 ■ k ; 1 -l /i+j-2-kt\ 1 /kt £1 \ j-t ; (k-ut+i 1 1 Therefore, i-1 k-1 a(i,j-l,k) + a(i-l.j.k) = ( i+ H - 2 v ] ' t=l i-2 k-1 (...) + "2' t=l (...) It remains to show that i-1 i-2 - k ; 1 J/i+j-2-kt\ 1 /kt\ . L k ;U/i+j-2-kt\ 1 /kt\ ^ \ j-i-t J (k-i)t+i U^ ^ I j-t 1 (k-ut+i U; i-1 k-1 = Lyj/ 1+ j-l-kt\ 1 ^ I j-t j (k-l)t+l \t J . kt (*) If y > 1 then i-1 _k-l_ = i-2 _k-l_ = x, and all the summations are 2 (...), hence t=l (*) is correct. If y = 1, we have i-1 x-1 = x, i-2 k-1 ■ x - 1, but then a term corresponding to t = x in the second summation on the left side of (*) is ( J **~ ); but j > x, so this is 0. The proof is thus completed. □ 13 4.3 As for the unranking procedure, we find it in a "reverse" interpretation of Theorem 5. We are given an integer t, and look for a tree T e T(k,n) s.t. Index(T) = t. For this we find a sequence X e X(k,n) s.t. index(X) = t. The modification for the Z sequences is left to the reader. Algorithm UNRANK (Finding the sequence X e X(k,n) s.t. index(X) = t for a given t): 1. A «- t, j *■ n 2. Find* a. s.t. a(£. j,k) < A <_ aU.+l ,j,k) 3. A + t - a(£, ,j,k) j -*- J - 1 If A > 1 then goto 2. n Comment: we have now t - 1 + 2 a(£.,j,k) C^o j o 4. X «- ° 1 ° m - j + 1 s 4- (k-l)m - i m 5. Change X as follows: X. unchanged fori ^s - k X s+1 X s+2 ••• X km "" X s-k+l X s-k+2 '■• X k(m-1) X s-k + l X s-l + 2 ■•• X s^ ^ ] 6. m *- m + 1 If m <^ n then set s «- (k-l)m - i m and goto 5 7. End. * This step is performed by using a table look-up for the pre-computed values of the a(i,j,k)'s, or by computing them on-line. In both cases, we use the formula derived in Theorem 6. Note that I. is unique because a(i,j,k) is increasing in i. 14 ■f"h Example : For the sequence X = 000100101 (the 8 sequence in figure 2), we have - by our ranking function - the following: index(X) = index(000100101 ) = = a(2,3,3) + index(000101) = = a(2,3,3) + a(0,2,3) + index(OOl) = = a(2,3,3) + a(0,2,3) +1=6+1+1=8. As for the unranking procedure: we look for a tree T e T(3,3) s.t. Index(T) = 8. For this purpose we will find X s.t. index(X) = 8. Applying our unranking algorithm, we have after its 4 step: 8 = a(2,3,3) + a(0,2,3) + 1. By step 5 we first set X to 001. Step 6 changes X to 000101 - by insert- ing 001 after the first - and then to 000100101. 15 V. SUMMARY 5.1 After discussing the feasibility criteria for the integer sequences associated with the k-ary trees with n internal nodes, we introduced the algorithm GENERATE (in 3.4) which generates the sequences Z e Z(k,n) lexicographically. The recursive ranking function index(X) (computed by Theorem 5) finds the position of X in the lexicographic ordering of X(k,n), and is based on the reduction criterion (Theorem 4) and on the solution to the recurrence relation for a(i,j,k) (Theorem 6). The algorithm UNRANK (in 4.3) finds the sequence X e X(k,n) s.t. index(X) = t, and - as is always the case - is done by using the ranking procedure backwards. 5.2 There is a simple 1-1 correspondence between all regular k-ary trees with n internal vertices and all k-ary trees with n vertices (see [KN: p. 559] or [TR1 : p. 3]), so our algorithms can be applied to these classes as well. ACKNOWLEDGEMENT I would like to thank Professor C. L. Liu for making valuable suggestions. 16 REFERENCES [BM] N. G. Debruijn and B. J. M. Morselt, "A Note on Plane Trees," Journal of Combinatorial Theory 2, (1967), 27-34. [DM] A. Dvoretzky and T. Motzkin, "A Problem of Arrangements," Duke Math. Journal 14 (1947), 305-313. [GA] M. Gardner, "Mathematical Games": Catalan Numbers, Scientific American , June 1976, 120-122. [GN] B. N. Gnedenko, The Theory of Probability , Chelsea Publishing Company, N.Y., 1962. [KL] D. A. Klarner, "Correspondences Between Plane Trees and Binary Sequences," Journal of Combinatorial Theory 9, (1970) 401-411. [KNO] Gary D. Knott, "A Numbering System for Binary Trees," Communications of the ACM , 20, No. 2, February 1977. [KN] Donald E. Knuth, The Art of Computer Programming Vol. 1: Fundamental Algorithms, Addi son-Wesley, Reading, MA 1968. [LI] C. L. Liu, Elements of Discrete Mathematics , McGraw-Hill, 1977. [RH] F. Ruskey and T. C. Hu, "Generating Binary Trees Lexicographically" SI AM Journal on Computing , to appear. [TR1] Anthony E. Trojanowski , "On the Ordering, Enumeration and Ranking of k-ary Trees," Tech. Report UIUCDCS-R-77-850, Department of Computer Science, University of Illinois at Urbana-Champaign, February, 1977. [TR2] Anthony E. Trojanowski, "Ranking and Listing Algorithms for k-ary Trees," SIAM Journal on Computing , to appear. [WH] W. A. Whitmorth, "Arrangements of m Things of One Sort and m Things of Another Sort Under Certain Conditions of Priority," Messenger of Math. 8 (1878), 105-114. [YY] A. M. Yaglom and I. M. Yaglom, Challenging Mathematical Problems with Elementary Solutions Vol. 1: Combinatorial Analysis and Probability Theory, Hoi den-Day, 1964. [ZA] S. Zaks, "Generating Binary Trees Lexicographically," Tech. Report UIUCDCS-R-77-888, Department of Computer Science, University of Illinois at Urbana-Champaign, August, 1977. [ZR] S. Zaks and D. Richards, "Generating Trees and Other Combinatorial Objects Lexicographically," Tech. Report UIUCDCS-R-77-903, Department of Computer Science, University of Illinois at Urbana-Champaign, November, 1977. IBLIOGRAPHIC DATA HEET 1. Report No. |2. UIUCDCS-R-77-901 3. Recipient's Accession No. Title anJ Subtitle- 5- Report Date November 1977 Generating k-ary Trees Lexicographically 6. Auchor(s) Shmuel Zaks 8. Performing Organization Rept. No. IVrlorming Organization Name and Address 10. Project/Task/Work Unit No. Department of Computer Science 11. Contract /Grant No. University of Illinois at Urbana-Champaign Urbana, IL 61801 MCS-73-03408 1 Sponsoring Organization Name and Address 13. Type of Report & Period Covered National Science Foundation Washington, D.C. 14. >. Supplementary Notes i. Abstracts We show a one-one correspondence between all the regular k-ary trees with n internal nodes and certain integer sequences. We then generate these sequences lexicographically, and discuss the ranking function and the unranking procedure. Relations to existing algorithms are discussed. '. Key Words and Document Analysis. 17a. Descriptors k-ary tree, ordered tree, lexicographic order, ranking, unranking. b. Identifiers/Open-Ended Terms c COSATI Field/Group ,'Availability Statement ! 1 19. Security Class (This Report) UNCLASSIFIED 21- No. of Pages 20. Security Class (This Page UNCLASSIFIED 22. Price )*M NTIS-35 (10-70) USCOMM'DC 40329-P7 1 FCD o lrt*7n AUG l 5 198G