The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 i 
 
 DEC 16 in 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/analysisofalgori401chas 
 

'J6r 
 
 Report No. 401 
 
 *f 
 
 \jt 
 
 ANALYSIS OF ALGORITHMS FOR FINDING 
 ALL SPANNING TREES OF A GRAPH 
 
 by 
 Stephen Martin Chase 
 
 October 19, 1970 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
 
 URBANA, ILLINOIS 
 
 The Library of the 
 
 J UN 978 
 
 at Urbana-Champaign 
 

Report No. 1<-01 
 
 ANALYSIS OF ALGORITHMS FOR FINDING 
 ALL SPANNING TREES OF A GRAPH* 
 
 by 
 Stephen Martin Chase 
 
 October 19, 1970 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 6l801 
 
 This work was supported in part by the following grants: US NSF GJ 217, 
 US NSF GJ 8l2, and project BUILD. The last stage of thesis rewriting was 
 supported by IBM. This work was submitted in partial fulfillment of the 
 requirements for the degree of Doctor of Philosophy in Computer Science, 
 October 1970. 
 
ANALYSIS OF ALGORITHMS FOR FINDING 
 ALL SPANNING TREES OF A GRAPH 
 
 Stephen Martin Chase, Ph.D. 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign, 1970 
 
 Relatively little attention has been paid to the problem of measuring the 
 efficiency of graph algorithms. The fact that the amount of work required by 
 most graph algorithms varies greatly and unpredictably with the structure of the 
 graph to which it is applied, makes this problem both practically important and 
 theoretically difficult. 
 
 Two major goals were set at the outset of this investigation: first, to 
 investigate and develop general approaches and specific techniques for analyzing 
 the efficiency of graph algorithms, and second, to test and illustrate some of 
 these approaches and techniques by using them for the analysis and comparison of 
 specific algorithms. 
 
 With respect to the first goal, empirical and analytical methods are dis- 
 cussed. The use of empirical methods is greatly facilitated by a Graph Algorithm 
 Software Package, GASP, which is an extension of PL/1 and has sets and graphs as 
 additional data types. 
 
 With respect to the second goal, the problem of finding all the spanning 
 trees of a graph was chosen. All published algorithms are analyzed and compared. 
 A new algorithm is described, analytically compared to the previous algorithms, 
 and found to be superior. For example, on the complete graph on n nodes, cost 
 (new algorithm) = cost (A) / (/z ), where A is the most efficient previous 
 algorithm. 
 
iii 
 ACKNOWLEDGMENT 
 
 The author wishes to thank Professor Jurg Nievergelt for his 
 extraordinary assistance, advice, and encouragement during the preparation 
 of this thesis . 
 
 The support of the author's graduate education "by the Department 
 of Computer Science, University of Illinois at Urb ana-Champaign, is 
 gratefully acknowledged. The thesis research was supported by the 
 following grants: US NSF GJ 217, US NSF GJ 812, and project BUILD. 
 The last stage of thesis rewriting was supported by IBM. 
 
 The author greatly appreciates the typing by Mrs. Joanne Bennett , 
 The efforts of the many people who aided the preparation of this thesis 
 are also appreciated. 
 
 Finally, the author wishes to dedicate this thesis to his wife, 
 Mary, and parents, Martin and Doris, for their sacrifices which made this 
 effort possible . 
 
TABLE OF CONTENTS 
 
 Page 
 ACKNOWLEDGMENT iii 
 
 1. INTRODUCTION. 
 
 1.1. Goals of This Investigation 1 
 
 1.2. Related Efforts 2 
 
 1.3- Notation 2 
 
 2. EMPIRICAL METHODS OF MEASURING EFFICIENCY OF COMPUTATION. . k 
 
 2.1. Advantages k 
 
 2.2. Disadvantages U 
 
 2.3. Lessening the Disadvantages by Using Better 
 
 Measuring Techniques 5 
 
 3. ANALYTICAL METHODS OF MEASURING EFFICIENCY OF COMPUTATION . 7 
 
 3«1. Advantages 7 
 
 3-2. Disadvantages 7 
 
 3 -3- Types of Analysis 8 
 
 h. THE PROBLEM OF FINDING ALL THE SPANNING TREES IN A GRAPH. . 11 
 
 k.l. The Problem: Its Variations and Applications. ... 11 
 
 k.2. The Algorithms: Their Common Features 12 
 
 5. DESCRIPTION OF THE ALGORITHMS Ik 
 
 5'1> Exhaustion lU 
 
 5-2. Determinants lU 
 
 5'3« Decomposition 15 
 
 5.^. Tree Transformations 15 
 
 5.5. Hamiltonian Paths 16 
 
 5.6. Introduction to Expansion Algorithms 16 
 
 5 -7 • Cancellation of Non-Trees 18 
 
 5.8. Circuit-Free Expansion 18 
 
 5«9« Connected Expansion 20 
 
 5.10. Factoring 20 
 
 5-11. More Factoring 23 
 
 5-12. Pruning 25 
 
 5.13. Variations 28 
 
TABLE OF CONTENTS 
 
 Page 
 
 6. ANALYTICAL MEASUREMENTS OF SELECTED ALGORITHMS 29 
 
 6.1. A Priori Bounds 29 
 
 6.2. Worst Case 30 
 
 6.3- Computation Trees 30 
 
 6.k. Direct Comparisons 32 
 
 6.5. Special Graphs: The Quotient Operator ~$k 
 
 6.6. Complete Graphs 37 
 
 7. CONCLUSIONS . kl 
 
 REFERENCES k-2 
 
 APPENDICES 
 
 VITA 
 
1. INTRODUCTION 
 
 1.1. Goals of This Investigation 
 
 Graph theory and Its applications have received much attention over the 
 past two decades. In particular, many algorithms have been proposed for the 
 solution of several graph problems which arise frequently In certain applications. 
 However, relatively little attention has been paid to the problem of measuring 
 the efficiency of these proposed algorithms. The fact that the amount of work 
 required by most graph algorithms varies greatly and unpredictably with the struc- 
 ture of the graph to which it is applied, makes this problem both practically im- 
 portant and theoretically difficult. 
 
 Two major goals were set at the outset of this investigation: first, to 
 investigate and develop general approaches and specific techniques for analyzing 
 the efficiency of graph algorithms, and second, to test and illustrate some of 
 these approaches and techniques by using them for the analysis and comparison of 
 specific algorithms. 
 
 With respect to the first goal, there are two major categories of methods 
 for measuring the efficiency of graph algorithms: empirical and analytical. 
 Empirical methods are discussed in chapter 2, analytical methods in chapter 3. 
 Empirical methods are greatly facilitated by a Graph Algorithm Software Package, 
 GASP, which is described in detail in appendix 1. 
 
 With respect to the second goal, one graph problem, that of finding all the 
 trees of a given graph, was chosen. It is discussed in chapter 4. There are many 
 published algorithms for this problem, and these are described in chapter 5. By 
 concentrating on these algorithms, useful techniques for efficiency analysis were 
 developed and tested. Such detailed study also led to the development of a new 
 algorithm for finding all the trees in a graph. This new algorithm is analytically 
 compared to the best among the known algorithms in chapter 6 and is found to be 
 superior to them. 
 
1.2, Related Ef forte 
 
 Previous effort! which share some of the goals of this investigation fall 
 mainly Into two categories: first, the analysis of specific algorithms, and 
 second, the development of general purpose graph aoftware. A brief review of 
 some of the most relevant papers follow. 
 
 Authors of spanning tree algorithms sometimes present data on the perfor- 
 mance of their algorithms ([Dawson 68], [Stehman 69]). This approach suffers 
 from the fact that one cannot deduce the efficiency of an algorithm from its 
 performance on isolated examples. A systematic comparison of seven algorithms 
 on 13 graphs was done by Fernandez la his thesis [Fernandez 69a], and is mentioned 
 in an abstract [Fernandez 69b]. 
 
 Notable among the analyses of other graph algorithms are Gotlieb and Cornell's 
 experiments with algorithms for finding a fundamental set of circuits ([Gotlieb 67], 
 see also [Paton 69]), Shirey's analysis of algorithms for testing the planarity of 
 graphs [Shirey 69], and Cornell and Gotlieb's analysis of an algorithm for testing 
 graph isomorphism [Cornell 70] . 
 
 The second category consists of papers which describe languages for graph 
 processing ([Friedman, 69], [Hart, 69], [Read ], [Wolfberg, 69]). These lan- 
 guages and GASP are similar in the sense that they all include graphs and sets as 
 data types, and they all are extensions of an existing base language (e.g., FORTRAN, 
 LISP). GASP is the only one which is an extension of PL/1, the richest widely 
 available programming language. 
 
 1.3. Notation 
 
 Throughout this thesis, "n" will stand for the number of nodes in a given 
 graph; "b M will stand for the number of branches; "t" will stand for the number 
 of spanning trees; M c" will stand for the time cost of an algorithm. 
 
 Upper bounds on c will be expressed as c « 0(f(n,b,t)), which means that 
 there exists a constant A such that c < A • f(n,b,t) for sufficiently large 
 n, b, and t. Similiarly, lower bounds will be expressed as f(n,b,t) ■ 0(c), 
 
which implies that there exists a constant A such that c £ A * f(n,b,t) for 
 sufficiently large n, b, and t. 
 
 The cardinality of a set s will be denoted |s|. The symmetric difference 
 (exclusive or) of sets si and s2 will be denoted si • s2. 
 
 Truth values, YES and NO, will be combined using "&", "or", and "-.". 
 
 A graph G consists of a set of words {v., v« v } and a set of 
 
 branches {e , ..., e, }. The set of branches incident to v will be denoted 
 B.. The degree of a node, "degree (v )", is equal to JB |. 
 
2. EMPIRICAL METHODS OF MEASURING EFFICIENCY OF COMPUTATION 
 
 There are several methods which could be used to measure the efficiency of 
 graph algorithms. These methods tend to fall into two categories, empirical and 
 analytical. Of course, some methods have both empirical and analytical features, 
 but the division into categories is still useful in order to understand general 
 principles. 
 
 Empirical methods consist of implementing the algorithm on a computer, 
 running several tests on it, measuring the cost, and drawing some form of con- 
 clusion from the observed data. This approach has several advantages and dis- 
 advantages. 
 
 2.1. Advantages 
 
 The first advantage is that empirical measures are often easier to obtain 
 than analytic measures. This is especially true if one needs the implemented 
 algorithm to solve problems. Obtaining data is relatively trivial. Interpreting 
 the data may be easy (e.g., if one only wants to compare algorithms qualitatively 
 to find out which algorithm is best), or may be very difficult (e.g., if one wants 
 a quantitative prediction of the cost on graphs which have not been tested). 
 
 A second advantage occurs after a graph algorithm has been implemented: 
 the programmer often sees ways to improve its efficiency (both on a programming 
 level and a graph theoretical level). Insights into measuring the efficiency 
 may occur as well. 
 
 Finally, empirical results produce numbers corresponding to actual run 
 times which may prove to be more useful than analytically derived formulas which 
 often yield only rates of growth. 
 
 2.2. Disadvantages 
 
 One major disadvantage of experimental testing of efficiency of graph 
 algorithms is that the run time of an implemented program depends on many factors 
 
/hich have little or nothing to do with the algorithm proper. These factors 
 Include the particular computer, language, and programmer; the implementation; 
 ind the method of representing graphs. A change in some of these factors could 
 reverse the experimental conclusion of the superiority of one algorithm over 
 mother. Similarly, once the machine on which they were obtained becomes ob- 
 solete, experimental results are likely to lose their value. 
 
 The other major disadvantage of experimental testing is that because of 
 :omputer time costs, only a small number of tests can be run. If the amount of 
 :omputation required by the algorithm is sensitive to the structure of the graph, 
 it becomes very difficult to accurately extend the results of tests on a small 
 number of graphs to the class of all graphs. 
 
 Similarly, if the algorithm requires computation time which increases rapidly 
 rfith increasingly large graphs, experimental measures will be limited to tests on 
 small graphs. For tree-finding programs, 15-node graphs may be too large [Dawson 
 68] . Costs of algorithms applied to small graphs usually will permit only very 
 poor extrapolations to the costs of larger graphs. 
 
 Many authors hide the inefficiency of their algorithms by illustrating them 
 on small graphs where they appear reasonable. When applied to slightly larger 
 graphs, the algorithms require considerably more computation. 
 
 2.3. Lessening the Disadvantages by Using Better Measuring Techniques 
 
 The two disadvantages mentioned in the previous section are due in varying 
 
 degrees to the use of data consisting of computer run times. In order to obtain 
 
 data dependent on properties of the algorithm rather than on the particular 
 
 i 
 
 computer system used, the following technique can be used. 
 
 Divide the given program into logical groups of operations which have the 
 
 i 
 
 property that during any test of the program, all the operations in a section 
 
 I 
 
 [will be executed the same number of times. Insert counters into the program, 
 
 jone to each logical section. Assign weights to each operation, and compute the 
 
 \ total weight of a section as the sum of the weights of all the operations in the 
 
section. Take the section counts from a computer test, multiply them by the 
 corresponding weights, sum over all sections, and you have the total cost of 
 that test graph. 
 
 This technique decreases the dependence on the particular computer system 
 used because one can arbitrarily assign weights to operations in a manner con- 
 sistent with any imaginable computer system. The cost then corresponds to the 
 run time on the imaginary computer, perhaps quite different from the real time 
 cost of the test run. Furthermore, many different costs (based on different 
 imaginary machines) can be computed at the cost of just one real test. To achieve 
 this, simply save the section counts and change the weight system. 
 
 This technique may increase slightly the size of test graphs which can be 
 directly measured. Once the program has been debugged, any code which does not 
 affect the flow of the program can be removed, reducing the real time required 
 for a test without changing the computed cost. 
 
 GASP is very useful when the above technique is applied. GASP allows 
 programmers to express graph and set operations in natural terms, without regard 
 to how these objects are represented. Similarly, the operations on these objects 
 are expressed independently of their implementation. Assigning a reasonable set 
 of weights to GASP operations is easy. 
 
 Because programs written in GASP are independent of the representations, 
 it is possible to run the same program with many different versions of GASP, 
 thereby obtaining experience with different representations. GASP is structured 
 so that small changes can be made in some GASP routines and data structures without 
 requiring changes in the routines which use them. 
 
 
3. ANALYTICAL METHODS OF MEASURING EFFICIENCY OF COMPUTATION 
 
 In contrast to empirical methods, analytical methods involve the mathematical 
 analysis of the computational structure of algorithms. This approach also has its 
 relative advantages and disadvantages. 
 
 3.1. Advantages 
 
 First, analytical results hold for arbitrarily large graphs, where experi- 
 mental results would have to be extrapolated. Thus analytical results give a 
 better indication of the true nature of the algorithm. 
 
 Second, analytical measures are usually performed on the algorithm proper 
 rather than a machine-dependent implementation of the algorithm. Thus the re- 
 sults will not become obsolete when implementations improve. 
 
 3.2. Disadvantages 
 
 The big disadvantage with the analytical approach is that many graph al- 
 gorithms are difficult to measure analytically, especially when the cost of the 
 algorithm varies greatly with the structure of the graph (and not just its size). 
 The goal of analytical methods is to express the cost in terms of a few easily 
 calculated parameters of the input graphs. For some algorithms, this goal is 
 unobtainable, and one must do at least one of the following: 
 
 1. Restrict the estimates and bounds to apply only to some 
 subset of the set of all graphs. 
 
 2. Introduce more complicated parameters. 
 
 3. Accept larger measurement errors. 
 
 Another possible disadvantage of analytical measures is that they are derived 
 for large graphs, so that small terms and details can be ignored. However, if 
 for some reason the algorithm is applied only to small graphs, the ignored 
 information may be more important than the derived formula. 
 
8 
 3-3» Types of Analysis 
 
 There are several techniques which can he used in making analytical 
 
 measures of efficiency. These techniques will he illustrated "by applying 
 
 them to an algorithm, A, of the following structure . 
 A: "Pick an arbitrary node X Q . 
 
 For all nodes X adjacent to X , do S . " 
 S is an operation whose cost is large hut constant, so that the total cost 
 
 of A is determined "by the number of executions of S. 
 
 Some of the techniques will he more significantly used (and therefore 
 
 illustrated) in chapter 6. 
 
 A standard technique for measuring an algorithm's efficiency is worst- 
 case analysis . If applied to algorithm A, the following analysis might take 
 place. "The bound variable X takes its values from the set of nodes of the 
 graph; therefore, n is a bound for the number of times S is executed. 
 
 Hence, c = 0(n) ." 
 
 This method is usually the easiest to apply, but usually the least 
 
 accurate. If an algorithm has c = 0(n ) with k very small (say 2 or 3), 
 
 then worst-case analysis may be accurate for some graphs. However, for 
 
 less efficient algorithms or for typical graphs, the errors can grow 
 
 rapidly and often become intolerable. 
 
 In order to get bounds which are tighter than those from worst-case 
 
 analysis, it is usually necessary to make assumptions. That is, the test 
 
 graphs are assumed to have certain properties. For example, assume all 
 
 nodes have the same degree, d (which could be either a constant or some 
 
 small function of n) . Algorithm A would be analyzed as follows: "X will 
 
 take on d values because that is exactly the number of nodes adjacent to 
 
 X Q . Therefore, c = 0(d)." 
 
 Assumptions should be chosen with care. Too many will make the 
 
 analysis easy, but the conclusions will be of limited use. Too few may 
 weaken the analysis so that only very loose bounds can be obtained. 
 
Particularly useful assumptions are those which specify the test graphs in 
 terms of one or more parameters. With such assumptions, analytic bounds can be 
 derived and expressed in terms of the parameters. 
 
 For many algorithms, a useful one-parameter family of test graphs is the 
 complete graph on n nodes. Complete graph analysis of Algorithm A would be 
 as follows: "X. is adjacent to all of the other n-1 nodes; therefore, 
 c = O(n-l)." 
 
 Other possible examples of parameterized classes of graphs include circuits 
 of n nodes, ladders of r rungs, star graphs of b branches, rectangular grids 
 of r rows and c columns, and others of even more parameters. 
 
 In addition to making the analysis easier, assumptions may be chosen in a 
 way that reflects the intended use of the algorithm. For example, if the applica- 
 tion is in electrical network theory, assumptions such as planarity or bounded 
 degree of nodes may reflect physical limitations of the hardware. 
 
 The main disadvantage of these techniques is that the assumptions restrict 
 the set of graphs for which the conclusions are valid. It is possible that the 
 conclusions will be false for most graphs. This disadvantage is lessened when 
 estimates which are derived on a small class of graphs can be used as bounds on a 
 larger class. For example, if the cost of an algorithm increases whenever a non- 
 parallel branch is added to the test graph, then the cost of that algorithm on 
 
 the complete graph on n nodes will be an upper bound of the cost on any graph 
 
 I 
 on n nodes . 
 
 When the task is to compare two or more algorithms and to determine which 
 I one is best, there are two approaches which can be used. The first approach is 
 
 ! to apply the previously discussed techniques on each algorithm individually, and 
 
 i 
 
 : then compare the derived estimates and bounds. The second approach is to analyze 
 | directly the computational aspects of the differences between the competing 
 ; algorithms. 
 
 To illustrate the second approach, suppose Algorithm B is obtained by 
 
 I 
 
10 
 
 modifying Algorithm A so that X Q is chosen to be a node of minimum degree. 
 Then the comparison analysis may be as follows: "If the computation required in 
 B to find a minimum degree X Q is negligible, then Algorithm B is better than 
 Algorithm A because S is executed fewer times." 
 
 One advantage of direct comparison is that the analysis is often easier, 
 thus fewer (if any) assumptions will be required. With fewer assumptions, the 
 conclusions will be valid for a larger set of graphs (perhaps all graphs). 
 
 Another advantage is that the inefficient parts of the algorithms are pin- 
 pointed. Such knowledge about the parts would be useful if it is possible to 
 recombine the parts into new algorithms, or if analogous parts appear in another 
 pair of algorithms. 
 
 A disadvantage of direct comparison is that numerical bounds for individual 
 algorithms are not automatically produced. A related disadvantage is that this 
 method cannot be used on an algorithm which has nothing in common with other 
 algorithms. 
 
11 
 
 4. THE PROBLEM OF FINDING ALL THE SPANNING TREES IN A GRAPH 
 
 4.1. The Problem: Its Variations and Applications 
 
 In order to make meaningful comparisons among graph algorithms, it is 
 useful to focus on a single graph problem. For this thesis, the chosen problem 
 is that of finding (i.e., listing exactly once) all the spanning trees of a 
 connected undirected graph. A [spanning] tree is a set of [n-1] branches which 
 are connected and contain no circuits. 
 
 There are several variations of the problem, including the following: 
 
 1. count the number of trees in any given graph [see section 5.2]; 
 
 2. find formulas for the number of trees in special graphs ([Bercovici 
 69], [Cayley 89], [Mullin 67], [Myers 65], [O'Neil 66b], [Riordan 
 60]); 
 
 3. find all spanning trees of a directed graph ([Chen 66b, 67], 
 [Paul 67]); 
 
 4. find all spanning trees common to two related graphs ([Ardon 69], 
 [Mayeda 66, 68], [Stehman 69]); 
 
 5. find, for a given (directed or undirected) graph, all k-trees (which 
 span k specified components), or all co-trees (complements of trees), 
 or all sets satisfying certain conditions ([Berger 68], [Chen 65, 66a, 
 69a, 69d], [Dunn 68], [Mayeda 57], [Paul 67]); 
 
 6. find two spanning trees with minimal intersection [Chase ]; 
 
 7. find all rooted ordered trees of the complete graph [Scions 68], 
 
 Only spanning trees on undirected graphs will be considered for the remainder 
 of this thesis, so the following conventions will be used. The term "tree" will 
 mean spanning tree, "graph" will mean undirected graph, and "finding all trees" 
 will mean listing all the trees of a given graph without duplications. Factoring 
 the trees into (unions of) cartesian products is allowed; the applications (see 
 below) can use answers in this form ([Bedrosian 62], [Chen 69a], [Dunn 68]). 
 
12 
 
 The primary application of finding all trees is in the analysis of linear 
 electrical networks ([Hakimi 66], [Stehman 69], [Weinberg 58]). A second appli- 
 cation is in the analysis of multilevel masers [Bedrosian 62]. Other potential 
 applications have been mentioned. 
 
 4.2. The Algorithms: Their Common Features 
 
 At least ten distinct algorithms for finding all trees have been proposed 
 in the vast literature on this subject. In addition to their large number, 
 these algorithms have other properties which make them highly desirable objects 
 for efficiency measurements. 
 
 One property of these algorithms is that the cost, c (as well as the 
 number of answers, t) grows exponentially with the size of the test graph. 
 Exponential algorithms are desirable as objects of efficiency measurements 
 (both empirical and analytical) because the large growth rates magnify the 
 differences between algorithms. Thus the inferiority of a bad algorithm will be 
 apparent even on small graphs. Competing exponential algorithms usually have 
 a variety of growth rates, allowing analytical measurements to determine the 
 most efficient algorithm, because only the growth rates of the costs of algorithms 
 
 are considered in analytical measurements. Examples of competing algorithms which 
 
 3 
 
 cannot be analytically contrasted because they share a common growth rate [n ] 
 
 are the better algorithms for testing the planarity of graphs [Shirey 69]. 
 
 Although different ideas for exponential algorithms can be contrasted by 
 analytical measurements, differences in graph representation and differences in 
 implementation efficiency do not show up. If an algorithm is more efficient in 
 
 a particular representation, it will always pay to convert the input graph into 
 
 2 
 that representation because the cost of conversion is 0(n ) which will be 
 
 small when added to an exponential term. Similarly, implementation improvements 
 
 can do no better than to reduce the cost by a constant factor, which will not 
 
 affect growth rates. 
 
 Another property of these algorithms is that analytical bounds are difficult 
 
13 
 
 to derive (this explains the complete lack of meaningful bounds by the many 
 authors of these algorithms). One of the reasons for this difficulty is that 
 the cost of these algorithms depends greatly on the parameter t, which cannot 
 be expressed in terms of n and b (except for a few special graphs). Fortun- 
 ately, the scarcity of individual bounds in terms of n and b does not rule 
 out comparison analysis. For example, any algorithm whose cost grows faster than 
 t will be inferior to any algorithm whose cost grows slower than t. 
 
 A property of these algorithms which aids direct comparison analysis is that 
 many of them can be arranged in a sequence in which the difference between one 
 algorithm and the next is small. This property aids both the description and the 
 analysis of the algorithms because only the differences need to be described and 
 analyzed. 
 
Ik 
 
 5. DESCRIPTION OF THE ALGORITHMS 
 
 This chapter briefly describes all known classes of algorithms for finding 
 all trees in a graph. Sections 5.1 through 5.4 are independent of each other; 
 section 5.5 describes a variation of the algorithm in section 5.4; section 5.6 
 introduces the remaining algorithms, each of which is described in terms of the 
 differences from the preceding algorithm. 
 
 5.1. Exhaustion 
 
 Exhaustion algorithms simply search through a large set of candidate 
 branch sets, testing each to see if it is a tree of the graph. 
 
 One algorithm ([Hale 61], [MacWilliams 58], [Mason 57], [Mayeda 57], 
 [Weinberg 58]) generates all sets of n-1 branches from the graph, and tests 
 each set to see if it is a tree. 
 
 Another algorithm ([Char 68], [Zobrist 64]) takes a previously computed 
 list of all the trees on the complete graph on n nodes, and tests each tree 
 to see if all of its branches belong to the input graph. 
 
 5.2. Determinants 
 
 The most efficient method to calculate t, the number of trees in a 
 graph, is to evaluate a determinant [Harary 59], Let M be a n-1 by 
 n-1 matrix with entries m.. defined as follows: m. . = degree (v.), and 
 (for i*j ) m. . = - (the number of branches connecting v. to v.). Then, 
 t = det (M) can be calculated by using any standard method of evaluating 
 determinants (e.g., Gaussian Elimination). 
 
 Determinant algorithms ([Trent 54], [Weinberg 58]) to find all trees 
 need to evaluate determinants symbolically, a complicated (and costly) process. 
 Some of the more efficient "determinant" algorithms ([Chang 68]), [Chen 68], 
 [Malik 67], [Nakagawa 58]) turn out to be different presentations of algorithms 
 to be described later (sections 5.4, 5.7, and 5.10). 
 
15 
 
 5.3. Decomposition 
 
 Many authors ([Berger 68], [Chen 69a, 69b], [Hakimi 64], [Jong 66], 
 [Kim 60], [Lee 63], [MacWilliams 58], [Mayeda 59], [Myers 67], [Row 61], 
 [Watanabe 61]) have suggested decomposition as a method to find all trees. 
 The basic idea is to divide the graph into two or more subgraphs, to find the 
 trees on these subgraphs, and then to combine these partial trees into trees 
 of the input graph. Unlike most decomposition algorithms for other graph 
 problems (e.g., planarity tests), the final step of combining the partial 
 answers is not trivial. 
 
 There are other difficulties in constructing decomposition algorithms. 
 Only a few algorithms ([Chen 69a, 69b], [Kim 60], [Mayeda 59], [Myers 67]) 
 avoid duplication and its penalty of checking each tree against the list of 
 trees. Some algorithms ([Chen 69a], [Lee 63], [MacWilliams 58], [Myers 67]) 
 can be applied only to special types of graphs. 
 
 Apparently only one of these algorithms [Chen 69b] is general, avoids 
 duplications, and overcomes some of the difficulties of combining partial trees. 
 Like most of the references to decomposition algorithms, significant details 
 are not specified, so no algorithm will be described (or analyzed) here. If 
 the details could be worked out, a decomposition algorithm might be competitive 
 with the best of the existing algorithms. 
 
 5.4. Tree Transformations 
 Several algorithms ([Chen 69c], [Fujisawa 59], [Hakimi 61, 66], [Kishi 
 
 69], [Malik 67], [Mayeda 65, 66, 68], [Stehman 69], [Watanabe 60], [Wing 63]) 
 are based on "elementary tree transformations". Tree Y.. is transformed by 
 adding any new branch a„ and removing any branch a., which lies in the path 
 connecting the endpoints of a ? . The new tree Y * Y- 9 {a..,a 9 }. 
 For any two trees, Y and Y, there is a sequence of trees 
 
 <Y 0* Y l» Y 2' ' "' Y k-1' Y k = Y> such that for 1 " *"' " k ' Y# is an elementary 
 
16 
 
 tree transformation of Y. .. The "distance" from Y Q to Y, denoted 
 d(Y-,Y), is the minimum number of transformations necessary to change Y. into 
 Y. For all Y and Y Q , d(Y Q ,Y) ^ n-1. 
 
 A tree transformation algorithm begins with an initial tree Y_. First, 
 all possible elementary tree transformations are applied to Y_ to get X 1 , 
 
 the 
 
 set of all trees at distance 1 from Y_. Next, X ? , the set of all trees 
 
 at distance 2 from Y , is found by applying elementary transformations to the 
 trees in X.. . Similarly, X^ is found from X- by elementary transformations. 
 This process continues until X is found where r = max[over Y]d(Y_,Y). 
 
 The details of this algorithm, such as how to avoid duplications, will 
 not be described here (see [Mayeda 65]). 
 
 On some graphs, the choice of Y~ can make a big difference in the cost 
 of the algorithm. The best Y is a "central" tree ([Deo 66], [Malik 68]), 
 for which max [over Y]d(Y ft ,Y) is a minimum. One algorithm for finding a 
 central tree has been suggested [Amoia 69]. 
 
 5.5. Hamiltonian Paths 
 
 The trees of any graph can be arranged in a (Hamiltonian Path) sequence 
 Y , Y 2 , ..., Y such that for 1 < i < t-1, d ( Y ± > Y ±+1 ) = 1 ([Chen 67], 
 [Cummins 66], [Shank 68]). Algorithms to find trees in such an order have 
 been suggested ([Kamae 67], [Kishi 67, 68]). These algorithms will not be 
 described here because they are too complicated. 
 
 5.6. Introduction to Expansion Algorithms 
 
 The remaining algorithms (sections 5.7 through 5.12) expand the "variable 
 
 Cartesian Product" X, * X„ x . . . * X n , where the definition of the set X. 
 
 1 2 n-1 j 
 
 depends on the choices of elements from X.. through X - . The basic flowchar 
 for these algorithms appears in figure 1, and will be explained in detail in 
 this section. Subsequent flowcharts will be described by explaining the 
 changes in the contents of boxes 1 through 4. 
 
 t 
 
( START Y-» 
 
 initialize 
 
 calculate X 
 
 3 
 
 17 
 
 pick a . from X 
 
 process \a^, a ,..., a jl 
 
 j = n >^ 
 
 NO 
 
 f return) 
 
 J«-J-l 
 
 Figure 1: Expansion Algorithms 
 
18 
 
 The two interlocking loops in the basic flowchart are roughly equivalent 
 to n-1 nested loops (fixed nested loops cannot be used because n is a 
 variable). The variable j specifies the nesting level. The "highest" level 
 is 1, the "lowest" level is n. 
 
 Each level j (1 < j £ n-1) , begins (box 2) with the calculation of 
 
 X., a set of branches. X controls the iterations at level j. Namely, an 
 
 iteration begins (box 3) with one branch being picked (and deleted) from X 
 
 and being assigned to the bound variable a.. 
 
 At the lowest level, the set {a., a_, ..., a , } is processed as a tree 
 
 12 n-1 
 
 candidate (box 4). 
 
 When computation at a level j is completed, the algorithm "backtracks" 
 (box 5) to the previous level j-1 where another iteration (a. 1 from X .) 
 leads to a new instance of level j . 
 
 5.7. Cancellation of Non-Trees 
 
 This algorithm ([Bellert 62], [Chen 65], [Maxwell 66], [Piekarski 65]) 
 is actually a method of expanding the symbolic determinant mentioned in section 
 5.2 [Myers 65]. 
 
 The flowchart for this algorithm appears in figure 2. Box 2 reads 
 
 "X. <- B. - {a,, a_, .... a. . }" which means that X, is the set B, (all 
 J 3 1 2* j-1 j j 
 
 branches incident to node v.) excluding any currently assigned a (i = 
 
 1,2 j-1). Box 4 reads "L = L 9 {{a n , a.,..., a ,}}", where L is a list 
 
 1 Z n— 1 
 
 of tree candidates which have been generated at previous instances of the lowest 
 
 level. If {a n , a_, ..., a , } is equal to a set S already in L, then S 
 12 n-1 
 
 is removed from L. {a., a_, ..., a -} is added to L if and only if there 
 
 12 n-1 
 
 is no such match. When the algorithm terminates, L is the list of all trees 
 of the graph. 
 
 5.8. Circuit-Free Expansion 
 
 This algorithm ([Brownell 68], [Char 68], [Hobbs 59], [Mason 57]) differs 
 
(startY* 
 
 J<-1 
 
 19 
 
 !,<-■ 
 
 
 r 
 
 YES 
 
 pick a . from X 
 
 3<-j+i 
 
 NO 
 
 L=l©{{ ai , a 2 , ..., a^]} 
 
 NO 
 
 >' 
 
 > J<-J-l 
 
 RETURN 
 
 5 
 
 Figure 2i Cancellation of Non-Trees 
 
20 
 
 from the previous algorithm by avoiding the generation of non-treee (rather 
 than waiting to cancel them out of the list L) . This algorithm rejects the 
 choice of any branch which forms a circuit with previously chosen branches . 
 At the lowest level, the tree candidate can be output immediately because any 
 set of n-1 branches which does not contain a circuit is a tree. 
 
 The flowchart for this algorithm appears in figure 3. Box 2 now reads 
 "X «- B - Circuit_Makers (a.., a 2 , . .., a J_i)"« Box ^ now reads "Output 
 
 1* 3 2 ' "**• a n-l * 
 
 5.9. Connected Expansion 
 
 This algorithm ([Berger 67], [Cummins 64], [Feussner 02, 04], [Hirayama 
 63], [Minty 65], [O'Neil 66a]) differs from the previous algorithm in the method 
 of avoiding non-trees. Instead of testing for circuits, this algorithm preserves 
 connectedness. The references cited above offer a variety of algorithms; an 
 efficient representative is described below. 
 
 The flowchart for this algorithm appears in figure 4. The new variables 
 
 are Y. (needed to avoid duplications) and p (representing the nodes in the 
 
 current connected subgraph). Box 1 has added the initializing statements 
 
 "Y +■ Branches of Graph" and "p ■*■ v ". Box 2 now reads "X «- Boundary 
 
 {p 1 , p„, ..., p } n Y " which means that X contains all branches (in Y . ) 
 
 which have exactly one endpoint belonging to {p , p P-«^* This guarantees 
 
 that any branch picked from X will preserve connectedness and avoid circuits. 
 
 Box 3 has added the statement "p.., *■ other endpoint (a.)", which means that the 
 
 r j+l - j 
 
 endpoint of branch a, which is not already in {p. , p„ p } is assigned 
 
 to the node variable P. +1 • Also in box 3 are the statements "remove a from 
 
 Y " and "Y.., **■ Y " , which limit the choice of branches at lower levels (see 
 
 3 J+l J 
 
 box 2) in order to avoid duplications. 
 
 5.10. Factoring 
 
 This algorithm ([Ardon 69], [Chang 68], [Chen 68, 69c], [Cummins 64], 
 
START 
 
 >\ 
 
 21 
 
 3<-i 
 
 . <-B - Circuit_Maker« (a,, a 2 » •••• *1-1^ 
 
 pick a , from X 
 
 NO 
 
 Output {a lf a 2 , ..., a n-1 ] 
 
 j<-j-l 
 
 M RETURN 
 
 Figure 5* Circuit-Froe Expansion 
 
G*EV 
 
 22 
 
 3<-i 
 
 Y,<— Branches of Graph 
 
 2 
 
 r->l 
 
 X^^-Boundary {p x> p 2> ...» P JO T 
 
 YES > 
 
 pick a from X 
 
 p <- other_endpoint (ii) 
 
 remove a from Y 
 
 Y <— Y 
 
 ^ NO 
 
 Output £a x , a 2 , ..., a n ^J 
 
 r IT 
 NO 
 
 :a ^( return) 
 
 V 
 
 1 
 
 3<-j-i 
 
 Figure 4 i Connected Expansion 
 
23 
 
 [Hirayama 65], [Holt 68], [Mason 57], [Mcllroy 69], [Nakagawa 58], [Percival 
 
 53]) differs from the previous algorithm in that when node p is added to 
 
 the currently connected subgraph, all branches in X. which are incident to 
 
 p are factored together into a single iteration. As a consequence, at the 
 
 lowest levels, instead of individual trees of n-1 branches, Cartesian products 
 
 of n-1 factors are produced. 
 
 The flowchart for this algorithm appears in figure 5. In box 3, "pick 
 
 A from X " and "p. , *■ other endpoint (A.)" mean that a. is picked from 
 j j J+ 1 J J 
 
 X and p , is the other endpoint of a. (as in the previous algorithm), 
 j j + l J 
 
 However, a. is now extended to A., a factor set of branches: A. = 
 
 J J J 
 
 {a } u X n B ; that is, A. contains all branches in X. which are incident 
 j 3 P j+1 J 2 
 
 to p , . All of A. is removed from X.. Similarly, "remove A. from Y." 
 *3+l J J J J 
 
 deletes the entire subset A. from Y.. 
 
 J J 
 
 In box 4, "output A * A„ * . . . * A " means that a family of trees is 
 output in the form of a Cartesian product of the factor sets A., 1 < j < n-1. 
 This factored form is adequate for the applications (see section 4.2), but if 
 individual trees are desired, they can be obtained by finding all combinations 
 of one branch from each of the n-1 factor sets (this Cartesian product expan- 
 sion could be accomplished by the flowchart in figure 1, with box 2: "X, «- A." 
 
 and box 4: "Output {a n , a , ..., a , }") . 
 r 1' 2' n-1 
 
 5.11. More Factoring 
 
 The idea behind this algorithm is to factor into a single iteration (the 
 
 last one) all those cases in which only one branch from X. appears in a 
 
 tree. To avoid duplication, the other (earlier) iterations from X lead to 
 
 the choice (at level i+1) of an additional branch of X. . 
 
 3 
 
 The flowchart for this algorithm appears in figure 6. The new variables 
 
 are d. (a truth value which controls X.) and Z. (temporary storage for 
 
 X ). Box 2 now reads "if d. then Z. ■*■ X. «- Boundary {p., p„, ..., p J } n Y, 
 j j 3 3 1 2 K j ■ j 
 
START 
 
 J<-1 
 
 P 1^" V 1 
 
 Y.<- Branches of Graph 
 
 ±. 
 
 X ^-Boundary [p^ P 2> •.., Pj}n Y 1 
 
 A 
 
 YES 
 
 > 
 
 pick A from X 
 
 ^ 3 
 p ^_ other_endpoint (A ) 
 
 remove A from Y 
 
 3 j 
 Y <-Y 
 
 3<-3+* 
 
 NO 
 
 Output A^X A 2 *..• x A n _ 1 
 
 4 
 
 j<-j-i 
 
 24 
 
 RETURN 
 
 Figure 5* Factoring 
 

 25 
 
 else X «-X. ,". This means that if d . = YES, then X. is calculated 
 J 0-1 J J 
 
 as in the previous algorithm, and stored in Z.. If d = NO, then X. is as- 
 signed the current value of X. . (since one branch was picked from X , at 
 level j-1, this guarantees another branch from x ._i at level j). 
 
 Box 3 has the additional statements "d,,. <■ (-id,) or (X . = $)" and 
 
 J+l J J 
 
 "if d & d , then A. «- Z.". Thus, if d, = NO then d... •*■ YES. If 
 J J+l J J j J+ 1 
 
 d = YES and X. is not empty, then d. in «- NO. If d. = YES and X. is 
 J j J+l J J 
 
 empty, then d - *■ YES and A. is replaced by Z. (the saved value of the 
 
 full set X. before deletions). 
 
 J 
 
 5.12. Prunin g 
 
 In the algorithms of sections 5.9 through 5.11, branches are deleted from 
 
 the Y.'s. Even though the input graph was connected, deleted branches may 
 
 cause YA. = Y. u (A. x A„ x . . . x A ,) to fail to connect all the nodes 
 J J 1 2 n-1 
 
 (denote this situation by "YA. fails"). Once YA. fails, further computation 
 
 J J 
 
 at levels j through n-1 is wasted because no spanning trees can be found on 
 
 a disconnected graph. Thus it would be useful to know when YA. fails. On 
 
 the other hand, an additional connectedness test would be expensive because it 
 
 would be executed so many times. 
 
 The algorithm of this section differs from the three previous algorithms 
 
 in that needless computation is avoided when YA. fails, but an additional 
 
 J 
 
 test for connectedness is not needed. This is accomplished by using "failure 
 to find trees" as a test for connectedness. 
 
 The flowchart for this algorithm appears in figure 7. The new variable 
 is k. (the count of executed iterations from X ). Box 2 initializes this 
 
 •J J 
 
 count: "k. *■ 0". Box 3 increments the count "k . -«- k. + 1". 
 J J J 
 
 The major change occurs in the NO branch of the "j=l?" decision box 
 where a further test is inserted: "k . = 0?", which means "was X. empty in 
 box 2?". If the answer is NO , then computation proceeds (as in previous 
 
26 
 
 START 
 
 Y<- Branches of Graph 
 
 d 1 <_yas 
 
 if d then Z <- X <— Boundary {p^ p 2 P j) nT 1 
 
 else 
 
 X J^V 
 
 pick A from X 
 J J 
 
 p. .-,<— other_endpoint (A ) 
 remove A from Y 
 
 3 3 
 
 y <— y 
 
 3+1 3 
 
 d. ,<-(-id ) or (X = 0) 
 
 3+1 3 3 
 
 if d 4 d then A <-Z 
 
 3j+l 3 3 
 
 3^-3+1 
 
 NO 
 
 Output A x yA 2 y ... *A n-1 
 
 £ 
 
 3«-3-i 
 
 YES 
 
 >f return) 
 
 Figure 6% More Factoring 
 
START 
 
 >f 
 
 27 
 
 
 3<-l 
 
 Y. <— Branches of Graph 
 d 1 <_YES 
 
 JiL 
 
 2 
 
 if d then Z <— X <— Boundary £p , p , • ••, P.} fl Y 
 
 pick A. from X 
 
 /v 
 
 p ^ — other endpoint (A,) 
 
 remove A from Y 
 J J 
 
 J+l^ j 
 d. .<-(-,d ) or (X = 0) 
 
 if d & d then A <-2 
 j j+1 J 3 
 
 3 3+1 
 
 3^3+1 
 
 -<r 
 
 _H0_ 
 
 Output Aj x Ap x ...x A n-1 
 
 V 
 
 4 
 
 3«-J-l 
 
 XS_^fRHTlRN ) 
 
 YES 
 
 JL 
 
 3<-3-i 
 until 
 
 V 1 
 
 Figure 7* Pruning [the New Algorithm] 
 
28 
 
 flowcharts) to box 5. If the answer is YES, then it is known that YA 
 
 J 
 
 fails, and computation proceeds to box 6 which reads "j +• j-1 until I > 1". 
 This means that control returns to the previous level (j ■*■ j-1) and continues 
 to return to higher levels (pruning unnecessary iterations) until I. > 1. The 
 remaining iterations at level j are then pruned by proceeding to box 5 
 
 (j - J-D. 
 
 The idea behind box 6 is that the first iteration of box 3 does not change 
 
 the value of YA. from that of YA. . Thus the final value of j leaving 
 
 box 6 indicates the highest level at which YA. failed. 
 
 J 
 
 5.13. Variations 
 
 There are many variations of the algorithms of sections 5.7 through 5.12. 
 Some will be mentioned very briefly in this section. 
 
 There is an algorithm, Circuit Check, which is "half way" between Cancella- 
 tion of Non-Trees [5.7] and Circuit-Free Expansion [5.8]. This algorithm is use- 
 ful for analysis and will be described in section 6.4. 
 
 The algorithms of sections 5.7 and 5.8 can be generalized [Maxwell 66] 
 by replacing B. with a somewhat more general cutset. 
 
 Sections 5.7 and 5.8 can be improved by labeling the nodes so that degree 
 (v.) < degree (v , ) . For sections 5.9 through 5.12, a good heuristic is to 
 
 always choose p. to be of minimal degree. 
 
 J 
 
 For graphs with b < 2(n-l), it may pay to find all co-trees (using some 
 form of duality) and convert them to trees. 
 
 Finally there are many special cases which can occur in graphs (either 
 initially or during computation) which can be handled more efficiently than the 
 general case. For example, the existence of separating nodes or separating 
 branches allows a quick decomposition. 
 
29 
 6. ANALYTICAL MEASUREMENTS OF SELECTED ALGORITHMS 
 
 The algorithms described in chapter 5 will now be measured by the tech- 
 niques described in section 3.3. A priori bounds (which do not depend on the 
 structure of an algorithm) are given in section 6.1. An example of worst case 
 analysis appears in section 6.2. In the remaining sections, only the expansion 
 algorithms [5.6 through 5.12] are analyzed, using the "computation tree" defined 
 in section 6.3. Section 6.4 employs direct comparisons of consecutive algorithms, 
 Section 6.5 introduces the "quotient operator" and uses it to measure the New 
 algorithm on "closed ladder" graphs. Finally, section 6.6 applies "complete 
 graph analysis" in order to obtain upper bounds for the factoring algorithms. 
 
 6.1. A Priori Bounds 
 
 Sometimes it is possible to derive bounds for an algorithm without knowing 
 its structure. If an algorithm is difficult to analyze, a priori bounds may be 
 the tightest available bounds. 
 
 Find-all-trees algorithms illustrate this point because the required 
 number of answers, t, grows exponentially. Any algorithm which finds trees 
 one at a time must have t = 0(c) [recall from section 1.3 that "f(n,b,t) = 
 0(c)" means 4A such that c > A • f(n,b,t)]. For the algorithms of sections 
 5.4, 5.5, 5.8, and 5.9, tighter bounds are difficult to obtain. 
 
 Another example of "a priori" bounding occurs in the exhaustion algorithms 
 
 u 
 
 (section 5.1). The first algorithm checks all ( ) combinations of n-1 
 
 n-I 
 
 b! 
 
 branches, so regardless of the details, -r. ■ v, - > — rrv = 0(c). The second 
 
 ° (b-n+1) ! (n-1) ! 
 
 n-2 
 algorithm checks each of the n [Cayley 89] trees of the complete graph, 
 
 so n = 0(c). The storage required by the second algorithm is also larger 
 
 o 
 than n . These lower bounds are sufficient to demonstrate the inefficiency 
 
 of these algorithms. 
 
30 
 
 6.2. Worst Case 
 
 This section will illustrate "worst case analysis" as applied to the first 
 
 exhaustion algorithm [5.1]. Let the branches of the graph be numbered 1 through 
 
 b. Represent each combination of branches by an ordered list of n-1 positions, 
 
 p., each position containing a branch number (p. - i, where l<i^b). In order 
 
 to avoid duplications, require that p < p, for j<k. In order for this 
 
 condition to be satisfied for all j and k, each p. must be restricted to 
 
 J 
 
 values from a set of b-n+2 branch numbers, namely j<p.^b-n+l+j (l<j<n-l). 
 Assume that the algorithm is equivalent to n-1 nested iteration loops, 
 each loop corresponding to a position in the ordered list. Since each loop 
 goes through a maximum of b-n+2 iterations, the code in the innermost loop 
 is executed no more than (b-n+2) times. If that code (a tree test) costs 
 0(n), then c = 0(n« (b-n+2 ) n-1 ) . 
 
 6.3. Computation Trees 
 
 A computation tree, CT(G, A), traces the execution of algorithm A 
 applied to graph G. The expansion algorithms have computation trees of height 
 n, as shown in figure 8. The meaning of CT(G, A) and the definition of its 
 size parameters c.(G, A), are as follows. 
 
 The nodes of CT(G, A) are arranged in n levels, directly corresponding 
 to the n levels of the flowchart of A. At the bottom (i.e., level n) of 
 CT(G, A), there are c,(G, A) nodes, each corresponding to an execution of 
 box 4 ("process {a,, a„, ..., a }") of the flowchart of A. Each of the 
 other c„(G, A) nodes corresponds to an execution of box 2 ("compute X."). 
 There are c (G, A) branches in CT(G, A), each connecting a node on level 
 j to one on level j+1 (corresponding to an execution of box 3: "pick a. 
 from X."). Since the number of nodes equals the one plus number of branches, 
 c 2 (G, A) + c,(G, A) = 1 + c (G, A). The arguments G and A may be modified 
 or dropped when no confusion would result. 
 
51 
 
 highest level = 1 
 
 iterations of X^ 
 
 level 2 
 
 iterations of X^ 
 2 
 
 level 5 
 
 level n-2 
 
 iterations of X 
 
 Figure 8: Computation Tree CT(G,A) 
 
32 
 
 The cost of algorithm A applied to G can be expressed in terms of 
 these size parameters. Using the last equality and ignoring very small costs, 
 c = c (G, A)[cost(box 2) + cost(box 3)] + c (G, A)[cost(box 4) + cost(box 3)]. 
 
 Using this formula, a worst case bound for the Cancellation of Non Trees 
 (CNT) algorithm of section 5.7 can be derived as follows. First, cost (box 4) = 
 cost (compare two sets) • (// of sets in L) . Intuitively and in practice, 
 (// of sets in L) = 0(t). Since cost (compare) = 0(n), cost (box 4) = 0(nt). 
 Since this dominates the costs of the other boxes, only c,(G, CNT) needs to 
 
 be calculated. Now, from the flowchart (figure 2), X. c B.. Therefore, in 
 
 J ~ J 
 
 CT(G, CNT) each node at level j leads to (at most) degree (v.) nodes at 
 
 n-1 J 
 
 level j+1. Thus c,(G, CNT) < II degree (v.). To express c, in terms of 
 
 J=l 
 n and b, first use the "geometric vs. arithmetic mean" inequality [i.e., 
 
 a, • x_ . . . x < — (x 1 + x„ + . . . + x ) 1 , to obtain 
 
 12 n n 1 2 n 
 
 1 n_1 n-1 
 
 c (G,CNT) < ( — - £ degree (v.)) . Assuming degree (v ) > degree (v.) 
 4 n— x A—l J J 
 
 [the most efficient case], c.(G.CNT) < (— ) n_1 . Thus c = 0(nt(— ) n_1 ). 
 
 4 n n 
 
 In the analysis to follow, it will be convenient to use the notation 
 ST(G, A, w) to denote the subtree of CT(G, A) consisting of the node w 
 [w e CT(G, A)] and all the nodes and branches connected to w from below. 
 If w is the root node, ST(G, A, w ) = CT(G, A). Two subtrees, 
 ST(G , A , w ) and ST(G„, A , w„), are isomorphic if there is a one-to-one 
 and onto mapping of the nodes and branches which preserves incidence and level 
 relationships . 
 
 6.4. Direct Comparisons 
 
 The expansion algorithms (5.7 through 5.12) will now be sequentially 
 compared in terms of their computation trees. The expression "c . (A ) < c.(A.)" 
 means that for all graphs G, and for each size parameter c. (i = 2, 4), 
 c.(G, A x ) < c ± (G, A 2 ). 
 
33 
 
 The Circuit-Free Expansion (CFE) algorithm [5.8] introduces two changes 
 from the previous algorithm (CNT) . First, the "non-tree test" changes from 
 "check L for duplicates" to "check for circuits". Second, this test has been 
 moved up from box 4 to box 2. In order to isolate the change in efficiency, let 
 us make these changes one at a time. 
 
 If the second change is made without the first [Piekarski 65], the cost 
 increases [based on limited empirical evidence]; therefore, let us try the 
 first change without the second. Call this the Circuit Check algorithm (CtC) . 
 Since the only change is in box 4 ("check {a.. , a„, ..., a } for circuits"), 
 the computation tree does not change: c.(CtC) = c.(CNT). However, cost (box 4) 
 drops from O(n.t) to 0(n) [the cost of a circuit test]. Clearly, Circuit 
 Check is more efficient than Cancellation of Non-Trees. 
 
 Now add the second change. The cost of each box is 0(n) regardless of 
 the placement of the circuit test (only the constants change). However 
 c . (CFE) < c.(CtC) because the non-trees are discovered sooner, and needless 
 computation is avoided. For nearly all graphs G, c.(G, CFE) < c.(G, CtC). 
 Thus the second change is an improvement also. 
 
 The Connected Expansion (Con) algorithm [5.9] will be considered equally 
 efficient as Circuit-Free Expansion. Both algorithms find trees one at a time, 
 avoiding non-trees and duplications, so c, (CFE) = c.(Con) = t. Empirically, 
 Circuit-Free Expansion appears more efficient [Fernandez 69a]. 
 
 The Factoring (Fac) algorithm [5.10] is clearly an improvement over "one 
 
 12 k 
 tree at a time" algorithms. Each factor A. = {a., a., .... a.} (box 3, 
 
 J J J J 
 
 figure 5), corresponds to a node w. ,_ in CT(G, Fac); i.e., A. corresponds 
 
 to the entire subtree ST(G, Fac, w,,.), Each a. corresponds to a node w.,, 
 
 J +1 J J+l 
 
 in CT(G, Con). Therefore, ST(G, Fac, w .) replaces k subtrees 
 ST(G, Con, w 1 +1 ), i = 1, ..., k. Clearly c.(G, Fac) < c.(G, Con), with 
 equality holding only if G is a tree. Typically, c.(G, Fac)/t (the "cost per 
 
3h 
 
 tree") goes to zero exponentially as n increases [6.6, figure 10]. 
 
 For each X. calculated in box 2 of the Factoring algorithm, the trees 
 
 which contain iust one branch from X. will be calculated k times over, 
 
 J ' 
 
 where k is the number of iterations necessary to empty X.. The More 
 Factoring (MF) algorithm [5.11] combines these k cases into a single iteration, 
 clearly an improvement in efficiency. Thus c.(G, MF) < c.(G, Fac) , with 
 equality holding rarely. Typically, c . (MF) /c . (Fac) goes to zero exponentially 
 as n increases [6.6]. An intuitive indication of the improvement is that the 
 factors are larger; i.e., on a complete graph, Factoring will always find at 
 least one Cartesian product family consisting of a single tree [Chang 68], while 
 (if n>3) every family found by More Factoring will contain at least two trees. 
 
 The Pruning algorithm [5.12] is clearly an improvement. The test for 
 correctedness is obtained at negligible cost, but the potential savings are 
 large. Naming this algorithm (with factoring and pruning, [figure 7]) the 
 New algorithm, c.(G, New) < c.(G, Fac). 
 
 6.5. Special Graphs: The Quotient Operator 
 
 This section (as well as the next) will illustrate the technique of 
 measuring the cost of an algorithm on a parameterized class of graphs, 
 G = G(p). For the algorithms to be measured, c = k c„(G) + k, c,(G) with 
 k. = 0(n); thus, the only quantities which need to be measured are c (G) 
 and c ,(G) . Since G = G(p), c.(p) will replace c.(G). Only algorithms 
 with factoring will be measured directly; the "one at a time" algorithms [5.4, 
 5.8, 5.9] have c /(p) = c (p)» the number of trees as a function of p. 
 
 For the classes of graphs to be considered, c~(p), c,(p), and t(p) 
 are all exponential in p. In order to derive, compare, and plot these func- 
 tions, f(p), the "quotient operator", Q(f, p), will be used: Q(f, 1) = 
 f(l), and for p>l, Q(f, p) = f(p)/f(p-l) [it is not difficult to interpret 
 
35 
 
 P 
 f(l) > 1, thus Q(f, 2) is well defined]. Clearly, f(p) = n Q(f, i) . 
 
 i=l 
 For the functions to be considered, there will always exist a "quotient limit", 
 
 q(f, p) , either a constant or a linear function of p, such that 
 
 lim ^IxJSl = i. For example, q(p!, p) = p, q(k P , p) = k, q(k P , k) = 0. 
 p-*» q(f, p) 
 
 As a first example of a special class of graphs, consider L(r), the 
 closed ladder of r rungs (see figure 9). Since n(r) = 2r and b(r) = 3r-2, 
 this example will show that even on graphs with rank > nullity [i.e., 
 b < 2(n-l)], the New algorithm has cost per tree, c/t, going to zero 
 exponentially. To prove this claim, it suffices to show that q(c.(r, New),r)/ 
 q(t, r) < 1. 
 
 It is not difficult to derive the equation t(r) = 4 t(r-l) - t(r-2), 
 with t(l) = 1, t(2) = 4. In fact, t(r) = t(i+l) • t(r-i) - t(i) • t(r-i-l), 
 for any i, l<i<r-2. From this equation it is not difficult to show that 
 q(t, r) = 2 + /3. 
 
 The New algorithm on L(r) starts at p , one of the four corners 
 (nodes of degree 2). There will be two iterations at the first level of the 
 computation tree CT(r) [see figure 9]. 
 
 The first iteration handles the case in which both branches incident to 
 p , the starting node, are included in a tree. Thus at level 3, the three 
 branches in X [see figure 9] will reach only two nodes. Because of factoring, 
 the remaining computation from X_ is just like starting at a corner of L(r-l). 
 If w is the node in CT(r) which corresponds to the calculation of X„, 
 then ST(r, w ) is isomorphic to CT(r-l). 
 
 The second iteration at level 1 handles the case in which only one of the 
 branches incident to p 1 will be included in a tree. Assuming p„ is chosen 
 to be the corner adjacent to p , then there is only one branch in X , and 
 this leads to a corner of L(r-l) [see figure 9]. Once again, if w' is the 
 
56 
 
 CT(r) 
 
 9- 
 
 L(r) = 
 
 r-1 
 
 -o- 
 
 T-2 
 
 -o- 
 
 6- 
 
 o- o 
 
 r-1 
 -O- 
 
 *\ 
 
 -o 
 
 o-_3 
 
 CT(r-l) 
 
 r-1 1 
 
 L(r-1) = 
 
 O O 
 
 7 V 
 
 Figure 9i L(r) and CT(r, New) 
 
no 
 
 37 
 
 de of CT(r) immediately below this iteration of X , then ST(r, w') is 
 isomorphic to CT(r). 
 
 From the above information, it is easy to derive the relationships 
 c 4 (r) = 2 • c 4 (r-l), c 2 (r) = 2 • c 2 (r-l) + 3, and c 4 (l) = c 2 (l) - 1. 
 Applying the quotient operator and taking limits yields q(c., r) = 2. The 
 claim is proved and c.(r, New) = 0(2 ). 
 
 6.6. Complete Graphs 
 
 This section will measure the computation tree of algorithms with 
 factoring [5.10, 5.11] applied to the complete graph on n nodes (G = S(n)). 
 This test is significant as an "upper bound"; that is, for all graphs G on 
 n nodes, c.(G, A) < c.(S(n), A). Because of factoring, this inequality remains 
 true even if G has parallel branches. For non-factoring algorithms, the state- 
 ment would have to be modified to exclude parallel branches. 
 
 CT(S(n), Fac) has n-1 iterations (branches) at the highest level 
 because in S(n) , the starting node p 1 is adjacent to each of the n-1 other 
 nodes. As these iterations are executed, Y [the set of available branches in 
 S(n)] will be diminished only by branches incident to p.. [see figure 5]. 
 Therefore, Y will always contain a complete graph on the n-1 nodes excluding 
 p 1 . This completeness implies that for each node w at the second level of 
 CT(n, Fac), ST(n, Fac, w) is isomorphic to CT(n-l, Fac). 
 
 From this information, it is easy to derive the following relationships: 
 
 c 4 (n, Fac) = (n-1) c 4 (n-l, Fac); c 2 (n, Fac) = (n-1) c (n-1 , Fac) + 1; 
 
 c 2 (l) = c 2 (2) = c 4 (l) = c 4 (2) = 1. Obviously, c 4 (n, Fac) = (n-1)!, so 
 
 q(c 4 , n) = n-1. Q(c 2 , n) = [ (n-l)c 2 (n-l) + l]/c 2 (n-l) = n-1 + l/c 2 (n-l). 
 
 lim n-1 + l/c 2 (n-l) 
 Since : = 1, q(c_, n) = n-1. Thus the quotient limits 
 
 for c 2 and c 4 are equal. The only effect of the "+1" term in the recursive 
 
 lim c 2^ 
 
 formula for c„ is that r-r- = e. 
 
 2 n-+°° c. (n) w 
 
38 
 
 The complete graph analysis of the New algorithm is slightly more 
 complicated. In box 2 [figure 6], if d. = YES, then X. is calculated as 
 in the previous Factoring algorithm [figure 5], The analysis of this (d. = YES) 
 case is similar to the analysis of the Factoring algorithm; if w. is the 
 corresponding node in CT(n, New), then ST(n, New, w ) is isomorphic to 
 CT(n+l-j, New). 
 
 Since d = YES, X will reach each of the n-1 nodes adjacent to p , 
 the starting node. Let w be the root of CT(n, New) and let w„ (i = 1, 
 ..., n-1) be the nodes at level 2. For w~ , d_ = YES, so ST(n, New, w ) 
 is isomorphic to CT(n-l, New). For l<i<n-2, let w ' be the nodes at 
 level 3 directly connected to w . For w„ (l<i<n-2) , d = NO, but for 
 i,k(i)^ = yES: sQ ST ( nf New ^ i,k(i)^ is isomorphic to cT(n-2, New). 
 
 n-2 
 
 v i 
 
 To compute I k(i) , note that X„ (corresponding to w„) is assigned [box 
 
 i=l 
 
 2, figure 6] the current (depleted) value of X . Thus there will be as many 
 
 iterations of X„ [k(i)] as there are remaining iterations of X.. [n-l-i]. 
 
 n-2 n-2 
 
 Thus [ k(i) = I n-l-i = (n-2) (n-1) /2 . 
 i=l i=l 
 
 From this information, it is easy to derive recursive formulas for 
 
 c.(n, New): c^n) = c 4 (n-l) + iH=22iSlil ^(n-2); c 2 (n) = c 2 (n-l) + 
 
 (n-2)(n-l) c ^ (n _ 2) + n _ 1; c ^ (1) m c ^ (2) = c ^ (1) = c ^ (2) = 1; C2 ( 3) m 3# 
 
 Applying the quotient operator to these recursive equations yields 
 
 Q(c 4 , n) - 1 + 2?Q(c (n n-l ) ' T ° Sh ° W that q(c 4' n) = ( n ~ 1 ) // 2» let 
 
 m = 
 
 lim nt s J?- „ u • - n , , /2 /2 
 
 0(c. , n) • — - . From the previous equation, Q(c. , n) • — =- = — - + 
 n-*-°° 4 n-1 ^ M 4' n-1 n-1 
 
 — — ■ ' — . Taking the limit as n-*°°, m = OH ; i.e., m = 1, so by 
 
 /2 Q(c A , n-1 m 
 
 definition, q(c 4 , n ) = (n-l)//l. The proof that q(c 9 , n) = (n-l)//2 is 
 
39 
 
 similar (an additional term goes to zero). 
 
 The quotient limit for the Circuit Check algorithm [6.4] can be derived 
 
 2b n_1 
 from the fact that for S(n), b = n(n-l)/2. Thus c A (S(n), CtC) » (— ) 
 
 , , n x n-l , n-2 1 n-2 
 
 . , >n-l _/ x (n-1; / , yn-l N lim ^n-iv 
 
 (n-1) . Q(c 4 , n) =— — 2 =(n-lX^2) . Since n _(^) 
 
 (n-2) 
 
 q(c^, n) = (n-l)e. 
 
 n— 7 
 
 The quotient limit for t(n) = n is similarly derived: Q(t, n) = 
 
 n-1 
 
 n " 2 1 n " 1 Hn, (n " 2+ n>(^T> 
 
 n / o ■ 1\ / n N iim n n-i .. , , 
 
 = (n-2+ -) (— -7) ; _„ . , - 1; therefore, 
 
 . nX n-3 n n-1 ' n^-°° (n-2)e 
 
 (n-1) 
 
 q(t,n) = (n-2)e. 
 
 Figure 10 plots c,(n, CtC), t = c,(n, CFE) = c,(n, Con), c.(n, Fac) , 
 and c.(n, New) using the quotient operator Q(f, n) . The quotient limits 
 derived above are the asymptotic limits of the plotted functions. 
 
 From this analysis, it is obvious that the cost per tree, c/t, goes to 
 
 zero exponentially (e ) for the Factoring algorithm, and the cost ratio of 
 
 -n 
 the New algorithm to Factoring also goes to zero exponentially (/2 ). Thus, 
 
 the most efficient way to find trees one at a time is to use the New algorithm 
 
 combined with a simple Cartesian product expansion algorithm [cost (New) < cost 
 
 (simple expansion) < cost (other "one at a time" algorithms)]. 
 
40 
 
 Figure 10: Complete Graph Analysis 
 
41 
 
 7. CONCLUSIONS 
 
 One important contribution of this thesis is the efficiency analysis of 
 all published algorithms for finding all trees of a graph. A very rough summary 
 of this analysis is as follows: 
 
 Algorithm Cost 
 
 2 
 "check for duplications" t 
 
 "one tree at a time" t 
 
 ~n 
 Factoring te 
 
 New t(e/2)~ n 
 
 Note that for the algorithms with factoring, the cost per tree goes to zero 
 
 exponentially as n increases. 
 
 The techniques which were used to measure efficiency include the 
 following: (1) the use of special classes of graphs on which the cost of an 
 algorithm can be accurately measured (e.g., complete graphs); (2) the direct 
 comparison (e.g., using computation trees) of competing algorithms in order to 
 show differences in efficiency without the need to derive individual bounds; 
 (3) the isolation of each idea of an algorithm (e.g., factoring) so that the 
 efficient ideas can be available for the development and analysis of new 
 algorithms; (4) the minimization of implementation details in empirical 
 measurements (e.g., using GASP and counting statements rather than seconds); 
 (5) the use of measures which reflect the nature of the class of algorithms 
 (e.g., the quotient operator which linearizes the exponential nature of "recur- 
 sive" algorithms). 
 
 The New algorithm is an important contribution of this thesis primarily 
 because these techniques show that it is more efficient than any previous 
 algorithm for finding all trees. 
 
U2 
 
 REFERENCES 
 
 These references are broken into three sections, and each section has its 
 own aims and criteria for inclusion. 
 
 The first section aims at being an exhaustive list of papers which discuss 
 various aspects of algorithms for finding all spanning trees of a graph. In 
 addition, this section contains some references which discuss graph theoretical 
 results of potential importance to such algorithms (e.g., the existence of a 
 Hamiltonian circuit in the tree graph of a graph, or bounds on the number of 
 trees in a graph) . Several of these references discuss applications which use 
 all spanning trees, mainly in the analysis of linear electrical networks. 
 
 The second section contains references to reports on general purpose graph- 
 processing languages or software packages (programs which implement a specific 
 graph algorithm are not included here) . 
 
 The third section includes a few references to important papers on other 
 graph algorithms, in particular those concerned with the following problems: 
 
 a) Find a minimum cost spanning tree of a graph. 
 
 b) Find a basis in the vector space of circuits of 
 
 a graph (also known as a set of fundamental circuits) . 
 
 c) Determine isomorphism of graphs. 
 
 d) Determine if a graph is planar. 
 
 e) Find shortest paths in a graph. 
 
1*3 
 
 1. Spanning Trees 
 
 Amoia, V., and Cottafava, G. "On Central Trees," Proceedings of the 12th 
 Midwest Symposium on Circuit Theory , paper XIV. 1, 196*9^ 
 
 Ardon, M., and Malik, N. "A Recursive Algorithm for Generating Trees and 
 Signed Complete Trees, " Proceedings of the 12th Midwest Symposium on 
 Circuit Theory , paper VII. 2, 1969. 
 
 Bedrosian, S. "Application of Linear Graphs to Multilevel Maser Analysis," 
 
 Journal of the Franklin Institute , Vol. 27^, No. k, pp. 278-283, October 
 1962. 
 
 Bellert, S. "Topological Analysis and Synthesis of Linear Systems," Journal 
 of the Franklin Institute , Vol. 27^, No. 6, pp. k25-kk3, December 1962. 
 
 Bercovici, M. "Formulas for the Number of Trees in a Graph, " IEEE Trans - 
 actions on Circuit Theory , Vol. CT-16, pp. 101-102, February 1969. 
 
 Berger, I. "The Enumeration of Trees Without Duplication," IEEE Transactions 
 on Circuit Theory , Vol. CT-1^, pp. 1+17-^-18, December 1967. 
 
 Berger, I., and Nathan, A. "The Algebra of Sets of Trees, K-Trees, and Other 
 Configurations," IEEE Transactions on Circuit Theory , Vol. CT-15, 
 pp. 221-226, September 1968. 
 
 Brownell, R. "Growing the Trees of a Graph," Proceedings of the IEEE , Vol. 56, 
 pp. 1121-1123, June 1968. 
 
 Cayley, A. "A Theorem on Trees," Quarterly Journal of Mathematics , Vol. 23, 
 pp. 376-378, 1889. 
 
 Chang, W., and Chan, S.G. "A Fast Tree-Finding Method," Proceedings of the 
 11th Midwest Symposium on Circuit Theory , pp. i+57-^62^ 1968. 
 
 Char, J. "Generation of Trees, Two-Trees and Storage of Master Forests," 
 
 IEEE Transactions on Circuit Theory , Vol. CT-15, pp. 228-238, September 1968. 
 
 Chen, W. "Generation of Trees and K-Trees," Proceedings of the Third Allerton 
 Conference on Circuit and Systems Theory , pp. 889-899; 1965 • 
 
 Chen, W. "On the Generation of Non-Singular Submatrices and Their Corresponding 
 Subgraphs, " Proceedings of the Fourth Allerton Conference on Circuit and 
 Systems Theory , pp. 207-217, 1966a. 
 
 Chen, W. "On the Realization of Directed Trees and Directed 2-Trees, " IEEE 
 Transactions on Circuit Theory , Vol. CT-13, pp. 230-232, June 1966b. 
 
 Chen, W. "Hamilton Circuits in Directed-Tree Graphs," IEEE Transactions on 
 Circuit Theory , Vol. CT-14, pp. 231-233, June 1967 . 
 
 Chen, W. "Iterative Procedure for Generating Trees and Directed Trees, " 
 Electronic Letters , Vol. K, No. 23, pp. 516-518, November 1968. 
 
 Chen, W. "Computer Generation of Trees and Co-Trees in a Cascade of Multi- 
 terminal Networks," IEEE Transactions on Circuit Theory , Vol. CT-16, 
 pp. 518-526, November 1969a. 
 
 Chen, W. "Generation of Trees and Co-Trees of a Graph by Decomposition, " 
 Proceedings of the IEE ( London ), Vol. Il6, No. 10, pp. l639-l61+3, 
 October 1969b. 
 
 Chen, W. "On the Generation of Trees Without Duplications," Proceedings of 
 the IEEE, Vol. 57, pp. 1292-1293, July 1969c. 
 
kk 
 
 Chen, W., and Mark, S. "On the Algebraic Relationship of Trees, Co-Trees, 
 
 Circuits, and Cutsets of a Graph, " IEEE Transactions on Circuit Theory , 
 Vol. CT-16, pp. 176-I8I+, May 1969d. 
 
 Cummins, R., and Thomason, L. "An Efficient Tree-Listing Program," Unpublished, 
 196^. 
 
 Cummins, R. "Hamilton Circuits in Tree Graphs, " IEEE Transactions on Circuit 
 Theory , Vol. CT-13, pp. 82-90, March 1966. 
 
 Dawson, D. "Computational Aspects of the Topological Approach to Active 
 
 Linear Network Analysis, " Proceedings of Hawaii International Conference 
 on System Sciences , pp . 113-115, 1968. 
 
 Dunn, W. Jr., and Chan, S.P. "Topological Formulation of Network Functions 
 Without Generation of K- Trees, " Proceedings of the Sixth Allerton 
 Conference on Circuit and Systems Theory, pp. 822-831, 1968. 
 
 Fernandez, E. "Analisis de Redes Electricas con Computador Digital mediante 
 
 Formulas Topologicas, " Thesis, Departamento de Electricidad, Universidad 
 de Chile, 1969a. 
 
 Fernandez, E. "An Evaluation of Tree Generation Methods, " Proceedings of 
 the 12th Midwest Symposium on Circuit Theory , paper VIlT5~ 1969b • 
 
 Feussner, W. "Uber Stromverzweigung in Netzformigen Leitern, " Annalen der 
 Physik , Vol. 9, pp. I30I4-I329, 1902 . 
 
 Feussner, W. "Zur Berechnung der Stromstarke io Netzformigen Leitern, " Annalen 
 der Physik , Vol. 15, pp. 385-39^, 190^+. 
 
 Fujisawa, T. "On a Problem of Network Topology," IRE Transactions on Circuit 
 Theory , Vol. CT-6, pp. 261-266, September 1959~ 
 
 Hakimi, S. "On Trees of a Graph and Their Generation," Journal of the 
 Franklin Institute , Vol. 272, No. 5, PP- 3^7-359, November 196I. 
 
 Hakimi, S., and Green, G. "Generation and Realization of Trees and K- Trees, " 
 IEEE Transactions on Circuit Theory , Vol. CT-11, pp. 2^7-255, June 196^. 
 
 Hakimi, S., and Deo, N. "A Topological Approach to the Analysis of Linear 
 
 Circuits, " Proceedings of the Fourth Allerton Conference on Circuit and 
 Systems Theory , pp. 197-206, 1966 . 
 
 Hale, H. "A Logic for ] dentifying Trees of a Graph," ALEE Transactions on 
 Power Apparatus and Systems , Vol. 80, pp. 195-197, June 196I . 
 
 Harary, F. "Graph Theory and Electrical Networks," IRE Transactions on 
 Circuit Theory , Vol. CT-6, pp. 95-109, May 1959- 
 
 Kirayama, H., Watanabe, H., and Harada, K. "Digital Determination of Trees 
 
 In Network Topology, " Journal of the Institute of Electrical Communications 
 Engineers of Japan , Vol. ^6, No. 1, pp. 23-30, January 1963 • 
 
 Hirayama, H., and Ohtsuki, T. "Topological Network Analysis by Digital Computer," 
 Journal, of the Institute of Electrical Communication Engineers of Japan , 
 Vol. hti, No. 3, pp. h2h-k'52, March 1965. 
 
 Hobbs, E., and MacWilliams, F. "Topological Network Analysis as a Computer 
 Program, " IRE Transactions on Circuit Theory , Vol. CT-6, pp. 135-136, 
 March 1959- 
 
 Holt, A., and Fiedler, J. "Efficient Tree-Generation Method Suitable for 
 
 Computer Programming," Electronic Letters , Vol. k, No. 10, pp. 183-1814-, 
 May 1968. 
 
h5 
 
 Jong, M., Lau, H., and Zobrist, G. "Tree Generation," Electronic Letters , 
 Vol. 2, No. 8, pp. 318-319, August 1966. 
 
 Kamae, T. "The Existence of a Hamilton Circuit in a Tree Graph, " IEEE 
 
 Transactions on Circuit Theory , Vol. CT-14, pp. 279-283, September 1967. 
 
 Kim, W., Freiman, C, Younger, D., and Mayeda, W. "On Iterative Factorization 
 in Network Analysis by Digital Computers, " Eastern Joint Computer Confer - 
 ence , pp. 2*4-1-253, December i960. 
 
 Kishi, G., and Kajitani, Y. "On Maximally Distinct Trees," Proceedings of the 
 Fifth Annual Allerton Conference on Circuit and Systems Theory , pp. 635- 
 6U3, 1967. 
 
 Kishi, G., and Kajitani, Y. "On Hamilton Circuits in Tree Graphs," IEEE 
 Transactions on Circuit Theory , Vol. CT-15, pp. 1 +2-50, March 1968. 
 
 Kishi, G.,'and Kajitani, Y. "Maximally Distant Trees and Principal Partition 
 of a Linear Graph, " IEEE Transactions on Circuit Theory , Vol. CT-l6, 
 pp. 323-330, August 1969. 
 
 Lee, S. "On Topological Formulae," Proceedings of the First Annual Allerton 
 Conference on Circuit and Systems" Theory" pp. ^•27- I +55, 19&3 • 
 
 MacWilliams, J. "Topological Network Analysis as a Computer Program," IRE 
 Transactions on Circuit Theory , Vol. CT-5, pp. 228-229, September I95H. 
 
 Malik, N., and Lee, Y. "Finding Trees and Signed Tree-Pairs by the Compound 
 Method, " Proceedings of the 10th Midwest Symposium on Circuit Theory , 
 paper VI- 5, 1967- 
 
 Mason, S. "Topological Analysis of Linear Non-Reciprocal Networks," Pro - 
 ceedings of the IRE , Vol. K^, pp. 829-838, June 1957- 
 
 Maxwell, L., and Cline, J. "Topological Network Analysis by Algebraic Methods," 
 Proceedings of the IEE ( London ), Vol. 113, No. 8, pp. 13^-13^7, August 
 1966. 
 
 Mayeda, W. "Digital Determination of Topological Quantities and Network 
 
 Functions," Interim Technical Report No. 6 , Contract No. DA-11-022-0RD- 
 1983, University of Illinois, Urbana, Illinois, January 1957* 
 
 Mayeda, W. "Reducing Computational Time in the Analysis of Networks by 
 Digital Computers," IRE Transactions on Circuit Theory , Vol. CT-6, 
 pp. 136-137, March 1959- 
 
 Mayeda, W., and Seshu, S. "Generation of Trees Without Duplications," IEEE 
 Transactions on Circuit Theory , Vol. CT-12, pp. 18I-I85, June 1965. - 
 
 Mayeda, W. "Generation of Trees and Complete Trees," CSL Report R-28U , 
 University of Illinois, Urbana, Illinois, April 1966. 
 
 Mayeda, './., Hakimi, S., Chen, W. and Deo, N. "Generation of Complete Trees," 
 IEEE Transactions on Circuit Theory , Vol. CT-15, pp. 101-105, June 1968. 
 
 Mcllroy, M. "Generator of Spanning Trees," Communications of the ACM , Vol. 12, 
 No. 9, p. 511, September 1969. 
 
 Minty, G. "A Simple Algorithm for Listing All the Trees of a Graph," IEEE 
 Transactions on Circuit Theory , Vol. CT-12, p. 120, March 19&5 • 
 
 Mullin, R., and Stanton, R. "A Combinatorial Property of Spanning Forests in 
 Connected Graphs," Journal of Combinatorial Theory , Vol. 3, pp- 236-2^3* 
 1967. 
 
k6 
 
 layers, B., and Auth, L. Jr. "The Number and Listing of All Trees in an 
 
 Arbitrary Graph," Journal of Combinatorial Theory , Vol. 3, pp. 236-2^3, 
 1967. 
 
 Myers, B., and Auth, L. Jr. "The Number and Listing of All Trees in an 
 Arbitrary Graph, " Proceedings of the Third Allerton Conference on 
 Circuit and Systems Theory , pp. 906-912, 1965. 
 
 Myers, B. "Efficient Generation of Tree -Admittance Products in a Cascade of 
 2-Port Networks," Proceedings of the IEE ( London ), Vol. 114, No. 11, 
 pp. l64l-l646, November 1967. 
 
 Nakagawa, N. "On Evaluation of the Graph Trees and the Driving Point 
 
 Admittance," IRE Transactions on Circuit Theory , Vol. CT-5, pp. 122-127, 
 June 1958 . 
 
 O'Neil, P., and Slepian, P. "An Application of Feussner's Method to Tree 
 
 Counting," IEEE Transactions on Circuit Theory , Vol. CT-13, pp. 336-339, 
 September 1966a. 
 
 O'Neil , P., and Slepian, P. "The Number of Trees in a Network," IEEE 
 
 Transactions on Circuit Theory , Vol. CT-13 , PP • 271-281, September 1966b. 
 
 ul, A. Jr. "Generation of Directed Trees and 2-Trees Without Duplications," 
 IEEE Transactions on Circuit Theory , Vol. CT-14, pp. 35^-356, September 
 
 L-clval, W. "The Solution of Passive Electrical Networks by Means of 
 
 Mathematical Trees," Proceedings of the IEE (London), pt. 3, Vol. 100, 
 pp. 110-150, May 1955. 
 
 Piekarski, M. "Listing of All Possible Trees of a Linear Graph," IEEE 
 
 Transactions on Circuit Theory , Vol. CT-12, pp. 124-125, March 19&5 • 
 
 Riordan, J. "The Enumeration of Trees by Height and Diameter," IBM Journal 
 of Research and Development , Vol. h, No. 5, pp. 473-478, November i960. 
 
 , P. "On a Tree Expansion Theorem, " IRE Transactions on Circuit Theory , 
 Vol. CT-8, pp. U 96-5OO, December 19637! 
 
 Scoins, H. "Placing Trees in Lexicographic Order," Machine Intelligence 3 ? 
 pp. 1+5-60, 1968. 
 
 Shank, H. "A Note on Hamilton Circuits in Tree Graphs," IEEE Transactions on 
 ■Vir.-uit Theory , Vol. CT-15, p. 86, March 1968 . 
 
 hman, C, Maenpaa, J., and Stahl, W. "Complete Tree Generation - Some 
 
 ractical Experience," :EEE Transactions on Circuit Theory , Vol. CT-16, 
 pp. 548,550, November 196^! 
 
 .11. "A Note on Enumeration and Listing of All Possible Trees in a 
 
 ;ted Linear Graph," Proceedings of the National Academy of Science , 
 . 40, pp. 1004-1007, October 1954. 
 
 Watanabe, H. "Computational Method for Network Topology," IRE Transactions on 
 Circuit Theory , Vol. CT-7, pp. 296-302, September i960. 
 
 Watanabe, . [ethod of Tree Expansion in Network Topology," IEEE Trans - 
 actions on Circuit Theory , Vol. CT-8, pp. 4-10, March 1961. 
 
 W dnb :r . L. "Kirchoff's Third and Fourth Laws," IRE Transactions on Circuit 
 Theory , Vol. CT-5, pp. 8-30, March 1958. 
 
 Wine, 0. "Enumeration of Trees," IEEE Transactions on Circuit Theory , Vol. 
 CT-10, pp. 127-128, March I963T 
 
hi 
 
 Zobrist, G., and Lago, G. "Digital Computer Analysis of Passive Networks 
 Using Topological Formula, " Proceedings of the Second Annual Allerton 
 Conference on Circuit and Systems Theory , pp. 513-595, 196*+ • 
 
 2. General Purpose Graph Software 
 
 Christensen, C. "An Example of the Manipulation of Directed Graphs in the 
 AMBIT/g Programming Language, " Interactive Systems for Experimental 
 Applied Mathematics , pp. ^23-^35, 196b. 
 
 Friedman, D.P., Dickson, D.C., Fraser, J. J., and Pratt, T.W. "GRASPE 1.5- 
 A Graph Processor and its Application, " University of Houston Report 
 RS I-69 , Houston, Texas, August 1969. 
 
 Hart, R. "HINT: A Graph Processing Language," Research Report , Computer 
 
 Institute for Social Science Research, Michigan State University, East 
 Lansing, Michigan, February 19&9 • 
 
 Read, R.C., King, C, Cadogan, C.C., and Morris, P. "The Application of 
 
 Digital Computer Techniques to the Study of Graph Theoretical and Related 
 Combinatorial Problems, " Scientific Report , Computing Centre, University 
 of West Indies, Mona, Kingston 1, Jamaica. 
 
 Wolfberg, M.S. "An Interactive Graph Theory System," Moore School Report No. 
 69-25 ? University of Pennsylvania, Philadelphia, Pennsylvania, June 1969. 
 
 3. Selected References to Other Graph Algorithms 
 
 Chase, S. "How to Win Shannon Switching Games: A Case Study in Automatic 
 Graph Processing, " Communications of the ACM , (to appear) . 
 
 Corneil, D., and Gotlieb, C. "An Efficient Algorithm for Graph Isomorphism," 
 Journal of the ACM , Vol. 17, No. 1, pp. 51-6^+, January 1970. 
 
 Dijkstra, E. "A Note on Two Problems in Connection with Graphs," Numerische 
 Mathematik , Vol. 1, No. 5, pp. 269-271, October 1959- 
 
 Gotlieb, C, and Corneil, D. "Algorithms for Finding a Fundamental Set of 
 
 Cycles for an Undirected Linear Graph," Communications of the ACM , Vol. 
 10, No. 12, pp. 780-783, December 1967. 
 
 Paton, K. "An Algorithm for Finding a Fundamental Set of Cycles of a Graph, " 
 Communications of the ACM , Vol. 12, No. 9, pp. 51^-518, September 1969. 
 
 Shirey, R. "Implementation and Analysis of Efficient Graph Planarity Testing 
 
 Algorithms," Ph.D. Thesis , University of Wisconsin, Madison, Wisconsin, 1969. 
 
 r, S.H. "GIT - A Heuristic Program for Testing Pairs of Directed Line 
 Graphs for Isomorphism," Communications of the ACM , Vol. 1, No. 1, pp. 
 26-3^, January 196^. 
 
 Witzgall, C. "On Labelling Algorithms for Determining Shortest Pahts in 
 Networks," NBS Report 98I+O, 1968. 
 
 
US 
 
 APPENDICES 
 
 1. GASP MANUAL 
 
 1.1. Purposes of GASP 
 
 The main purpose the the Graph Algorithm Software Package is to allow 
 programmers of graph algorithms to code programs in a natural and machine 
 independent way. Because operations are expressed in a language of graph and 
 set terms, the programs will be easy to follow and estimates of the amount of 
 computation will be easier to compute. Comparisons among different algorithms 
 for the same problem will be much easier using GASP becuase it is easy to 
 generate programs from their description in conventional graph-theoretic terms. 
 Moreover, some tallies on the amount of computation are provided by GASP. 
 
 The logical structure of GASP makes it possible to change representations 
 with relatively little change in programs. A few low-level routines would have 
 to be rewritten, but the higher level programs would not. 
 
 1.2. Basic Concepts and Terminology 
 
 1.2.1. Data Types 
 
 An integer has the usual definition. A character string is a fixed-length 
 string of characters. A truth value is a variable which can take on one of the 
 two values: yes or no. A name references an object (see below). An object is 
 a conglomeration consisting of one integer, one character string, three names, 
 and (most important) one set . A set can exist only as part of an object. There 
 are two types of objects: restricted and unrestricted . Restricted objects can- 
 not belong to sets. 
 
 1.2.2. Definitions and Assignments 
 GASP objects are available to the programmer only through one level of 
 
^9 
 
 indirect addressing. The programmer deals with names which refer to objects 
 which have values. This relationship (shown in figure 11) is very important 
 for the understanding of GASP. 
 
 name 
 
 definition 
 
 assignment 
 
 values 
 (for set, etc.) 
 
 Figure 11 
 
 In order to distinguish one level from the other, we will use two sets 
 of words. A name is defined if it references an existing object, otherwise it 
 is undefined. A name may change its definition; that is, it can be made to 
 refer to a different object. The number of names referring to an object may 
 vary from zero to any reasonable positive integer. 
 
 When the contents of a set are changed, we say the set is assigned a 
 new value . Also, the object involved is assigned a new value . 
 
 The main advantage of this indirect addressing scheme is that not all 
 objects need to be accessed by permanent names. E.g., all objects in a set 
 S may be made accessible by means of the statement "FOR (ALL, X, S)", even if 
 no object in S has previously been given a name. In this case, we regard 
 the bound variable X as a name whose definition ranges over all the objects 
 in the set S. 
 
 
 1.2.3. Graphs 
 
 A graph is represented as an object whose set contains the branches and 
 nodes belonging to the graph. Each node and each branch is an unrestricted 
 object. In this first implementation, the set associated with a branch is the 
 
50 
 
 set of two incident nodes; the set associated with a node is the set of all 
 incident branches and adjacent nodes. 
 
 1.2. A. System Objects 
 
 GASP reserves a few restricted objects for special use. NULLSET is a 
 read-only object whose set is always empty. AC is an object whose set holds 
 intermediate values of set operations. USED is the object whose set contains 
 all currently active unrestricted objects. NODES and BRANCHES are the objects 
 whose sets contain all nodes and branches (respectively) belonging to the 
 union of all graphs. 
 
 1.2.5. GASP Statement Forms 
 
 GASP is an extension of PL/1 (through the use of the PL/1 Preprocessor). 
 Thus any PL/1 statement could be considered a GASP statement. The 'pure' GASP 
 statements fall into three categories: PL/1 statements which declare or assign 
 values to GASP data types; GASP Procedure calls which constitute a complete 
 statement starting with CALL (unless the procedure name begins with $) and 
 ending with a semi-colon; Type - functions where type is one of the following: 
 name, integer, truth value. A type-function call can be inserted almost any- 
 where a 'type' variable is allowed. 
 
 1.3. The GASP Statements for Sets 
 
 1.3.1. Notation 
 
 In the instruction set that follows, the actual code that must appear 
 (as spelled) is capitalized while arbitrary names used as arguments are not. 
 GASP statements will frequently be set off by quotation marks which are ob- 
 viously not part of the code. 
 
 
51 
 
 1.3.2. Declarations 
 
 GASP variables are declared just as regular PL/1 variables are declared. 
 Conversion from terms in section 2 to actual program words is shown below. 
 Formal Description Computer Code 
 
 name (and unrestricted object) $NAME 
 
 integer $INTEGER 
 
 character string $CHAR 
 
 truth value $BIT 
 
 restricted object $MAXSET 
 
 NOTE: Declaring a name (or unrestricted object) does not define it. However, 
 'DCL x $MAXSET;' will create a restricted object and that object will be the 
 definition of x. $MAXSET is the only declaration which cannot be factored; 
 'DCL (SI, S2) $MAXSET;' would result in both names SI and S2 referring to 
 the same object. 
 
 1.3.3. Definitions 
 
 A name, x, can be defined in two other ways besides 'DCL x $MAXSET; ' 
 [previous paragraph] . 
 
 ' $ALLOC (x) ;' creates (storage for) a new unrestricted object which 
 will become the definition of the name x (previously declared $NAME) . Also 
 the character string part of this object will be assigned the value 'x 1 . 
 
 'x = name-expression ; ' will define (or redefine) the name x to refer 
 to the object named by name-expression (name-expression can be either a pre- 
 viously defined name or an arbitrary expression which computes a name value) . 
 
 1.3.4. Freeing of Storage 
 
 The storage taken up by an object can be freed as follows: 
 '$KILL (x) ;' will free the unrestricted object named x and the name 
 x will become undefined. 
 
52 
 
 'CALL POP (x) ;' will free the restricted object named x and leave 
 x undefined. 
 
 1.3.5. Operations on Sets 
 
 1.3.5.1. Notation 
 
 Nearly all arguments will be defined names. In the following examples, 
 those names beginning with 's' are to be considered as sets (of any object) 
 and those beginning with 'e' should be considered as elements of sets ('e 1 
 names must refer to unrestricted objects). As is the case with most GASP 
 operations, no names are changed by the instructions in this section. 
 
 1.3.5.2. Truth Value Functions 
 
 '$IS-IN(e, s)' answers "does e belong to s?". 
 '$EQUALS(sl, s2)' answers "does set si = set s2?". 
 '$EMPTY_(s)' is equivalent to '$EQUALS(s, NULLS ET) ' . 
 
 1.3.5.3. Procedure Calls 
 
 '$STORES (sname, s_expression) ; ' will assign to the set named sname 
 the value of the set named s_expression (which remains unchanged). 
 
 '$CLEAR(s); f is equivalent to '$STORES (s, NULLSET) ;'. 
 
 '$CHANGES (s, elem, op)', where op = ADD or DELETE, will add (delete) 
 elem to (from) the set s. If this does not change the truth value of the 
 expression "elem e s", then it is a harmless waste of time. 
 
 '$CSES (s, e) ;' (Clear and Store Element in Set) assigns to s the 
 value {e}. 
 
 1.3.5.4. Integer Functions 
 
 'CARD(s)' returns the integer number of elements belonging to s. 
 
 1.3.5.5. Name Functions (Choice) 
 
 The functions in this section pick elements out of sets with varying 
 
53 
 
 side effects. 
 
 'ELEM_OF(s) ' will return the name of an object belonging to the set s. 
 This statement should not be used unless it is known that s is not empty. 
 The set s is unchanged by this instruction. 
 
 *CAN_PIC (e, s)' is a truth value function which will answer "CAN one 
 PICk an element from s?". If set s is not empty, e will name an object 
 belonging to s, which will then be deleted from s. If s is empty, e 
 will be undefined. 
 
 'ITH_EL (s, i)' will return (withoug deleting) the i-th element of the 
 set s, where i is an integer. Since this depends on the arbitrary (but 
 fixed) ordering of elements in the implementation of the set, it has little use, 
 
 'RANDEL(s)' will return a randomly chosen element from the set s, with- 
 out deleting it. 
 
 1.3.5.6. Name Functions (Intermediate Results) 
 
 The name functions in this section all perform some operation on the 
 input sets and store the result in the AC. The name returned is always AC. 
 The input sets remain unchanged (unless one of them is AC). 
 
 'UNION (si, s2)' takes the union of sets si and s2. 
 
 'INTER (si, s2) ' takes the intersection of sets si and s2. 
 
 'COMPL (s)' takes the complement of the set s with respect to the 
 universal set of unrestricted objects (useless by itself). 
 
 'DIFF (si, s2)' contains those objects belonging to si but not to s2 . 
 
 'SYMDIF (si, s2) ' contains those objects belonging to exactly one of 
 the sets si and s2 (exclusive or) . 
 
 1.3.6. Saving Object Values 
 
 'CALL PUSH (x) ; ' saves the value of the object named x. 
 
^ 
 
 'CALL POP (x) ;' restores the saved value of the object named x. For 
 example, consider the following code: 
 
 CALL PUSH (x); 
 2 
 
 CALL POP (x) ; 
 4 
 
 The values (of all the parts) of the object named x will be the same at 
 points 1, 2, and 4 regardless of the values at point 3. The definition of 
 x remains unchanged throughout. 
 
 As the words 'push' and 'pop' imply, any number of copies of an object 
 may be saved in this way, and restored in the usual 'last in - first out' 
 order. Implementation restrictions will limit the number of saved objects 
 at any point during execution. 
 
 1.3.7. I/O 
 
 GASP does not aid the user in the input of sets. 
 
 'CALL PELEMSK (s) ; ' (Print ELEMent and SKip to next line) will print 
 the character string of the object s and the character string of all objects 
 belonging to the set of s. 
 EXAMPLE: 
 
 DCL (SI, S2, El, E2, E3, E4) $NAME; 
 
 $ALLOC (SI) ; $ALLOC (El); $ALLOC (E2); 
 
 E3 = El; E4 = E2; S2 = SI; 
 
 $CSES (S2, E3); $CHANGES (S2, E4 , ADD); 
 
55 
 
 CALL PELEMSK (S2) ; END; 
 would generate the output line 
 
 SI =■ (El, E2) 
 and skip to the next line. 
 
 *$PUT (var-name);' is like 'PUT DATA (var-name) ; ' but with no restrictions 
 on var-name. 
 
 'CALL ABDUMP;' dumps the entire data base (of objects). 
 
 Regular PL/1 I/O is also available. 
 
 1.3.8. Expanding Operations 
 
 'EXPAND2 (subr, set)' is a name function which returns AC. Subr may be 
 any name function (e.g., UNION, INTER, SYMDIF) which takes two names as arguments 
 and performs a binary (usually associative and communtative) operation on their 
 sets, returning the name of the set which holds the result. For example, if 
 s = {el, e2, e3}, then EXPAND2 (UNION, s) is equivalent to UNION(el, UNION 
 (e2, e3)). If s is empty, then EXPAND2 (subr, s) is empty, and if 
 s = {el} then EXPAND2 (subr, s) = (the value of the set) el. 
 
 'CALL EXPAND1 (subr, set);' is a procedure call which can be used with 
 any procedure subr which takes one name as input. EXPAND1 will call this 
 routine with set as the argument, and then will call it with each object 
 belonging to set as the argument. Useful choices for subr include PELEMSK, 
 PUSH, and POP. 
 
 1.3.9. Loop Control 
 '$FOR (q, x, s);' 
 code 
 
 '$END; ' 
 allows code to be executed iteratively with each iteration having a different 
 definition of the name x chosen from the set s. The quantifier, q, may 
 
56 
 
 be any number, including ANY (equivalent to 1) and ALL (equivalent to cardinalit 
 of set s). Code will be executed minimum (q, ALL) times. 's' may be any name 
 or name function. Once the $FOR statement is executed, changing s will not 
 affect or be affected by the iterations. The $FOR - $END pair is a PL/1 block, 
 and the bound variable x is automatically declared within this block (it 
 need not be declared before). 
 
 The normal exit from a $FOR - $END section is to the next statement 
 after $END. Any other jump outside must be expressed as 'GO_TO label ;'. 
 
 '$4ALLPAIRS (bvl, bv2, s) ' 
 
 code 
 
 '$4APEND; * 
 is similar to the $FOR statement except that code is executed for all possible 
 unordered choices of bound variables bvl and bv2 subject to bvl, bv2 e s 
 and bvl * bv2. Abnormal exits must be through the statement 'G0_2 label;'. 
 
 1.4. The GASP Statements for Graphs 
 
 Nodes will be denoted by n, nl, n2; branches by b, graphs by g. 
 
 1.4.1. Truth Value Functions 
 
 'INCIDENT (n, b)' answers "is n incident to b?". 
 'ADJACENT (n, n2) ' answers "is n adjacent to n2?". 
 
 1.4.2. Simple Information Extraction 
 
 To get the set of adjacent nodes or incident branches of a given n or 
 the set of incident nodes of a given b, use the following name functions (all 
 return AC) : 
 
 'SET_OF_INCIDENT (NODES, n) ' , 
 
 'SET_OF_INCIDENT (BRANCHES, n) ' , or 
 
 'SET OF INCIDENT (NODES, b) ' . 
 
57 
 
 To get the set of nodes in g, use the name function (returns AC) 
 
 '$NOF(g)'. Similarly, the branches of g are obtained by '$BOF(g)'. 
 
 'CALL GET_BAN (b, nl, n2, g) ;' defines nl and n2 to be the endpoints 
 of b in g. 
 
 1.4.3. Advanced Graph Operations 
 
 'NBOUND (nodeset, g) ' is a name function which returns (AC) the set of 
 all nodes of g which do not belong to nodeset but are adjacent to at least 
 one node belonging to nodeset. 'BBOUND (nodeset, g)' is a name function which 
 returns (AC) the set of all branches of g which have exactly one endpoint 
 belonging to nodeset. 
 
 'CALL INTBANS (s, g) ;' is a procedure call which returns with s re- 
 assigned the value of the subset of branches of g which have both endpoints 
 belonging to s (a set of nodes at input time). 
 
 'D1ST (nl, n2, g) ' is an integer function which returns the distance 
 from nl to n2 in g. 
 
 'CALL COLAPS (b, g) ; ' is a procedure call which changes g by merging 
 the endpoints of b into a single node and removing any branches connecting 
 those endpoints (such as b). 
 
 'CALL DELBAN (b, g) ;' is a procedure call which deletes all trace of 
 b from g. 
 
 'CALL DELNOD (n, g) ; ' is a procedure call which deletes n and all of 
 its incident branches from g. 
 
 1.4.4. Graph I/O 
 
 'CALL READGR (g) ;' is the procedure call to input g. The input format 
 is a sequence of paths of node numbers (from 1 to the number of nodes) . A new 
 path is begun by a minus sign in front of the starting node [only Euler graphs 
 can be given by a single path]. The entire sequence is terminated by a zero. 
 READGR also will output g (see below). 
 
58 
 
 EXAMPLE: Given the output sequence 
 
 1, 2, 4, -2, 3, 4, 1, 0, 
 READGR would create the graph shown in figure 12. 
 
 Figure 12 
 
 'CALL DEF_BAN (b, nl, n2, g) ;' is a procedure call which will create a 
 branch b connecting nl and n2 in g. 
 
 ' $PUTGRAPH (g) ;' is the procedure call which outputs g as a set of 
 nodes and branches. It is equivalent to 'CALL EXPAND1 (PELEMSK, g) ;'. 
 
 1.5. Measuring GASP Programs 
 
 A count of the number of executions of each block of a GASP program is 
 accomplished with the following statements [even though they are complete 
 statements, they need not be followed by a semicolon]. 
 
 '$DCLSTAT(k) ' declares k integers to be used for counting. 
 
 '$STAT' is placed in each logical block to be counted. 
 
 '$CLEARSTAT' initializes the counts to zero. 
 
 '$0UTSTAT' prints out the k integers, in the order that the '$STAT"'s 
 appeared (compilation-wise, not execution-wise). 
 
 1.6. Implementation Details 
 
 1.6.1. Data Structure 
 
 A Universal SET (USET) contains all GASP objects. USET is a PL/1 struc- 
 

 59 
 
 ture subdivided into $TSIZE objects (level name is ELEMENT) of which $SIZE are 
 unrestricted. The current systems has $SIZE»64 and $TSIZE»127, but these can 
 be changed easily. The set part (SSET) of an object is a bit string of length 
 $SIZE (this is the only reason for restricted objects: an increase in the number 
 of restricted objects increases the memory requirement only linearly; an increase 
 in unrestricted objects increases memory requirements quadratically) . The other 
 parts of an object are CHARP (CHARacter string Part), INTP (INTeger Part), REFP 
 (REFerence Part), RP_2 and RP_3 (Reference Part 2 and 3). PL/1 declarations for 
 the various data types [1.2.1.] are as follows: $CHAR - CHAR (8), $INTEGER = 
 BIN FIXED (15), $NAME = BIN FIXED, and $BIT = BIT (1). 
 
 1.6.2. How GASP Works 
 
 GASP procedures which require only a line or so of code are translated 
 by the PL/1 preprocessor. The identifiers which are translated by the GASP 
 macros usually start with a '$'. 
 
 Longer GASP procedures are incorporated into the programs as separate 
 PL/1 procedures. The user has a choice of two methods which include these 
 procedures in his program. The more efficient way is to include them as pre- 
 compiled external procedures. The more flexible way is to have their source 
 code inserted into the main program: this allows the user to set the limits 
 $SIZE and $TSIZE to fit his needs. 
 
 1.6.3. Cost Parameters 
 
 Since the PL/1 preprocessor and PL/1 compiler are used, compilation time 
 is usually large, run time is usually reasonable. For example, a typical program 
 took 20 seconds to compile, 6 seconds to execute. 
 
 The core requirement for the basic GASP programs and data is around 120k 
 bytes, a typical program might require a total of 150k bytes. 
 
6o 
 
 GASP macro definitions require 206 lines of PL/1 code, the source code of 
 GASP procedures is around 240 lines. 
 
 1.6.4. Implementation Defects 
 
 When coding a binary set operation (e.g., 'UNION (SI, S2)'), one must 
 make sure that at least one of the arguments is not AC. 
 
 'G0_T0' and 'G0_2' are precompiled into more than one PL/1 statement and 
 therefore should not appear immediately after a 'THEN' . 
 
 
61 
 2. THE NEW ALGORITHM PROGRAMMED IN GASP 
 
 NEW: PROC(GtN); DCL G $NAME, N SINTEGER; 
 
 DCL ( SET_PJ t XJ_NODES, AJ, XJ, P(N) ) $NAME, 
 
 ( IS_DISCON, D(N) ) $BIT, J SINTEGER; 
 
 %DCL ( KJ, ZJ, SAVED, $CARD ) CHAR; /* USE PARTS OF OBJECTS */ 
 
 % KJ = •INTP(XJ)' ; % ZJ = »REFP(AJ)« ; % $CARD = «INTP' ; 
 
 % SAVED = 'RP_2' ; /* RP_2 POINTS TO THE SAVED VALUE OF OBJECTS */ 
 
 IF N < 3 THEN RETURN ; $ALLOC ( TEM P ) ; 
 
 $ALLOC(XJ ); $ALLOC(SET_PJ); $AL LOC ( X J_NODES ) 5$ALLOC( A J ) ;$ALLOC(YJ ) ; 
 BOXl: J = 1 ; 
 
 P(l) = NODES#(l); $CSES ( SET_PJ, P(l) ) ; 
 
 SSTORES ( XJ, SET_OF_INCIDENT ( BRANCHES, P(l))) *, 
 
 SSTORES ( YJ, $BOF ( G ) ) ; 
 
 D( 1 ) = YES ; 
 
 IS_DISCON = NO ; 
 BOX2 : $STAT /* COUNT C2(G, NEW) */ 
 
 IF D(J) THEN DO; $STAT 
 
 SALLOC(ZJ); $STORES( ZJ, XJ ) ; 
 
 $STORES(XJ_NODES, DIFF( EXPAND2 ( UN I ON , X J ) , SET_PJ ) ) ; 
 /* NODE BOUNDARY ( P 1 , P2 , . . . , P J ) */ 
 
 SCARD ( XJ_NODES ) = CARD ( XJ_NODES ) J END; 
 
 KJ = ; 
 BOX3 : IF -<CAN_PIC( P( J + l) ,XJ_NODES) THEN SIGNAL ERROR; 
 
 $CARD ( XJ_NODES ) = SCARD ( XJ_NODES ) - 1 ; 
 
 $STORES ( AJ, INTER ( XJ, P(J+1))) ; 
 
 SSTORES ( XJ, DIFF ( XJ , AJ ) ) ; 
 
 
62 
 
 SSTORES ( YJ, DIFF ( YJ, AJ ) ) ; 
 
 CALL PUSH (XJ) ; /* X(J) <- X(J-l) */ 
 
 $STORES ( XJ, UNION ( XJ, I NTER ( Y J , P ( J+l ) ) ) ) ; 
 /* I.E.,X(J+1)<- BOUND (PI, P2, . . . , P(J+1)) */ 
 XJ_EMPTY: IF $EMPTY_( XJ ) THEN DO; $STAT 
 
 CALL POP ( XJ ) ; GO TO DISCON ; END ; 
 
 SCHANGES ( SET_PJ,P( J+l ),ADD ) ; 
 
 CALL PUSH ( YJ ) ; /* Y(J+l) <- Y(J) */ 
 
 CALL PUSH ( XJ_NODES ) ; 
 
 IF -> D( J ) THEN DO ; 
 
 SSTORES ( TEMP, DIFF ( AJ, ZJ ) ) ; 
 
 IF -n $EMPTY_( TEMP ) THEN DO; $STAT 
 
 /* MOW PAY THE PRICE FOR INCORRECT XJ */ 
 
 $STORES ( AJ, INTER ( AJ, ZJ ) ) ; 
 
 SSTORES ( SAVED ( YJ ), UNION(YJ, TEMP) ) ; 
 
 SSTORES ( SAVED ( XJ ), UN I ON ( SA VED ( XJ ) , TEMP) ) ; END; 
 
 D ( J + 1 ) = YES ; 
 
 GO TO BUMP ; END; 
 /* FLSE IF D(J) THEN */ 
 
 IF SCARD ( XJ_NODES ) > THEN D ( J + 1 ) = NO *, 
 
 ELSE DO; $STAT D(J+1) = YES J 
 
 SSTORES ( AJ, ZJ ) ; SKILL(ZJ) J END; 
 BUMP: KJ = KJ + l ; 
 
 CALL PUSH(AJ) ; 
 
 J = J + 1 ; 
 J_EO_N : IF J < N-l THEN GO TO BOX2 ; 
 B0X4 : SSTAT /* CMG, NEW) */ 
 
63 
 
 IF D(J) THEN $STORES ( AJ, YJ) ; 
 
 ELSE SSTORES ( AJ, INTER(ZJ, YJ) ) ; 
 
 /* 'OUTPUT Al X A2 X • . • X A(N-1)» COMES HERE */ 
 BOX5 : J = J - 1 ; 
 
 CALL POP(AJ); CALL POP(YJ); CALL POP(XJ); CALL POP ( X J_NODES ) ; 
 
 SCHANGES ( SET_PJ, P(J+1), DELETE ) ; 
 
 IF IS_DISCON THEN GO TO DISCON ; 
 XJ_EMPTY_: IF $CARD (XJ_NODES) > THEN GO TO BOX3 ; 
 J_EO_l : IF J = 1 THEN GO TO RETURN_ ; 
 
 GO TO BOX5; 
 niSCOM : $STAT 
 
 IF O(J) THEN DO; $KILL(ZJ); $STAT END; 
 
 IS_DISCON = ( KJ = 1 ) ; 
 
 GO TO J_EO_l ; 
 RETURN_ : $KILL(YJ); $KILL(AJ); $KILL(XJ); $K I LL ( X J_NODES ) ; 
 
 $KILL(TEMP); $KILL(SET_PJ) ; 
 
 RETURN; END NEW; 
 
6k 
 
 VITA 
 
 The author, Stephen Martin Chase, was born in Urbana, 
 Illinois, on September 21, 19^3* He received his Bachelor of Science 
 degree in Mathematics in June 1965, and his Master of Science degree 
 in Mathematics in June 1967 from the University of Illinois. From 
 June 1965 to June 19J0, he was a research assistant in the Department 
 of Computer Science of the University of Illinois at Urb ana- Champaign . 
 In June 1970, he joined the research staff of the Thomas J. Watson 
 Research Center in Yorktown Heights, New York. 
 

■■-■■ 
 

 
tf0* 
 
 \&tt