The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. To renew call Telephone Center, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN Vs 1 7 I3P2 *****■■ APR 03 JAN 6 1997 1996 L161— O-1096 ' U' -Ur "r*u UIUCDCS-R-77-905 UILU-ENG 77 1759 ■1 Average Analysis of Simple Path Algorithms by Yehoshua Perl November 1977 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN • URBANA, ILLINOIS Average Analysis of Simple Path Algorithms Yehoshua Perl* Department of Computer Science University of Illinois at Urbana-Champaign November 1977 *0n leave from Department of Mathematics and Computer Science Bar-Ilan University, Ramat-Gan, Israel **This work was supported in part by the National Science Foundation under Grant No. NSF MCS-73-03408. Digitized by the Internet Archive in 2013 http://archive.org/details/averageanalysiso905perl Abstract Given a graph of n vertices and e edges. The average complexity of several known simple path algorithms is analyzed. The average and the standard deviation of the number of the edges scanned to find the target vertex in both breadth first search and depth first search are shown to be of order n. Both the average and the variance of Prim's minimum spanning tree algorithm are shown to require 0(n lg n ln(e/n)) time. The same result holds for Dijkstra's shortest path algorithm. Krususkal's minimum spanning tree algorithm, which competes with Prim's algorithm requires 0(n In n lg e) on the average. The connection to related results is discussed. -1- 1. Introduction Given an undirected graph G(V,E) of n vertices and e edges. We discuss the average complexity of several known path algorithms as breadth first search (BFS) , depth first search (DFS), Prim's and Kruskal's minimum spanning tree (MST) algorithms and Dijkstra's shortest path algorithm. The analysis for directed graph is essentially the same. Recently there are attempts to design new algorithms which are very efficient on the average even though they are not better in the worst case. For example, a transitive closure algorithm of Bloniarz, Fischer and Meyer [1] and an algorithm of Spira [15] for finding shortest paths between all pairs in a graph. Such works are motivated by the fact that in practice the average complexity of an algorithm is more relevant than the worst case complexity. We analyze the average complexity of several known algorithms since we feel that while designing new algorithms which are efficient on the average we should know the average complexity of known algorithm so we can compare them. Average analysis of simple algorithms might lead .to analysis of more complicated algorithms. Also simple algorithms are sometimes applied as procedures in compound algorithms and their analysis may help to analyze the complexity of the compound algorithms. For example Dinic's maximum flow algorithm [5] applies both BFS and DFS. -2- We analyze the average complexity of the algorithms for a random graph assuming equal probability of all graphs of n vertices and e edges and that the lengths of the edges are independently chosen from a non-negative distribution. We also assume that for every vertex a list of its edges is given. The connection of our results to related works of Bloniarz, Fischer and Meyer [1], Spira [15] and Johnson [8] is discussed. Let d(v) denote the degree of a vertex v and l(u,v) denotes the length of an edge (u,v). By In v and lg x we denote the natural and binary logarithms, respectively. -3- 2. Breadth First Search and Depth First Search The two common procedures for scanning a graph are breadth first search (BFS) and depth first search (DFS). Both of them require in the worst case scanning of all the edges of the graph. We want to find what is the average number of edges to be scanned in order to find a path from a source vertex s to a target vertex t. Actually, we need the average number of edges scanned until we reach the vertex t for the first time; since then one can backtrack along the edges of the path, using pointers which are prepared while scanning. Let us find the average behavior of BFS and DFS for a random graph. Following Erdos and Renyi [6] we assume equal probability of all graphs of n vertices and e edges implying that the probability of every pair of vertices being connected by an edge is equal e/(„). The probability of an edge emanating from a vertex v ^ t to enter the target vertex t is p = l/(n-l) (note that the list of the edges of the second vertex scanned, for example, contains an edge to the source vertex) . Both BFS and DFS begin scanning at the source vertex s and continue to scan I edges emunating from vertices, which were already visited, until the vertex t is reached. They differ in the order of scanning the edges. The BFS scan the edges from the vertices in first visited first scanned order while in DFS the order is last visited first scanned . (It seems an extension of the fifo and lifo orders where every element is processed several times.) Actually -4- while scanning the i-th edge of a vertex v the probability to reach t is l/(n-i) and not l/(n-l) since i vertices different from t are not reachable through this edge. But since the degrees of the vertices and the order of their scanning are not known we shall use l/(n-l) as a lower bound for the probability obtaining an upper bound for the average number of edges scanned until reaching t for both BFS and DFS or any other order of scanning. (Actually the probabilities of the first edges scanned in BFS are higher than in DFS since in BFS more edges of the same vertices are scanned first, but again it is not clear how to calculate this difference.) e e E(I) < Z i p(l-p) 1 " 1 = p Z i(l-p) 1 " 1 < i = J,* = n-1 i=l i=l p ' KU i; The variance is Var(I) < E(I 2 ) < Z ± 2 p(l-p) 1 1 < (2-p)/p 2 < 2/p 2 Var(I) < 2 (n-1) i=l 2 and the standard deviation is a(I) < /T (n-1) Note that our bound applies also for cases when the algorithm stops earlier where there is no path from s to t. -5- In practice we may assume that the number of edges required to reach the target t is linear with the number of vertices since both the average and the standard deviation are linear with n. This result is interesting since it is independent with the density of the graph. Although we can not calculate the difference between BFS and DFS it is clear that BFS is slightly more efficient on the average. This may be expected since BFS finds a shortest path while DFS not necessarily does. A related result was obtained by Bloniarz, Fischer, and Meyer [1] in their average analysis of a transitive closure algorithm, which essentially applies BFS from every vertex in the graph. They find that BFS requires scanning of n In n edges on the average in order to scan all the vertices reachable from s. Actually their proof is also valid for DFS. Our result shows that while there is no deference in the worst case between scanning one target vertex and all vertices, there is a deference in the average case. Now, what is the average number of vertices scanned in BFS until reaching t. Let d denote the degree of the target vertex t. The probability of reaching t while scanning the source vertex s in p=d/(n-l) since this is the probability of s being one of the d neighbours of t. The probability of reaching t while scanning the i-th edge, assuming t was not reached before, is d/(n-i)>p since the i vertices already scanned are not possible neighbours of t. Again we shall use p as a lower bound for the -6- probabilities obtaining an upper bound for the average number of vertices scanned, assuming degree d for t, E(I,d) < £ i p(l-p) 1 * < i-1 P d and the variance V(I,d) < 2/p 2 = 2((n-l)/d) 2 The probability that the degree of t is d is ,n-l N d... .n-l-d ( d ) q (1-q) where q = e/(„) is the probability of an edge connecting two vertices, If t is a single vertex then at most n-1 vertices are scanned. Thus the average number of vertices scanned until reaching t is E(I) < (n-1) (1-q) 11 " 1 + Z E(I,d) C"" 1 ) q d (l-q) n 1 d d=l ^/-r\ / -i \ /-, \ n— 1 r- n— 1 ..n— lv d,, x n-l-d E(I) < (n-1) (1-q) + E — — ( ) q (1-q) d=l -7- Let us simplify the right handside of the inequality. m , . Denoting m = n-1 and using the identity (,) = E (, .) we obtain d . , d-± 1=1 m(l-q) m + m Z ± (») qVq) 1 ^ d=l d d m , m . m(l-q) + m Z q (1-q) j Z M-l' d=l i=l mm j i • i /i \ m , v .m-d 1 ,1-1,. = m(l-q) + m E E q (1-q) ^ ^d-1^ i=l d=l and by the identity (.) = — (, .. ) d d d-1 mm, j i • /i \ m , v v d /i xm-d 1 ,1, = m(l-q) + m Z E q (1-q) t (j) i=l d=l m l , , ' m(l-q) m + m E 1/i E q d (l-q) m " d (t) 1=1 d=l m l , _, m(l + 1/H ) E 1/i E q d (l-q) m d (J) m 1=1 d=0 d where H = E 1/i = In n + 0(1) m 1=1 = m(l + 1/H ) E (1-q)" 1 l /i Z q d (l-q) i " d (J) m 1=1 d=o d m , m(l + 1/H ) E (l-q)™" 1 /! m 1=1 m m(l + 1/H ) E (1-q) /(m-i) m 1=1 -8- No close formula was found for this series, thus n-1 E(I) < (n-l)(l + 1/H ) I (l-e/i^r/in-i-l) n_1 i-1 2 -9- 3. Prim's Minimum Spanning Tree Algorithm Consider Prim's nearest neighbour algorithm [13] for finding an MST of a given graph. We begin with a subtree T containing one arbitrary vertex u. Then in each step we choose a vertex v { T with a minimum distance D(v) to a vertex of the subtree T. Then v is added to T and the distances D(u ) of the neighbours u. f T of v to the subtree T are updated if possible. D(u ± ) = min(D(u i ),l(v,u i )) The straight forward implementation of this algorithm 2 requires 0(n ) time. Thus Prim's algorithm is efficient for 2 complete graphs since 0(n ) edges must be scanned for obtaining 2 an MST. On the other hand, for sparse graphs, where e<< n , the straight forward implementation of Prim's algorithm is inferior to Kruskal's MST algorithm [11] requiring 0(e lg e) time, But for sparse graphs there exists an 0( (e+n) lg n) implementation [12] of Prim's algorithm, using a heap, (see for example [14]) as a priority queue for choosing the nearest neighbour to the subtree T. The vertices v f T are stored in a heap according to their distances D(v) from T. Thus either choosing the nearest neighbour or updating D(v) requires at most 0(lg n) time. (For every vertex v in the heap we have a pointer to its place in the -10- heap, which is updated while v is moving in the heap.) There are at most e updates through the algorithm. Thus the priority queue implementation of Prim's algorithm requires at most 0((e+n)lg n) time. Consider now the average complexity of this implementation. We assume that the lengths of the edges are independently chosen from a non-negative distribution. Let N(v) denote the number of times D(v) was updated in the algorithm, then N(v) <_ d(v) where d(v) denotes the degree of v. M = Z N(v) is the total number of updates in the algorithm. V We refer to the edges of the vertex v in the order of their scanning from T in the algorithm. The variable x.(v) i = l,2,...,d(v) has value 1 if D(v) was updated through its i-th edge and o otherwise, d(v) clearly N(v) = Z X.(v). i=l X The distance D(v) is updated through the i-th edge from T to v if this edge is shorter than the i-1 previous edges from T to v and the probability for this is 1/i. Thus E(X.(v)) = 1/i and Var(X.(v)) = E(X. 2 (v)) - E 2 (X.(v)) = 1/i - 1/i 2 The distance D(v) is updated through some edges of v until v is added to T. Hence d(v) d(v) E(N(v)) < Z E(X,(v)) = Z 1/i - H ., v - In d(v)+0(l) " 1-1 X 1-1 d(v) -11- The event that D(v) was updated through the i-th edge is independent with the event that D(v) was updated through the j-th edge. Thus the variables X.(v), i = l,2,...,d(v) are independent, and d(v) d(v) Var(N(v)) = Z E(X.(v)) = Z (1/i - I/O = H , , " - R^\ = In d(v) + 0(1) i=l x i=i d ^ v ) d ( y ) and Var(N(v)) < E(N(v)). The average of the total number of updates in the algorithm M = Z N(v) is V E(M) = Z E(N(v)) 1 0(E In d(v)) = 0(ln n d(v)) V V V The maximum of II d(v), where Z d(v) = 2e, is obtained where all d(v) V V are equal (up to a difference of 1) and d(v) = 2e/n. Hence n E(M) = 0(ln n d(v)) £ 0(ln n 2e/n) - 0(n In (2e/n)) V 1 Using the heap each update requires at most 0(lg n) time. Therefore the average time required by the priority queue implementation of Prim's algorithm is bounded by 0(n lg n In (e/n)). There is a difficulty in computing the variance of M = Z N(v) since the variables N(v), v z V are dependent, for example V -12- if N(u) = n-1 then N(v) < n-1 for v / u. Therefore Var(M) = E Var N(v) - 2 E Cov(N(u) ,N(v) ) V u,veV d(u) d(v) Cov(N(u),N(v)) = Cov( E X.(u), E X.(v)) = i=l X j=l J d(u) d(u) d(v) d(v) = E(( E X.(u) - E E(X.(u))( E X. (v) - E E(X.(v)) i=l X i=l X j=l J j=l J d(u) d(v) = E( E E (X (u) - E(X (u)))(X (v) - E(X (v)))) i=l j=l 2 J d(u) d(v) = E E E((X (u) - E(X (u)))(X (v) - E(X (v)))) i=l j=l J J d(u) d(v) = E E E(X (u) • X (v)) - E(X (u)) • E(X (v)) i=l j=l 2 3 If the i-th edge to u is not actually the j-th edge to v than x.(u) and x.(v) are independent and thus E(X i (u)«X j (v)) - E(X i (u)) . E(X (v)) = Thus the only contribution to Cov( N(u),N(v)) is in case (u,v) e E. But in such a case it can not happen that both u was updated from v and v was updated from u. Hence at most one of x.(u) and x.(v) is equal 1 and x.(u) • x.(v) = for all cases. -13- Thus if (u,v) e E then Cov(N(u),N(v)) = - E(x i (u))«E(x.(v)) = - 1/i •' 1/j < -1/n 2 Hence Var(M) < Z Var N(v) - 2 I 1/n 2 - E Var N(v) - 2e/n 2 V (u,v)eE V The fact that Var(M) is slightly smaller than if the variable N(v) were independent is not surprising since the dependency is of a "negative" nature. By this we mean that the only information we have is that if some N(v)'s are high enough than the values of the others are bounded, but if some are small than we have no information about the values of the others. Var(M) < Z Var(N(v)) < Z E (N(v)) = E(M) = 0(n In (2e/n) V V Since the standard deviation a(M) <_ / E(M) the time required by the algorithm in practice is quite concentrated near the average 0(n lg n In (e/n)). 2 2 For dense graphs, where e = (n ) , we obtain 0(n lg n) average behavior of the algorithm. But better results are obtained for sparse graphs becuase of the careful analysis of the complexity. -14- For example, in case e = a(n lg n) the average time required is 0(n lg n lg lg n) = 0(e lg lg n) which is the worst case behavior of the very efficient algorithms of Yao [7] and Cheriton and Tarjan [2]. Hence, in practice, it might be sometimes reasonable to apply the more simple algorithm of Prim. For graphs which are even more sparse, as in planar graphs for example, where e = 0(n) the algorithm requires 0(n lg n) time on the average. Our result has some influence on Johnson's 0(e) MST algorithm [8]. He shows that for graphs which are dense enough, where e ^ 0(n ) for a constant positive integer k, we can implement Prim's algorithm to an 0(e) algorithm by using as a priority queue a heap of a constant height k (allowing n^'^ sons for every vertex in the heap). Johnson's main idea is that the e possible updates may only decrease the value of the updated element and, thus require only climbing up in the heap which takes at most k operations per update. Eliminating a minimum element of the heap 1/k requires at most 0(kn ) operations but only n minimum elements are eliminated in the algorithm. Thus the algorithm requires at 1/k most ke + kn n = 0(e). Our result shows that even for complete graphs the average number of updates is small enough to make Johnson's implementation unefficient. This consequence is strengthen by the small variance obtained. Thus from the average point of view Johnson's implementation is inferior to the binary heap implementation. -15- 4. Dijkstra's Shortest Path Algorithm Dijkstra's very efficient algorithm [A] for finding a shortest path from a source vertex s to all other vertices in the graph is another example of a nearest neighbour algorithm, A set of vertices S contains all vertices for which the shortest distance from the source s was already computed, initially S = {s}. For every vertex v, D(v) is the length of a shortest path from s to v through vertices of S. In each step we choose a vertex v i S with a minimum distance D(v) from s to be added to S and the distances D(u.) of the neighbours u. t S of v are updated if possible. As for Prim's algorithm there is an 0((e+n) lg n) implementation [10] of Dijkstra's algorithm using a heap as a priority queue for the distances D(v) of v t S. The analysis of the average complexity of this implementation is very similar to the analysis of Prim's algorithm. The difference is in the probability of an update of D(v) through the i-th edge. In Prim's algorithm D(v) is updated through the i-th edge if it is shorter than the i-1 previous edges and since the lengths cf the edges are drawn independently from the same distribution this probability is 1/i. In Dijkstra's algorithm D(v) is updated through the i-th edge (u,v) where u is the last vertex added to S if D(u) + l(u,v) < Min (D(u.) + l(u.,v)) u. J J J ■16- where u. are the other neighbours of v in S. The vertices are J added to S in a non-decreasing order of their distances from s. Thus D(u) ^D(u.) and the probability of an update of D(v) through the i-th edge is bounded by 1/i. Now we may apply the name analysis as in Prim's algorithm and obtain that the average time and the variance required by Dijkstra's algorithm are bounded by the corresponding results for Prim's algorithm i.e. 0(n lg n In (e/n)) . Applying Dijkstra's algorithm n times, for all vertices of the graph yields an algorithm for all shortest paths in a graph for which the average time and the variance are bounded by 2 0(n lg n ln(e/n)). Spira [15] gives an algorithm for this problem 2 2 of 0(n lg n) average time. The difference of ln(e/n) instead of lg n comes from the careful use of the possible sparseness of the graph. Our result for the variance was obtained by using the ;i negative'' nature of the dependency of the variables and is much lower than Spria's result, 0(n lg n) . Also our analysis is slightly simpler than Spira' s. There is an important difference between these two algorithms. Spira' s algorithm applies n times a heap-priority-queue implementation of a one source shortest path algorithm, which is actually due to Danzig [3]. But this last algorithm requires an initial sorting of the edges of the graph which might take 0(n^ lg n) time which is -17- higher than the time required by the straight forward implementations of the main part of the algorithm and of Dijkstra's algorithm, both of which require 0(n2) time. Therefore Spira's algorithm is suggested only for finding all shortest paths in the graph while Dijkstra's algorithm is efficient for both problems. As in Prim's I1ST algorithm, Johnson [9] uses a constant height heap to obtain an 0(e) implementation for Dijkstra algorithm. Our observation in Section 2 holds also for this case. -18- 5. Kruskal's Minimum Spanning Tree Algorithms In a previous section we analyzed the average behavior of Prim's MST algorithm. Let us now analyze the competitive MST algorithm of Kruskal [11]. This algorithm sorts first all the edges of the graph in a non-decreasing order. Then we scan the edges in this order adding to the tree every edge which is not closing a cycle with the edges already inserted into the tree. Another implementation of Kruskal's algorithm uses a heap as a priority queue for the edges instead of the initial sorting. Both implementations clearly require 0(e lg e) time in the worst case since checking if an edge is closing a cycle is performed very efficiently using the Union-Merge algorithm [7] [16]. Let us calculate the average behavior of this priority queue implementation by finding the average number of edges taken out of the priority queue. The first two edges, i.e. with the smallest lengths , must be inserted into the tree. Assuming k edges were already inserted into the tree, it is difficult to calculate the exact probability of an edge to be the next edge of the tree, since it depends on the number of the vertices in the subtrees of the forest containing the k first edges. But this probability is the lowest in case all k edges are in the same subtree. Thus we can obtain an upper bound for the average assuming all k edges generate only one subtree of k+1 -19- k+1 k vertices. The number of edges closing a cycle is ( _ ) - k = (_) and the probability of an edge to be inserted into the tree is bounded k n by p, = 1 - («)/(«). Note that this is the probability even in case the graph is not complete. Actually even in the case of one subtree the probability is higher since some edges connecting vertices of the tree might already have been scanned before. Let I, denote the number of edges scanned while the tree has already k edges until the k+l-th edge is inserted into the tree. The total number of edges taken out of the priority queue is n-2 1 = Z \ k=0 k E(I) = E E(I k ) < E E 1 P^l-P^ 1 " 1 < Z Z id-^)/^))^ 7 ^ 1 X k=0 k=0 i=l k=0 i=l < z (")/((")- ())) = Z 1/(1 - k ^ ]\ ) = k=0 2 2 k=o n(n_1) n-2 °° , „ , x °° _ • • n ~2 = Z n^Cn-l)" 1 Z (k'-k)""" < = Z Z ( M k ~^ ) = Z n'^n-l)" 1 Z (k 2 ' k=0 1-0 n(n_1) 1-0 k=0 _. n-2 < Z n 1 (n-l) 1 Z k i=0 k=0 Now let us use the following approximation (see for example [14]) Z k = -T--j- n + 0(n ) k=l -20- We shall use the first term and later show that the second term contributes a lower order term. Thus E(I) < Z n i (n~l) i (n-2) 2i+1 /(2i+l) = i=0 i , i+1 , „ 9 i+l „ 1 /n-2. _ 1 /Bz£\ = n Z n ... ( ) < n Z — — ■ (— — ) . n 2 l+l n . _ i+I n i=0 i=0 n-2 Denote q = n n OO CO 00 E(I) < n Z — - q =n Z / q dq = n / Z qdq = n / ■= dq • n 1+1 • n in 1-c l 1=0 1=0 1=0 = -n In (1-q) = -n In (l-(n-2)/n) < n In The contribution of the second term in the approximation is 0( Z n i (n-l) 1 (n-2) 21 < 0( Z ((n-2)/n) i ) i=0 i=0 = 0(1/(1- ^ = 0(n) which is of lower order. Each elimination of a shortest edge from the priority -21- queue takes at most 0(lg e) . Thus the average time required by Kruskal algorithm is bounded by 0(n In n lg e) . Prim's MST algorithm average behavior was of the same order and these two algorithms are competitive from both the worst case and average behavior. -22- Acknowledgment I wish to thank Shmuel Zaks for simplifying the bound on the average number of vertices scanned in BFS. -23- References [1] Bloniarz P. A., M. J. Fischer and A. R. Meyer, "A note on the average time to compute transitive closures,' 1 Proc. of the 3rd Int. Colloquium on Automata, Languages and Programming, S. Michelson and R. Milner (eds.), July 1976. [2] Cheriton D. and R. E. Tarjan, ''Finding minimum spanning trees, ''' SIAM J. Comput., 5(1976), 724-741. [3] Danzig, G. B., Linear Programming and Extensions , Princeton University Press, Princeton 1963, 363-366. [4] Dijkstra, E. W. , : 'A note on two problems in connexion with graphs, : ' Numer. Math., 1(1959), 269-271. [5] Dinic, E. A., ''Algorithm for solution of a problem of maximum flow in a network with power estimation," Sov. Math. Dokl, 11(1970), 1277-1280. [6] Erdos, P. and A. Renyi, "On random graphs I," Publications Mathematicae, 6(1959), 290-297. [7] Hopcroft, J. E. and J. D. Ullman, 'Set merging algorithm," SIAM J. Comput., 2(1973), 294-303. [8] Johnson, D. B. , 'Priority queues with update and finding minimum spanning trees," Info. Proc. Let., 4(1975), 53-57. [9] Johnson, D. B. , "Algorithms for shortest paths," Ph.D. Thesis, Cornell University, 1973. 10] Johnson, E. L. , 'On shortest paths and sorting,' Proc. ACM 25th Annual Conference, August 1972, Boston, Vol. I, 510-517. -24- [11] Kruskal, J. B. , 'On the shortest spanning subtree of a graph and the traveling salesman problem. 1 ' Proc. Amer. Math. Soc, 7(1956), A8-50. [12] Kerschenbaum, A. and R. Van Slyke, "Computing minimum spanning trees efficiently,'' Proc. ACM 25th Annual Conference, August 1972, Boston, Vol. 1, 518-527. [13] Prim, R. C, "Shortest connection networks and some generalizations,' Bell Sys. Tech. J., 36(1957), 1389-1401. [14] Reingold, E. M. , J. Neivergelt and N. Deo, Combinatorial Algorithms: Theory and Practice , Prentice Hall, Englewood Cliffs, N.J. , 1977. [15] Spira, P. M. , ''A new algorithm for finding all shortest paths in a graph of positive arcs in average time 0(n^ log^ n), r SIAM J. Comput., 2(1973), 28-32. [16] Tarjan, R. E. , "Efficiency of a good but not linear set union algorithm, : " JACM, 22(1975), 215-225. [17] Yao, A. C. C, "An 0(|e| log log |v|) algorithm for finding minimum spanning trees,'' Info. Proc. Let., 4(1975), 21-23. -25- JOCRAPHIC DATA ET 1. Report No. UIUCDCS-R-77-905 2. 3. Recipient's Accession No. ■Ac and Sunt itlc Average Analysis of Simple Path Algorithms 5- Report Date November 1977 6. ithor(s) Y> p er ^ 8. Performing Organization Rcpt. No. ■rtorming Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 10. Pro)ect/Task/Work Unit No. 11. Contract /Grant No. MCS-73-03408 p.Tisoiing Organization Name and Address National Science Foundation Washington, D.C. 13. Type of Report & Period Covered 14. uppk mcntary Notes Ihsir-uts Given a graph of n vertices and e edges, known simple path algorithms is analyzed. The of the number of the edges scanned to find the search and depth first search are shown to be The average complexity of several average and the standard deviation target vertex in both breadth first of order n. Both the average and the variance of Prim's minimum spanning tree algorithm are shown to require 0(n lg n ln(e/n)) time. The same result holds for Dijkstra's shortest path algorithm. Krususkal's minimum spanning tree algorithm, which competes with Prim's algorithm requires 0(n In n lg e) on the average. The connection to related results is discussed. cj Words and Document Analysis. 17a. Descriptors Average analysis, Average complexity, Breadth first search, Depth first search, Minimum spanning tree, Path algorithms, Probabilistic analysis of algorithms, Shortest path. Identif iers/Open-F.nded Terms COSATI Field/Group vailability Statement 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price NTIS-35 ( 10-70) USCOMM-OC 40329-P71 FEB 1 5 JQ7Q *w im