LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAICN 510.84 cop. Z The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN V L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/boundsonscheduli632liuj j /a ^^ UIUCDCS-R-7^-632 ^}V<^L^^ 1 Bounds on Scheduling Algorithms for Heterogeneous Computing Systems by Jane W. S. Iiu C. L. Liu June 197^ DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS THE LIBRARY OF THE i N 1 £ UIUCDCS-R-7U-632 Bounds on Scheduling Algorithms for Heterogeneous Computing Systems by Jane W. S. Liu C. L. Liu June 197^ Department of Computer Science University of Illinois at Urb ana- Champaign Urbana. Illinois This work was supported in part by the National Science Foundation under Grants No. US NSF GJ-36265 and GJ-^1538. 1. INTRODUCTION Recent progress in hardware technology and computer architecture has led to the design and construction of computer systems that contains a large number of processors. Because of their capability of executing several tasks simultaneously, the problem of job scheduling in a multi- processor system is of both theoretical and practical interest. Several authors have designed scheduling algorithms to produce schedules which minimize the total execution time of a given set of tasks and thus achieve optimal utilization of the processors [1-3]. Unfortunately, such algorithms are known only for some special uses. Furthermore, in many instances, the algorithm that produces optimal schedules is so complex that the reduction in the total execution time of a set of tasks is offset by the computation time required to determine an optimal schedule. For this reason, simple algorithms that produce only sub-optimal schedules are often used in practice. Such a choice becomes even more appealing when the performance of the simple algorithms producing sub- optimal schedules can be compared quantitatively with that of algorithms producing optimal schedules. For this reason, lower bounds on the performance of simple algorithms have been studied extensively [^, 5]. In previous works on job scheduling, a multiprocessor computing system is modelled as one containing identical processors. We introduce here a more general model in which different processors have different computation speeds. This model is a realistic one when we consider the possibilities of replacing one or more of the processors in an existing system by faster processors and of interconnecting different computers in an installation. It will be described precisely in Section 2. In Section 3 } lower bounds on the performance of a class of simple nonpreemptive algorithms are derived. These results will also provide us with information concerning the relative merit of different computing systems and the trade-off between the speeds and the number of processors in a multiprocessor system. Algorithms which produce preemptive schedules with minimum total execution time have been found in some special cases [3] for systems containing identical processors. Algorithms which produce optimal schedules of independent tasks when the multiprocessor systems contain different processors are described in Section k. The performance of preemptive scheduling algorithms is compared to that of nonpreemptive scheduling algorithms studied in Section 3. An algorithm which produces schedules with minimum mean flow time is described in Section 5. Performances of different computing systems, using mean flow time as a criterion, are compared. 2. GENERAL MODEL 2.1 A Model of Heterogeneous Computing System By a heterogeneous computing system, we mean a multiprocessor system in which processors have different computing speeds. We measure the speed of a processor against that of a "standard" processor whose speed is considered as 1. The speed of a processor is said to be To if it is b times as fast as a standard processor. Without loss of generality, we shall assume that b > 1 throughout our discussion. Let us denote a multiprocessor system which contains n_ processors of speed b.., n„ processors of speed b p , . .., n, processors of speed b -.by (P = (n , n , . .., n, ; b , b , . .., b ). Furthermore, the N processors will be referred to individually as processors P. for i = 1, 2, . .., N where N = n. + n + . . . + n. In particular, the n processors of speed b. are referred to as P , P , ..., P ; the n p processors of speed b are referred to as P n +1' P n +2> •*•' P n +n ' etc * For exam P le > the system (P = (l, 3; 2, l) contains four processors P , P , P , and P, whose speeds are 2, 1, 1, and 1, respectively. 2.2 Definitions and Notations Let if - {T , T , ..., T } denote a set of tasks to be processed by a system (P . The execution time of a task is defined as the time required to complete the task on a standard processor. We shall denote the execution time of the task T. by ^(T. ) where |_i is a function from ^ to the reals. In other words, (i is a function that specifies the execution times of the tasks in Zf . Furthermore, we suppose that there is a precedence relation < defined over ZT . That T. < T. (reads T. precedes T. or T. 1 J I J J follows T. ) shall mean that T. cannot begin to be executed before the J execution of T. is completed. A task is said to be executable at a certain time if the execution of all tasks preceding it has been completed. Formally, a set of tasks is specified by an ordered triple ( 7, \i, <). We also describe a set of tasks {j> , \±, <) by a directed graph whose vertices correspond to the tasks and are labeled by the names, T., and J their execution times, u(T.). There is a directed edge from T. to T. if T. < T.. For example, a set of tasks {&, n, <) is shown in Fi^ore 2-1 where V = {T^ T g , T , T^, T } and ^ < T^, T^ < T , etc. By scheduling a set of tasks on a multiprocessor system, we mean to specify for each task the time interval within which it is to be executed and the processor on which execution will take place. A schedule can be described by a timing diagram (also known as the Gantt chart) such as that shown in Figure 2-2. In this timing diagram, each horizontal line is a time axis and its subdivisions give the sequence of tasks executed on a processor together with the idle periods of the processor. The idle periods are the time intervals within which the processor is not executing any tasks. We use cp , cp , ... to denote idle periods of the processors as shown in Figure 2-2. With a slight abuse in notation, we use u(cpu ), |J.(cpp), ... to denote the lengths of .the idle periods. The completion time of a schedule is the total time it takes to t execute all the tasks according to the schedule. For example, the completion time of the set of tasks (Z/ f u, <) shown in Figure 2-1 according to the schedule shown in Figure 2-2 is 6.5. When the completion time is used as a criterion for comparing different scheduling algorithms, an optimal schedule for a given set of tasks (.7", (a, <) is one with the minimal completion time. Throughout this report, we assume that the execution of a set of tasks begin at t = 0. T,/3 T s /1 T 2 /2 Figure 2-1 Pi T 4 i T 2 q^ ^ u(T ) =— - (3-2) n D T.G^ 1 n ° l Also u 2+ i 2 = OJ n (3-3) To bound the magnitudes of I and I_, let us consider an idle period cp of a certain processor, P n . Let t n and t~ denote the times at which m k 1 2 cp begins and ends. We have the following observations: (i) if the m execution of a task T. on another processor begins at t where t < t < t p , then there must exist a task T. such that T. < T. and the execution of i i -J T. ends at t. Otherwise, T. would have been executed prior to t on i .) processor P . (ii) If the execution of a task T. begins at t where t < t, then there must exist a task T. such that T. < T. and the execution of T. ends at t . Combining these two observations, we conclude that if l 2 ' 9 , cp , ..., cp are the idle periods of a certain processor, there exists a set of tasks "6 - (T.,, T. , ..., T. .) such that (i) t Z n(cp, ) Vt? : >,^ fflV v...,U It follows that I 1 < b u>» (3-U) Moreover, Hence, n E n(T ) > I + I T^e k 1 2 I l +I 2 -^ V '"' V T 2n+1^ 2 yields a schedule whose completion time is n + nb + b for very small e as shown in Figure 3 -la. On the other hand, the priority list L= ^ T 2n+1' T n+1' T n+2> '~> T 2n' T l' T 2> "" T n^ yields a schedule whose completion time is n + b for very small e as shown in Figure 3-lb. □ Let us observe that the upper bound of the ratio go/ go' in Theorem 3-1 is approximately equal to b + 1 for large n. That is, even when there is only one fast processor in a multiprocessor system with many standard processors, the worst case performance of an arbitrary scheduling algorithm when compared to that of an optimal scheduling algorithm still depends primarily on the speed of the fast processor. For a fixed n, the upper bound of oj/gj' approaches b as a limit when b increases. This fact indicates that a priority-driven schedule may become very inferior for large b. As a matter of fact, when b is very large, better performance can often be assured by using only the fast processor to execute all the tasks in -/ . In that case, the completion time approaches the minimum completion time as a limit for large b. We shall return to this point in Sec. 3-3. The result in Theorem 3.1 can be extended immediately. Theorem 3.2 Consider a multiprocessor system (P = (n , n , ..., n. ; b b , ..., b ). Let go and go' be the completion times of a set of tasks (J" ', \x, <) when it is run on (P according to a priority-driven schedule and an T T • • • • T p t i V i i i i n i ■ 2 p, f P n+1 f p 3 P r n+i (a) T 2n +i p l I p I W , ]i. T n + ; 12 A n+l i '2n+i 1*2 n i Tn (b) Figure 3-1 T n+ 2 , b„ > ... > b . We have b. b co' - b k Is. 1 Z n. b. .-,11 1 = 1 Moreover, the bound is the best possible. Proof. In the priority- driven schedule specified by a priority list L, let U. denote the sum of the execution times of all tasks executed l on the processors of speed b.. Let I. denote the sum of the lengths of the idle periods on the processors of speed b.. We have oo = k Z n. U l U 2 b^ +I l +I 2 + '" \ i=l Z n. 1 = 1 • k k b -b. U. ± z u. + z -±-l Ur + i-) b n . .. i . b_ v b. i' 1 i=l i=l 1 i > k k-1 b.-b ") - Z I. + Z ~~ - I. 'l i=l x i=l b l \J (3-6) Clearly k Z i=l U. l k Z i=l a. b. l i < UJ 1 (3-7) and Ik u. : — +1. = n. w D. 1 1 1 (3-8) Using the same argument presented in the proof of Theorem 3-1, we conclude that the total length of the idle periods of any processor must be equal to b co' or less than — . Hence \ b. I. k l < b. oo' n. - 1 l (3-9) Moreover, Consequently, k k b oo' Z I. < ( Z n.-l) ~^— i=l i=l k k b n Z I. k . , l 1=1 k Z n. -1 i=l x < b co» (3-io) Substituting (3-7)-(3-10) into eq. (3-6), we obtain co < 1 - k Z n. i=l x Z n. b. k b -b. i=l 1 1 oo « + Z '■ ■. X n. oo b n i=l i b b k + r^rr ( Z n.-l) oo- b_ b \ _ i 1 k i=l k-1 b.-b b n + Z -V^ >T n - "' i=l b l b k x 15 Multiplying both sides of this expression by b b Z n., we obtain 1=1 1=1 1=2 i=l i=l k-1 + b Z (b -b ) n ] u' i=1 i k i which simplifies to k k k (b, Z n. b. ) w < (b Z n. b. + b Z n. b. - b b ) w» k . 1 l i' - 1 ., i i k . , i i Ik That is , 2 < 1 + 1 . 1 W — b, k k Z n. b. 1=1 X X To show that the bound is best possible,, we consider a set of tasks ^/ , u, <) where the precedence relation < is empty and ^ = fT rpql r = lf 2 > ••" k ' q = - 1 ' 2 > '"> N > and p = 1, 2, ..., n n -l for r 1 and p = 1, 2, ..., n^ for r / 1 k-1 U(Tjs = 1, 2, ..., Z n +1} s i=l x The execution times of the tasks T are rpq 16 rpq b r \ \ H(T ) = < b Id b r k 2 b r \ \-l r k k 1 < q < n ! n + 1 < q < n., + n_ 1 - H - 1 2 k-2 k-1 E n.+l < q < En. i=l 1=1 k-1 k E n.+l < q < En. 1=1 i=l for r = 1, 2, . .., k and p = 1, 2, ..., n -1 if r = 1 and p if r / 1. The execution times of the tasks T are = 1, 2, n n(T fl ) k-1 s = 1, 2, . .. E n. i=l ± ^V = b l\ . z n n i b i i=l k-1 where e is a small positive number and S = E n.+l. Let us consider a i=l 1 priority-driven schedule according to a priority list which assigns higher priorities to the tasks T than the tasks T . Priorities are rpq s assigned to tasks T according to the lexiographical order of their rpq * subscripts. (That is, T ^_ has the highest priority and T-,-,p has the next highest priority, etc. ) The priority assignment of the tasks T is s (T,, Tp, ..., T q ). We obtain the schedule in Figure 3-2a where CO = b 1 b k (i^-l) + b 2 b^ n 2 +.. b. b, n. + t>_ E n. b k k k 1 i=1 i i k = (b n +b n ) E n. b. - b b v k 1 . n 11 Ik i=l 17 However, this set of tasks can be scheduled as shown in Figure 3-2b where k w' = h, Z n. b. □ i=l Let (P = (n , n , . .., tcl; b , b , . .., b ) be a multiprocessor k system. The quantity Z n. b. can be considered as a measure of the total i=l k computational capacity of the system. Indeed, the sum E n b. is the i=l ± x maximal throughput of the system. According to the result in Theorem 3-2, the worst case performance of a priority- driven schedule depends mainly on the ratio of speeds of the slowest processor and the fastest processors in the system. For systems of large maximal throughput, the bound in b l Theorem 3-2 approaches — + 1. This implies that in systems where the speed k of the fastest processors is significantly larger than that of the slowest processors, the worst case performance of a priority-driven schedule may become very poor in comparison with that of an optimal schedule. 3.2 Performance of Some Heuristic Scheduling Algorithms Theorem 3.1 and 3.2 establish a lower bound on the performance of priority-driven scheduling algorithms. Between the two extremes of using an optimally chosen priority list and of using a completely arbitrary priority list, there is the possibility of using priority lists obtained by simple heuristic procedures. Such a possibility offers a good compromise between the performance of the resultant scheduling algorithm and the computational cost of determining the priority list. In this 18 d* » • • • „ ■ H + OJ + ■ • • • -. • » • •H «■ • . . * H CM H H H + ,£> H O e O rH + a o OJ + H a o + EH 75 •H + a vo H ° ' bej W II •rl H e 1^ OJ EH _ J" EH^ 1^ ^ I a _ EH H _ a _ EH — H - OJ rH _ h EH H ? : EH ^-^ OJ + a _ a - EH - - + - ?. rT H - — OJ ™ - eJ i pf - + H A! OJ a + H ,_f + . w> »o a J ^M % % « a H J3 SJ *4 a Eh .o EH EH SL- - SL- EH M EH B~- _ Eh _ ,_i M EH H EH J3_ J3 ^ EH*_ ^_ H — 3- EH " EH^" EH EH « * *■ ^ E> -. « «. ^ - - H H + - rH - OJ - + f + H 6? OJ fl ^ a OJ ' *" a ^ SB OJ EH OJ EH OJ EH OJ > ^ OJ EH OJ EH - h oj. * * : ^: — EH 4 — EH ,. & ^-~^ ■« ► ^ * -.» H + - OJ a cr H AS OJ H ^a + rH + H «o H 8 OJ ,Q OJ OJ OJ a ^fl_ OJ OJ OJ OJ OJ OJ OJ OJ EH EH ,0 EH EH EH OJ OJ OJ OJ EH OJ --> OJ + - EH - - EH_^- HEH- 4- rl SI H 3 OJ + ^} JO H 3 OJ OJ OJ OJ H C H OJ OJ EH EH rD EH OJ EH OJ EH '-* H a EH EH^ ^ rH _ OJ _ _ H _ _ H _ .5." rH OJ + EH EH W Vr>~ + 1/5 H c H 1 H 1 H 1 H 1 1 H H H G H H H H H a 1 fl ,Q d fl H H H H a EH EH 3" EH _ ,-4 - to ■. •« -J EH ^ >■ ■-' > if — H* OJ rH H A! OJ a + + + W) + fe 3 P H Si Si fl H H El H a Si UJ Si EH £> H EH EH Si EH EH SI Si sT Eh - rH - Eh - OJ - EH - > rH + H rl a • • • 3 + H Pi + H a + • • • H ... g 52- ... H EH X) EH EH H H H EH _ H EH H EH EH ' £H EH Q? o5 c + o? £ 19 "i +1 >i +2 L l +n 2 H 111 112 T X 1(^-1)1 L 211 221 T 2n 2 «L kll k21 *V ttf b. Z n.b. k 1=1 x x T l^ T ll(n 1+ 1) T ll(n 1+ 2) T H(n 1+ n 2 ) Tll ^V n 2 +i ) T $ + T T $ 11N 2 ^2 VlT^^lV^ Vk ' b 2\ l(n -1)2 I l(n 1 -l)nV 1 - 1 )(V l) 1 _LAJ x -J -H I t i, L, I V Vk 'Vk M * 1 I HH 1 1-+ 212 -H T 21n T 2l( n;L +l) T 21( n;L +2 ) T 2l( n;L +n 2 ) T 21( n;L +n „+!] I n I i — £-+ b k b k ■ e /b 1 l(n,-l)N T n, 1 n, £ / b l T 21N V 1 *^ 1 T„„ T. L 222 x 22n x 22(n + l) 1 22(n+2) T 22(n +n Q ) T 22 (n +n ^l) HH — I I — I \ I I '-f^- T 2n 2 T 2n n ^xUn^l)^ V 2 } S^l^^ S» 2 h i 21 I 2 ' I HH I Mr- kl2 T km T kl(n 1 +l) T kl(n 1 +2) T kl(n 1+ n 2 ) ^(Oj+i^+l) M HH I M- k22 -H 1 Mr kry M I 1 HV- e/b ; T * T 22N V 2 V 2 eA 2 T 2ruN \ +r fe n ^ eA L klN , ' 4 5 T 6 k2N ,5+1 T * kiL N N Figure 3-2b 20 section, we shall consider the special case in which the precedence relation < is empty. We denote such a set of tasks by ( J , \i). Let us consider the problem of scheduling a set of tasks (2T, \i) on the multiprocessor system (p = (l, n; b, l). Suppose that we choose the r longest tasks in the set 3 and schedule them according to a priority list L . Let oo denote the corresponding completion time when they are executed on (r . Let L denote a priority list obtained by appending to L an assignment of priorities to the remaining tasks. Let oo denote the corresponding completion time. When oo > go, we have, Theorem 3.3 "r ' 1 7^< 1 + u- M "' (b+n)M where m I • / Tr+1 r r+l [b] rbJ-lA M = max mm — st-t > — rrr i_ H - - j_ t — \> \|n+pb] ' n+[b] b b J' r+1 n+b and co' is the completion time according to an arbitrary schedule. This bound is the best possible when b is an integer and n+b divides r. Proof. Let T n , T~, .... T denote the r longest tasks and T n 1 ' 2 7 ' r r+1 denote the (r+l)st longest task in ^J . Since oo > go, tasks T,, T p , ..., T are completed before t = co . Moreover, since each processor can have at most one idle period immediately prior to t = w , the idle period of the r fast processor is equal to or less than u(T , ). The idle period of any slow processor is equal to or less than — u(T , ) if the fast processor is not idle at t = co and is equal to or less than li(T ) if the fast processor is idle at t = w . Now, 21 1 U l "r " S3 f~ + U 2 + h + V m [ ^ ( W + ¥ (U 2 +I 2 ) + k J 2 +I i] (3- 11 ) where I , I p , U and U are as defined in the proof of Theorem 3.1. We note that if L = J 2 < ^ T r + 1> n - b and if I-, ^ 0, h ± "' or r+1 n+b K ' VJ "r+l' — ^(T J < (3-1*0 Combining (3-12) - (3-1^+), we write M ^(L i) < w ' r+1 where M = max min (li r+1 n+[b] r+1 n+[b] [bj |"b]-l\ r+1 b b /' n+b Equation (3 -11) becomes , 1 r n+b , b-1 n+b-1 , n co < — - [— — co ' + — — nto + co • 1 r - n+1 L b b r bM J which simplifies to ^ 1 1 co- - M ~ (n+b )M To show that the bound is best possible, let us note that for b and n+b being integers, the bound becomes r rb+n+b (n+b ) ^7 _ rb+n+b (3-15) 23 Let tT = (T^ T 2 , ..., T r+n ( n+i ) +1 ) and n(T ± ) = (n+b)b i = 1, 2, ..., r, r + 1 ji(T. ) = 1 i = r + 2, ...., r + n(n+b) n(T. ) = 1 + e i = r+ n(n+b) + 1 where e is a small positive number. The tasks T-,, T , . .., T. can be scheduled by the priority list (T, , Tp, . .., T ). ■ If we use the priority list (T^ T 2 , . .., T^ T r+1 , T^ 2 , . .., T r+n+1 , T r+n+2 , ..., T ") r+n(n+b)+ I we obtain the schedule in Figure 3-3a with completion time w = rb + n + b If we use the priority list (T^ Tp, ..., T r , T r+2 , T r+3 , ..., T r+n(n+b) ^ T r+ n(n+b)+l> T r+1 } the completion time is oj = rb + n + b(n+b) as shown in Figure 3-3b. □ r When b and — — are integers and when the maximal throughput of the system, n+b, is large, the upper bound of — - given by eq. (3-15) becomes U) approximately 1 + - b 1 +Zk TT ii 1 1. 2k Vn-b+l T r TV* ? * I ^k * H^ I btlt ,ft H+tHH^ p . T 3 Tr-n-b+3 J*"!*? T mmrt)*l ?3 h bU * H-+-^ h T 4 T HH T H ? *hI „,,Ir n — hv^-^v — kh^h b(»+b) b(ntb) i i -y -^"— rb n+b y A V ' w T, T hvb W^i TV Trfz T„ B(ritb)tl ^ ?, Ht-^ I " I * I I I * 1 : I .I. I IWh-H^ 4 1 h+b *+b mb *+b -r -r i+e P D b Tr+| V z i ^ pv I """" ^ i- : r4-A l , I 1 V- b(n+b) b(*+b) ' ' b(n+b) % I ^ f-A 1 Tr "" E> ^| | 1 V- bln+b) bCn+b) I I I ? nM l \ hV-l \ h-H H 1 1 ^ 1 b(r\+b) bl"+b} I I I V v /\ y A v J rb n b(^4b) Figure 3-3 25 Hence, the ratio of w to the minimum completion time of the tasks in ^J is upper bounded by 1 + n+b when the minimum completion time of the r longest tasks is less than w . We now generalize Theorem 3.3 to the problem of scheduling a set of tasks (J, u.) on a multiprocessor system Q= (n , ru, . .., tl ; b , b , . .., b ). Again, let us choose the r longest tasks in 3. Let to denote the completion time of these tasks when they are executed on (? according to a priority list L . Let L denote the priority list obtained by appending to L an arbitrary assignment of priorities to the remaining tasks in 3 • Let w denote the corresponding completion time, and let w' denote the completion time according to an arbitrary schedule. When w > oj and b, =1, we have, r k ' ' Theorem 3.^- U) 7 < i 1 + Q Q Z n. b. 1-1 X X where Q = max min l Z n. b. - 1 1=1 l l b. "'W Thus eq. (3-l6) becomes co < — = r — k 2 n. i=l X Z n. b. . , , . , 11 k b -b. i=l . 1 l ■ co' + Z — : n. co 1 i=l 1 Z n. b.- 1 i=l X 1 (3-19) We note that in any schedule, there must be r+1 Z njb.l l i i=l b^-Cfb^-l) of the r+1 tasks T.,, T , . .,, T , being executed on a processor of speed b for some I, < £ < k. Consequently, we have w ' > ^ T r+ 1 } r+1 Z n.[b." i=l x X r^i I^L- 1 (3-20) for some g, = 1, 2, ..., k. Since k (r+1) |i(T , , ) < ( Z n.b. ) co' r+1 — . , 11 i=l 28 and eq. (3-20), we write QuOr,-,) < r+1' - where Q, = max mm Equation (3-19) becomes •• r+1 k Z i=l n.[b.] 1 X '1A !hl±) _-'- b. b 4 |' k Z n.b. 1=1 X X co < — r k Z n. i=l x k , , , Z n. b. - i , k b n - b. . t l l s n . b . HI + E _2^_i n . u + izi «L i=l X X b l 1=1 b l X r \ Q which simplifies to Z n. b. oj < . -, l i r — i=l ■ k z , n. b. + i=l i l k Z n. b. i=l X X Q . ±1 QJ Q ' ' That is "Wi 1 Q Z n. b. .-,11 i=l For integer values of b. and Z n. b. divides r, the bound becomes l . , l l 1=1 k /., , n Z n.b. - b co rb n + (b^+l) .,ii 1 r < 11 i=l co' - k rb_ + Z n.b. 1 i=1 xi (3-21) and is best possible. That the bound in eq. (3-21) is best possible can be demonstrated by an example similar to the one shown in Figure 3-3. 29 Another simple heruistic for assigning priorities to the tasks in (fT, (j.) is to assign higher priorities to longer tasks. Let w denote the completion time when a set of tasks {??, \x) is executed on the multiprocessor system (P = (l, n; b, l) using a priority assignment according to the lengths of the tasks. Let W be the completion time of an arbitrary schedule. We have Theorem 3.5 - t < ^b) oj' - b+2 ' - and u , < 2?, D > 2 oo' - 2 Proof. Consider the priority-driven schedule corresponding to the priority assignment according to the lengths of the tasks. Let U, and LL be the sums of the execution times of the tasks run on the fast processor and the standard processors, respectively. Let I, and I p denote the sums of the lengths of the idle periods of the fast processor and the standard processors, respectively. The completion time is u = 57i [T + u 2 + I i + y (3-22) We consider two cases : Case 1. I = 0, that is, the fast processor is never idle. In this case, 1 U i oj = —. r-± t L + I n+l ' b 2 2 J [-TT^ + ^Co+^+ry (3-23) n+l L b b v 2 2 J b 2- t Note that the case I, - I Q = needs not be of concern to us since when I , = I p = 0, we have co = co ' . 30 where i (U 1+ U 2 ) <^o>' (3-21+) and U 2 + I 2 = n w (3-25) Let us note that if the fast processor executes only one job according to this schedule, then co < co'. Therefore, we can assume that the fast processor executes two or more jobs. Let T be the last job executed in the fast processor and cp be the longest idle period among the idle periods in the slow s processors as illustrated in Figure 3-*+. Since ^ (T r } >n(y.) and u(T r ) < w - u(cp s ) We have Moreover, Hence w > n(T ) + u(cp ) > (b+1) u(cp ) *Hi(

I 2 n oj > (b+1) I 2 Substituting this expression and eqs. (3-2*0 and (3-25) into eq. (3-23), we obtain . 1 r n+b , b-1 nco _ - n+1 L b b b(b+l) J which simplifies to (o (b + l)(n + b) (3 _ 26) co' - b(n+b+l) v ° ' 31 ^ — i — h T r n(T p )/b) ^ 1- ^\ v n+1 (a) Case 1: T is executed on the fast processor T r H v 3 (b! Case 2: T is executed on a slow processor Figure 3-^ 32 Case 2. 1^0, that is, at least one of the slow processors is never idle. Suppose that a processor that is never idle is P.. We consider the two J oo possibilities, b > 2 and b < 2. For b < 2, I > — implies that at most one task is executed on processor P.. Consequently, the first task executed in the fast processor is not the longest task which is a contradiction to the assumption that higher priorities are assigned to longer tasks. Therefore, for b < 2, I < |. Thus, eq. (3-22) becomes u - m: £ + t + y „ 1 r n+b , b-1 n-1 w < — - [— — to' + — — n w + ^— (o + -] - n+1 L b b b 2 J That is a) 2(n+b) co T - b+2 Similarly, we claim that I, < — — w for b > 2. (if I_ > — f— oo, each 1 — b — lb slow processor executes at most one job. Let T. be the task whose execution time u(T. ) is equal oj. Let T be the longest task in 5^. Since T is executed in the fast processor and r ■ 1 and b < 2, (b+l)(n+b) < 2 (n+b ) b(n+b+l) - b+2 Similarly, we have for b > 2 w n+b go' - 2 Since for n > 1 and b > 2 b(n+b+l) - 2 u 3.3 Scheduling on Different Systems We now extend our discussion to the case of executing a set of tasks on different multiprocessor systems. Let {^T , \i, <) be a given set of tasks and (P = (n., n , . . . , n ; b , b , ..., b ) and (P ' = (n', n', ..., n'; b', b', . .., b') be two multiprocessor systems. Let co be the completion iC -L 2 i£ time when ( Zf , u, <) is executed on (P according to a priority-driven schedule. Let to' be the completion time when {^ , u, <) is executed on (P* according to an arbitrary schedule. Extending the result in Theorem 3.2, we obtain Theorem 3.6 u)' - h k £-> 11 . >~> . .,11 1=1 b i k Z n. b. i=l X X k Z n. b. i=l X 1 3^ Moreover, the "bound is the best possible. Proof. The proof of this theorem is similar to that of Theorem 3.2 and will only be outlined here. As was discussed in the proof of Theorem 3.2, the completion time of the priority- driven schedule is given .by eq. (3-6). Again, we have U. r^- + I. = n. to b. l l l and k Z U. i=l : k« Z n' b! i=l ± 1 < co' (3-27) Similar to eqs. (3-9) and (3-10), the sum of the lengths of the idle periods of any processor must be such that b. I. k l < b' w< n. - i i (3-28) and k b Z I. k i=l X < b* w* - 1 Z n. - 1 i=l X (3-29) Thus, eq. (3-6) becomes to < k Z n. i=l x "k r Z n! b.' , , . . -, i i k b -b. i-l , „ 1 i = W + Z — : — - n. OJ b n . , b, l 1 i=l 1 k b.-b, b' i k 1 Z n.-l w f + Z — r — t— ■ n. W b b • -, i I ■ i b, b. l 1 k \i=l / i =1 1 k b n b' k 1 35 That is k k' k b n Z n. b. oo < (h Z nj b.' + b' Z n. b. - b n b ' ) u> f k.,11 - v k . n 1 1 I.-.11 k 1' i=l i=l i=l or Z n.' b! bj . . i i b W <^ + i= 1 oo» - b k k Z n. b. Z n. b 1=1 x 1 i=l x x The example in Figure 3-5 demonstrates that this bound is best possible. □ k* k Let us note that if Z n! b.' = Z n. b., the bound in Theorem 3-6 i=l x 1 i=l x * becomes b' b' ". < TT + 1 " T — ( 3 " 3 °) oo' - b k k Z n. b. i=l x X That is, for two multiprocessor systems with the same maximal throughput, the bound is mainly a function of b' and b . Thus, if we hold the value J_ K. of b T and b fixed, we can trade a smaller number of fast processors for a large number of slow processors or conversely without changing the bound on the worst case performance of priority- driven scheduling algorithms. k For example, suppose b' is approximately equal to b and Z n. b. is 1 k i=1 i i very large, then the ratio -, is approximately upper bounded by 2. In this case, P is a system containing a small number of fast processors and (P ' is a system containing a large number of slow processors. The result in (3-30) says that any arbitrary priority- driven schedule for (P is never worse than the best possible schedule for ^?' by a factor of 2. This indeed is a 36 rH OJ - • • • «; H H + OJ + sf - • • •+ ^ rT 1 • • • «: ..»•«; © O © H a © rH © © •H n + to a (M 43 OJ (3 EH OJ OJ OJ -OJ OJ OJ OJ OJ OJ r W EH 43 EH EH~ rH a EH™ OJ OJ EH EH EH~ + EH rl CM 43 -OJ 9 OJ 5 ^H OJ + rH J3. + H H OJ rH 3 OJ EH 43 EH OJ EH S3 EH OJ EH H 0J_ — EH _ rH _ + H H OJ H — OJ — EH " A _ E" 1 ^^ H _ + lO S~ H d r-l H 2t 1 rH 1 ^-^ H 1 -H ?H -rH -H i-l -rH H 1 <3 43 fl — ,—i 13 H -H rH rj EH EH _ EH _ S3 _ - EH - ^H ~ H - En- ^ » -« *■ EH •< * ■«: ^ ^ 's k. ™ ^_^~ ~ ^_^ — - ^ - H J»! H + OJ + if H H- EH 43 OJ 9 a rH rH J3 H + rH S3 r^ lO EH rH B-i EH 3 H _ 3 H 3 _ H _ EH EH ^_^ ^^ ^ H rH OJ r^ rH H a EH 43 3 EH EH ^ 4- H^ • • • H ... ^ + H • • • s rH H 6-1 EH H EH rH Eh Eh "^ fk,^ w » -^ 37 T., P 2 P' P' n'+l ■n*+2 P' n i +n 2 S' P' ^ T T 111 , 112 b. Z n.b. k i=1 e/b. T Hn n T ll(n + 1)T 11( +2} T 11( } H(n 1+ n 2+ l) T^ T 2 $ Vk b l\ b l b k ' b 2 b k ' b 2 b k b 2 b k ' b 3\ T 1 Cr,f. T l(n'-l)l T l(n'-l)2 T l(n'-l)n 1(^-1)^1) — 1 — . — uj — i — h — i — i Vk " e Ai C l(n^-l)N \ W MM \-^ 1 h^H h-H T T 211 i 212 T_ 0U,,_ .^,,_ .„x 3U,/_ ^w .,n T 2 U V + lVl L 21n ■ L 2l(n 1 +l)- L 2l(n 1 +2) x 2l(n 1 +n 2 ) J -2l(n 1 +n +1 e/b 1 l + l , l , i : . 1 2 . i 3 .- 1 . i M 1 1 HI I M— | hH T T 221 212 e/b 2 4Ar C 22n T 22(n + l) T 22(n + 2 ) T 22(n 1+ n ) 2 ^V n 2 +1 ) T 22N V 2 V 2 1 1 — M l I \-±-\ h-H e/b ; T n' T 2 2 2n'2 T 2n'n T 2ni(n, + l) T2n 2 ( V 2) ^VV ^2 ^l +n 2 +l) ^ V^ \ + *Z - 2 (V 1) HI II HH 1 h-^H h+ e/b. T •. T k'M T kMd ^ V,(n 1+ 1) k «(n^) ^(^J^'KVV 1 ' V,K1 V T T -HH 1 1 HH 1 M— I 1 — HH 1 1 HH 1 h+-+ T T k'n,' , k»n»t *„' tAH 1 1 HH 1 h-^-H T k'n£,lJ *N k»-l S' - ^ N +1 i=l 1 k 1 N' = L n! 1-1 i w' = b, L n.b. k 1=1 A i Figure 3 -5b 38 quantitative confirmation of the fact that a processor of speed b is more desirable than b slow processors of speed 1. There is another interesting interpretation of the result in Theorem 3.6. Suppose that we are to execute a set of tasks on the system f) = (n , ru, . .., tl ; b , b , . .., b ). Instead of using all the processors in the system, one might choose only to use the n fastest processors. Let w„ denote the completion time when a set of tasks ( 5^, \i, <) is executed on the n fastest processors according to a priority- driven schedule and let W be the completion time when the set (5^, u, <) is executed on (P according to an arbitrary schedule. According to Theorem 3.6, we have —r < r— Z n. b. + 1 oj' — n n b., . ., ix n n 1 1 i=l 1 To compare the upper bound of — with the upper bound of — given in Theorem 3.2, we plot in Figure 3-6 the values of b for which the upper bound of — , becomes larger, than that of —7 for fixed values of n n and k Z n. b.. Without loss of generality, assume that b n = 1. When b n is .,11 k 1 i=l larger than that shown in Figure 3-6 one is assured of an improvement on the worst case performance of a priority-driven scheduling algorithm by using only the fastest processors in the system. Another interesting problem is to compare the execution of a set of tasks on two different multiprocessor systems using a priority- driven schedule specified by the same priority list. This is a realistic situation when some of the processors in an existing system were replaced by processors of different speed yet the priorities assigned to the tasks to be executed were not changed. To illustrate the point, we investigate a 39 10 12 14 k MAXIMAL SYSTEM THROUGHPUT 2 n L b; i = l Figure 3-6 ko special case in which a set of tasks (u, u) is to be executed on two multiprocessor systems (P = (n+1; 1) and (P* = (.1, n; b, l) according to the same priority list L . Let w and co f denote the respective completion times. Theorem 3.7 The ratio -, is bounded by CO 00 2(n+b) n+2 b < 2 b > 2 Proof. Suppose that there are n+1 or fewer tasks in *f . Clearly, " = n(T x ) and W >^ u(T i ) where T- is a task with the longest execution time. It follows that w -. < b U) 1 — Suppose that there are more than n+1 tasks in J . Let U denote the sum of the execution times of the tasks in 7 and let I denote the sum of the lengths of idle periods of all processors when {tF, u) is executed on the system (P = (n+1, l). Clearly, CO = FT1 ^ U+I ) (3-3D Let T be the last task on a processor which is never idle in the time interval (0, co). The length of the idle period of any processor must be equal to or less than |i(T ). Hence I < n ti(T r ) But U H (T ) < "r / - n + 2 Combining these two inequalities with eq. (30), we have , 1 /tt nU \ U) < (U + ;r.) - n + 1 v n+2' nU n + 2 Since go the ratio — , is bound by <*• > u - n + b Since u 2 ( n +b ) u» - n + 2 2 (n+b) 1 < -, < 2 -co' — for any priority list L and the bound is best possible. iH for b > 2, the theorem follows immediately. □ Corollary 1. For the case of b = 2 and n = 2, we have k2 k. PERFORMANCE OF PREEMPTIVE SCHEDULING ALGORITHMS In this section, we study algorithms which produce preemptive schedules with minimum completion time. By a preemptive schedule, we mean one in which the execution of a task may he interrupted before its completion. We assume that the cost associated with task preemption is negligible throughout our discussion. This assumption is justified in systems where task preemption does not require the reloading and removal of tasks in and out of the main memory. k.l Performance Improvement Obtainable by Preemptive Scheduling The bound on improvement in completion time of a set of tasks executed on a system (P = (n , n , . .., ri ; b , b , . .., b ) according to a preemptive schedule over that using an arbitrary priority-driven schedule can be obtained directly from Theorem 3.2. Indeed, this theorem can be restated as Theorem ^t-.l Let w be the completion time when a set of tasks {^7 , u, <) is executed on the system (P = (n.,, n , . .., ri ; b , b ? , ..., b ) according to a priority-driven schedule and w be the completion time when the set is executed on (P according to an optimal preemptive schedule. Then w t> t b n b_, - co — "b k P k Z n. b. i=l X 1 where b > b > ...>b . Moreover, the bound is best possible, J+3 As noted in Section 3> for systems of large maximal throughput k b ( Z n. b. ), this bound approaches — + 1. Hence, when the speed of the i=l k fastest processor in the system is very large compared to that of the slowest processor, significant amount of reduction in completion time of the set {7 ', u, <) can be achieved by allowing task preemption. Let us note again, that in a priority-driven schedule, processors are never idle when there are tasks ready to be executed. We expect the improvement in completion time attained by preemption to be less for non-preemptive schedules in which processors are allowed to idle when there are executable tasks. k.2 Performance of Preemptive Schedule for Independent Tasks Let us for the moment consider the special case in which the precedence relation < over the set of tasks is empty. That is, the tasks in Z7* = {T,, T , ..., T } are independent. Suppose that u(T ) > u(T ) > u(T ) ... > u(T ). Let w denote the completion time of the set (&" , \i) when it is executed on the system 0-* = (1, n; b, l) according to an optimal preemptive schedule. We have Lemma k . 1 oj > max j max P- f r J n /^ n+b m n(T, ) Proof. It is clear that m u(T ) w x, ; 2 ±- P- i=1 n + b kh since no schedule can have a completion time shorter than the one which keeps all of the processors busy. To show that w > max P l b+1 ' b+2 > •"> b+n is equal to the time required to execute task T-, on the fast — - _ ~^^~ . ~~ *.~^~.~ ~ -— ^ 1 processor, w is equal to or larger than this quantity since there is no way to complete the execution of T., in shorter time. Similarly, t— r [u(T-.) + |_i(T p )] is equal to the shortest possible time to complete the execution of tasks T-, and Tp and , . j [|i(T 1 ) + u(Tp) + ... + \i(T. )] is equal to the shortest possible time to complete the execution of tasks T, , T , . .., T. . Clearly, w is lower bounded by the maximum of these quantities since no schedule can have a completion time shorter than the time required to execute the i longest task on i processors. □ Indeed, the lower bound on the optimal completion time in Lemma h.l. can be achieved. In other words, we have Theorem k.2. co = max / max ^ l- w p) and "to the processor p in the time interval (0, t'). Similarly, we let the integer p, be such that where T„ = t' + E u(T ^ . ) and We assign the tasks T ,, T , . . . , T to thp P-L+. • -P^_! +1 Pl +P 2 +P i-1 +2 ' p^.. . tp^_! t0 tiie processor P and the task T to the processor P. in the time i> p l l> & interval (i g > w p ) and to the processor P„ 1 in the time interval (0, t«). We proceed in this manner until the m tasks are all assigned. The resultant schedule is as shown in Figure k-1 where T is completed at the time w^ on the processor P _ . P ntl Suppose that the execution times of some of the tasks are larger than w In particular, suppose that u(T-,) > u(Tp) > ... > \x(T ) > co We now show that m w p < Z p.(T i ) < hw p i=n+l hi >+8 Since and we obtain That m Z Ji(T.) i=i - = CO b+n P n Z H(T.) > nw i=l m Z u(T ) n and ^ v n+l y P \+l ~ b-1 Since bA. + u _ - A. = n(T. ) i = 1, 2, ..., n l P l^i h9 and n+1 E A. = w„ . -, i P the schedule in Figure k-2 is a valid one. Suppose that there is only j tasks with execution time longer than Up, and j < n. That is, ^(T-) > u(T„) > ... > u(T.) > u but u(T. t) < u In this case, we let T! , be a new task such that ^ v 3+l J - P 0+1 m u(T' ) = Z u(T.). We can use the schedule shown in Figure h-3> where ' J i=j+l X H(T ± ) - ^ p \ = £T[ i = 1* 2 , . . ., j and 3 t 3+1 P i=1 i Since b^ + w p - /\ ± = n(T ± ) i = 1, 2, ..., j f This can be done because . Z u(T ) - ju i=l A j+1 = W P " b^l J (b+j-1) Up - Z n(T ± ) _ i=l b-1 3 2 [i(T ) i=l follows from u > — . — P - b+j-1 > 51 3 ft 52 and V + w p"V + (n " J) w p = (la-1) A. +1 + (n-d+1) w p 2 n(T ± ) - jw p = (b-1) [«p - — ^1 3 + ( n "J +1 ) w t = (b+n) w p - Z n(T ± ) i=l m Z H(T.) i=j+l X the schedule in Figure k~3 is a valid one, Case 2. Suppose that hCf-l) "P ~ ~" b w. From eq. (k-l), we note that (l^) + n(T 2 ) u(T 1 ) b+1 < It follows that ■ (To) H(TJ 2' - b (M) Also, m u(T. ) Z — ±- i=2 n m u(T. ) ,, x n+b i=2 n+b n n+b n m u(T ) Z \l(\) Li=l n+b n+b 53 n+b — n ' k i(T 1 ) k i(T 1 )' - b n+b _ h(T-l) (WO Therefore, the schedule shown in Figure k-h can be used. In this schedule, T-, is assigned to the fast processor P and the. remaining tasks are assigned to P p , V , ..., P , . The relation in (h-3) and (h-k) guarantee the validity of the schedule. Case 3. 3 H(T.) for some j, 1 < j < n+1. It is clear that ^(T.)<03 p 1=1, 2, . .., j 0*-5) However, since which can be rewritten as j-l fi(T.) j-l n(T.) ifi b+ J- 2 b+j-l b+j-2 < «p (i(T.) oj - P b+j-1 we have w p < u(Tj) (U-6) ^ ef ef b/" On the other hand it follows from 55 0+1 n(T.) i=l b+o ~ P (That is, b+j-1 b+j 3 z Li=l ^ T i) n+b b+,1-1 < — < J P that Moreover, from H ^ «p m |i(T. ) E — < w . _ n+b - P i=l (W7) we have b+j-1 n+b J [i (Tj i=l b+j-1 n-j+1 n+b m Z i=J+l H(T. ) — r=r < w t, n-j+1 - P or v • n . , m ^(T. ) EStl u + n=fl+l z ULil < a, n+b P n+b . . . n-j+1 - P 1=0+1 which simplifies to m i-J+1 n-j+1 f,i (h-Q) Therefore, the schedule shown in Figure U-5 can be used. In this schedule, the tasks T,, T , ..., T. are assigned to the first j processors. The 1. d. J execution time of the portion of T. assigned to P. , A., is given by J P n(T. ) - t *1 b-1 56 u(T ) ... > u(T ). Let r be an integer such that [rbj + rn < m < [ (r-i-l)bj + (r+l) n (5-2) 59 where [xj denotes the integer part of x. We have, Theorem 5«1 When a set of tasks {f7, u) is executed in the system (P = (l, n; b, l), a set of weights {w,, w p , . .., w } that minimizes the mean flow time is 1 b w. = x-sn v i = 1, 2, ..., [bj i = [bj + 1, [bj + 2, ..., [bj + n i = [sbj + sn + 1, [sbj + sn + 2, . .., |_(s+l)bj + sn and s = 1, 2, . .., r-1 (5-3) i = [sbj + (s-1) n + 1, [sbj + (s-l) n + 2, ..., [sbj + sn, and s = 2, 3, •••> ^ and w. i-rn i = |_rbj + rn + 1, [rbj + rn + 2, ..., in if m - [rbj - rn < [(r+l)bj - [rbj, and i-rn i = [rbj + rn + 1, [rbj + rn + 2, ..., [ (r+l)bj + rn w. = l r+1 i = [(r+l)bj + rn + 1, [ (r+l)bj + rn + 2, ..., m (5-4) otherwise. Proof. We show that there is a schedule for which the weights w. are l that given by eqs. (5-3) and (5-*0. To construct this schedule, we partition the tasks in *7 into r+1 blocks, B,, B p , ..., B where 6o B l= {T 1> T 2' •••' T M' T [bJ+l' T LbJ+2' ••" T Lbj+n 3 B 2 = {T [bJ+n+l' T [bj+n+2> •" T [2bJ+n' T [2bJ+n+l' "" T [2bJ+2n } B r = {T [(r-l)bJ+(r-l)n+l' T [ (r-l)b]+(r-l)n+2' ' " T [rbJ + (r-l)n' T LrbJ+(r-l)n+l^ '" T LrbJ + (r-l)n+n-l' T [rb]+rn 3 J r+1 = {T [rbj+rn+l' T |.rbJ+rn+2' •"' T [ (r+l)bj+rn' T [ (r+l)bj+rn+l' In the schedule, the task ^(r+OObJ+rn' T [ (r+l)bj+rn-l' ~" T [rbj+rn+l in B are assigned to the fast processor in that order and the remaining tasks r+1 in B are assigned to any m-l (r+l)bj-rn of the standard processors one in each r+1 processor. Next, tasks ^ rb j + (r _ l)n , T [ rb j + ( r _i) n .r •" T [ (r-l)bj+(r-l)n+l in B are assigned to the fast processor in that order while the remaining r n tasks in B are assigned to the n standard processors one in each processor, r This procedure is repeated until the tasks Ti fe i, T j -(-, ! _i/ '"> T 2' T l in B l are assigned to the fast processor and the remaining n tasks in B_ L are assigned to the n standard processors. The resultant schedule is shown in Figure 5-la. We note that when b=l, this schedule is just a shortest execution time first schedule. To show that the set of weight w. in eqs. (5-3) and (5-k) indeed minimizes the mean flow time -x, we note that for any q tasks assigned to the fast processor, the associated weights are f We assume that m-[rbj-rn > [b(r+l)J-[brJ . Whenwe have m-LrbJ-rn < [ (r+l)bj-|_rbj ; the tasks in B n are assigned to the fast processor. The resultant schedule r+1 is as shown in Figure 5-lb. 61 -a EH .a EH © 1111 1 62 V b ' •••' b' b Similarly, for any q tasks assigned to a standard processor, the associated weights are Hence, the smallest m weights are 12 LbJ [bj+l [b]+2 \2bj_ h' h' ' ' ' ' b ■* . > '"> ' >) ' >) > *••> -u > *-) ^t ' • • > ^> . . . , r, r, . . . , r, n n l rb J +1 !■*>]-* I (r*D* 1 (r+1) (r+1) b ' b ' "•' b n which are exactly that given by eqs. (5-3) and (5-*+). The weighted sum in eq. (5-1) is minimized follows from the assignment of smaller weights to longer tasks. The result in Theorem 5.1 can be generalized to the case when the set of tasks is executed on the system (P = (n, , n„, ..., n ; b , b_, ..., b ), Suppose that b > b > . ,.>b , let b. d, = r± i = 1, 2, .,., k-1 (5-6) - 1 - Vl We have Theorem 5»2 When a set of tasks (fT, \i) is executed on the system P = (n , n , . .., n^j b^ b g , . .., b^ the weights w^ w g , ..., w ffi that minimize the mean flow times are given by n l n l [ 63 b ' b ' "*' b ' b ' b ' •"■' b ' ***' b ' b ' '"' b ' D l D l D l 1 1 1 D l u l D l n 2 X - 1 £ A ^ UJ [dj+l [dj+l [dJ+2 [dJ+2 d 1+ 2 y bj' ••" bT/ ~b^~' b x ' "••' b 1 ' b 1 ' b x ' "•' ~b^~' *1 A , A / ^ [2dJ [2d 1 J [2d 1 J ■ 2 ' ~~ b ' ~~ b ' * * * ' ~~ b ' b~~ ' ~ ' ' ' ' ' b~~ ' 1 1 1 2 2 2 n l r^ [d 2 d x j [d 2 d x j Ld 2 d x j [d 2 j [d 2 j [d 2 j b i ' \ ' "" \ ' V V "" V "3 b ' b ' ' "' b ' 3 3 3 (5-6) The proof of this theorem being similar to that of Theorem 5.1 is not repeated here. 5.2 Comparison of Systems We observed in Sec. 3.^- that on the basis of the worst case performance of priority-driven schedules, a processor of speed b is more desirable than b processors of speed 1. We now show that the same conclusion can be reached when the performance of two multiprocessor systems are compared in the basis of minimum mean flow time. Let us consider two multiprocessor systems (P = (l, n; b, l) and tf> x = (n+b; 1) where b is an integer. Let t and t' denote the minimum mean flow times 6k when a set of tasks {ZT , u) is executed on the systems (P and <^>', respectively. Again, we suppose that u(T-.) > u(T ) > ... > u(T ). We have Theorem 5.3 b-1 ,. .,.- m t b (5-9) Similarly, an optimal schedule for the set of tasks {Zf , u) on the system (p • is shown in Figure 5-2b. Let r+l t* = E f.* i=l x where f i - 1 [^ T (i-l)(n + b) + l) + ^ T (i-l)(n + b) +2 ) + ••• + ^ T i(nA)> ] 1 — X^ £- ^ • • • y X and f r+l = r ^ (T r(n+b)+l ) + ^ (T r(n+b)+2 ) + • • + ^ T m^ The expressions for f. in eqs. (5-8) and (5-9) can be written as f, - f. - £ d-i) ^ (i . 1)( , +n)+j ) i = 1, 2, ..., r (5-10) 67 Similarly, a f ^ = f \i ~ 2 (1-J-) u(T - v .) (5-11) r+1 r+1 . -. b' ^ r(b+n)+n 3=1 where a = min[b, m-r(n+b)] Substituting eqs. (5-10) and (5-11) into eq. (5-7)> we obtain r b a . t = t' - Z Z (l-£) n(T/. lW _ x .) - Z (1-J) u(T ,. v .) .-,.-. b 7 (i-l)(b+n)+3 y . . b y ^ v r(b+n)+j ' 1=1 3=1 3=1 ' u Since u(T ) < u(T. ) for i = 1, 2, ..., m-1, we have r b CK T < T* - Z Z (l-£) u(T ) - Z (l-£) u(T ) — . , . -, b ^ m . -, b ^ m i=l j-l 3=1 < t' - r ^ n(T m ) But I m I r = — r L n+b J Therefore T 1972, pp. 200-213. [2] T. C. Hu, "Parallel Sequencing and Assembly Line Problems, " Operations Research , Vol. 9, No. 6, I96I, pp. 8^1-8i+8. [3] R. R. Muntz and E. G. Coffman, Jr., "Preemptive Scheduling of Real-Time Tasks on Multiprocessor Systems, " Journal of the ACM , Vol. 17, No. 2, 1970, pp. 32^-338. Also, "Optimal Preemptive Scheduling on Two-Processor Systems, " IEEE Transaction on Computers , Vol. e-18, No. 11, 1969, pp. 10li+-1020. [h] R. L. Graham, "Bounds on Multiprocessing Timing Anomalies, " SIAM J. on Applied Math , Vol. 17, No. 2, 1969, pp. U16-U29. Also, "Bounds for Certain Multiprocessing Anomalies, " Bell System Tech . Journal , 1966, pp. I563-I58I. [5] M. R. Garey and R. L. Graham, "Bounds for Multiprocessor Scheduling with Resource Constraints, " (to appear). iLIOGRAPHIC DATA 1. Report No. UIUCDCS-R-7^-632 3. Recipient's Accession No. "itle and Subtitle Bounds on Scheduling Algorithms for Heterogeneous Computing Systems 5- Report Date June I97J+ iuthor(s) Jane W. S. Iiu and C. L. Iiu 8. Performing Organization Rept. No. 'erforming Organization Name and Address Department of Computer Science University of Illinois Urbana, Illinois 6l801 10. Project/Task/Work Unit No. 11. Contract/Grant No. GJ-36265 GJ-^1538 Sponsoring Organization Name and Address National Science Foundation Washington, D.C. 13. Type of Report & Period Covered 14. Supplementary Notes Abstracts The problem of job scheduling in a computing system containing processors of different operation speeds is studied. In particular, bounds on the worst case performance of same scheduling algorithms that can be implemented easily are obtained. Such bounds also provide us with information concerning the effect of the speeds of the processors and the maxmal throughput of the system on the performance of these scheduling algorithms. The trade-off between the speed and the number of processors in the system is also discussed. Optimal scheduling algorithms which produce preemptive schedules with minimal completion times and schedules with minimal mean flow times for independent tasks are described. Key Words and Document Analysis. 17a. Descriptors multiprocessor scheduling, non-preemptive scheduling, preemptive scheduling, priority-driven scheduling Identifiers/Open-Ended Terms • ( OSATI Field/Group Availability Statement IM NTIS-38 110-701 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price USCOMM-DC 40329-P7I Wfc 28 \*& O"' * s&