LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAICN 
 
 510 -^f 
 top- & 
 
«mRALC.RCUlAT.ONBOO^ACKS 
 
 The ^rL^'e"! wafo™^ return to 
 sponsible for its rene borro wed 
 
 the library ^JTSjJf Date stamped 
 on or before ^e Ut«t B milllmllin 
 
 ^ :,*r 5 So^or e e.«M.s.bo.k. 
 
 fee OT $/3«" w ,w ^ ,,,0 reosens 
 
 « — — '-; ^jeta •» — — *- 
 
 for dUelplhwry actl0B ° 
 
 the Unlverrity. ^, CENTER. 333-8*0° 
 
 To RENEW CAU TELEPHONE CENTER URBANA ^rA 1 G^ 
 
 SEP 
 
 16 19* 
 
 u we write new due date below 
 When renewing by phone, wnt u62 
 
 previous due date. 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/loadregulationdi537fitz 
 
'/O.St 
 
 <■ V / UIUCDCS-E-72 -537 
 
 fS37 
 
 LOAD REGULATION AND DISPATCHING IN A NETWORK OF COMPUTERS 
 
 BY 
 JAMES T. FITZGERALD 
 
 August 1972 
 
 IH§ LIBRAE QE ih e 
 
 >EP 12 1972 
 
 UNIVERSITY OF ILLINOIS 
 ATUr^' 'VCH "MPAIGN 
 
UIUCDCS-R-72-537 
 
 LOAD REGULATION AND DISPATCHING IN A NETWORK OF COMPUTERS 
 
 BY 
 
 JAMES T. FITZGERALD 
 
 August 1972 
 
 Department of Computer Science 
 University of Illinois at Urbana -Champaign 
 Urbana, Illinois 6l801 
 
 *This work was supported in part by Contract No. NSF GJ 28289 and was 
 submitted in partial fulfillment of the requirements for the Master 
 of Science degree, August 1972. 
 
Ill 
 ACKNOWLEDGMENT 
 
 I would like to express my sincere gratitude to the 
 National Science Foundation whose Grant No. GJ 28289 financially 
 supported this work and to my adviser, Professor Edward K. Bowdon, Sr., 
 whose friendship and guidance supported me in other ways. A special 
 thanks also goes to Mrs. Gayanne Carpenter, who typed this paper so 
 "beautifully. But my biggest thanks must go to my family for whom this 
 work was done. First, to my wife, Debbie, without whose encouragement, 
 I would never have finished, and secondly, to my daughter, Shannon, who 
 encouraged me just by being here. 
 
IV 
 
 PREFACE 
 
 This paper is aimed at developing tools to control efficiently 
 the flow of jobs and job traffic in a network of computers. Input of 
 jobs to each center is controlled by predetermined information based 
 on probabilities and stored in table form. These probabilities are 
 developed mathematically, predicated on the fact that we consider the 
 input rate to be a random variable capable of assuming any size. The 
 table is then extended to handle the dispatching of jobs that must be 
 
 rerouted between different centers in the network and an efficient 
 controller is thus developed. 
 
V 
 
 TABLE OF CONTENTS 
 
 Page 
 
 ACKNOWLEDGMENT iii 
 
 PREFACE iv 
 
 1. INTRODUCTION 1 
 
 2. THE LOAD REGULATOR k 
 
 2.1. Description of the Network and Load Regulator . . k 
 
 2.2. The Probability With a Known X 7 
 
 2.3. The Probabilities 10 
 
 2.k. Calculation of the a Change Factor 18 
 
 3- THE DISPATCHER 29 
 
 3.1. An Introduction to the Problem 29 
 
 3*2. The Communication Links 30 
 
 3-3- General Considerations for Rerouting 36 
 
 3.1+. The Algorithm 39 
 
 3«5« The Load-Regulator-Dispatcher i+3 
 
 k. CONCLUSION hG 
 
 APPENDIX k-J 
 
 LIST OF REFERENCES 50 
 
1. INTRODUCTION 
 
 "For -which of you, wishing to build a tower, does not sit 
 down first and calculate the outlays that are necessary, whether he 
 has the means to complete it? Lest, after he has laid the foundation 
 and is not able to finish, all who behold begin to mock him, saying, 
 'This man began to build and was not able to finish!'" 
 
 St. Luke lU:28-29 
 
 The Objectives 
 
 Two thousand years ago the importance of scheduling was 
 recognized. Having enough material to start and finish a job while not 
 having too much unused upon completion introduced the art of resource 
 allocation. Since then, much attention has been paid to the area of 
 schedules and to the algorithms which alleviate or at least somewhat 
 subdue the problems associated with scheduling jobs, or tasks, in a 
 particular environment. The amount of attention is due to the fact that 
 an ever increasing complexity in the type and number of jobs demanded 
 by a mechanically expanding society necessitates their fast and efficient 
 completion. One environment which by its very nature demands painfully 
 high levels of efficiency and the one towards which we turn our attention 
 is the area of network computers . 
 
The term network computers can mean different things to 
 different people. It could be the connection of two or more processors 
 at a single computer installation. Or it could mean the connection 
 by telephone wires of many geographically distant and distinct single 
 machines. We will combine both of these ideas in our definition. We 
 will use the term network computer to mean the connection by some 
 communication facility of geographically distant and distinct computing 
 centers, each of which has a number of processors and peripheral devices. 
 
 In recent years, schedulers have turned their attention to the 
 area of network computers in an attempt to more efficiently use the 
 tremendous computing power of such a system. Efforts, on their part, 
 have generated many priority assignment rules and scheduling algorithms 
 effecting fast and efficient throughput of jobs at a particular center. 
 They have had a tendency, though, to concentrate on just one center, 
 forgetting, perhaps, that this is only a small part of the entire picture 
 While we will concede the importance of this work, we feel that some 
 scheduling problems as they relate to the entire network deserve some 
 attention . 
 
 One area which has been all but neglected, is the one of load 
 leveling or load regulation for the entire network. It is not unrea- 
 sonable to expect that all jobs once accepted at a center will be run 
 to completion at that center. It may happen, however, that because of 
 mis -management at or failure of a center, some jobs already accepted 
 into the network cannot be run at the center intended. Should the user 
 

 3 
 then be called and told that his job will be delayed or not run at all? 
 We think not. Management would frown if every effort was not made to 
 meet our obligations. We would like then to have these type of jobs 
 run at another center in our network so as still to meet the deadlines. 
 We would, however, still like to enjoy the benefits of making some profit 
 even though through our own fault it will be reduced. 
 
 Towards this goal then we develop in Chapter 2. a load regulation 
 scheme meeting our objectives, the most important of which is to minimize 
 as much as possible, the chances of having this overloaded condition. 
 Upon discovering that an overloaded condition exists at one center, we 
 must choose another center that is capable of running the excess jobs. 
 This is the object of Chapter 3- 
 
2. THE LOAD REGULATOR 
 
 2.1. Description of the Network and Load Regulator 
 
 We speak of load regulation in a computer network as the 
 scheduling and routing of jobs so that available resources are used 
 efficiently to implement fast completion of all jobs. In a multi- 
 processor system such as we deal with, each processor should be loaded all 
 the time to be used to best advantage. A processor sitting idle while 
 another has a full load or is even overloaded not only seems sub-optimal 
 but also gross mismanagement of an expensive asset, machine time. 
 Jobs originally scheduled to be run on a machine now overloaded should 
 be rerouted. We could ask, at this point, what happens to incoming 
 
 jobs when the entire system is already running at full capacity, but we 
 
 i 
 
 will leave this as a management consideration. Our concern will be with 
 the efficiency of our full system. The tool that we will use in this 
 controlling function we will dub, oddly enough, the load regulator. 
 We wish, then, to construct a feasible and workable load 
 regulator for the entire network. A regulator that will periodically 
 check each of the nodes in our system and immediately be able to determine 
 if that particular center is in danger of being overloaded or is, in fact, 
 already overloaded. After this status check, we would like our load 
 regulator to take the proper action, if necessary, to alleviate the 
 impending or present congestion. We will allow our regulator to make 
 the decision to either inhibit or reduce further input to the center or 
 to let the input flow continue at the present rate. We will not, at this 
 
point, force our regulator to be concerned with the rerouting of jobs 
 whose entry it has refused. 
 
 Our network will consist of n computing centers with a varying 
 number of processors and peripheral equipment. Each processor in a 
 center draws its work from a common queue relevant only to that center 
 (i.e. an idle processor in another center may not draw work from the 
 queue of an overloaded center but rather must wait for it to be rerouted 
 to it). We will assume that the input to the queue of a particular center 
 is a Poisson process with an unknown random mean X. Due to the randomness, 
 we will allow X to range from zero to infinity; despite the infinite 
 capabilities of the input rate, we never really expect an infinite number 
 of jobs to be present at one time. Our load regulation will consist, 
 then, of periodically sampling this random variable queue size and based 
 on predetermined information, stored in table form, make the proper decision 
 
 Again, since our queue size is a random variable, the information 
 stored in the decision table must be based on some estimate, as accurately 
 as possible. Toward this goal, then, we will develop analytical tools 
 based on the probabilities of the queues reaching critical threshold 
 levels and then bound these probabilities by some arbitrary criterion e 
 which will be small. 
 
We "begin by characterizing our computer network of n centers 
 by the following: 
 
 (1) Assume that the input rate to any center is a 
 Poisson process -with unknown random means X, 
 
 (2) Assume that the service rate in each center is an 
 exponential random variable with an average value of ju • Then the 
 service time is l/ju, 
 
 (3) Assume that each center has a finite storage capacity c, 
 (h) The queue size of each center is sampled at discrete 
 
 instants of time K 5, K = 0, 1, 2, . . . 
 
 (5) A delay time A is associated with the load regulator, 
 where A is the time elapsed between the issuing of a control order and 
 its implementation, 
 
 (6) Each center has a common queue of work relevant 
 only to that center, 
 
 f7) Let qft) be the number of jobs in the queue at time t 
 then the criterion of estimation is to keep 
 
 Prob [queue size at time t > c] < e for all t > where e 
 is a given real number rather than zero. 
 
 For ease of computation, we will assume that each node is 
 of equal stature (i.e. each center in the network has approximately 
 equal computing power, equal speed, and on the average equal work loads). 
 Therefore, we can speak of the relationship between K, 11, c, and A as 
 being the same for each center. Since computing centers and their affect 
 
7 
 
 on the entire network, unless identical should not "be considered so, 
 this equating of the X, \i, c, and A may seem like an unrealistic 
 approach. Intuitively it is, until we realize that a center far "below 
 the capacity, speed, and overall computing power of the others, would 
 have a smaller queue and, therefore, accept fewer jobs, jobs that were 
 shorter, and jobs that required less sophisticated computing power. 
 In this sense, then, the X, \x, c, and A of a large center in the network 
 would be relatively and proportionally equal to those of the smaller 
 center. Since we are speaking of probabilities relative to each 
 separate center, this approach suits our purposes. For each center, 
 then, to use just the one decision table in the load regulator, it is 
 just a matter of a normalization factor with respect to the different 
 centers. To emphatically dissolve the problem of a consistent A, 
 we will assume that the load regulator occupies roughly the geographic 
 center of our network and, therefore, the communication delay times 
 are equal. Our network, therefore, will roughly assume the shape of a 
 ring (see Figure 2.1.). 
 
 2.2. The Probability With a Known X 
 
 Before proceeding with the development of the probabilities 
 for our system, we may do well to look at the probabilities for a similar 
 system when X is known. Predicating our system on an unknown X as will 
 be shown in section 2.3- necessitates using complicated mathematics 
 to solve long and messy equations. Morse [l] shows us that these messy 
 equations are not needed if the input rate is defined or can be 
 
QUEUE 
 
 Q.UEUE 
 
 LOAD 
 REGULATOR 
 
 QUEUE 
 
 QUEUE 
 
 QUEUE 
 
 QUEUE 
 
 Figure 2.1, 
 
approximated closely enough to suit the purpose . When X is known, 
 the probability of the queue at a center being overloaded can be 
 defined by: 
 
 Let P = the probability of n units in the system. Then, 
 
 1 - *At 
 
 1 - (\/u) 
 
 c+1 
 
 (VV)' 
 
 n=l,2, . . ., c 
 
 where N is the capacity of the queue. In our case, the probability 
 of the queue being overloaded is just the probability that there are 
 c+1 in the queue. Therefore, 
 
 c+1 
 
 1 - Vm 
 
 1 - (Vm) 
 
 c+2 
 
 (Vm) 
 
 c+1 
 
 which will simplify to 
 
 c+1 
 
 x c+1 (m-x) 
 
 c+2 " c+2 
 U -A 
 
 We will see that this is much simpler to work with as we now move on to 
 describe the probabilities when X is unknown. 
 
10 
 
 2.3. The Probabilities 
 
 We begin our analysis by developing the probabilities which 
 we will need. The following approach to the derivations is due mainly 
 to Bailey [2] and Saaty [3] • Assume that initially q(0) = and that 
 our first discrete sampling time (K 5, K=0, 1, ...) which will have 
 any meaning is K=l. Therefore, our first sample is taken at time 1 • 8 
 or 8 and the size of the sample is m. We wish then to define this 
 probability that there are m jobs in the queue at time 8 given that 
 there were none in the, queue at time 0. We can state this formally as, 
 
 Prob [q(8) = m | q(0) = 0] . 
 
 Remembering that the input to each center is a Poisson process with 
 unknown random mean X and that the service time is an exponential random 
 variable with an average service time of l/ju we define the following: 
 
 Prob [l arrival in time At] = XAt (l) 
 
 Prob [>1 arrivals in time At] =0 (2) 
 
 Prob [l service completion in time At] = juAt (3) 
 
 Prob [>1 service completion in time At] = (h) 
 
 where At is a very small time interval and (2) and (k) tend to zero in 
 
 the limit as At ^ since the actual probabilities are 0^((At) ) (Read: 
 
 2 
 On the order of (At) ) which is too small to be of any significance. 
 
11 
 
 If we let 
 
 P (t) = Pro"b [of m jobs in the system at time t] 
 m 
 
 then 
 
 P (t + At) = Prob [of m in the system at time t and during 
 the interval t to At no jobs arrived and no jobs were serviced] + 
 Prob [of m-1 in the system at time t and during the interval t to At 
 one job arrived] + Prob [of m+1 jobs in the system at time t and during 
 the interval t to At one job was serviced] . 
 
 Therefore, 
 
 P (t+At) = [l-(\-Hi)At] P 't) + P ,(t) AAt + 
 m' ' m m-1 
 
 P m+1 (t) juAt + ^(At) 2 , m > 1 (5) 
 
 and 
 
 P Q (t + At) = P Q (t) (l-*At) + P 1 (t) ]uAt + e^(At) 2 (6) 
 
 from the first equation (5) 
 
 P (t+At) = P (t) - (\-hu) At P (t) + P .ft) XAt + P , (t) ,uAt 
 m m \ fi m m-1 m+1 
 
 P (t+At) - P (t) = - (X+u) At P (t) + P ,(t) XAt + P ,,(t) juAt 
 m m ' m m-1 m+1 
 
 P (t+At)-P (t) 
 m m 
 
 = - (\+ w )P m (t)+P wi . (t)M-P^ (t)/i, m > 1 
 
 14 _ P (t+At) - P (t) dp (t) 
 
 llm m m D« f+\ m 
 
 At-0 At _ F m [Z) ~ "3t 
 
12 
 
 Therefore, 
 
 dp (t) 
 m 
 
 ~3t 
 
 (\+u) P (t) + P '(t) \ + P lV (t)-jLi, m > 1 (7) 
 
 "> ^ i m ^ ' m-l v ' m+1 ' — 
 
 and for 
 
 we set 
 
 P Q (t-fAt) = P Q (t) (1-XAt) + P 1 (t) juAt 
 
 SP Q (t) 
 
 1TE— = " x p o^ + M p i^ 
 
 for P (t) the corresponding probability generating function is 
 
 m 
 
 n(z,t) = E p (t) z m . (9) 
 
 m=0 
 
 Multiplying (7) and (8) by z and z, respectively, summing overall 
 values of m and then using (9), shows that H(z,t) satisfies the partial 
 differential equation 
 
 z ^^ = (i-z) {(m-MO n (z,t) - y p (t)J . (io) 
 
 We will continue the derivation under the assumption that initially there 
 are i jobs in the system we are looking at where i > 0. 
 
 If initially we have t = and a total of i jobs in the system, 
 we have the initial condition 
 
 n(z,0) = z 1 (11) 
 
 
13 
 
 using the Laplace transform with respect to time 
 
 00 
 
 0*(s) = / e" st 0(t)dt , R(s) >0 (12) 
 
 
 
 and using the inverse, we have 
 
 C+ioo 
 
 2«i J 
 
 C-loo 
 
 :i3) 
 
 Applying (12) to (13) and using (11) gives 
 
 z 1+1 - w (l-z)P *(s) 
 
 II*(z,s) = r= r, r-^r (1*0 
 
 sz - (l-z)(ju-Xz) 
 
 It follows from the definition of the Laplace transform of 
 Il(z,t) that II*(z,s) must converge somewhere within the unit circle 
 |z| = 1, provided R(s) > 0. Thus in this region, zeros of both the 
 numerator and the denominator on the right-hand side of (lU) must 
 coincide. The zeros of the denominator o^.(s) are obtained from the 
 equation 
 
 o^Cs) = J(Mtu+s) + [x +u +s) 2 - l+Xju] 1 / 2 }- /2\ K = 1,2 . (15) 
 
 By Rouche's theorem the denominator [sz-(l-z) (u-\z)] has only one zero 
 in the unit circle and it can be seen from (15) that |a p (s)| < |a, (s)|. 
 Thus the numerator on the right-hand side of (ik) must vanish when z = a p (s) 
 
 Hence, 
 
 P *(■) = S i+1 /U(l - a )] 
 
Ik 
 
 and (ik) can "be written as 
 
 n*(z,s) 
 
 z i+1 - [(1- Z ) a p i+1 /(l-«J] 
 
 - x (z-a 1 ) (z-au) 
 
 multiplying the numerator by (1-cO and factoring, we get 
 
 (16) 
 
 
 : 
 
 II*(z,s) = (z-a 2 ) (z 1 + a 2 z 1+1 + . .. 0£g) 
 
 - za 2 (z-a 2 ) (z 1_ + o; 2 z 1 "'" + ... a 1_ ) 
 x a^z-a^) (l-z/a^) (i-a 2 ) 
 
 -1 K 
 
 and since (l-z/cu ) = Z (z/a ) and adding and subtracting 
 
 K=0 
 
 i+1 
 OU , we get 
 
 II*(z,s) = — (z^z 1 " + 
 
 ad) Z (z/a ) 
 K=0 
 
 K 
 
 OL. 
 
 i+1 
 
 + 4^? K ? (z/a i )K 
 
 P* (s) is the coefficient of z in the expansion; hence for m > i 
 
 P*(s) = ± 
 m v ' X 
 
 1 
 
 + 
 
 mA , [bM i 
 
 a/" 14 - 1 a/- 1 - 1 " 3 of^" 145 
 
 + 
 
 a 
 
 (nA) 1 
 
 m+i+l 
 
 + ( ^) m+1 z MJ 
 
 K 
 
 1 
 
 K=m+i+2 a 
 
 K 
 1 _J 
 
15 
 
 now the inverse 
 
 C+loo 
 
 P (t) = JL / e St P*'s)ds 
 
 c-ipo 
 
 and we get, 
 
 -(\+ M )t 
 
 p ( t ) = ^~T 
 m X 
 
 ,m-i+l 
 
 (4i~/~uf~ ±+ (m-i+l)t" 1 -I ._ (2s/?it) 
 
 + u/\ (\/I7Ii) m_1+3 (m-i+3)t" • I -i+3(2^It) + 
 
 ... + UA) 1 (>/xA0 m+1+1 (m+i+i) i^Ca^t 
 
 m+i+1 ' 
 
 >m+l 
 
 + (\A0" r * ^ (>/^A) K Kt" 1 I (2>Ct] 
 
 K=m+i+2 
 
 where I ... is a modified Bessel function of the first type and 
 substituting 
 
 2v 
 
 I (z) = I ,(z) - I n fz) 
 
 z v y v-1 v+1 
 
 and simplifying, we get 
 
 P ft) = e' 
 m 
 
 (~X+u)e 
 
 ,r~/- si-m+1 
 
 uA) 1_m I .(2^t) 
 ' m-i 
 
 '2\^t) + 
 
 TV 
 
 (1 - \/ M ) (X/ U ) m Z (^\) K (2^t) 
 K=m+i42 
 
 (IT' 
 
16 
 ■which is the equation we sought and finally substituting for i 
 (initially we have zero) and 8 for our time we get 
 
 Prob [q(&) = m | q(0) = 0] 
 
 - (Mm) 5 U^Jx)~ m i m (2^ 8) 
 
 e 
 
 + (^) Vl^^ 6) + (1 - X/U) (X/U)1 
 
 ,K 
 
 ij\u 8) y 
 
 Z W\rL(2^8) (18) 
 
 K=m+2 * J 
 
 The above probability (18) defines our chances of finding 
 m jobs in the queue at any of the centers in our network given that 
 there were none to start with. We have shown, however, that q(0) = 
 is not a necessary restriction and that we may start with any number in 
 the system. 
 
 We now know the probability of finding m jobs in the queue 
 at time 5. What is even more important for us to know, at this point, 
 is whether or not m is greater than c (i.e. are there more jobs in the 
 queue than the system can handle) . If m > c we would want to either 
 prohibit further input of jobs to that node or at least to reduce it 
 until the overloaded condition was no longer present. We are, therefore, 
 interested in knowing the probability upon finding m jobs in the queue 
 at time 8, that this m exceeds the capacity for the center. Formally, 
 we want 
 
 Prob [q(t) > c | q(8) = m] 
 
17 
 
 Since X is a random variable it is allowed to vary over the range 
 
 \e(0,<x>) and even the best means, to estimate X will often be grossly 
 
 in error. Rather than depend on this somewhat unstable statistic, an 
 
 average value of the quantity under consideration, Prob [q(t) > c], 
 
 will be derived, and this estimate will minimize, though by no means 
 
 correct, the error inherent in this calculation. Let P /c . «\(X) be 
 
 q(o,m,0) 
 
 the normalized Prob [q(S) = m / q(0) = 0] such that 
 
 
 
 P ,_ _v(\) ax = 1 
 
 q(P.,m,0) v 
 
 then from the previous probability derivation with substitution it should 
 be clear that for t > o, we have 
 
 Prob [qCt) > c | q(g) = m] 
 
 n _P J q(fi,m,0) v 
 1 
 
 . [(^/x) m - n i n _ m [a^(VB)] 
 + (^A) m " n+1 i n+m+1 [S^(t-B)] 
 
 00 \ 
 
 + (i-x/u) (x/ii) n Z (Jl/xf I [2^u(t-fS)] ( dX (19) 
 
 K=n+m+2 K J 
 
 We might note that 1 - Prob [qCt) > c ' q(fi) = m] is the probability 
 that the queue size is within efficient limits. 
 
18 
 
 2.k. Calculation of the a Change Factor 
 
 Before continuing with the probabilities we will look at some 
 general considerations that must be taken into account. Remembering 
 that the load regulator has a decision delay time of A, we note that no 
 change in the input rate (either a blocking or reducing change) can be 
 achieved before A + 5. This is because our first sampling time of any 
 importance is K8, where K=l. Thus if the probability in equation (19) 
 exceeds e at any time in the range (5,A+8), the load regulator will not 
 change X in time to correct it. Furthermore, it should be noted that if 
 no change is ordered at time 8, then the earliest possible time for 
 effecting a change would be A+25 (A time units later than the next 
 observation) . Therefore, the present input rate X should be maintained 
 if and only if Prob [q(t) >c I q(;6) =m] < e for all t in the range 
 ^5,A+2o). If a change is necessary it should be made so as to guarantee 
 that the criterion e is met only for t in the range (5,A+2o), since at 
 time 25, we may order another change if needed. We conclude, therefore, 
 that we wish to order a change in the input rate X if and only if 
 
 Prob [q(t) > c q(&) = m] > e, for t e(A+8, A+2o) . 
 
 If by chance 
 
 Prob [q(t) > c ! q(&) = m] > e, for t e(o, A+S) 
 
 which could happen, then our criterion will be violated. 
 
19 
 
 If a change is necessary and ordered, we will then change 
 
 X to OX, for a=0, . . .,1, where Cfc=l means no change, Q!=0 means a complete 
 
 halt to all input to that center, and a in "between is some reduction 
 
 factor. Assume that Max Prob [q(t) > c q(S) = m], for t in the range 
 
 (A+5, A+25), occurs at time t=t_ and that we need a change (i.e. the 
 
 probability is greater than e). We, then, compute the change factor a 
 
 in the following way: 
 For q(0) = 
 
 Let = t - (A + 6), then Prob [q(t_) > c q(&) = m, X 
 
 - Q ^ , „„ „«„ ***,„ ^\- ^ - 
 
 was 
 
 changed to a X at A + 6] = E / E Prob [q(A+5) = m | q(o) = m] • 
 
 n=c " n m =0 
 
 Prob [q(t Q ) = n | q(A+6) = rr^] P ( 6 ) M dX (20) 
 
 from this we get 
 
 00 . 
 
 n=c - m n =0 L L m i -m 
 
 + 
 
 "T 
 
 , m-m +1 " 
 
 (n/mA) X I ^ , (2s^uA) 
 m.+m+l 
 
 m oo >, 
 
 + (l-\/ M ) (\/ U ) l E (^) K I„ (2nT^a)^ • 
 
 K=m+m +2 K J 
 
20 
 
 f -(oMm) 
 1 
 
 m, -n 
 
 (^) X I v (2n^ 0) 
 
 n-m 
 
 . m -n+1 
 
 + (n/aiM) x i n4 . ,. (av/aXju 0) 
 
 "n+m +1 
 
 00 __ — -i "\ 
 
 + (1 - 9£) (^) n E (-^A.) K I K (2^ 0) ^ 
 
 M M K=n-kn.+2 * J J 
 
 P ,_ M (\) d\ = e (21) 
 
 q(5,m,0)\ 
 
 the derivation being similar to what we did before . 
 
 For q(0) * 
 
 Here we wish to make our decision at some time K8, with some 
 number of jobs i already present in the center when we checked at time 
 [K-l] 5. Let q(K&) = m, and let q([K-l]o) = i. Then if no change was 
 ordered at time [K-1J8 from (18) we have that 
 
 Prob [q(K8) = m | q([K-l]5) = i] 
 
 = e -(^)8 | (^) 1 - m I m _. (2^5) 
 
 + (^A) 1 "^ 1 I m+i+1 (2^8) 
 
 + (1-A./M) (\/n) m S (^) K L. (2^8)} (22) 
 
If a change was ordered, at time [K-l] 5 we have 
 
 Prob [q(KS) = m ] q([K-l] 8) = i] 
 
 21 
 
 I I e - {X+tl) 
 
 m =0 ^ 
 
 1-1IL 
 
 (>/mA) 1 T m _i (2>^A) 
 
 mm J- 
 
 i-m 
 
 + (>/J7\) - 1 I _ ± (2^) 
 
 + (i-Vm) (Vm) 1 " 1 Z (^) k I* (2n^A)^ 
 
 K=m 1 +i+2 
 
 ■K 
 
 | e -(a(K-X). +M ) ( 5 -A) f^jg^V Im (2 ^I), W(5 . A ) 
 
 Q!(K-1)\ n / a(K-l)\ vm 
 
 E (^/a(K-i)\r I (2s/a(K-l)\jLi(8-A) 
 K=m+m 1 +2 
 
 I 
 J 
 
 (23) 
 
 where A < o. Again because of the random X we will minimize the error 
 in computing Prob [q(t) > c] by normalizing. 
 
 Let P ,.„ .\(>0 be the normalized Prob [q(K8)=m|g( [K-l]&)=i] 
 q(K8,m,i) v ' ^ ' |BV ' 
 
 such that 
 
 p ( V * .s (>,) d\ = 1 
 q(K8,m, l) 
 
then for t > KB, we have 
 
 Prob [q(t) > c ! q(K8) = m, q([K-l]s) = i] 
 
 22 
 
 00 ,-> 
 
 = z , ,. ;x , ^(Mu) (t-KB) 
 
 n=c q 
 
 q(K6,m,i) 
 
 • { (^) m - n I n _ m [2^ (t-K5)] 
 + (^) m " n+1 I n+jn+1 [2^ (t-K5)] 
 
 OO "I 
 
 + (l - \/n) (\/ u ) n E C^A) K \ 1&6* (t-KB)] [■ d\ 
 
 K^n+m+2 R. J 
 
 as before we wish to change (X if and only if 
 
 Prob [q(t) > c | q(K8) = m, q([K-l]s) = i] > e 
 
 for t in the range (KS + A, [K +l] 8+ A) occurs at t = t Q . Now let 
 = t - (KS+A) . Then 
 
 00 
 
 oo r, oo r . N m-m, 
 
 2 / S je-^ )A [(V^A.) 1 I m ^ (2-^A) 
 
 -c '•• . m,=0 ^ 1 
 
 n=c ... r 
 
 MW^I^l^A)] 
 
HI-, oo 
 
 + (l - \/u) (\/n) x I 
 
 T 
 
 K=m+m +2 
 
 uA) I K (2^A) ( 
 
 sKiA) t • 
 
 {e-( a ^^[(^M) Vn i n . m (2^ 
 
 0) 
 
 m - n+1 
 + (■« I n+mi+1 (2^ 0) 
 
 + (1 - a\/n) (a\/V) n Z (^/^) K I K (2n5^ 0)] [ 
 
 K=n+m +2 J 
 
 K=n+m +2 
 
 23 
 
 q(K8,m, i) 
 
 [25) 
 
 We can see that solving (21), (23), and (25) for Q! is at best a tedious 
 and complicated operation. We can console ourselves "by the fact that 
 the OL change factors need only be computed once and then stored away in 
 the table of the load regulator; after this it is just a referencing 
 operation . It could also be noted, at this point, though it should be 
 obvious, that for each value of m observed at K5 in the range (0, c) a 
 decision with regard to the input rate \ can be computed from the previous 
 decision value of i observed at [K-l] B. We can then generate our table 
 in the following form: 
 
2k 
 
 q(K-l) 
 
 q(K-l) 
 
 a Change Factor 
 
 
 
 
 
 a(o) 
 
 1 
 
 a(D 
 
 
 
 C 
 
 a(c) 
 
 
 
 
 
 
 
 C 
 
 
 
 a(o) 
 
 
 
 c 
 
 a(C) 
 
25 
 For purposes of illustration, a ta"ble "was generated according 
 to the following criterion: 
 
 (1) Storage capacity, c = 5 
 
 (2) Average service time = 5 units of time, therefore, ju = 0.2 
 
 (3) Queue scan at KS, K = 0,1,2, ... with K = .05 units of time 
 (if) Delay time A = .0^5 units of time 
 
 (5) Keep the probability that the queue size exceeds c "bounded 
 below e, where e = .01. 
 
 The first column of the table, q(K), is the observed sample at 
 time K, the second column of the table, q(K-l), is the observed sample 
 at (K-l) . The third column of the table is the required input multi- 
 plicative change factor a. 
 
 We can see from looking at the table that with the parameters 
 chosen as they were that the load regulation schemes decision seems highly 
 dependent on the present observed queue size. This is obviously the case 
 since, with only a few exceptions, the first half of the possible queue 
 samples 0,1, and 2 require an alpha change of 1 (i.e. no change), while 
 the last half 3,^, and 5 required prohibition of all further inputs. 
 Upon expanding this table for bigger queues we will find that this is 
 also the case. We can reasonably expect that when the queue is empty 
 to approximately half full that there will be no change, around half full 
 to suffer some reduction in input, and from approximately half full to 
 full to completely prohibit the input rate X. 
 
26 
 
 q(K) 
 
 q(K-l) 
 
 a Change Factor 
 
 
 
 
 one 
 
 
 1 
 
 one 
 
 
 2 
 
 one 
 
 
 
 3 
 
 one 
 
 
 k 
 
 one 
 
 
 5 
 
 one 
 
 
 
 
 one 
 
 
 1 
 
 one 
 
 
 2 
 
 one 
 
 1 
 
 3 
 
 one 
 
 
 k 
 
 one 
 
 
 5 
 
 one 
 
 
 
 
 • 5U 
 
 
 l 
 
 .87 
 
 
 2 
 
 one 
 
 2 
 
 3 
 
 one 
 
 
 U 
 
 one 
 
 
 5 
 
 one 
 
 i 
 < 
 
 
 
 zero 
 
 
 1 
 
 zero 
 
 
 2 
 
 zero 
 
 3 
 
 3 
 
 zero 
 
 
 if 
 
 zero 
 
 
 5 
 
 • 32 
 
 i 
 j 
 
 
 
 zero 
 
 
 l 
 
 zero 
 
 
 2 
 
 zero 
 
 k 
 
 3 
 
 zero 
 
 
 k 
 
 zero 
 
 
 5 
 
 zero 
 
 
 
 
 zero 
 
 
 1 
 
 zero 
 
 
 2 
 
 zero 
 
 5 
 
 3 
 
 zero 
 
 ! 
 
 1+ 
 
 zero 
 
 1 
 1 
 
 : 
 
 5 
 
 zero 
 
 C 
 
 = 
 
 5 
 
 M 
 
 - 
 
 0.2 
 
 e 
 
 = 
 
 .01 
 
 A 
 
 = 
 
 .0^5 
 
 8 
 
 z= 
 
 • 05 
 
 Table 2.1. 
 
27 
 We can also see that for servicing an entire network of 
 
 multi-processing computer centers the size of the table (i.e. the 
 amount of memory storage in the load regulator) is not at all excessive. 
 
 With the addition of the zero capacity possibility, our table has 
 
 2 
 (c + l) entries. With the relative symmetry of one and zero around 
 
 the middle of the table, it is not unreasonable to expect that even 
 
 this figure could be reduced. We may reduce it in the following manner 
 
 using the data from Table 2.1. 
 
 We will let an entry of the form A,B where A and B are real 
 
 numbers define a range from the first q(K-l) value to the last for the 
 
 same q(K) that require the same a change factor. For example in Table 
 
 2.1. for q(K) equal to zero we have for q(K-l) from zero to five an a 
 
 change factor of one (no change). Then for q(K) equal to zero the table 
 
 entry of q(K-l) will be 0,5- The entire Table 2.1. could then be 
 
 decreased in size to 
 
 q(K) 
 
 q(K-l) 
 
 a Change Factor 
 
 
 
 0,5 
 
 one 
 
 1 
 
 0,5 
 
 one 
 
 2 
 
 0,0 
 
 1,1 
 
 2,5 
 
 • 5U 
 
 •87 
 one 
 
 3 
 
 t k 
 
 5,5 
 
 zero 
 • 32 
 
 k 
 
 0,5 
 
 zero 
 
 5 
 
 0,5 
 
 zero 
 
 Table 2.2 
 
28 
 
 A total of nine entries now define the entire Table 2.1., a seventy-five 
 percent reduction. A simple hashing algorithm can now he applied to find 
 the correct q(K-l) range. While we do not expect this kind of reduction 
 in all tables that would be generated we would expect some. In fact, 
 all we can say is that the number of entries in the new table is bounded 
 
 by (c + l) and (c + l) . Our intuition, however, tells us that the 
 
 2 
 number will be far less than (c + l) . With these ideas in mind, we 
 
 will now move on to discuss what happens to the jobs that are refused 
 
 at some center by our load regulation scheme. 
 
29 
 3- THE DISPATCHER 
 
 3.1. An Introduction to the Problem 
 
 In speaking of our load regulation scheme, we touched lightly 
 on some of the problems inherent in a working network computer. We also 
 paid lip service to the fact that there are some problems which must be 
 solved by management and not by our scheme or any other (i.e. what happens 
 to jobs when the entire system is full and no center can accept it) . One 
 problem which we chose to lay aside before, but which now requires our 
 
 attention, is the rerouting of jobs from a full or an overloaded center 
 to one less busy. We will handle this by incorporating into our load 
 regulation table other decision making material adequate to efficiently 
 handle this redistribution of the work load. We will then call our 
 scheme the load-regulator-dispatcher. 
 
 We note at this point that we could have made things considerably 
 easier on ourselves in the beginning. If all jobs were initially routed 
 to and dispatched from the load regulator instead of a common queue at 
 each center, our problem would already be solved. The load regulator 
 would know at each discrete sampling time the sizes of all the queues 
 in the network and would route new jobs to centers it knew could handle 
 them. But this would have been a costly if not an unrealistic approach; 
 the time and the money wasted to send a job, possibly hundreds of miles, 
 before it is even started, negate all the advantages of this idea. It 
 would also restrict the type of jobs we would accept as the cost of 
 transmitting and running and re-transmitting a short, fast job may out- 
 weigh the reward we would receive from it. We will, therefore, use. the 
 
30 
 
 load-regulator-dispatcher to reroute only the jobs whose entry was 
 refused at one of the centers. We begin by discussing the communication 
 links that hold our network together. 
 
 3-2. The Communication Links 
 
 In constructing our network (Figure 2.1.), we omitted 
 description of the communication links connecting the centers in our 
 network. We did this because the question of inter-center communications 
 was not relevant to our discussion of load regulation. It sufficed to 
 say, at the time, that jobs refused entry at one center would have to be 
 rerouted to another center. Since it is our intention in this chapter 
 to advocate a workable dispatcher to handle this rerouting problem, we 
 next consider the question of communication channels between our centers. 
 For purposes of clarification and completeness, we will talk about 
 three possible communication configurations, the last of which we choose 
 for our network. 
 
 The simplest way of forming a communications network is to 
 provide each center with a communications line connecting it with every 
 other center. Figure 3-1* shows our system if this approach is used. 
 Assuming that each communications link is bidirectional (i.e. a link 
 from center 1 to center 3 implies a link from center 3 to center l) f 
 a network with n centers has n(n-l)/2 links. This configuration is 
 optimal from a communications standpoint as it allows one center to 
 communicate directly with all others but, unfortunately, it is only 
 practical for a network with a very small number of centers. The cost of 
 
31 
 
 Figure 3.1. 
 
 the many lines when n is large is prohibitive and forces us to seek a 
 less optimal but more economical solution. 
 
 Our obvious line of attach, therefore, is to eliminate as many- 
 connections as possible while still maintaining efficient communications 
 in the network. The minimum number of communication links that will 
 allow for our network to function (i.e. one center has the capability 
 to communicate with all others, though not necessarily by direct means) 
 is achieved by one bidirectional line between geographically adjacent 
 
32 
 
 centers. This situation is portrayed in Figure 3*2 
 
 Figure 3«2. 
 
 If a job is refused in a network that is connected in this 
 manner, it cannot always he directly transmitted to the center that has 
 
 accepted it, if any. The i — center has direct links with only the 
 
 st st 
 
 (i-l) J and the (i+l) ' centers. It is necessary if a job is to travel 
 
 from center i to center j where center j is not connected to center i by 
 
 a bidirectional communication line for that job to travel through the 
 
 intermediary center or centers. When transmitting a job, our wish is to 
 
 minimize the number of centers that we have to pass through to reach a 
 
 particular accepting center. We, therefore, define the following. 
 
33 
 If we let negative traffic flow be the transmission of a 
 job clockwise in Figure 3-2. and positive traffic flow be the transmission 
 of a job counterclockwise in Figure 3-2. and state that a job to be 
 rerouted from a refusing center i to an accepting center j may travel 
 in either the positive or negative direction so as to minimize the 
 number of transmissions, then center j can be reached in not more than 
 n/2 transmissions according to the following rules 'see Table 3 •!•)*• 
 
 For: 
 
 Direction of Job Travel 
 
 j > i and -i=i < 1/2 
 
 n ' 
 
 Positive 
 
 j > i and ^ > 1/2 
 ° n — ' 
 
 Negative 
 
 j < i and - — < 
 ° n 
 
 Negative 
 
 j < i and -1 < ^ < 1/2 
 
 o n / 
 
 Positive 
 
 TABLE 3.1. 
 
 This scheme applies itself well and efficiently 'only n 
 transmission lines) when n is a small number. When n gets larger, 
 the cost to transmit a job the maximum number of times (n/2) may be 
 more than the worth or the reward of the job itself. 
 
 ^Centers are numbered arbitrarily but consecutively l,2,...,n in a 
 ring formation with the n^ node connected to the first node. If the 
 centers are numbered in some other manner, these rules do not hold. 
 

 3>+ 
 
 We also encounter a question of reliability in this configuration. 
 We want our network to "be such that we can depend on it . If one node 
 connector fails in this scheme it may drastically hinder the running 
 of the entire network. Since the sending of refused jobs depends so 
 critically on finding the minimum transmission path, we cannot tolerate 
 a breakdown in the communications between any two centers. To drastically 
 illustrate this point, picture the situation where the node between the 
 refuser and acceptor fails . Instead of two transmissions using the 
 fallen node, we now must use n-2 transmissions to reach the acceptor 
 at a much greater cost. And if two centers failed at the same time, 
 unless they were adjacent, it would mean a complete isolation of at least 
 one center from the rest. Therefore, we will turn our attention to a 
 scheme that uses more transmission lines to effect faster and more reliable 
 communications between the geographically distant centers. The scheme 
 we deem feasible and propose to use is the connection by one bidirectional 
 line, of a center with two adjacent centers, and in turn, their adjacent 
 centers (see Figure 3-3-)* 
 
35 
 
 Figure 3«3- 
 As can easily be seen, we have added only n more lines for 
 a total in this configuration of 2n. This may not seem like an addition 
 significant enough to increase our efficiency but we have proposed this 
 design for three reasons: 
 
 (1) Reaching the j — center farthest from center i can 
 be achieved twice as fast . 
 
 (2) We are talking about computer centers with mult i -processor 
 capacities, not about inexpensive equipment. Each center is a large 
 financial investment and, therefore, the number of centers is bounded 
 
36 
 by the availability of funds . We feel a network of sixteen centers 
 to be sufficiently large for any purposes. Our scheme will work well 
 with this number of centers. 
 
 (3) Should one center fail there is still a communication 
 line maintaining a possible minimum path in the direction from the 
 sender through the failed node, to the receiver. 
 
 In the light of this scheme, we find that the maximum number of 
 transmissions from center i to center j is [rt/lfj* if n is not exactly 
 divisible by k and n/k otherwise. It may be noted that the rules for 
 governing the positive or negative flow of a job to minimize the number 
 of transmissions are the same as in Table 3»1« We feel that this 
 transmission linkage is adequate to handle the inter-center communications 
 we desire. We will then discuss some general considerations and then 
 describe an algorithm to be used for the rerouting. 
 
 3.3. General Considerations for Rerouting 
 
 Before formally stating the rerouting algorithm, we shall 
 consider the reasons why the job was refused entry to a center in the 
 first place . We will also discuss under what conditions a center will 
 accept work that another has turned down. It may happen, that a job is 
 refused at one center and no other center can accept it at the time; is 
 the job then lost to the system entirely or should it be resubmitted at a 
 later time? We will now focus our attention on these and other questions 
 of interest. 
 
 *Read: the ceiling of n/k and meaning the next integer higher than n/k. 
 
37 
 
 From Chapter 1. we see that the load regulation scheme used 
 refuses entry to the incoming job solely on the basis of the queue size 
 probabilities developed there. We feel, at this point, that a more 
 thorough discussion of the idea of critical queue size is in order. It 
 is implied, though not intended, that the decision to accept or reject 
 work rests strictly on the number of jobs that are held in the queue, 
 and when the queue reaches the critical number, the load regulator would 
 inhibit further inputs to the center. While the queue size is one of 
 the factors in determining whether or not to accept work, it is not the only 
 one. Another factor in the decision is the expected execution time of 
 the jobs waiting in the queues. If this expected execution time is large, 
 then incoming jobs joining this queue may have to wait for a long time. 
 It may be reasonable under these circumstances to assume that the center 
 has reached its saturation point and to inhibit any further input to it. 
 In this light, we would inhibit input even though the number of jobs is 
 less than the maximum queue size. 
 
 We, therefore, observe that the decision to accept work is a 
 function of the execution time as well as the number in the queue. We 
 must remember, however, that the critical queue size in our equations is 
 dependent only on the number of jobs. We must, therefore, express the 
 queue size in terms of the number of equivalent average jobs, rather 
 than just the number of jobs actually in the queue so that our load 
 regulator will still work effectively. We do this in Appendix A. In 
 summary, then, we have the equivalent number of average jobs, ENJ, 
 given by 
 
38 
 
 ENJ = TET./CRIT. • CAP. 
 l' 1 l 
 
 where 
 
 th 
 CAP. = the maximum queue capacity of the i — center 
 
 TET. = the total execution time for the i — center 
 
 l 
 
 CRIT. = the critical execution time of the i — center 
 l 
 
 Another reason for refusal of a job is that the initial center 
 is just not equipped to handle it. In this case, the job must be re- 
 routed to a center capable of running it. For ease of understanding 
 in the formation of the rerouting algorithm, we will assume that this 
 does not happen. Any job entering the network is capable of being run 
 at the center where it is input initially or at any other in the system. 
 Again we will leave it to management to decide what happens to these 
 types of jobs that cannot be run at some center. 
 
 With the discussion of the refusing center completed, we can 
 talk about the center, if any, that will receive the job. It would be 
 expedient, at this point, to say that the circumstances governing a center 
 accepting a job from another center are exactly opposite from the reasons 
 the first center refused it (i.e. it has room in the queue and plenty of 
 time in which to run it). But it would also be incomplete. The biggest 
 factor in the acceptance of a job another center refused is the cost in- 
 volved. In particular, the receiver must be able to ascertain if the 
 network will still profit despite the cost of transmission and 
 retransmission of results between the refuser and the acceptor. If one 
 
 
39 
 center refuses a job and no other center can accept it, one of two 
 things can happen: 
 
 (1) the refusing center can try to make room for the job 
 
 "by sending one or more jobs to another center (at a profit, of course), 
 
 (2) or the job is lost to the network and must be forgotten 
 or re-submitted at a later time. 
 
 Money, here as in most areas of any interest, is the governing 
 force. With these considerations in mind we will allow the user to place 
 a time estimate on his job, and to assign to it a priority number between 
 1 and some x (the higher the number the higher the priority), used to 
 position his job in the queue. We assume here that an incoming job 
 with some priority p, which causes an overload and forces the load 
 regulator to inhibit further inputs, will not displace jobs of priority 
 less than p unless it cannot be run elsewhere. Now, we will finally 
 consider the algorithm. 
 
 3.k. The Algorithm 
 Let 
 CAP., i = 1,2, ...,n = the queue capacity at the i — center 
 
 AT., i = 1,2,... ,n = the average execution time of jobs 
 that enter the i — center 
 
 CRIT., i= 1,2, ...,n = the critical execution time of the i — 
 
 center (CAP. • AT. ) 
 i l 
 
 T ., I = 1,2, . . .,k, i = 1,2, . . .,n = the expected execution 
 times of the k entries in the i — centers queue 
 
 th 
 TET. = the total expected execution time for the i— - center 
 
ko 
 
 th 
 M. = the present size of the i — queue 
 
 1 
 
 REW,,. = the reward for doing the I — job at the i — 
 
 center (here reward is defined to be the profit 
 for running this job. If a job is transmitted 
 to another center, the reward is decreased by 
 the cost of transmitting the job and getting 
 back the results) 
 
 I = the position in the queue of the I — job, where 
 
 the priority of the t — job is greater than or 
 equal to the priority of the (l + l) job and 
 jobs are run in order 1,2, . . ., t, t+1, « • • ,&. 
 
 1 th 
 
 w. ( Z rew )/m. = weight in importance of the i — 
 
 1 £=1 ^ X 
 
 m. 
 
 l 
 
 center where Z rew.. is in the range 
 
 (m. • 1, m. • x) and w. is in the range (l,x) 
 
 (Remember that the user specifies the worth of his job by giving it a 
 number between 1 and x. We can, therefore, with the w. get an estimate 
 of the kind of work a center is doing; a center with w. = 2 would not 
 seem to be doing useful work, while a center with w. » x/2 would be 
 doing very useful work. We will assume that the x is the same for all 
 centers. Heuristically it seems that a w. = x/2 for all i, i=l,2, ...,n 
 would be the best for the network since a center with w. » x/2 is doing 
 the more important work and is a bigger risk to the network if it should 
 fail. If jobs are distributed equally in importance then the risk is 
 about the same for all centers in the network. We mention this here but 
 we will not attempt to level the work loads so that the w. 's are 
 relatively equal) . 
 
kl 
 
 ET. , 
 job 
 
 job 
 
 PROF . . 
 job 
 
 C. . 
 1J 
 
 TIME 
 
 = the estimated time to run the refused job 
 = the priority of the job (between 1 and x) 
 
 = the profit on the job 
 
 th 
 = the cost to transmit a job from the i — 
 
 center to some center j (this cost is computed 
 
 by using the shortest path from center i to 
 
 center j according to the communication paths 
 
 described in section 3«1») 
 
 = the total estimated time to process all jobs 
 already in center j, plus the estimated time to 
 run the refused job (TET. + ET 
 
 J job 
 then, the algorithm proceeds as follows; for a job I that must be 
 
 rerouted from center i to center j 
 
 STEP I: 
 
 STEP II: 
 
 obtain the w., i=l,2, ...,n from the table 
 
 choose the center j not yet considered with the smallest 
 
 remaining w. and compute 
 
 TIME . = TET . + ET . , 
 J J job 
 
 If no more w., GO TO STEP IV 
 
 STEP III 
 
 If TIME. >CRIT. RETURN TO STEP II otherwise compute 
 
 PROF . . = REW , . - C . . 
 job Vl ij 
 
k2 
 
 STEP IV: 
 
 If PROF . t_ < RETURN TO STEP II otherwise insert the 
 job - 
 
 th 
 job into the queue of this j — center in the following way: Given 
 
 that the last job already in the queue that has a priority equal to 
 
 P is found in the q — position, then insert the job in the (q+l) 
 
 position and displace the lower priority jobs if any by 
 
 — — 
 
 ET . , /AT . + 1 positions in the queue then STOP 
 job' JJ 
 
 STEP V: 
 
 If the job has a higher priority than some jobs in the queue, 
 
 th 
 
 insert the job into the i — center as in STEP IV and then GO TO STEP II 
 
 to re-route a job or jobs that had to be displaced. 
 
 Generally, this algorithm tends to choose the center that is doing less 
 useful work than the others to hopefully increase the w. for this center 
 and make it more useful to the network. 
 
 ^Performed in integer arithmetic because we still want our queue size 
 in terms of time if necessary. 
 
^3 
 
 3-5« The Load-Regulator -Dispatcher 
 
 We have thus far discussed the communications scheme for 
 our network, some of the reasons why a job might "be refused and need 
 these communication paths, and finally the algorithm "by which we 
 accomplished the rerouting. in order to achieve our goal of a workable 
 load-regulator-dispatcher we need to finish construction of the decision 
 
 table . We built the load regulator part of the table in Chapter 1 with 
 
 2 
 
 (c+l) entries; we need now to add the dispatcher part to the table. 
 
 The following information is necessary for the algorithm and can 
 be divided into two categories; information stored in the load-regulator- 
 dispatcher permanently and information local to each center that comes 
 to the load-regulator-dispatcher as parameters of the refused job: 
 
 LOCAL INFORMATION 
 
 A. 
 
 (1) ET. . (2) P. . (3) REW . 
 ; job ; job VJ/ ti 
 
 these are job parameters carried by each refused job. 
 
 B. 
 
 (1) CAP. (2) AT. (3) T^ 
 
 local statistics used for updating of information stored in load- 
 regulator-di spatcher . 
 
kk 
 
 L0AD-REGU1AT0R-DISPATCHER INFORMATION 
 
 (1) TET. (2) CRIT. (3) W. 
 
 (k) C. . (5) M. 
 
 information used for algorithm and updated as the characteristics of 
 a center change. Our load-regulator-dispatcher table then takes on the 
 
 following form ( see Table 3 «2 . ) • 
 
 2 
 
 The number of entries in the table is now 2(c+l) + kc, a 
 
 number which is well within the realm of feasibility, especially when 
 all centers with the use of a normalizing factor can use the same table . 
 We also expect that with the same analysis of Chapter 1, we can reduce 
 this number. Our load-regulation-table, with the addition of this 
 information, becomes a load-regulation-dispatching table, the generation 
 of which was our goal. 
 
^5 
 
 •H 
 
 
 O 
 
 
 
 H 
 
 
 n 
 
 
 o 
 
 
 s 
 
 
 S 
 
 
 
 s 
 
 
 
 
 S 
 
 
 •H 
 
 
 O 
 
 
 
 H 
 
 
 
 
 CJ 
 
 
 Is 
 
 
 |S 
 
 
 
 IS 
 
 
 
 
 is 
 
 
 •H 
 
 
 o 
 
 
 
 rH 
 
 
 
 
 CJ 
 
 
 EH 
 
 
 EH 
 
 
 
 EH 
 
 
 
 
 Eh 
 
 
 H 
 
 
 H 
 
 
 
 H 
 
 
 
 
 H 
 
 
 « 
 
 
 Ph 
 
 
 
 « 
 
 
 
 
 « 
 
 
 O 
 
 
 O 
 
 
 
 O 
 
 
 
 
 O 
 
 
 •H 
 
 
 o 
 
 
 
 H 
 
 
 
 
 o 
 
 
 EH 
 
 
 EH 
 
 
 
 EH 
 
 
 
 
 EH 
 
 
 H 
 
 
 W 
 
 
 
 W 
 
 
 
 
 (jH 
 
 
 EH 
 
 
 EH 
 
 
 
 EH 
 
 o 
 
 . 
 
 
 
 
 ■r-3 
 
 o 
 
 3 
 
 
 CJ 
 
 o 
 
 H 
 
 
 o 
 
 
 CJ 
 
 ■H 
 
 o 
 
 a a a 
 
 o 
 
 H 
 
 H 
 
 • • • 
 
 H 
 
 
 o 
 
 • • • 
 
 CJ 
 
 CJ 
 
 cj 
 
 o 
 
 
 o 
 
 O 
 
 O 
 
 
 o 
 
 
 u 
 
 
 o 
 
 fn 
 
 
 
 
 
 
 
 
 
 
 
 
 
 o 
 
 
 
 
 
 
 
 
 
 
 
 
 
 -p 
 
 
 
 
 
 
 
 
 
 
 
 
 
 o 
 
 
 
 
 
 
 
 
 
 
 
 
 
 CCJ 
 
 
 
 
 
 
 
 
 
 
 
 
 
 [X) 
 
 •-— v 
 
 .-"-v 
 
 
 
 .» V 
 
 
 
 ^— s 
 
 
 s~^ 
 
 
 S >v 
 
 
 o 
 
 H 
 
 
 o 
 
 o 
 
 H 
 
 
 CJ 
 
 
 o 
 
 
 CJ 
 
 CD 
 
 "■ — H* 
 
 
 • • a 
 
 s. .* 
 
 V. — 
 
 1 a** 
 
 • • • 
 
 v. s 
 
 
 s s 
 
 • • • 
 
 N* •* 
 
 M 
 G 
 
 8 
 
 B 
 
 
 B 
 
 B 
 
 B 
 
 
 B 
 
 
 B 
 
 
 B 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ,G 
 
 
 
 
 
 
 
 
 
 
 
 
 
 O 
 
 
 
 
 
 
 
 
 
 
 
 
 
 b 
 
 
 
 
 
 
 
 
 
 
 
 
 
 •i-3 
 
 
 
 
 
 
 
 
 
 
 
 
 
 °8 
 
 
 
 
 
 
 
 
 
 
 
 
 
 H 
 
 i 
 
 o 
 
 H 
 
 • • • 
 
 o 
 
 o 
 
 H 
 
 • • • 
 
 cj 
 
 
 o 
 
 • • • 
 
 o 
 
 w 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 •H 
 
 
 
 
 
 
 
 
 
 
 
 08 
 
 
 o 
 
 
 
 H 
 
 
 
 
 CJ 
 
 
 ^-~, 
 
 
 
 
 
 
 
 
 
 
 
 ; : 
 
 
 
 
 
 
 
 
 
 
 
 n' 
 
 
 
 
 
 
 
 
 
 
 
 
 
 OJ 
 
 CO 
 
 
 H 
 
 ■3- 
 
 EH 
 
U6 
 
 k. CONCLUSION 
 
 In the preceding chapters, we discussed "briefly the ideas 
 of load regulation and dispatching in a network of computers. We 
 developed analytical tools which we used to form a decision table in 
 memory, a decision table that adequately handled the inhibition or 
 reduction of input to any center in our network. We then extended our 
 load-regulator into a load-regulator-dispatcher. We, therefore, have 
 the tool for control of our network that we sought to find. The only 
 limitation to this scheme would be one of memory space, and how much 
 of this precious resource management is willing to lay aside for this 
 purpose . We have shown that the memory needs for the load-regulator- 
 dispatcher to work were not excessive in terms of the job that it does 
 for us . 
 
 It could be noted, at this point, that the load regulation part 
 of our table could be used in a single center for control of jobs on 
 the different processors and future research may look at this. The 
 algorithms that do the dispatching in our network could also be sophisti- 
 cated to a degree that management would have a hard time finding fault 
 with these ideas for control. We have, in this paper, then, razed the 
 wall that stood in the way of using this type of controller. It now 
 remains to only sidestep or climb over the rubble and debris left behind 
 to reach a usable load-regulator-dispatcher. 
 
^7 
 
 APPENDIX 
 
 We wish to express the queue size in terms of the number of 
 equivalent average jobs in queue so that our load regulator will still 
 work efficiently if we let 
 
 m. 
 
 CAP. 
 
 1 
 
 AT. 
 
 1 
 
 th 
 = the actual number of jobs in the i — centers queue 
 
 th 
 the capacity of the queue of the i — center 
 
 the average processing time of jobs that are run 
 at the i — center (a figure gathered over a 
 
 representative period of time) 
 
 T„., -6=1, 2,. ...m = the expected execution times of the m entrie: 
 li ,, .th 
 
 in the l — queue 
 
 TET. 
 l 
 
 CRIT. 
 
 ENJ 
 
 = total execution time for the i — center 
 
 th 
 the critical execution time of the i — center 
 
 = the equivalent number of average jobs in the queue 
 
 then, 
 
 CRIT. = CAP. • AT. (a reasonable upper bound) 
 l li 
 
 and 
 
 TET. = Z T,. + AT. 
 
 (A-l) 
 
 where we add AT. to represent the execution time of jobs already in 
 
 th 
 process at the i — center and finally the equivalent queue size is the 
 
 number of equivalent average jobs (ENJ) is given by 
 
 ENJ 
 
 TET. /CRIT. • CAP. 
 1' l l 
 
 (A-2) 
 
kQ 
 
 We can also see at this point that if 
 
 TET./CRIT. > 1 
 
 th 
 we should inhibit further inputs to the i — center. 
 
 EXAMPLE 
 
 Let 
 
 CAP. = 100 
 
 l 
 
 AT. = 50 units of time 
 
 then CRIT. = CAP. - AT. = 5,000 units of time; given that the following 
 ill- to 
 
 m jobs (m=20) are in the queue 
 
 Job No. Units of Expected Execution Time, T . . 
 
 1 150 
 
 2 25 
 
 3 ^0 
 k 90 
 
 5 110 
 
 6 30 
 
 7 10 
 
 8 75 
 
 9 50 
 
 10 175 
 
 11 k5 
 
 12 55 
 
 13 20 
 Ik 90 
 
 15 200 
 
 16 70 
 
 17 60 
 
 18 100 
 
 19 220 
 
 20 80 
 
^9 
 then applying (A-l) we find 
 
 TET. = 1695 + 50 = 17^5 
 
 and (A-2) we get 
 
 ENJ = 17^5/5000 ■ 100 = 35 
 
 this implies that for this situation when our queue size is polled 
 at a discrete sampling time KS, for some K, the equivalent queue size 
 sent back to the load regulator should he representative of the number 
 of average jobs (35) and not the actual figure (20) . 
 We should also note that if 
 
 AT. * m > TET. 
 1—1 
 
 (i.e. the expected total execution time for m jobs is less than or equal 
 
 to the average), then the equivalent queue size sent back to the load 
 
 regulator should be m. Therefore, we stipulate that the lowest number 
 
 that can represent the equivalent queue size is the actual size of the 
 
 queue in terms of the number of jobs (i.e. CAP. > ENJ > m. ) . We do this 
 
 because it is too hard to change the hardware configuration of the queue 
 
 which specifies that each job should occupy a specific physical space 
 
 (i.e. four words). It would, therefore, be unwise to attempt to cram 
 
 four or five short jobs into the physical space generally occupied by 
 
 one or two. 
 
50 
 
 LIST OF REFERENCES 
 
 [l] Morse, P. M., Queues, Inventories and Mai ntai nance , John Wiley, 1962 . 
 
 Bailey, Norman T. J., "A Continuous Time Treatment of a Simple 
 Queue Using Generating Functions, " Journal of the Royal Statistical 
 Society , Series B. Vol. 16, I95I+, pp. 288-291. 
 
 [3] Saaty, Thomas L., "Time Dependent Solution of the Many-Server 
 
 Poisson Queue," Operations Research , Vol. 8, i960, pp. 755-772* 
 
 [k] Bowdon, E. K., Sr., and Bar r, W. J., "Throughput Optimization in 
 
 Network Computers, " Proceedings of the Fifth International Conference 
 of Systems Sciences , Honolulu, Hawaii, 1972 • 
 
 [5] Frank, H. et al, "Topological Considerations in the Design of the 
 
 ARPA Computer Network," Proc . SJCC , AFIPS Press, Montvale, New Jersey, 
 1970, pp. 581-587- 
 
 [6] Foley, James D. and Lau, Kar-Wong, "Computer Aided Design of 
 
 Computer Networks Via Computer Graphics, " The University of North 
 Carolina at Chapel Hill (unpublished) . 
 
 [7] El-Bardai, M. T., "Load Regulation Through Stochastic Queue Control," 
 Working Paper No. WP-72^4-, Collins Radio Company, Cedar Rapids, Iowa, 
 1969, 2k pages. 
 
 [8] El-Bardai, M. T., "Load Regulation in C -System, " Working Paper 
 No. WP-72J+5, Collins Radio Company, Cedar Rapids, Iowa, 1969, 
 Ik pages . 
 
 El-Bardai, M. T., "Numerical Results for Some Load Regulation Schemes," 
 Working Paper No. WP-72^+9- Collins Radio Company, Cedar Rapids, Iowa, 
 I969, 20 pages. 
 
3GRAPHIC DATA 
 
 r 
 
 1. Report No. 
 
 UIUCDCS-R-72-537 
 
 » and Subtitle 
 
 ad Regulation and Dispatching in a Network of Computers 
 
 3. Recipient's Accession No. 
 
 5. Report Date 
 
 August 1972 
 
 ior(s) 
 
 imes F. Fitzgerald 
 
 8. Performing Organization Rept. 
 No. 
 
 orming Organization Name and Address 
 
 oartment of Computer Science 
 
 Lversity of Illinois at Urbana-Champaign 
 
 sana, Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 NSF GJ 28289 
 
 insoring Organization Name and Address 
 
 tional Science Foundation 
 shington, D.C. 
 
 13. Type of Report & Period 
 Covered 
 
 Thesis Research 
 
 14. 
 
 >plementary Notes 
 
 is paper is aimed at developing tools to control efficiently the flow of jobs 
 i job traffic in a network of computers. Input of jobs to each center is 
 ntrolled by predetermined information based on probabilities and stored in table 
 rm. These probabilities are developed mathematically, predicted on the fact 
 at we consider the input rate to be a random variable capable of assuming any 
 ze. The table is then extended to handle the dispatching of jobs that must be 
 routed between different centers in the network and an efficient controller is 
 us developed. 
 
 y Words and Document Analysis. 17a. Descriptors 
 
 twork Computer, Load Regulation 
 
 entifiers/Open-Ended Terms 
 
 OSATI Field/Group 
 
 ^lability Statement 
 
 'nlimited Distribution 
 
 TIS-3B ( 10-70) 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 20. Security Class (This 
 
 Page 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Price 
 
 None 
 
 USCOMM-DC 40329-P7 1 
 
SEP2 2|97 ?