LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510.84 
 
 no.722-727 
 cop. 2. 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/comparativerespo722mamr 
 
tJU 
 
 no- 723. 
 
 uiucDcs-R-75-722 
 
 COMPARATIVE RESPONSE TIMES OF TIME -SHARING SYSTEMS 
 ON THE ARPA NETWORK 
 
 by 
 
 SANDRA ANN MAMRAK 
 
 May 1975 
 
/sr/ 
 
 Report No. UIUCDCS-R-75-722 
 
 COMPARATIVE RESPONSE TIMES OF TIME -SHARING SYSTEMS 
 ON THE ARPA NETWORK 
 
 by 
 
 Sandra Ann Mamrak 
 
 May 1975 
 
 Department of Computer Science 
 University of Illinois at Urbana-Champaign 
 Urbana, Illinois 61801 
 
 This work was supported in part by the Computing Services Office at the 
 University of Illinois at Urbana-Champaign and in part by the Advanced 
 Research Projects Agency under contract DAHC0U-72-C-0001, and was submitted 
 in partial fulfillment of the requirements for the degree of Doctor of 
 Philosophy in Computer Science, May 1975* 
 
Um 2- 
 
 in 
 
 ACKNOWLEDGMENT 
 
 The support, guidance, advice and criticisms of Professor 
 Edward K. Bowdon, Sr. were elements essential to the success of this 
 research project. His contribution to all substantive aspects of the 
 thesis are greatly appreciated. 
 
 Ms. Gayanne Carpenter's contribution to the project in terms 
 of friendly and timely advice about departmental policies and various 
 deadlines were also invaluable help toward project completion and 
 are also greatly appreciated. 
 
IV 
 
 COMPARATIVE RESPONSE TIMES OF TIME-SHARING SYSTEMS 
 ON THE ARPA NETWORK 
 
 Sandra Ann Mamrak, Ph .D . 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign, 1975 
 
 If, indeed, the ultimate aim of a computing network is resource 
 sharing, then the human component as -well as the technical component 
 of networking must be fully investigated to achieve this goal. This 
 research is a first step toward assisting the user in participating 
 in the vast store of resources available on a network. Analytical, 
 simulation and statistical performance evaluation tools are employed 
 to investigate the feasibility of a dynamic response time monitor that 
 is capable of providing comparative response time information for users 
 wishing to process various computing applications at some network computing 
 node. In particular, the following areas are investigated: 
 
 1. The measurement and statistical analysis of 
 response times of individual time-sharing systems 
 on a computing network. 
 
 2. The comparison of response times of these same 
 time-sharing systems as they process a set of 
 benchmark jobs. 
 
 3- The development of a single analytical and a single 
 simulation model able to explain and predict 
 response times for all time -sharing systems under 
 investigation . 
 
 k. The effect of heavy network traffic on the comparative 
 response times of the individual time-sharing systems. 
 
V 
 
 The research clearly reveals that sufficient system data 
 is currently obtainable, at least for the five diverse ARPA network 
 systems studied in detail, to describe and predict response time for 
 network time-sharing systems as it depends on some measure of system 
 busyness or load level. 
 
VI 
 
 TABLE OF CONTENTS 
 
 Page 
 
 ACKNOWLEDGMENT iii 
 
 1 . INTRODUCTION 1 
 
 1.1. Computer Network Evaluation Trends 1 
 
 1.2. Computer Network Evaluation Deficiencies: A 
 
 Problem Statement k 
 
 1«3- Time -Sharing System Evaluation 6 
 
 2 . COMPARATIVE RESPONSE TIMES ON THE ARPA NETWORK 9 
 
 2.1. The ARPA Network 10 
 
 2 .2 . System Variables 12 
 
 2 .2 .1. The Computing Systems 16 
 
 2.2.1.1. TSS - IBM Time -Sharing System 16 
 
 2.2.1.2. TENEX - PDP-10 Time-Sharing System 18 
 
 2.2.1.3. TSO - IBM Time -Sharing Option 2k 
 
 2 .2.1.1+. MULTICS - MIT Time-Sharing System 27 
 
 2.2.1.5. CANDE - University of California at 
 
 San Diego (UCSD) Time-Sharing System 30 
 
 2 .2 .2 . Benchmark Jobs • 3^ 
 
 2.2.3. Load Level 37 
 
 2 .2 .U . Response Time Uo 
 
 3 • MEASURING TIME-SHARING SYSTEMS kk 
 
 3-1- Analysis of Individual System Data U8 
 
 3.1.1. AMES-TSS kQ 
 
 3-1.2. BBN-TENEX 5^ 
 
 3.1.3. CCN-TSO 56 
 
 3-l.U. MIT -MULTICS 60 
 
 3.1-5- UCSD-CANDE 6k 
 
 3 .2 . Comparison of Computing Systems 66 
 
 3 .2 .1 . Arithmetic Benchmark 73 
 
 3 .2 .2 . Bit Manipulating Benchmark 73 
 
 3.2.3. I/O Bound Benchmark 73 
 
Vll 
 
 TABLE OF CONTENTS (Continued) 
 
 Page 
 
 k . MODELING TIME -.SHARING SYSTEMS 79 
 
 k.l. An Analytical Model for Time -Sharing Systems 79 
 
 k.2 . A Simulation Model for Time -Sharing Systems 93 
 
 h.3- Analysis of Model Predictions 99 
 
 1+.3.1. Individual System Results 100 
 
 4.3.2. Success of Model Generalization 10U 
 
 h.k . Consideration of Network Queueing Delays 108 
 
 5 • A DYNAMIC RESPONSE TIME MONITOR 113 
 
 5-1. Currently Feasible Monitor Features 113 
 
 5 -2 . Additional Desirable Monitor Features 117 
 
 6 . CONCLUSIONS 119 
 
 6.1. Implications for Future Network Development 120 
 
 6 .2 . Suggested Further Research 122 
 
 LIST OF REFERENCES 12k 
 
 APPENDIX A 127 
 
 APPENDIX B 129 
 
 APPENDIX C 131 
 
 VITA 134 
 
Vlll 
 
 LIST OF TABLES 
 
 Page 
 
 2.1. Computing Systems Summary 13 
 
 2.2. Benchmark Jobs Run at Various Computing Centers 36 
 
 2.3- Load Level Definitions 39 
 
 2.k. Command Sequence for Systems ' Measurement ^3 
 
 3 • 1 • Systems ' Saturation Level hG 
 
 3 .2 . Average Benchmark Processing Times ^7 
 
 3-3- Residual Mean Squares for AMES-TSS Curve Fits 53 
 
 k .1. Analytical Model Parameters 102 
 
 k.2. Transmission Times for Illinois to Experimental Sites... 109 
 
 k.3. Infinite Network Delays from U. of I. Node 112 
 
 5.1. Load Levels at AMES-TSS 11 4 
 
 CI. Residual Mean Square (RMS) Statistics 132 
 
 C .2 . Individual System Best Curve Fit Data 133 
 
IX 
 
 LIST OF FIGURES 
 
 Page 
 
 2.1. ARPA Network Configuration in Early l^jk 11 
 
 2 .2 . Generalized Time-sharing Scheduling 15 
 
 2-3. The TENEX Scheduler 20 
 
 2.k: BBN-TENEX Scheduling 22 
 
 2.5. CCN-TSO Scheduling 27 
 
 2 .6 . MIT-MULTICS Scheduling 31 
 
 2 .7 . UCSD-CANDE Scheduling 33 
 
 3.1. Statistical Results - AMES-TSS hy 
 
 3 .2 . Statistical Results - BBN-TENEX 55 
 
 3-3- Statistical Results - CCN-TSO 57 
 
 3-h. Statistical Results - MIT-MULTICS 6l 
 
 3 .5 • Statistical Results - UCSD-CANDE 65 
 
 3 '6 . Arithmetic Benchmark Comparisons 67 
 
 3 -7 • Bit String Benchmark Comparisons 7 J 4 
 
 3 '8. I/O Bound Benchmark Comparisons 76 
 
 l+.l. Comparison of Two Models 87 
 
 k.2. Simulation of MIT-MULTICS Time -Sharing Scheduler 9h 
 
 '4.3. Model Comparison - BBN-TENEX 101 
 
 k . k . Model Comparison - CCN-TSO 103 
 
 J +-5- Model Comparison - MIT-MULTICS 105 
 
 k .6 • Generalized Simulation Model Results 107 
 
1 
 
 1. INTRODUCTION 
 
 Less than a decade ago, the time-sharing concept on single 
 computer systems was one of the main objects of computer science inquiry. 
 There existed a wide divergence of opinion on such issues as where the 
 technology stood, key application possibilities, feasibility, future 
 directions and economics. Today the resource sharing concept on networks 
 of computer systems has moved into the spotlight and become the object 
 of identical kinds of inquiries. 
 
 1.1. Computer Network Evaluation Trends 
 
 Although the computer network concept developed in an 
 unrevolutionary manner, proceeding logically and in an orderly way from 
 the development of highly sophisticated single processor systems, the 
 performance evaluation techniques developed for single processor systems 
 differ radically from those developed for geographically distributed 
 multiple processor computer systems. Performance evaluation in single 
 processor systems is characterized by a hodge-podge of performance goals 
 and performance measurements. The most significant convergence of thought 
 among single processor systems' analysts is agreement that what is 
 required is a quantitative methodology on which to base analysis of real 
 system data for model formulation and validation. Performance evaluation 
 in networks, on the other hand, where it has been present, has been 
 characterized by a careful development of analytic and simulation network 
 
models, generally supported "by data analyzed using optimization and 
 statistical techniques. These evaluation techniques, as well as their 
 specific application in existing or proposed networks are surveyed 
 elsewhere [MAM7 1 !-] . 
 
 An examination of existing models and measurements in computer 
 network systems reveals several trends. Analysis "based on queueing 
 theory has "been anchored in a node-by-node approach, assuming independence 
 of the various network nodes. This approach works very satisfactorily 
 for a limited set of network phenomena. Simulation has been successfully 
 used, but can become prohibitively expensive when detailed representations 
 of the network system are required [KLE7O, SAL73, WAR39J • Optimization 
 techniques have been effectively transferred from network flow theory 
 and are working well to yield specific design parameters [WHI72] . Actual 
 system measurements, analyzed using statistical techniques and used to 
 improve queueing and simulation models, have been relatively neglected 
 [C0L7l] • (This neglect may be due in part to the unavailability of tools 
 for making desired observations of dynamic systems and of statistically 
 significant test environments.) Finally, although sophisticated performance 
 evaluation tools are generally available, they have been applied almost 
 solely to the ARPA (Advanced Research Projects Agency) network. 
 
 Not the least important among the recent trends in computer 
 network performance evaluation is research aimed at aiding the user in 
 optimizing job routing and scheduling, and minimizing job cost. This 
 trend has been spurred by a relatively stable network technology, coupled 
 with an ever increasing number of general network users. From their 
 embryonic days of the late 1960's until just recently, computer networks 
 
have been a subject of interest mainly to universities and research 
 agencies. As late as January of 1973, the AREA network [R0B70] statistics 
 were showing that even though the network was reliable and available, 
 communication lines were used 3*5 percent on the average [MCQ73J • Also 
 in 1973, the MERIT network [HER72] found itself in serious financial 
 difficulties due to lack of interest by a sufficient number of users. 
 However, over the last year, very substantial interest has been materializing 
 in the wider university and research communities and in the commercial 
 world as well . 
 
 The Distributed Computer System network concept developed 
 by D. Farber [FAR72] at the University of California at Irvine has been 
 a significant exception to the common mode of development of network systems 
 which provides inter-connected computer resources but requires users to 
 do their own unadvised job scheduling. The host sites on Farber' s ring- 
 structured network send bids for jobs back to the customers, thus pro- 
 viding them with some criteria by which to choose a particular hpst for 
 job processing. The majority of operational networks, though, do not 
 provide the user with formal information on comparative job costs or 
 comparative job run times. 
 
 Marshall Abrams at the National Bureau of Standards should also 
 be mentioned here as another unique contributor to user-oriented network 
 performance evaluation research. He has developed a "stimulus-acknowledg- 
 ment-response" model to describe the user-computer interaction and a 
 data acquisition system called the Network Measurement Machine. He is 
 using these tools to analyze network performance as perceived by a 
 network user or the "consumer of computer services" [ARB7*+J • 
 
1.2. Computer Network Evaluation Deficiencies; A Problem Statement 
 
 There exists a need, then, for network performance evaluation 
 
 efforts to be geared toward aiding the network user in the decision -making 
 
 inherent in network interactions. Network designers and managers have 
 
 been the fortunate recipients of analytical, simulation and statistical 
 
 tools useful in carrying out their network duties. These same tools of 
 
 the network performance analysts must also be applied to answer questions 
 
 of importance to the user . 
 
 While cost-effectiveness is an important performance factor, 
 
 response time is often the primary performance parameter of interest to 
 
 users and, in particular, interactive or time-sharing system response 
 
 time . Given a choice of different interactive computing systems with 
 
 varying capabilities for handling particular types of computer applications, 
 
 network users need to be advised of the comparative turnaround or response 
 
 times of those systems. 
 
 More specifically, for a given network facility, let the system 
 
 environment for a user at a particular time, t, be described by the set 
 
 [i,j,k i (t),T i (s,j)J, where 
 
 i is one of a set of n time- sharing computing 
 
 systems accessible from the facility (presumably 
 n is a constant over reasonably short periods 
 of time), 
 
 j is one of a set of m computing applications 
 required by the user (presumably m is a 
 constant over reasonably short periods of time), 
 
 k. (t) is the load level on the i computer system 
 
 
 1 
 
 at time t (for convenience, k.(t) is partitioned 
 
 •4- V. 
 
 at the i facility into ten equal length 
 intervals), and 
 
T.(s,j) (called "response time") is the time required 
 at load s, where s = k.(t) at some time t, to 
 complete the execution of a run command for the 
 
 j application at the i facility. 
 
 Within this system environment, answers to the following questions 
 must he provided: 
 
 (1) For some particular system i, is it possible to 
 describe and predict the behavior of T.(s,j) as s 
 varies with time? (Discussed in section 3«3»1«) 
 
 (2) At some time t, is it possible to meaningfully 
 
 th 
 
 compare T.(s,j) for a particular i computing 
 
 application when run at m different time -sharing 
 computing facilities? (Discussed in section 3«3«2.) 
 
 (3) Is there a single response time model (analytical, 
 simulation or statistical) that will describe and 
 predict T.(s,j) for each i and each j with an 
 acceptable level of accuracy? (Discussed in sections 
 k.k.l. - k.k.3.) 
 
 (k) What is the effect of network traffic on T.(s,j)? 
 (Discussed in section 4.U.U.) 
 
 If the first three of these questions above can be answered affirmatively, 
 then it will be feasible to develop a dynamic response time monitor that users 
 can query to gain up-to-the-minute, on-line, comparative response time data 
 for a particular computing application to be run on one of a set of 
 network time-sharing facilities. 
 
6 
 
 1.3. Time -Sharing System Evaluation 
 
 The research required to answer the response time queries 
 of the network user cuts across two distinct, but related areas — comparison 
 of independent computing systems and the investigation of response time 
 parameters in time-shared systems. The work done to compare systems is 
 sparse. One significant comparative study of computing machines has been 
 published and one is presently in process. K. E. Knight has compared 
 the performance capabilities for 318 general purpose computer systems 
 in terms of computing power and cost [KNI66, KNI68] . His measurements 
 spanned the evaluation of machines from l^hk-l^Gj and distinguished 
 machine capabilities in performing "scientific" computations from those 
 in performing "commercial" computations. P. A. Alsberg from the 
 University of Illinois at Urbana-Champaign has directed research aimed at 
 producing comparative data for machine cost-effectiveness as it is 
 measured across six interactive computing systems performing four different 
 types of work--(l) numerical, (2) console, (3) input/output, and (k) 
 bit/byte manipulation. All of the six computing systems either are on or 
 will be added to the ARPA network. A third comparative study of computing 
 systems was performed by P. E. Jackson and his associates [FUC70, JAC69] • 
 This work will be discussed below. 
 
 Extensive measurements and performance evaluations of response 
 time in time-sharing systems have been reported by several independent 
 researchers. Kleinrock [KLE72] has produced a survey of these performance 
 studies, with an emphasis on analytical results. Studies based mainly 
 on system measurements rather than analytical models have also resulted 
 in important contributions to the field. 
 
A. L. Scherr [SCH67] who analyzed a large set of measurements 
 taken on the MIT Project MAC Compatible Time-Shared System (CTSS) 
 concluded from his work: that only mean think time, mean processor time 
 and the number of users interacting with the system are of first-order 
 effect in describing system behavior. R. A. Totschek's contribution 
 [TOT65] resulting from his study of the SDC Q-32 system was characterized 
 by the classification of many of the empirical distributions associated 
 with interactive usage as having density functions with long slowly 
 decreasing tails and standard deviations exceeding the mean value . 
 
 Jackson and Stubbs [JAC69] studied a number of time-shared 
 systems and determined average values for a variety of measurements 
 relevant to interactive systems--think time, idle time, response time and 
 so on. Later Jackson along with Fuchs [FUC70] estimated the distribution 
 of many of these random variables. This study reiterated Totschek's 
 finding in that Jackson and Fuchs found that for all the continuous 
 random variables, the gamma distribution was an excellent fit and that the 
 parameter in the gamma distribution ranged between 1.0 and 1.8. At 1.0 
 the distribution becomes exponential, and even at 1.8 its tail is still 
 definitely exponential. 
 
 The essential elements of the research methodologies associated 
 with comparing computer systems and those associated with describing the 
 behavior of time- shared systems can be abstracted from the work reported 
 above. The comparative system studies are characterized by (l) running 
 benchmark jobs with specified properties and (2) measuring well-defined 
 quantities obtainable from all of the machines involved. The time-shared 
 
8 
 
 studies also have two essential characteristics: (l) the conception 
 
 of all interactions with the time- shared system (compiles, edit commands, 
 
 run commands, etc.) as being of equal significance and the measurement 
 
 of 'them as such, and (2) the development of models to describe and predict 
 
 system behavior. 
 
 The research methodology required to compare response times for 
 different job applications on different machines is similar to that 
 methodology already used in comparing systems, but somewhat different from 
 the methodologies used to date in studying response time data. Since 
 comparative results are required, running jobs with identical characteristics 
 on each system and measuring well-defined quantities obtainable from each 
 system is an appropriate and useful procedure. On the other hand, while 
 former studies on time- shared systems considered all interactions to be of 
 equal importance and measured and modeled under this assumption, we are 
 concerned here only with job execution interactions. Furthermore, our 
 concern is with run command response time measurements and models for 
 specific computing applications . 
 
2. COMPARATIVE RESPONSE TIMES ON THE ARPA NETWORK 
 
 The task of providing the network user with information to 
 facilitate decisions concerning job routing must "be accomplished within 
 the framework of the present network technology. Theoretically, such 
 aids may be as sophisticated as a "black box" environment in which users 
 need merely indicate the type of job and special resources required and 
 jobs are automatically scheduled to run with minimum response time. Given 
 the present configuration of even the most advanced networks, however, 
 a scheme best able to be readily implemented would be one in which the 
 network interface processors contained sufficient information to indicate 
 current expected response times for all time-sharing systems on the network . 
 Such a scheme would require that both the user and the computing hosts 
 input relevant information with which the interface processor can make its 
 predictions. Generally, users are able to indicate the expected execution 
 time required for a job and characterize the job as basically i/O bound 
 or CPU bound. Generally, time-sharing systems provide some measure of 
 load, such as number of users. 
 
 One of the major purposes of this research is to develop a 
 process for the network interface processor (dynamic response time monitor) 
 that, given the user and system input described, can predict response 
 time within some confidence interval. A combination of statistical, 
 analytical and simulation tools will be used to produce this result. 
 
10 
 
 The Advanced Research Projects Agency (AREA) network has been 
 chosen as the environment for the research project and five different 
 time-sharing systems accessible from the network will be investigated.* 
 A brief description of the ARPA network and a definition of the system 
 variables --the computing systems, the benchmark jobs each representing 
 a particular computing application, response time and load level--are 
 given below. 
 
 2.1. The ARPA Network . 
 
 The ARPA network shown in Figure 2.1 is generally recognized 
 as the pioneering effort in computer networking and resource sharing 
 research. The initial objective of the network was to provide a system 
 research environment in which the technical problems of networks could 
 be explored by allowing persons and programs at one computing center to 
 interactively access data and programs at other computer centers attached 
 to the network. A packet switching store-and-forward network** whose nodes 
 consist of interface message processor computers (iMPs) was set up and 
 interconnected by 50 kilobytes/ second synchronous communication lines. 
 The host computers ranged from large-scale general purpose systems such 
 
 *Both experimental design and practical considerations influenced the 
 decision to limit this investigation to just five systems. The systems 
 themselves were diverse enough to represent a wide range of time-sharing 
 scheduling philosophies and research funds were available for this 
 specific set of computing nodes. 
 
 **Definitions for such technical terms as "packet switching" and store- 
 and-forward network" are given in Appendix A along with a ready-reference 
 list of frequently used abbreviations. 
 
11 
 
 Figure 2.1. AEPA Network Configuration in Early 197^ 
 
12 
 
 as PDP-lOs and IBM 36o/370s to specialized computers such as the Illiac IV 
 and the Massachusetts Institute of Technology (MIT) Multics System. The 
 network is considered to "be a technological and resource sharing success 
 as is evidenced "by its operational accomplishments which include: 
 
 - remote use of computers either from a termination on 
 a host or a terminal interface processor (TIP) 
 
 - file movement and printing 
 
 - communication of personal messages by way of "mailboxes" 
 
 - machine-to-machine subroutine communication 
 
 - access to large common data bases. 
 
 2 .2 . System Variables 
 
 Five interactive operating systems currently available on the 
 AREA network were chosen for comparison. The performance of these operating 
 systems is essentially tied to the computing installation supporting 
 and maintaining them. All references to interactive systems, therefore, 
 will include the computing site as well as the name of the system itself. 
 A summary of the five systems and their basic scheduling characteristics 
 is presented in Table 2.1. A more detailed discussion of each of the 
 systems follows. Throughout the discussion, reference is made to. the 
 "working set" of a process, in describing its paging behavior. This 
 concept was first defined by Denning and is explained in an article written 
 by him [DEN68] . 
 
 Four of the five interactive systems (all but AMES-TSS) dispatch 
 jobs to the processor using a scheduling algorithm whose major components 
 are a series of priority queues and associated CPU time-slices. As jobs 
 
13 
 
 Table 2.1. Computing Systems Summary 
 
 
 Hardware 
 
 Scheduler 
 
 Location 
 
 Configuration* 
 
 Characteristics** 
 
 AMES67--Nasa Ames 
 
 IBM 360/67 
 
 Table driven; frequency 
 
 Research Center, 
 
 1000K-2000K core, 
 
 and duration of processor 
 
 Moffett Field, CA 
 
 5 disks, 3 drums, 
 
 time-slices determined by 
 
 (TSS) 
 
 10 tape drives, 3 
 printers, 2 card 
 readers, 6k 
 terminals 
 
 paging behavior 
 
 BBN--Bolt, Beranek 
 
 PDP-10 
 
 Five priority queues 
 
 and Newman, Inc . , 
 
 193K core, 9 disks, 
 
 with SXFS processing 
 
 Cambridge, MA 
 
 1 drum, k tape 
 
 among queues, LXFS pro- 
 
 (tenex) 
 
 drives, 1 printer, 
 
 cessing within queues, 
 
 
 6k terminals, 1 dis- 
 
 RR processing in the 
 
 
 play processor, 1 
 
 last queue 
 
 
 plotter, 1 paper tape 
 
 
 
 punch, 1 paper tape 
 
 
 
 reader, 1 teletype 
 
 
 
 scanner 
 
 
 CON-Campus Com- 
 
 IBM 360/91 
 
 Series of priority queues, 
 
 puting Network, 
 
 4000K core, 5 disks, 
 
 each with lower dispatching 
 
 Los Angeles, CA 
 
 1 drum, 8 tape drives, 
 
 priority and effectively 
 
 (TSO) 
 
 k printers, 85 
 
 a longer time-slice than 
 
 
 terminals 
 
 the former; each queue 
 served FIFO 
 
 MIT — Massachu- 
 
 Honeywell 6k5 
 
 Series of priority queues, 
 
 setts Institute 
 
 38i« core, 11 disks, 
 
 each with lower dispatching 
 
 of Technology, 
 
 1 drum, 10 tape 
 
 priority and a longer fixed 
 
 Cambridge, Kk 
 
 drives, 2 printers, 
 
 time-slice than the former; 
 
 (MULTIC3) 
 
 1 I/O controller, 1 
 card reader, 1 card 
 punch, "several 
 hundred" terminals 
 
 each queue served FIFO 
 
 US CD- -University 
 
 Burroughs 6700 
 
 Basically two priority queues, 
 
 of California at 
 
 2I40K core, 19 disks, 
 
 with high priority queue of 
 
 San Diego, CA 
 
 8 tape drives, 3 
 
 burst-oriented processes and 
 
 (CANDE) 
 
 printers, 1 remote 
 
 low priority queue of compute 
 
 
 job entry terminal, 
 
 bound processes; both queues 
 
 
 1 card punch, 1 card 
 
 served FIFO 
 
 
 reader, 512 terminals 
 
 
 "Detailed hardware descriptions are available in [ANR73a] 
 information is accurate as of August, 1973. 
 
 **FIF0 - First arrival, first service 
 RR - Round robin 
 
 SXFS - Shortest execution, first service 
 IXFS - Longest execution, first service 
 
 This 
 
Ik 
 
 Table 2 .1 (continued) . Computing Systems Summary 
 
 Location 
 
 Memory Management 
 
 Remarks 
 
 AMES67--TSS 
 
 Estimations of working set 
 and working set size char- 
 acteristics are the heart 
 of the scheduler; "balance 
 core time principle" used 
 to determine time-slicing 
 
 Distinguished by a scheduler 
 that is primarily concerned 
 with core demands rather than 
 CPU demands . 
 
 BBN--TENEX 
 
 Balance set control module 
 in scheduler regulates 
 running processes so as to 
 minimize the probability of 
 and idle CPU due to too 
 frequent page faults 
 
 Most sophisticated of sche- 
 dulers; embodies all three 
 scheduling disciplines of . 
 SXFS, LXFS, and RR 
 
 CCN--TSO 
 
 Fixed (virtual) region size 
 alloted to each virtual 
 machine; single process 
 currently on a virtual 
 machine has access to 
 entire region 
 
 Distinguished by binding 
 processes to one of a fixed 
 ' number of virtual machines 
 within which no multi- 
 programming occurs 
 
 MIT— MULTICS 
 
 A list of "eligible" 
 processes is maintained 
 consisting of those 
 processes which have the 
 highest dispatching 
 priority and can simultan- 
 eously exist in core 
 
 Concept of set of "eligibles" 
 insures efficient resource 
 utilization in a multi- 
 programming environment 
 
 UCSD— CANDE 
 
 Multiprogramming paged 
 system in which each core 
 resident process can expand 
 core holdings up to the 
 maximum size of its 
 currently assigned 
 "sub space" 
 
 Simplest of time -sharing 
 scheduling philosophies; 
 like TSS, time-slices are 
 associated with a process 
 rather than a queue 
 
15 
 
 enter the system, they are assigned to the highest priority queue. This 
 queue has a relatively short time-slice associated with it. If the job 
 uses its entire time-slice in its first pass through the system, it is 
 relegated to the second priority queue which has a slightly longer 
 time-slice associated with it and so on. Queues are served from highest 
 priority to lowest priority. Discipline within queues vary among FIFO, 
 RR, SXFS, and LXFS as explained in Table 2.1. A generalized version of 
 these scheduling algorithms is presented in Figure 2.2. This representation 
 will be made specific for each system (except AMES-TSS) as it is described 
 in detail. 
 
 Figure 2.2. Generalized Time-sharing Scheduling 
 
 PRIORITY 
 LEVEL 
 
 i i t-i i ^ lictct \ 
 
 ■ 
 
 (QUEUE 
 DISCIPLINE) _ 
 
 
 
 
 
 
 
 1 IHIbntb 1 J - 
 
 ARRIVAL 
 
 ^ 
 
 ti 
 
 
 DEPARTURE 
 
 — ^ 
 
 
 
 
 
 
 
 
 
 
 
 2 
 
 1 ^ 
 
 t 2 
 
 
 
 — ► 
 
 
 
 
 
 
 
 
 
 
 
 3 
 
 i ^ 
 
 t 3 
 
 
 
 — ► 
 
 
 
 
 
 
 N (LOWEST) 
 
 t 
 
16 
 
 2.2.1. The Computing Systems 
 
 2.2.1.1. TSS - IBM Time -Sharing System [DOHTO] 
 
 The TSS/360 interactive system has a table driven scheduler 
 consisting of a set of programs in the resident supervisor used for 
 scheduling, and a table with many rows (levels) of entries. The scheduling 
 philosophy is based on the premise that processes making light demands on 
 the CPU and core resources should receive fast service and those making 
 heavier demands on these resources should receive relatively slower 
 service . The implementation of this philosophy is concentrated almost 
 entirely in the constant monitoring of a process ' paging requirements 
 (as opposed to its CPU usage). Programs with small working set sizes 
 are awarded frequent and comparatively long time-slices in the processor. 
 Processes with large working set sizes and poor locality are awarded only 
 short, infrequent time-slices. This strategy tends to minimize the time 
 that any large program can clog memory, thereby providing a potentially 
 significant increase in the level of multiprogramming, and faster response 
 time for a larger number of processes. 
 
 Assignment of core resources is the heart of the TSS scheduler. 
 The table which drives the scheduler can be thought of as being divided 
 into sets of levels grouped primarily according to the core usage char- 
 acteristics of a process. The interactive sets of table levels are the 
 Starting Set, the Looping Set, the AWAIT set, the Holding Interlock Set 
 and the Waiting for Interlock Set. 
 
 
17 
 
 The Starting Set of table levels is used to handle new inputs 
 from the terminals. This set consists of several successive high priority 
 table levels, each with small execution time limits and increasingly larger 
 core space limits. A process remains under control of the Starting Set 
 of table levels and proceeds through its various queues as long as it 
 continues to exceed its space limits only (up to some maximum). When the 
 process exceeds its time limit at a given level, the space limit of that 
 level is used as the estimate of the current working set size of that 
 process and the future execution of the process is controlled by the 
 Looping Set of table levels. 
 
 The Looping Set table levels performs three significant functions 
 Its first function deals with the dynamic estimation of the time and space 
 requirements of a process in accordance with the balanced core time 
 principle. This principle states that the length of the time-slice to 
 be awarded to a process is inversely proportional to the working set 
 size in that time interval. The second function of these table levels 
 is to cause the load generated by long running processes to be distributed 
 so as to allow Starting Set entries to be processed quickly. Finally, 
 the Looping Set optimizes CPU utilization and penalizes bad paging processes 
 by causing processes with minimal paging requirements to be selected for 
 running far more frequently than those with large paging requirements. 
 
 Of the three remaining sets, only the Holding Interlock Set 
 of table levels deals with processes that are ready to run. Processes 
 running from this set are currently holding interlocks on some system 
 
18 
 
 resource and have a high priority so that the interlocked resource may be 
 quickly freed. The AWAIT Set and the Waiting for Interlock Set administer 
 processes which are in a wait state for some reason. 
 
 As described above, processor time- slices are allocated dependent 
 upon a process' recent core usage behavior. The frequency and duration 
 of the time-slice a process is awarded is determined by values in the 
 table levels of the Starting Set, Looping Set and the Holding Interlock 
 Set. These values in turn are determined by the working set size and 
 locality characteristics .demonstrated in the. process' paging demands. 
 
 2.2.1.2. TENEX - PDP-10 Time -Sharing System [B0BT2] 
 
 The TENEX scheduling philosophy takes a middle ground between 
 two conflicting precepts of process behavior in a time-sharing environment. 
 On the. one hand, the more time a process has used, the closer it is 
 to completion. On the other hand, the longer a process has run, the less 
 are the chances that it will complete "soon". Ready jobs are distributed 
 in queues for service, therefore, such that if two processes are widely • 
 separated in accummulated run time (are in different queues) the one 
 with the lesser time will be preferred, and if two processes are closely 
 spaced (are in the same queue), the one with the greater time will be 
 preferred. This type of scheduling can be characterized as shortest- 
 processing-time first among queues and longest-processing- time first 
 within queues. 
 
19 
 
 A second aspect of the TENEX scheduling philosophy is concerned 
 with the complex interplay in the allocation of core and CPU resources. 
 Incorrect handling of the information gathering and decision making 
 procedures involved in determining working sets and core utilization in 
 a multi-process paged system can result in poor efficiency and bad service. 
 Thus, a "balance set control" module directly responsible for these 
 functions is made an integral part of the scheduler. 
 
 Figure 2.3 depicts the four distinct scheduler modules. The 
 process controller and balance set control modules will be discussed in 
 detail below. The real-time scheduler is concerned only with those 
 processes which are currently making real-time demands on the system. Its 
 scheduler portion is invoked whenever an external signal or clock indicates 
 that rescheduling may be required. If there are no real-time processes 
 requiring service, then the selection of a process to run falls to one 
 of the other modules. The function of the startup and dismiss routines 
 is fairly common and straightforward. Included in this module are routines 
 to save and restore environments as they go out of and into execution. 
 No important scheduling or other decisions are made by this module. 
 
 The balance set control module of the TENEX scheduler is 
 responsible for efficient use of core. The logical storage organization 
 includes the core, drum and disks and their associated channels so that 
 the efficient use of core is closely related to making efficient use of 
 the data channels to the drum and disk. Because of this logical memory 
 
20 
 
 Figure 2-3. The TENEX Scheduler 
 
 BALANCE SET 
 CONTROL 
 
 REAL-TIME Y 
 SCHEDULER J 
 
 STARTUP AND DISMISS 
 INTERFACES 
 
 (PROCESS \ 
 
 CONTROLLER J 
 
 structure, when a process cannot "be run because of a page fault, the 
 process is not considered to he in a wait state. The process is, in fact, 
 still demanding CPU services which cannot he given because core rather 
 than the CPU is not available. 
 
 Three basic functions fall under the jurisdiction of the balance 
 set control module. These include maintaining the list of processes 
 in the balance set such that the working set of all these processes can 
 co-exist in core, selecting a process in the balance set for running 
 
21 
 
 when the running process must be stopped for a page fault, and, on 
 
 the occurrence of rescheduling event, removing and/or adding processes 
 
 to the balance set in cooperation with the process controller. 
 
 Dynamically, determining how many processes can simultaneously 
 
 reside in core and what the size of these processes should be is the 
 
 central function of the balance set control. This involves trying to 
 
 keep a balance set which maximizes the probability that there will always 
 
 be at least one process to run. That is, whenever one process experiences 
 
 a page fault, there should be another process ready to utilize the CPU 
 
 resource. This suggests that the processes must run an average time, 
 
 T , greater than the average interval over which one page transfer 
 
 will be completed for one of the page-waiting processes, W . The 
 
 balance set control module iteratively estimates T and W and attempts 
 
 J av av 
 
 to maintain an environment in which T > W 
 
 av av 
 
 If the balance set control function described above provides 
 more than one process which is an eligible member of the balance set, 
 then an algorithm is required for selecting one among these processes to 
 run when a page fault occurs. This algorithm is also a part of the 
 balance set control module. Finally, several rescheduling events can occur 
 which require the removal or addition or processes to the balance set. 
 These events include processor time quantum overflow, I/O blocks, or i/O 
 unblocks. Handling these process exchanges in and out of the balance 
 set is a balance set control module task. 
 
22 
 
 Processor resources in TENEX are allocated to processes chosen 
 from distinct ready queues, where queue position is determined "by 
 previously accumulated processor time. Figure 2.k is a graphic presentation 
 of this scheduling algorithm. 
 
 Figure 2.k. BBN-TENEX Scheduling 
 
 PRIORITY 
 LEVEL 
 
 ARRIVAL 
 
 FIFO 
 
 
 
 DEPARTURE 
 
 
 
 *» 
 
 64 ms 
 
 
 
 
 
 
 
 
 
 
 
 LXFS 
 
 
 
 
 
 1 ■ »» 
 
 256 ms 
 
 
 
 
 
 
 
 
 
 
 
 LXFS i 1 
 
 
 
 
 1 >» 
 
 1024 ms 
 
 
 
 LXFS 
 
 4096 ms 
 
 t 
 
 RR 
 
 16370 ms 
 
 The scheduler prefers a process in a smaller numbered queue 
 over that in a higher numbered queue. In this respect, it prefers 
 processes with the smallest amount of accumulated time. But further, 
 within a queue, the scheduler chooses for execution the process with the 
 longest accumulated time in the expectation of completing a process which 
 probably requires only a small additional amount of CPU time. 
 
23 
 
 These queues are not extended indefinitely, but terminated 
 with N = 5 distinct queues, for two separate reasons. First, a process 
 that had run a very long time would get no further service if another 
 process began a long computer run until the second process had run nearly 
 as long as the first. (A long running process could also be completely 
 shut out of service by a set of short running processes which used 100 
 percent of the CPU.) Second, although the frequency of rescheduling 
 goes down as the queue time becomes large, a point is reached at which 
 the rescheduling overhead is an insignificant fraction of the total time 
 and no gain is achieved by reducing it further. For these reasons, then 
 a "last queue" is defined. Processes in this queue are scheduled using 
 a round-robin discipline, disregarding all former processing history 
 at this point and cyclically giving each process a certain quantum of 
 processing time in turn. 
 
 Use of this scheduling algorithm requires the assignment 
 of three parameters: 
 
 - the factor by which the processing time allotted on 
 each queue is greater than the last 
 
 - the amount of processor time allotted on the first queue 
 
 - the number of queues. 
 
 The basic principle involved in assigning these parameters is that 
 fewer and longer queues result in less system overhead but produce a 
 poorer approximation to ideal scheduling as represented by a large 
 number of queues. Bolt, Beranek and Newman (BBN) have currently assigned 
 values to these three parameters as follows: 
 
2k 
 
 - the i queue receives four times the processing allotment 
 
 st 
 as the (i-l) queue 
 
 - queue one allots 6k msec for processing 
 
 - there are five distinct queues. 
 
 Up to this point, the discussion of the scheduler has "been 
 limited to handling jobs on the ready queue. The scheduling algorithm 
 also keeps account of processes waiting for some external condition or 
 event such as an i/o device to complete or a user to type a character. 
 In this case, the scheduler's goal is to insure that these processes 
 too will receive their fair share of processing time, i.e., about l/M 
 of the CPU, where M is the number of processes in the system. The 
 scheduler achieves this goal by using the following procedure. During 
 the periods in which a process is in the wait state, the process is 
 "credited" for CPU time not used by reducing the accumulated time 
 values at the rate of l/M. Reducing this quantity tends to move the 
 process to the higher queues so that it will be preferred over other 
 processes which continue to run. This procedure does not include waits- 
 occasioned by disc or drum transfers as explained in the previous section 
 describing the core allocation algorithm. 
 
 2.2.1.3. TSO - IBM Time -Sharing Option 
 
 The basic scheduling philosophy of the TSO time-sharing system 
 is to award fast response times to processes requiring only a short 
 amount of CPU service. Processes requiring increasingly longer amounts 
 of processing time experience proportionately longer response times. 
 This philosophy is implemented in a series of queues (usually three or 
 
25 
 
 four) through which a process descends during its residence in the system. 
 Each queue has a lower dispatching priority than the former one, and each 
 queue typically allots a longer processing time-slice to its members. 
 Processes are served strictly first-come, first-served within queues. 
 The TSO time -sharing system can be run in a real or virtual 
 memory system, for example OS/MVT* or 0S/VS2*, respectively. The basic 
 concept behind core assignment is the same in both types of systems, but 
 the implementation of the assignment is, of course, different. A pre- 
 determined number of regions, say four, is set up in memory and these 
 regions form separate virtual processing systems which are assigned to 
 users as they log onto TSO. Users are associated with one of these 
 regions exclusively for the duration of their working session. Each 
 of the virtual systems acts independently of the others and each has 
 an independent, optionally identical scheduling algorithm as described 
 below. Within a region (,or virtual processing system) no multiprogramming 
 exists. Each process has use of the entire core and CPU resources 
 assigned to its region until it is swapped out in total and put back 
 on one of the dispatching queues. The UCLA Campus Computing Network 
 (CCN) TSO system is an OS/MVT system with one memory region. 
 
 *See Appendix A for definitions. 
 
26 
 
 The TSO scheduler chooses processes for running dependent 
 only on the most recent behavior of the process. That is, only the last 
 cause for removal from execution (i/O request, timer run-out, etc.) is 
 used to determine the next queue position for that process. The dispatching 
 algorithm, illustrated in Figure 2.5, typically defines three queues, 
 Ql, Q2, and Q3, to which a ready process may "be assigned. The first 
 queue consists of processes which have just passed from a blocked (or 
 wait) state to a ready state. These processes have the highest dispatching 
 priority and are served first-come, first-served within Ql. The second 
 and third queue consist of processes which experienced a timer run-out 
 during their last time-slice in Ql or Q2, respectively. 
 
 In general, an extensive set of parameters exist with which 
 to manipulate the function of dispatching processes for CPU service. 
 CCN-TSO in effect controls its queues by setting three of these optional 
 parameters to significant values. The "preempt" option is enabled, and 
 parameters called "min-slice" and "occupancy time" are set for each queue. 
 The occupancy time associated with a queue is the maximum time-slice 
 of execution allowable to a process from that queue. These values are 
 presently set a 2.0 seconds for Ql, k.O seconds for Q2, and 16.0 seconds 
 for Q3« The min-slice settings work in conjunction with the preempt 
 option and presently are assigned values of 1.6 seconds for Ql, 2.0 
 seconds for Q2, and 3-0 seconds for Q3« These matter values override 
 the occupancy time settings in the following way. If a process is 
 queued for service at the same or higher priority level than a process 
 presently holding the CPU, then the process holding the CPU is preempted 
 
27 
 
 after its respective min- slice, rather than "being allowed to utilize 
 its entire occupancy time quantum of service. Preempted processes 
 return to the queue from which they had just come, until they have been 
 allocated processor time equal to the occupancy time for that queue. 
 
 Figure 2.5. CCN-TSO Scheduling 
 
 PRIORITY 
 LEVEL 
 
 ARRIVAL 
 
 FIFO 
 
 2 sec. * 
 
 FIFO 
 
 »- 
 
 4 sec. 
 
 FIFO 
 
 16 sec. 
 
 DEPARTURE 
 
 2.2. l.k. MULTICS - MIT Time-Sharing System [0RG72] 
 
 The MULTICS time -sharing scheduler design was based on the 
 philosophy that the higher the load a process places on the system when 
 it is allowed to run, the lower its scheduling priority should be. Thus, 
 processes requiring the smallest amount of processor time share the 
 highest priority queue. Principally because of memory limitations, 
 
28 
 
 however, not all equal-priority processes can share the processor 
 simultaneously. The basic time-sharing scheduling philosophy, then, is 
 modified by a multiprogramming scheduling function. This multiprogramming 
 function restricts access to the processor to an appropriate subset of 
 equal-priority processes called the "eligibles" . This subset is chosen 
 small enough so that work that is done for each member is not degraded, 
 for instance, by thrashing. 
 
 An active process in the MULTICS system cycles through five 
 execution states- -running, ready, waiting, blocked and stopped. The 
 execution state not only describes a process ' processor contention 
 characteristics, but also suggests how that process is competing for 
 me mor y res our c e s . 
 
 Only running and waiting processes are considered eligible 
 to directly compete for pages of core memory at any one time. Eligibility 
 refers to the depth or degree of multiprogramming and is first conferred 
 on a ready process when that process attains highest relative priority 
 among noneligible ready processes and when its core requirements, when 
 added to those of the eligible processes, do not exceed the total available 
 core. Eligibility is withdrawn when a process uses up its time-slice 
 allotment, completes an interaction or otherwise enters a dormant (blocked 
 or stopped) state. 
 
 A running process may attempt to capture as much core as it 
 needs. It will be restricted in its attempts only by the competing 
 demands of processes that are simultaneously executing on the processor. 
 A waiting process (differentiated from a blocked process by the predictably 
 short period of time it has to wait for a system event, for example, 
 
 
29 
 
 the arrival of a page into core) remains eligible to compete for core 
 and retains its favorable queue position. In general, because a waiting 
 process is not actually executing, attrition can occur in its core holdings 
 due to demands made by executing processes. Since wait periods are 
 expected to be relatively short, however, there are only short periods 
 between the wait and running states of a process and, therefore, minimal, 
 if any, attrition of the waiting processes' core holdings occurs. 
 
 The ready, blocked and stopped processes share the same core 
 competition status in that they are all "losers". Because these processes 
 are not eligible, they cannot acquire core pages. The executing processes 
 fulfill their core requirements at the expense of these noneligible 
 processes and thus these latter continue to lose what pages they previously 
 had resident in core. The longer a process is not eligible, the fewer 
 pages it can expect to have in core. 
 
 As stated earlier, a process receives a dispatching or scheduling 
 priority based on the load it will place on the system. Since in general 
 a command's duration is not known in advance, an adaptive technique is 
 used to dynamically estimate the processor requirements of each process. 
 In the MULTICS scheduler, the assumption is made that every process 
 arriving on the ready list for the first time will execute a short command 
 and, therefore, deserves a high priority position on the ready list. 
 Associated with the position is some fixed time allotment t . When a 
 process is picked to compete directly for processor and core resources, 
 i.e., is eligible, the command may run to completion. If the allotted 
 time is exhausted, a timer run-out mechanism will halt execution of the 
 process and it will then be assigned to a lower priority position. Each 
 
30 
 
 lower priority position awards the process an increased allotment of 
 time up to some maximum until it completes execution. The processing 
 time allotment associated with the r priority position is approximately 
 
 Figure 2.6 illustrates a convenient way to conceptualize the 
 MULTICS dispatching of processes. Even though, in fact, only one 
 ready list exists in the MULTICS scheduling scheme, this single list 
 effectively functions as a set of n priority queues. The processing time 
 allotment in queue 1 is one second and approximately doubles in each queue 
 up to queue k. Processes are served FIFO at each priority level. Exact 
 implementation of this straightforward algorithm "becomes fairly complex 
 in the MULTICS system and the reader is referred to other authors 
 [GRE7^, 0RG72] for a more detailed discussion. 
 
 In keeping with the policy of giving good response to interactive 
 users that issue commands of short duration, preemption is permissible in 
 the MULTICS system. A higher priority process can preempt a presently 
 eligible process of lower priority. The preempted process is favorably 
 treated, relatively speaking, in that it is placed at the top of its 
 priority queue with a time allotment equal to whatever time is unused 
 from its last scheduling allotment . 
 
 2.2.1.5. CAKDE - University of California at San Diego (UCSD) 
 Time -Sharing System 
 
 The CANDE interactive computing system espouses a straightforward 
 
 approach which operates basically by distinguishing burst-oriented 
 
 processes from those that are compute bound. Processes which are 
 
 estimated to require a "small" amount of CPU time as determined by the 
 
31 
 
 Figure 2.6. MIT-MULTICS Scheduling* 
 
 PRIORITY 
 LEVEL 
 
 ARRIVAL FIFO 
 
 *• 1 sec. 
 
 FIFO 
 
 2 sec. 
 
 DEPARTURE 
 
 FIFO 
 
 
 
 
 
 I — »» 
 
 4 sec. 
 
 
 ^ 
 
 
 
 
 
 RR 
 
 
 1 ► 
 
 ■ *• 
 
 8 sec. 
 
 ► 
 
 
 
 
 x This illustration is an approximate description of the MULTICS scheduling 
 function. In fact, only one dispatching queue is maintained, and the 
 system has two processors. 
 
 fact that they did not exceed their allotted time-slice during their 
 most recent execution state, are served first-come, first-served from a 
 high priority queue. Processes which incurred a timer run-out during 
 their last run period are served first-come, first-served from a low 
 priority queue . 
 
 CANDE is a virtual memory system which multiprograms processes 
 into "subspaces" of real core. If a process has the highest dispatching 
 priority and there is adequate memory available for a swap-in, then 
 the process receives its required core storage. Memory assigned to a 
 
32 
 
 process is increased up to the fixed size of the subspace whenever the 
 process exceeds its currently allotted space. There are five events 
 which can cause a process to be swapped out. These include an input wait, 
 an output wait, a process suspension, a time-slice allotment expiration and 
 a core demand in excess of the subspace size allocated to the process 
 during its previous swap into core. 
 
 The primary goal of the subspace option is to allow a large 
 number of burst-oriented processes to run without freezing memory 
 resources during their dormant periods. Memory is freed by immediately 
 swapping the process to disk when it becomes dormant. Because a large 
 number of tasks are bidding for a limited memory resource, tasks which 
 discontinue their burst-orientation (become compute bound) have an 
 artificial burst rate imposed upon them. This artificial burst rate 
 is called the process' time-slice. 
 
 CAKDE has two priority levels (queues) for selecting ready 
 tasks for execution, or swapping into core as illustrated in Figure 2 .7. 
 The lower priority queue contains processes which exceeded their time 
 slice during their last swap-in. The higher priority queue contains 
 all other ready processes. These high priority processes are those which 
 are new to the system, which have received input for which they were 
 waiting, which have output at least half of the data excess which originally 
 caused them to be swapped out or which have been awakened from swap-out 
 suspension. Within this high priority or "demand status" queue processes 
 are ordered first-in, first-out as they are within the lower priority 
 queue. Lower priority queue processes, or "time-sliced" processes, are 
 swapped into available memory only if there are no demand status swap 
 requests which can be satisfied. 
 
33 
 
 Figure 2.7. UCSD-CANDE Scheduling 
 
 PRIORITY 
 LEVEL 
 
 ARRIVAL 
 
 FIFO 
 
 f(c,n)* 
 
 DEPARTURE 
 
 *— 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 FIFO_ 
 
 
 
 
 
 
 f(c,n) 
 
 
 
 
 
 
 
 ^ 
 
 
 
 »~ 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *n and c are defined on page 3^ • Jobs feed into the first priority queue 
 if their last removal from execution was caused "by a wait or blocked 
 state and they feed into the second priority queue if their last cause 
 of removal from execution was a timer run-out. 
 
 The time-slice allocated to each process when it is swapped 
 into core is computed on an individual basis and does not depend exclusively 
 on priority level. Before allocating a processor to a swappable process, 
 both its allowable processor time-slice and its allowable elapsed time- 
 slice are checked. If either has been exceeded, a new slice is computed 
 as defined by the formulas given below. 
 
3h 
 
 The formulas for computing a time-slice are: 
 
 Processor Time-Slice: T= (n*kl + c+p + 8) * k2 + m * I+16667 
 
 Elapsed Time-Slice: E = T * r 
 
 where 
 
 n is the slice number. When a process is swapped out due to a 
 demand condition, its slice number is set to zero. Each time 
 a process is swapped because of exceeding its (processor or 
 elapsed) time-slice, its slice number is incremented by one. 
 This number is subject to a maximum value of 7* 
 
 c is the core space used by the process in chunks, (l chunk =± 990 words) 
 
 m is the minimum time-slice in seconds, (m = 1) 
 
 kl is h. 
 
 k2 is 5000 
 
 p is priority (p = 51) 
 
 r' is the ratio of elapsed time to processor time. 
 Time-slice units are 2 .k msec. 
 
 2.2.2. Benchmark Jobs 
 
 Three benchmark jobs were distributed on each of the computing 
 
 systems studied, with some exceptions. The first benchmark job was 
 
 dominated by arithmetic operations, the second consisted of manipulations 
 
 of bit strings and the third was input/output bound. Listings of these 
 
 benchmark jobs as they were stored and used on each computing system are 
 
 presented in Appendix B.* These jobs were chosen for their distinct 
 
 *These benchmark jobs were generated by members of a research group 
 working under the direction of Dr. P. A. Alsberg, Center for Advanced 
 Computation, University of Illinois, Urbana-Champaign . They were 
 used in this research with Dr. Alsberg' s permission. 
 
35 
 
 claims on the system resources of CPU processing, core use and i/o 
 channel utilization. These particular listings were generated at MIT- 
 MULTICS. Job listings from all other installations are essentially 
 identical. 
 
 The "number cruncher " or arithmetic benchmark job was written 
 in standard FORTRAN and generates a 100 x 100 correlation matrix for a 
 100 x 100 input array called DATA. A main program dimensions all 
 arrays and appropriately initializes arrays and variables. This main 
 program then calls on a subroutine to generate the required correlation 
 matrix. This benchmark places demands on the system resources of core 
 (more than 20 kilobytes of core are required just for array storage) and 
 on CPU processing (the innermost loop in the subroutine is executed 
 .5*10 times) . 
 
 The bit string manipulating benchmark job was designed to place 
 its main system resource demand on the CPU alone. This standard PL/ I 
 program takes a 100 x 100 input matrix called REALITY whose entries are 
 ones that can be traversed from the top row to the bottom row, traveling 
 only vertically and horizontally between adjacent squares. A second 
 matrix (FOUND) of the same dimensions as REALITY is used as an internal 
 work space. Initially all entries in FOUND are zeroes. When a valid 
 path is discovered from the first row of reality to an adjacent square, 
 the corresponding neighboring element in FOUND becomes a one. Thus, the 
 elements in FOUND that are ones represent elements in REALITY which can 
 be reached from the first row. At each iteration, an element in FOUND 
 
 4 
 
 becomes a one if the corresponding element of REALITY is a one (i.e., it 
 
36 
 
 is connected to a valid path from the top row) . The process terminates 
 either when no new ones appear in FOUND or when an element in the "bottom 
 row of FOUND becomes a one. This program is a bit manipulating benchmark 
 since matrices are stored and referenced as hit strings. 
 
 The third and simplest of the benchmark jobs was also written 
 in standard PL/ 1 • It was designed to make its main resource demand on 
 the i/O mechanism of the computing system. The program opens a file and 
 writes 1,000 250-word records into it. It proceeds to close the file, 
 reopen it, read the same 1,000 records back and finally closes the file 
 once again. 
 
 Table 2 .2 indicates exactly which benchmarks were run at each 
 of the computing centers and explains why certain of the benchmarks 
 were omitted. 
 
 Table 2.2. Benchmark Jobs Run at Various Computing Centers 
 
 System 
 
 Number Crunching 
 Benchmark 
 
 Bit Manipulating 
 Benchmark 
 
 I/O Bound 
 Benchmark 
 
 AMES-TSS 
 
 Yes 
 
 Yes 
 
 Yes 
 
 BBN-TENEX 
 
 Yes 
 
 PL/l is not available on this system 
 
 CCN-TS0 
 
 Yes 
 
 Yes 
 
 Yes 
 
 MIT-MULTICS 
 
 Yes 
 
 Yes 
 
 Yes 
 
 UCSD-CANDE 
 
 Yes 
 
 PL/l is not available on this system 
 
37 
 
 2.2.3- Load Level 
 
 Each of the computing systems under study was arbitrarily said 
 
 to have ten distinct load levels within which it operated. In general, 
 
 the load levels are uniformly distributed intervals in which the value 
 
 (e.g., number of users, load average or utilization fraction) of the 
 
 two end points and the interval width depend on the load measure for a 
 
 particular system and its observable load range, respectively. The ' 
 
 k load level for the i system, I. , , is defined by an interval 
 
 i,k' 
 
 *i,k = [ (( s i/ 10 ) * ( k " 1 )) + !> ( s i/ 10 ) * ^ 
 
 where s. is a measure of load in a saturated system i. 
 
 For example, UCSD measures load in number of users and its 
 highest observable load level was taken to be 30 users. The fifth load 
 level, therefore, would be defined as 
 
 \rCSD,5 = [ ((30/10) * k) + 1, (30/10) * 5J 
 
 or 
 
 Ws " [13 ' 151 
 
 Several exceptions to this load level definition arise owing to 
 the individual characteristics of the systems being studied. AMES67 
 measures increasing load in terms of a decreasing function in direct 
 
38 
 
 contrast to all the other systems under consideration. The AMES67 
 measure is a utilization fraction ranging from 1.0 for no load to 0.0 
 for extremely heavy loads. In this case, the load interval is defined 
 as follows : 
 
 'jBBSST.k " [ - 1 * (1 °" k) ' U * (U - k)) " - 001] - 
 
 Other exceptions to the load level definitions occur in the 
 ■widths of the most heavily loaded levels (that is, load levels 8, 9 and 
 10) . Since response times on some of the systems studied grows very 
 large with increasing loads (response times rise to approximately one 
 hour on the BBN-TENEX system under heavy loads), it "becomes difficult 
 to take a response time measurement within a load level that is too 
 narrowly defined. The load varies more during these longer periods than 
 during the lightly loaded, short response time periods. For this 
 reason, the widths of load levels were sometimes broadened at the high 
 end of the load level spectrum (see levels 6-10 of BBN-TENEX in Table 2.3) 
 
 Still another adjustment was made in the load level definition 
 for the BBN system running TENEX. The TENEX load measure is one of 
 "load average" defined as the ratio of number of runnable jobs (jobs not 
 blocked for i/o or otherwise in a wait state) to running jobs (jobs which 
 are loaded in core and immediate potential candidates for CPU time-slices) 
 The rapidly changing nature of this measure, combined with the relatively 
 long response times for the TENEX system, even under moderate loads, 
 necessitated overlapping load level definitions to obtain any valid 
 
39 
 
 Table 2.3- Load Level Definitions 
 
 SYSTEM 
 
 AMES67-T3S 
 
 BBN-TENEX 
 
 CCN-TSO 
 
 MIT-MULTIC S 
 
 UCSD-CANDE 
 
 1 
 
 LOAD 
 MEASURE 
 
 Utilization 
 Fraction 
 
 Load 
 Average 
 
 Num.be r of 
 Users 
 
 Number of 
 Users 
 
 Number of 
 Users 
 
 LOAD 
 LEVEL 
 
 
 
 
 ■I 
 
 1 
 
 1 
 
 i 
 
 (-900, .999) 
 
 ( 0,2 ) 
 
 ( 1,1 ) 
 
 ( 1,7 ) ( 1,3 ) 
 
 2 
 
 (.800, .899) 
 
 ( 1,3 ) 
 
 ( 2,2 ) 
 
 ( 8, lU) ( 4,6 ) 
 
 3 
 
 (.700,-799) 
 
 ( 2,k ) 
 
 ( 3,3 ) 
 
 | 
 (15,21) ( 7,9 ) 
 
 h 
 
 (.600, .699) 
 
 ( 3,5 ) 
 
 ( h,k ) (22,28) (10,12 ) 
 
 i 
 
 5 
 
 (.500,-599) 
 
 ( k,8 ) 
 
 ( 5,5 ) (29,35) 
 
 (13,15 ) 
 
 6 
 
 (.koo,.k99) 
 
 ( 6,io ) 
 
 ( 6,6 ) (36,42) 
 
 (16, 18 ) 
 
 7 
 
 (-300,-399) 
 
 ( 8,12 ) 
 
 (7,7 ) (43, 49) (19,21 ) 
 
 8 
 
 (.200, .299) ' (10, lk ) 
 
 ( 8,8 ) (50,56) (22,26 ) 
 
 9 
 
 (.ioo,.i99) 
 
 (12, 16 ) 
 
 (9,9 ) i (57,63) (26,30 ) 
 
 10 
 
 (.ooo, .099) 
 
 (i4,n4) 
 
 (10,>10) i (64,70) (31,>31) 
 
1+0 
 
 response time measurements. For example, iL__ T ,- = [ 10.0,1^.0] and 
 ^■d-dtvt £ - [I2.0,l6.0], where the end points of the intervals are load 
 
 BB1N , O 
 averages. Table 2.3 contains a complete listing of the load level 
 
 definitions for the five systems under study. 
 
 The system load was recorded before and after each response 
 
 time measurement. A measurement was said to be taken at one of the ten 
 
 possible load points only if both load recordings fell within the 
 
 interval defined by that respective load level. 
 
 2.2.4. Response Time 
 
 The main performance measure to the user of an interactive 
 system is response time. Users are happy if the system reacts within a 
 time span they have learned to expect. If the system does not perform 
 as expected, user discontent rises. Frustration increases rapidly 
 when expectations of immediate response are thwarted. However, frustration 
 increases much more slowly when the expected turnaround time is such 
 that the user turns attention away from the response time to other 
 activities. This latter expected response time may range from approxi- 
 mately ten minutes to several hours. 
 
 The response time to a "run" command, given that the required 
 CPU time for the program to be run is less than one minute or so, hovers 
 between two response classes. On the one hand, if the system is lightly 
 loaded, program execution may be completed in a few minutes. In this 
 case, the users would probably devote their attention solely to waiting 
 for the system response. On the other hand, if the system is heavily 
 
kl 
 
 loaded, full program execution may require as much as an hour, or even 
 more, and users would turn their attention to some other activity while 
 they were waiting. 
 
 In order to measure and compare response times to run commands 
 on heterogeneous computing systems, a definition of response time is 
 required that will be consistent across all systems, exhibit a meaningful 
 association to time-sharing system performance and also correspond to 
 the users' conception of how long they have waited. J. F. Maranzano [MAR73J 
 has proposed such a definition. Maranzano ' s definition of interactive 
 response time identifies the interval "from the end of user typing of a 
 command (often called the carriage return) to the first character of 
 output on the terminal" as the critical time span. This response time 
 definition meets the criteria described above in that it is measurable 
 on all systems, the distribution of its values under varying circumstances 
 is a description of system performance and users stop their waiting 
 activity at the first physical sign of output on the terminal. 
 
 This definition will be slightly modified in this study to its 
 
 following form: 
 
 DEFINITION: Interactive response time is the number of 
 seconds which elapse from the end of user 
 typing of a command (carriage return) to the 
 first character output on the terminal indicating 
 the completion of execution of the command. 
 
 The first output character is required to be that which signals the 
 
 completion of command execution because some commands print informative 
 
 messages at the beginning of their execution. 
 
k2 
 
 Separate classes of commands are defined by Maranzano to insure 
 that uncontrolled variability of times within each class will be minimized. 
 We are concerned here only with the respective "load and run" command 
 associated with each computing system that directs the system to load 
 the (previously compiled) object version of a particular program and to 
 proceed with its execution. Since our response time comparison is limited 
 to this single command, no further command classifications are required. 
 
 Two of the systems under study (BBN-TENEX and UCSD-CANDE) trace 
 and record the interactive elapsed time automatically and report it to 
 the user upon completion of a command execution. For the other three 
 systems, the response time was measured by utilizing system clocks in 
 various ways. The exact command sequence used in each system measurement 
 is presented in Table 2.k. The average response time for the execution 
 and printout of the TIME command information was calculated in each 
 case and accounted for in the final determination of the "load and run" 
 response time. AREA, network transmission time which is presently less 
 than .1 second in either direction was not isolated in the response time 
 determination (was recorded as part of the individual system response 
 time). All response time measurements were made from a terminal, using 
 commands available to all users of the system. No special hardware or 
 software monitors were used. 
 
^3 
 
 Table 2.k. Command Sequence for Systems' Measurement 
 
 SYSTEM 
 
 COMMANDS* 
 
 COMMENTS 
 
 AMES67-TSS 
 
 TIME? 
 
 CALL PROGRAM 
 
 TIME? 
 
 The TIME? command 
 returns the wall 
 clock time. 
 
 BBN-TENEX 
 
 PROGRAM NAME 
 
 Response time to 
 this run command is 
 returned by the system 
 automatically . 
 
 CCN-TSO 
 
 TIME 
 
 GOCOMPILER NAME 
 TIME 
 
 The TIME command gives 
 the total connect time 
 
 MIT-MULTICS 
 
 TIME 
 
 PROGRAM NAME 
 TIME 
 
 TIME is a user written 
 subroutine that calls 
 and displays the system 
 clock time. 
 
 UCSD-CANDE 
 
 EXECUTE PROG NAME 
 
 The EXECUTE command 
 returns the response 
 time automatically 
 upon completion. 
 
 All the run commands load (if it is not already loaded) and 
 execute the object module of the program. 
 
hk 
 
 3- MEASURING TIME-SHARING SYSTEMS 
 
 Response times were measured and recorded at the various 
 observable load levels, for each of the appropriate benchmark jobs, on 
 each of the five computing systems. The data was subsequently subject 
 to curve-fitting analysis in order to formulate statistically significant 
 quadratic, cubic or exponential representations of the response time-load 
 level relationships. Linear and nonlinear least squares regressions 
 were performed. 
 
 The curve fitting was done using a package program authored 
 by J. A. Middleton titled, "Least- Squares Estimation of Non-Linear 
 Parameters--NLIN" [MH>68] • User subroutines indicating the function to 
 which the data are to be fit are called by the main program which then 
 iteratively attempts to determine the required variable coefficients 
 (a and p in the log-normal case). The algorithm used selects an 
 optimized correction vector for the coefficients by interpolating between 
 the vector obtained by the gradient method and that obtained by a 
 Taylor's series expansion truncated after the first derivative. Iteration 
 is applied to this vector according to the least squares method of 
 estimating parameters until one of the several stopping criteria is met. 
 
 The set of criteria used to choose the curve that best fit the 
 data included comparison of the residual mean square of each of the fits 
 (these are presented in Appendix C), consideration of the possible and 
 most probable shape of the curve for the time -sharing system under 
 consideration, and special handling of "outlying" or obviously exceptional 
 
^5 
 
 data points. A discussion of the results of the analysis for each of 
 the computing systems under study is presented below. The plots of the 
 curve fits presented for each benchmark on each computing system display 
 the best pair of fits in each case and indicate which of the two fits 
 was finally chosen. 
 
 Also included in the discussion of the individual system 
 results is a determination of whether or not "saturation" occurs within 
 any of the observable load level intervals. Mathematically speaking, a 
 system is said to be saturated when the probability of zero users waiting 
 for service becomes less than some arbitrarily small number. This 
 definition may be related to a quadratic, cubic or exponential response 
 time curve that is relatively flat and then becomes concave upward by 
 determining the point (or load level) at which the slope of the curve 
 becomes greater than some arbitrarily small number. Alternatively, when 
 the curve fit tends to have linear characteristics (slow steady rising), 
 or in the interest of relating saturation to the users' experience with 
 the system, saturation may be defined as that point or load level in 
 which the response time exceeds users' expectations of waiting time. 
 For the types of benchmarks and systems involved in this study, except 
 for BBN-TENEX, two minutes was taken as a reasonable time span within 
 which to expect job completion. Not all systems exhibit definite 
 saturation characteristics within the observable range of the data. 
 A summary table of saturation levels in each of the systems is presented 
 in Table 3«1> The information given there is explained more fully in the 
 discussions of individual systems that follows. The processing times 
 required by each benchmark in each system are presented in Table 3«2. 
 
46 
 
 Table 3«1« Systems' Saturation Level 
 
 
 SATURATION 
 
 SATURATION 
 
 
 SYSTEM 
 
 CRITERIA 
 
 LOAD LEVEL 
 
 COMMENTS 
 
 AMES-TSS 
 
 Response time rises 
 
 10th 
 
 Only a gradual, steady 
 
 
 above 120 seconds. 
 
 or above 
 
 increase in response 
 times occurs. 
 
 BBN-TENEX 
 
 Sharp rise in 
 
 3rd 
 
 System response times 
 
 
 response time curve 
 
 
 tend to be of the mag- 
 
 
 combined with 
 
 
 nitude of batch pro- 
 
 
 excessively high 
 
 
 cessing times rather 
 
 
 response times 
 
 
 than interactive 
 
 
 "(about 5 minutes). 
 
 
 processing times. 
 
 CCN-TSO 
 
 Fairly sharp rise 
 
 above 
 
 System is generally 
 
 
 in response time 
 
 10th 
 
 lightly loaded and 
 
 
 curve combined with 
 
 
 saturation did not 
 
 
 rise of response 
 
 
 occur in the observ- 
 
 
 time above 120 
 
 
 able range of the 
 
 
 seconds. 
 
 
 data. 
 
 MIT-MULTICS 
 
 Extremely sharp 
 
 8th 
 
 System response times 
 
 
 rise in response 
 
 
 conform most closely 
 
 
 time curve com- 
 
 
 to popular response 
 
 
 bined with rise 
 
 
 time expectations. 
 
 
 of response time 
 
 
 
 
 above 120 seconds. 
 
 
 
 UCSD-CANDE 
 
 Response time rises 
 
 8th 
 
 Only a gradual, steady 
 
 
 above 120 seconds. 
 
 or above 
 
 increase in response 
 
 
 
 
 time occurs. 
 
kj 
 
 Table 3.2. Average BenchmarK Processing Times* 
 
 
 NUMBER 
 CRUNCHER 
 
 BIT 
 MANIPULATOR 
 
 
 
 FILE 
 FLOGGER 
 
 AMES-TSS 
 
 21 
 
 16 
 
 7 
 
 BB'N-TENEX 
 
 63 
 
 NA 
 
 NA 
 
 CCN-TSO 
 
 6 
 
 5 
 
 OJ 
 
 MIT-MJLTIC3 
 
 ^5 
 
 2 
 
 3 
 
 UCSD-CANDE 
 
 57 
 
 NA 
 
 NA 
 
 *A11 processing times are given in seconds. NA, not applicable, 
 indicates that the benchmark was not run at the computing 
 center in question. 
 
1+8 
 
 3.1. Analysis of Individual System Data 
 3-1.1. AMES-TSS 
 
 When a dispatching algorithm for time-sharing systems assigns 
 processes to one of a set of increasingly lower priority queues depending 
 mainly on the processes' former behavior in using its alloted time-slice 
 (like BBN-TENEX), the response time curve for that system is generally 
 almost constant for a lightly loaded system and "begins to rise rapidly 
 when the load increases beyond some critical point. On the other hand, 
 when some other criteria is the main factor in determining queue position, 
 such as the amount a user is willing to pay, or the paging behavior of the 
 process, then the response time curve appears to rise slowly as the load 
 increases, in a strictly linear fashion. AMES-TSS is one such system. 
 
 As described earlier, core usage characteristics of a process 
 are the main factor in determining queue position at AMES. This implies 
 that a process with a small working set size and good locality will 
 stay in the top priority queues, regardless of how much service it is 
 requiring of the CPU. The response time of a process can increase 
 linearly with increased load, therefore, and not necessarily exhibit 
 a sharp rise at some critical saturation point. 
 
 This phenomenon can be observed in the response time curves for 
 all three of the benchmark jobs run at AMES, shown in Figures 3«l( a ) 
 through 3.1(c). Although the exponential, quadratic and cubic curves 
 were chosen as best fits for the arithmetic, bit string manipulator and 
 i/O bound benchmarks, respectively, within the observable load span all 
 three curves rise slowly but steadily, in an almost linear fashion. 
 Because of the linear shapes of the curves, saturation in this system 
 
k 9 
 
 Figure 3.1(a). Statistical Results - AMES-TSS 
 
 RESPONSE TIME (SECSO 
 
 § 
 
 8 
 
 § 
 
 8 
 
 S 
 
 I 
 
 o o 
 
 c > 
 
 > -I 
 
 D > 
 
 > T) 
 H O 
 
 O 
 
 > 
 
 o 
 
 r 
 n 
 < 
 
 p 
 
 * ->J 
 
 (O 
 
 £ " 3 
 r w 
 
 op o 
 
 n 
 
 > 
 
 n 
 w 
 en 
 
 cz 
 
 n 
 
 z] 
 
 n 
 
 c 
 
 n 
 n 
 rn 
 
 7D 
 
 *See Table 2.3. for the correspondence between Load Levels 1-10 and 
 the system measure of busyness for each of the five systems studied 
 
50 
 
 Figure 3.1(b) (continued). Statistical Results - AMES-TSS 
 
 RESPONSE TIME (SECS.) 
 
 o 
 > 
 
 a 
 
 rn 
 < 
 
 p 
 
 i © 
 
 G 9 
 
 Q > 
 
 33 
 
 "S 
 
 09 
 
 CO 
 
 cn 
 
 en 
 
 oo 
 
 § 8 g g 
 
 ro 
 
 £00 
 
 
 
 i 
 
 $0 © 
 
 \ 
 \ 
 
 08 
 
 
 © © 
 
 \ 
 
 \ 
 
 ■n 
 
 > 
 
 rn 
 w 
 cn 
 
 i 
 
 H 
 
 Tl 
 
 e 
 r 
 > 
 
 o 
 
 X! 
 
51 
 
 Figure 3.1(c) (continued). Statistical Results - 
 
 AMES-TSS 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 no 
 
 o 
 
 s 
 
 ro 
 
 GO 
 
 s 
 
 _\ 
 
 \ O 
 
 \ 
 
 V 
 
 o 
 > 
 
 a 
 
 5 
 P 
 
 I 
 
 
 t) 
 
 ID 
 
 O 
 
 GG0 
 
 \ 
 
 OD 
 
 \ 
 
 \ \ 
 
 \ \ 
 
 G Q\ G 
 
 \ 
 
 G G Q \> 
 
 \ 
 
 Q G 
 
 > 
 
 rn 
 
 w 
 
 -J 
 I 
 
 r~ 
 m 
 
 ID 
 
 r - 
 o 
 
 GO 
 O 
 
 rn 
 
52 
 
 must "be considered as occurring in the load level in which response 
 time rises above 120 seconds. This rise does not occur within the 
 observable range of the data except for the i/o bound benchmark, in 
 
 which case the curve barely climbs above two minutes between the 9 
 
 th 
 and 10 load levels. The relatively long response time for this 
 
 benchmark takes on significance in view of the fact that the i/o 
 
 bound job required less execution time by a factor of 1:2 as compared 
 
 with the bit string manipulator and 1:3 as compared with the arithmetic! 
 
 benchmark job. 
 
 Since "pi", a measure of core contention, was used as the 
 load measure in this system, the question arises as to whether using 
 number of users as load measure would yield different results. Number 
 of users is an undesirable measure for load in the AMES system because 
 of the tendency for local users to stay logged in for long periods of 
 time, regardless of whether or not they are doing useful work. The 
 response time data were plotted against number of users, however, and 
 linear curves similar to the ones already displayed resulted' as best 
 fits. But, as can be observed from Table 3*3, the residual mean squares 
 (RMS) were larger in every case for these plots as compared to the 
 response time versus pi plots. 
 
 The data collected on the AMES-TSS system is complete, even 
 though few valid observations were recorded at the 9 and 10 load 
 levels, in the sense that AMES has adjusted their overall scheduling 
 scheme such that the value for pi very seldom goes below 0.2. A new 
 "Resource Allocation Scheme" attempts to guarantee some level of service 
 
53 
 
 Table 3.3. Residual Mean Squares for AMES-TSS Curve Fits 
 
 Benchmark 
 
 Residual Mean Square (RMS) 
 
 Using "no. of users" 
 -vs- 
 response time 
 
 Using "pi" 
 -vs- 
 response time 
 
 Number Cruncher 
 
 1.09 (10 3 ) 
 
 2.76 (10 2 ) 
 
 Bit Manipulator 
 
 2.25 (io 2 ) 
 
 1.U5 (10 2 ) 
 
 File Flogger 
 
 1.05 (10 3 ) 
 
 1.03 (10 3 ) 
 
 to authorized priority users at various times of the day, e.g., group 1 
 receives top priority between 8 a.m. and 10 a.m., group 2 from 10 a.m. 
 to 12 noon and so on. The data, therefore, represents observations 
 over all the load levels that AMES-TSS will assume in its present 
 configuration. 
 
5h 
 
 3-1.2. BBN-TENEX 
 
 Even the novice user of the TENEX system at BBN quickly forms 
 the impression that for a fairly good sized job, even with light loads, 
 response times are slow and they tend to increase very rapidly. The 
 exponential curve shown in Figure 3 .2, chosen as the "best fit to the 
 BBN number crunching benchmark data, readily verifies this impression 
 (as explained in Section 2.2.2., the bit manipulating and the file flogging 
 benchmarks were not run on this system) . The data range is the largest 
 of all the systems studied, rising to a measured turnaround time of more 
 than one hour at the tenth load level. The slope of the curve rises 
 relatively rapidly, making a saturation point difficult to define . Only 
 measurements at the lowest load level were consistently under 120 seconds. 
 The BBN system response actually hovers between time- sharing and batch 
 expectations. The exponential curve fit reflects the success with which 
 the philosophy of the TENEX dispatching algorithm (which predicts 
 approximately exponential response times) is implemented in the total 
 BBN- TENEX system. 
 
 Of special interest in this system is the fact that the load 
 measure is not number of users as it was in the majority of systems, but 
 is the quantity defined as "load average" in the earlier description 
 of the TENEX system. With this quantity as independent variable, the BBN 
 data yield the best regression fit of any other set of data. The ratio 
 of regression sum of squares to total sum of squares is a satisfyingly 
 high 0.872 (see Appendix C). 
 
55 
 
 Figure 3-2. Statistical Results - 
 
 BBN-TENEX 
 
 RESPONSE TIME (SECS.) 
 
 
 M 
 
 2 
 
 re 
 
 
 
 c 
 
 c 
 
 o 
 
 — _ 
 
 
 ■ — 
 
 — 
 
 ro 
 
 o 
 > 
 
 a 
 
 r 
 
 p 
 
 yi 
 
 CP 
 
 i © 
 
 n > 
 
 3 
 
 2 
 
 m 
 
 (C 
 
 ^ 
 
 X 
 
 00 
 
 e 
 
 - r i 
 
 n 
 cz 
 
 — ^ 
 
 o 
 
 IE 
 
 n 
 
56 
 
 The BBN data can "be accepted as complete in the sense that 
 the observable range of load levels shows the interactive response time 
 for the arithmetic benchmark rising above an intolerably high one hour. 
 A user searching for a time- sharing system on which to run a job would 
 surely reject the BBN-TENEX option (except for some extenuating circum- 
 stances such as free computing) when the load average rose above about 
 
 rd 
 14.0 as it does in the 3 load interval. 
 
 3.1.3. CCN-TSO 
 
 The CCN-TSO system is not often heavily loaded, with thirteen 
 users being the maximum load observed during this study. Moreover, the 
 processor is a powerful one and in the context of TSO's particular 
 dispatching algorithm, the CCN system required only 6 seconds of execution 
 time to execute the arithmetic benchmark. This was a performance improve- 
 ment of more than 3:1 over the next fastest system (AMES-TSS) and of more 
 than 10:1 over the slowest system (BBN-TENEX) . Further, since within the 
 entire CCN computing system the TSO system is guaranteed a portion of - 
 CPU service, but not a portion of i/O service, i/o interactions of a process 
 become the dominating factor in determining response time. This becomes 
 evident upon examination of Figures 3«3(a) through 3-3( c ) in which response 
 times for the i/o benchmark is almost double that for either the arithmetic 
 or bit manipulating benchmark, even though the i/o benchmark requires less 
 than half the processing time of either of the latter two. 
 
 Both the arithmetic and bit manipulating benchmark sets of data 
 suggest that the CCN-TSO system has not reached saturation within the 
 observable range. Both curves are very slowly rising and stay below 120 
 
57 
 
 Figure 3.3(a). Statistical Results - CCN-TSO 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 8 
 
 2 
 
 
 O 
 O 
 Z 
 
 I 
 
 cz 
 
 Q 
 
 DO 
 PI 
 XI 
 
 8 O 
 
 n 
 
 X 
 
 C 
 
 in 
 
 o 
 > 
 
 o 
 
 9 
 P 
 
 I 
 
 O D 
 
 C > 
 
 > -o 
 H O 
 
 CD 
 
 ceo o 
 
 00 
 
 o 
 
 o 
 z 
 n 
 
 x 
 
 ■n 
 
 =1 
 
58 
 
 Figure 3.3(b) (continued). Statistical Results - CCN-TSO 
 
 RESPONSE TIME (SECSO 
 
 n o a 
 
 x, c > 
 
 TD ■> -\ 
 
 o a > 
 
 Z H O 
 
 p n 5 
 
 r u> 
 
 s 
 
 § 
 
 § 
 
 S 
 
 i © 
 
 oa 
 
 tD 
 
 © 
 
 0\ 
 
 n 
 o 
 
 (BE) 
 
 ! 
 Da 
 
 0\O0 
 
 > 
 
 C 
 
 r~ 
 > 
 
 r~ 
 o 
 > 
 
 a 
 
 r 
 rn 
 < 
 
 p 
 
 en 
 
 01 
 
 0\© 
 
 \,e 
 
 O 
 XI 
 
59 
 
 IS 
 
 Q > 
 
 "1 
 
 Figure 3.3(c) (continued). Statistical 
 
 Results - CCN-TSO 
 
 1 
 
 ID 
 
 RESPONSE TIME (SECS.) 
 
 § 
 
 § 
 
 n 
 o 
 
 Tl 
 
 OJ 
 
 Q 0\ 
 
 rn 
 
 r* 
 o 
 > 
 
 a 
 
 r 
 
 P 
 
 o 
 
 rn 
 
 00 
 
60 
 
 th 
 seconds even in the 10 load interval. CCN personnel estimate that 
 
 their system will saturate with about twenty users and the data suggests 
 that this intuition may be valid. These two benchmarks require approximately 
 the same amount of processing time (about five seconds) and their response 
 time curves are similar. 
 
 The i/o benchmark response time curve is effectively linear, 
 rising steadily as the load increases. An exponential-like quick rise 
 is not observed in this case because it is the i/o service and not the 
 processor service that is causing the increased waiting time. This bench- 
 mark required only two seconds of processing, so it did not descend through 
 the priority dispatching queues. Rather, it spent time waiting as a 
 result of increased competition with all other TSO and total CCN system 
 jobs for limited i/o resources. This wait time grows linearly as the load 
 
 increases, and has a high degree of variability as is seen by observing 
 
 th th 
 
 the actual data point values in the 7 through 10 load intervals. 
 
 3-1.4. MIT-MULTICS 
 
 The MIT-MULTICS data as shown in Figures 3.4(a) through 3.4(c) 
 conforms most closely to the popular conception of expected response time 
 from a time- sharing system. Considering the arithmetic benchmark plot, 
 the exponential curve chosen as the best fit is almost constant (and 
 below 120 seconds) until approximately the 8 load level. Between 
 the 8 and 9 load levels, the curve shoots up extremely sharply, 
 clearly indicating a saturated system. The combination of a fairly 
 fast processor and a scheduling algorithm that relies very heavily on 
 
61 
 
 Figure 3.U. Statistical Results - MIT-MULTICS 
 
 RESPONSE TIME (SECS.) 
 
 o 
 > 
 
 P 
 
 1 
 
 CBf) 
 
 CD 
 
 n 
 
 n 
 
 cz 
 
 o 
 
 in 
 rn 
 7] 
 
62 
 
 Figure 3-M"b) (continued). Statistical Results - MIT -MULT ICS 
 
 RESPONSE TIME (SECSO 
 
 8.6 S S g 
 
 u 
 
 O 
 > 
 
 3 
 P 
 
 m 
 
 en 
 
 OB 
 
 i © 
 
 S 
 
 H O 
 
 to o 
 
 GO 
 
 <SB\ 
 
 00® 
 
 00 
 
 > 
 
 TJ 
 
 C 
 
 r 
 > 
 
 H 
 O 
 X) 
 
1 1 
 
 H O 
 
 63 
 
 Figure 3-^(c) (continued). Statistical Results - 
 
 MIT-MULTICS 
 
 m 
 
 I © 
 
 CD 
 
 RESPONSE TIME (SECSO 
 
 8 
 
 § 
 
 8 
 
 8 
 
 O 
 > 
 
 p 
 
 QQ 
 
 
 
 QDV O 
 
 ~n 
 
 u 
 
 3 
 
 100 
 
 o 
 
 o 
 
 PI 
 
6k 
 
 previous time-slice usage and a series of priority dispatching queues 
 work together to achieve this expected behavior. The approximately ^5 
 seconds of required execution time allow the benchmark job to remain in 
 the system long enough to keep using up its formerly alloted time-slice 
 and descend through the priority queues. Position on a low priority 
 queue is of no significance until the probability that there are 
 processes waiting for service becomes greater than some arbitrarily small 
 number. This happens at the 8 load level. 
 
 The other two benchmarks run at MIT required only about 2 seconds 
 of processing each and so were not caught up in the descending queue 
 phenomenon. They received excellent response times regardless of the 
 load level. 
 
 3.1.5- UCSD-CANDE 
 
 The response time load level curve (Figure 3*5) is linear for 
 UCSD-CANDE as it was for AMES-TSS, but for different reasons. (Recall, 
 only one benchmark was run at UCSD.) UCSD has only two priority queues 
 for its interactive programs, the lower priority queue for processes which 
 exceed their previous time-slice and the higher priority queue of all 
 other ready jobs. All processes are served FIFO from both queues, so that 
 except for the possible interruption of high priority processes, even 
 jobs requiring long processor service times are served approximately 
 round robin (RR) until completion. Response time grows linearly with 
 load, therefore, rather than exponentially. For the arithmetic benchmark 
 which required 57 seconds of processing time on the average, the response 
 time rises to less than three times the execution time within the 
 
65 
 
 Figure 3-5. Statistical Results - 
 
 UCSD-CANDE 
 
 RESPONSE TIME (SECSO 
 
 o 
 
 £ 
 
 ro 
 o 
 
 O 
 > 
 
 a 
 
 r 
 
 5 
 P 
 
 I G 
 
 o o 
 c: > 
 
 □ > "0 
 
 z h o 
 
 ui 
 
 CO 
 
 (IT 
 
 ©00 
 
 oe o 
 
 G 
 
 Q \ 
 
 CZ 
 
 o 
 
 U) 
 
 a 
 
 CZ 
 
 CD 
 
 n 
 
 X 
 
 o, 
 
 XI 
 CZ 
 
 o 
 
 ZE 
 
 rn 
 x 
 
66 
 
 observable range of the data. The curve rises above 120 seconds at 
 
 th 
 approximately the 8 load level, but given the average performance 
 
 ratio of better than 3:1 of total response time to required execution 
 
 time, a more heavily loaded system needs to be observed in order to 
 
 more accurately pinpoint a saturation level, if one exists. 
 
 3.2 . Comparison of Computing Systems 
 
 One of the major goals of this study of the response times on 
 various time- sharing systems on the ARPA network was the comparison of 
 system performance . Each of three benchmarks was run on from three to 
 five different systems, with response time measurements being made at 
 varying load levels. The arithmetic benchmark job was run on all five 
 of the systems under study. The bit string manipulating and i/o bound 
 benchmark jobs were run at AMES, CCN and MIT only. The load levels are 
 
 equivalent (and hence comparable) in the sense that each i load level 
 
 th 
 represents the i (approximately) uniformly distributed load interval- 
 over the range of the observable data for a particular system. Reference 
 should be made to section 2.2.3. for the precise load level definitions 
 on each system. 
 
 3.2.1. Arithmetic Benchmark 
 
 Comparison plots for the arithmetic benchmark job are presented 
 in Figures 3«6(a) through 3«6(e). Curves shown in these figures are those 
 determined to be the best fit in the individual system analyses. The 
 BBN-TENEX response time curve dwarfs all other systems in comparison as 
 Figure 3«6(a) illustrates. Reducing the dependent variable scale by a 
 
67 
 
 Figure 3.6(a). Arithmetic Benchmark Comparisons 
 
 RESPONSE TIME (SECSO 
 
 8 
 
 u 
 
 en 
 
 en 
 
 P 
 
 * 
 
 *See Table 2.3. for the correspondence between Load Levels 1-10 and 
 the system measure of busyness for each of the five systems studied 
 
68 
 
 Figure 3«6(b) (continued). 
 
 Arithmetic Benchmark Comparisons 
 (Without BBN-TENEX) 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 8 
 
 2 
 
 § 
 
 o 
 > 
 o 
 
 r 
 n 
 < 
 
 n 
 
 r 
 
 C*l 
 
 tn 
 
 01 
 
 oo 
 
 CO 
 
 CI 
 
 en 
 m 
 XI 
 
 o 
 
 73 
 
 C 
 
 O 
 
 IE 
 
 n 
 
 XI 
 
6 9 
 
 Figure 3 .6(c) (continued). Arithmetic Benchmark Comparisons 
 
 (With 9% Confidence Intervals) 
 
 RESPONSE TIME (SECSO 
 
 S 
 
 8 
 
 8 
 
 ! 
 
 o 
 > 
 
 a 
 
 s 
 p 
 
 no 
 
 w 
 
 oi 
 
 CD 
 
 rn 
 
 X) 
 
 o 
 
 XJ 
 
 o 
 in 
 n 
 
 XI 
 
TO 
 
 Figure 3«6(d) (continued). Arithmetic Benchmark Comparisons 
 
 (With 95$ Confidence Intervals) 
 
 RESPONSE TIME (SECSO 
 
 8 
 
 8 
 
 8 
 
 s 
 
 o 
 > 
 
 n 
 < 
 
 p 
 
 ro 
 
 oj 
 
 en 
 
 o> 
 
 ca 
 
 ID 
 
71 
 
 Figure 3-6(e) (continued). Arithmetic Benchmark Comparisons 
 
 (With 95$ Confidence Intervals) 
 
 RESPONSE TIME (SECS.) 
 
 S 
 
 8 
 
 2 
 
 ro 
 
 ro 
 
 OJ 
 
 O 
 > 
 
 a 
 
 3 
 P 
 
 on 
 
 CD 
 
72 
 
 factor of more than five as was done in Figure 3«6(b) brings the comparison 
 of the other systems into better perspective. Figures 3«6(c) through 
 3.6(e) show the 95 percent nonlinear confidence intervals for the four 
 curves presented in Figure 3 -6(b). 
 
 The CCN-TSO, AMES-TSS and MIT-MULTICS systems give very nearly 
 
 st th 
 equivalent response times in the 1 through 7 load intervals. In the 
 
 8 interval, MIT becomes saturated and response time in that system 
 
 rises sharply, while CCN and AMES continue to give comparable good 
 
 response time throughout the entire observable range. An important 
 
 consideration in these observations is that AMES and MIT are producing 
 
 favorable response time data over the entire range of usage in those 
 
 systems, while the CCN data, though favorable, was collected on only a 
 
 lightly loaded system. 
 
 The UCSD-CANDE system, while giving quite acceptable response 
 times, is generally out performed by all systems except BBN-TENEX. 
 The UCSD system reacts to saturation less radically than does the MIT - 
 system, however, and performance is better at UCSD than at MIT in the 9 
 and 10 load intervals . 
 
 If a strict ranking were required, from fastest to slowest 
 systems in terms of response time curves for the type of processing 
 inherent in the arithmetic benchmark job, it would be given as CCN-TSO, 
 AMES-TSS, MIT-MULTICS, UCSD-CANDE and BBN-TENEX. Such a ranking, though, 
 must be considered in the context of how significant the difference 
 between any two particular systems really is. 
 
73 
 
 3.2.2. Bit Manipulating Benchmark 
 
 The bit string manipulating "benchmark was run on the three 
 systems that supported the PL/ I programming languages: AMES, CCN and MIT. 
 Figures 3«7(a) and 3«7(h) present the comparative response time results 
 for this highly CPU bound benchmark. The MIT-MULTICS system required 
 only two seconds of execution time on the average to complete the task 
 and clearly out performs the AMES and CCN systems in terms of response 
 times. Even the 95 percent confidence interval is very tight and evidences 
 
 the MULTICS superiority. As Figure 3.7 indicates, the AMES and CCN 
 
 th 
 curves intersect in the k load interval at which point the advantage 
 
 switches from CCN to AMES. The AMES 95 percent confidence interval is 
 
 smaller than that of CCN, however, and indicates that of AMES and CCN, 
 
 AMES generally gives the faster response time. This is true in spite of 
 
 the fact that in the AMES system the benchmark requires more than three 
 
 times (l6 seconds) the execution time of the CCN system (5 seconds). 
 
 For a completely CPU bound job of only moderate length requiring 
 
 no signficant amount of core and doing no significant amount of i/o, the 
 
 ranking of systems from fastest to slowest is MIT-MULTICS, AMES-TSS and 
 
 CCN-TSO. 
 
 3.2.3. I/O Bound Benchmark 
 
 The file flogging benchmark was run on the same systems as the 
 bit string manipulating benchmark. Figures 3 .8(a) and 3 .8(b) demonstrate 
 that MIT-MULTICS again gives the best response time performance, with 
 AMES-TSS clearly second and CCN-TSO third. The CCN system acknowledges 
 
Ik 
 
 Figure 3.7(a). Bit String Benchmark Comparisons 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 8 
 
 8 
 
 8 
 
 o 
 > 
 a 
 
 r 
 n 
 
 < 
 
 ro 
 
 oj 
 
 ai 
 
 on 
 
 (D 
 
 TO 
 
 H 
 
 
 O 
 73 
 
75 
 
 Figure 3-7(1)) (continued). Bit String Benchmark Comparisons 
 
 (With 9% Confidence Intervals) 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 8 
 
 8 
 
 § 
 
 » ra 
 
 u 
 
 en 
 
 a i 
 
 i 
 
 \ 
 
 I 
 I 
 I 
 I 
 
 OB 
 
 I 
 
 ua 
 
 > 
 Z 
 
 T) 
 C 
 
 > 
 
 O 
 
 X] 
 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 \ 
 
76 
 
 Figure 3 -8(a). i/o Bound Benchmark Comparisons 
 
 RESPONSE TIME (SECS.) 
 
 3 
 
 8 
 
 8 
 
 s 
 
 o 
 > 
 
 rn 
 < 
 n 
 r 
 
 u 
 
 CT> 
 
 (O 
 
77 
 
 Figure 3.8(b) (continued). i/o Bound Benchmark Comparisons 
 
 (With 9% Confidence Intervals) 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 § 
 
 8 
 
 S 
 
78 
 
 that its I/O resources are those most likely to "become bottlenecked. 
 The wide variability in the CCN 95 percent confidence interval is 
 evidence of the processes outside of TSO control that also compete for 
 the i/O resources. 
 
79 
 
 k. MODELING TIME-SHARING SYSTEMS 
 
 The system comparison data presented thus far is useful in 
 evaluating the performance of various time-sharing systems in reference 
 to a given set of computing applications (benchmark jobs) which require 
 a given amount of actual processing time. In order to compare and predict 
 turnaround time for a wider class of jobs, however, a system model is 
 desired which accepts the processing time of a job as an independent 
 variable rather than as an implied constant. 
 
 The approach used in this investigation is to develop an analytical 
 and/or simulation model to describe the behavior of the various time- 
 sharing systems under study as they process the number crunching benchmark 
 job. These more general models are tuned to approximate as closely as 
 possible the behavior of the already developed statistical models describing 
 the respective systems. The tuned models, depending on the success with 
 which they are able to describe system behavior, may then be used in place 
 of the statistical models to predict job response time for similar job 
 applications but for jobs requiring any amount of processing time. In 
 addition, the effect of network delays (which was not a factor in the 
 statistical model) is introduced into the analytical and/or simulation 
 models to more completely predict job response time. 
 
 *4 .1. An Analytical Model for Time-Sharing Systems 
 
 During the late 1960's, analytical modeling of time-sharing 
 systems with various scheduling disciplines resulted in a wide range of 
 useful system models. A thorough survey of such models is presented by 
 
80 
 
 L. Kleinrock [KLE72] . Basically, the systems are studied by considering 
 priority disciplines operating in a stochastic queueing environment. 
 The essential elements of such systems include the source from which jobs 
 emanate for service, the input process, the service process, the number 
 of servers and the service discipline. Many variations exist within 
 and among these elements, providing a wide choice of model designs. 
 
 Some design parameters are strongly recommended for ease of 
 model analysis, such as Markov assumptions for the arrival and service 
 processes. Other design options such as a particular queue discipline 
 can be more closely matched with the actual system that is being modeled. 
 Below is a list of the set of design options that completely define the 
 analytical model used to represent the time-sharing systems under study. 
 Except for the AMES-TSS system, all the systems dispatch processes 
 through a set of priority queues, each of which has its own associated 
 time-slice. 
 
 Source: The source was assumed to be an infinite one. 
 
 The load of a system is equated with the number 
 of job arrivals emanating from this source. This 
 assumption is not a completely accurate one since 
 the load on a time- sharing system is often limited 
 by the number of terminals with system access 
 capability. Scherr [SCH67] has developed a model 
 based on a finite source . 
 
 Input Process: The input process is assumed to be the Poisson 
 
 process and is described by an interarrival time 
 distribution denoted by A(t) . A(t) is defined 
 by the exponential distribution 
 
 A(t) = 1 - e " xt t > 0, X > 
 t < 0, \ > 0. 
 
81 
 
 The mean arrival rate is than l/X seconds . The 
 interarrival times form a sequence of independent 
 and identically distributed random variables. 
 
 Service Process: 
 
 The service process is also assumed to be exponential 
 and is defined by 
 
 B(t) = 1 - 
 
 •JUT 
 
 
 
 T > 0, U > o 
 
 T < 0, M > 0, 
 
 The mean service time is l/ju seconds. The service 
 times are also independent and identically distri- 
 buted. In a measurement study by Fuchs and 
 Jackson [FUC70], a significant result showed that 
 for all continuous random variables studied, the 
 gamma distribution was an excellent fit. Because of 
 the close relationship of the gamma and exponential 
 distributions, analytical models studied under the 
 assumption of exponential distributions may not be 
 far from the truth. 
 
 Number of Servers: 
 
 The number of servers is 1. The standard notation 
 used to describe the model thus far is M/M/l, where 
 the first and second parameters indicate the 
 exponential distribution for the input and service 
 process, respectively, the third indicates one server 
 and the lack of a fourth indicates an infinite 
 source. 
 
 Service Discipline 
 
 The service discipline is quantum controlled with a 
 variable quantum size, FB„ 
 .th 
 
 V 
 
 FIFO, preemptive resume 
 Each 
 
 in the N~" queue and having zero swap time 
 of these options is discussed separately. 
 
 Quantum Controlled: Each process receives a 
 maximum service time from the service facility 
 equal to the quantum q. associated with its 
 
 particular queue. Different quantum sizes may be 
 associated with different queues, but the variability 
 is limited to a linear function of some constant 
 quantum. 
 
 FB. 
 
 N' 
 
 If a job has not completed processing during 
 
 its quantum, it returns to the system at the end 
 of the next lower priority queue . There are N such 
 queues. Units at the N level are served a quantum 
 
 q at a time in turn until completion . 
 
 That is, an 
 
82 
 
 N level process will be preempted "by a higher 
 level process if one exists, or by another N^* 1 
 level process if one exists, after it has completed 
 the quantum- service in progress. 
 
 FIFO: The service is first-in, first-out within 
 queue s . 
 
 Swap Time: The time required to swap a process in 
 and out of the memory is assumed to be absorbed in 
 the process' required service time. The swap time 
 is thus considered to be zero. 
 
 Of the many time -shared models presented in the literature, two 
 
 meet almost all of the specifications listed above. Wolff [WOL68] analyzes 
 
 a model identical to the one described except that it is FB rather than 
 
 00 
 
 FB„. Jobs are permitted to descend through an infinite number of priority 
 
 queues before completing processing. Coffman and Kleinrock [COF68] 
 
 present a model identical to the one described except that it does not 
 
 provide for variable quantum sizes. A modification of the Coffman- 
 
 Kleinrock model extends its application to include a limited use of 
 
 variable quantum sizes. 
 
 In addition to the arrival and service time definitions already 
 
 given, the following notation will be used: 
 
 5 = a constant fractional amount of time allocated 
 to a job on each pass through the system 
 
 q. = the amount of time allocated to a job on its i 
 
 pass through the system, i = 1,2, . . . We require 
 that 5 < q. and q. = m.8 , i = 1,2, . . . , and m. 
 
 is an integer. 
 
 Q. = the total time allocated to a job on its first j 
 passes: 
 
 J 
 CL = £ <L J = 1,2, . . . 
 3 i=l 
 
83 
 
 Coffman and Kleinrock derive the expected response time for a 
 job that requires t seconds of processing. The result depends on a 
 parameter k, where k is the smallest integer such that k 5 > t. Since 
 the derivation depends on the integral property of k and since the q. 
 were defined to map into an integral multiple of 5, the model can be 
 adjusted to accommodate variable q. . 
 
 For a job requiring t seconds of service in the FB„ system with 
 fixed quanta of length 5, the expected waiting time in the system as 
 derived by Coffman and Kleinrock is 
 
 U/2 [E.(t=) +7 ]r E-(t C )] 
 w ( t ) = £ [ 1 — i 
 
 k [1 - p (1 - e"^- 5 )] [1 - p (1 - e~^ k - 1)Q )] 
 
 p (l-e^" 1 ^) 
 
 -iu(k-l)5v (k-l)o + t 1 < k < N-l 
 
 1 - p (1 - 
 
 c 
 
 W, (t) 
 
 P(l//i 
 
 k (l - P )!l - P (l - t -'^ N - 1 )5) 
 
 p (1 - e-MCH-DB) 
 
 1 - P ( 
 
 1 _ p -„(N-l)B) 
 
 (k-l)B + t k > N 
 
 where k is the smallest integer such that k8 > t, where we define 
 (t ) as the second moment of the distribution 
 
*V(t) = 
 
 84 
 
 0, T < 
 
 1 - e~ MT , < T < ks 
 
 1, t > kS 
 
 with 
 
 EJT) = ^ [1 - e^* 8 ], 
 
 \(t ) = — - ^-2— [(/iksT + 2^/kS + 2] 
 
 M Id 
 
 and where 
 
 7 e' M 
 k = 
 
 1 - e^ 5 
 
 and 
 
 P = "Nju 
 
 where p is a measure of system utilization. 
 
 Now since q. > 5 and q. = m. 5 for all i and m. an integer, 
 the number of 5s required to service a job can be partitioned into A. 
 subsets in such a way that there exists a unique mapping between the 
 A. 's and the q. 's. Let the partitions of the 5s be defined by sets 
 A. = k.S, with k. an integer. If a process requires t seconds of 
 service time with k the smallest integer such that k 5 > t and m the 
 smallest integer such Q > t, then partition the k 5s into m subsets, 
 each representing a sum of 5s, such that 
 
 k t 5 = A i = q. ± 1 < ± < m- 1 (3) 
 
85 
 
 and 
 
 -<\- (h) 
 
 m m 
 
 Each q is now associated with a particular k . 
 x i 
 
 There also exists a mapping from the number of priority queues, 
 
 N, in the Coffman-Kleinrock model into the number of priority queues, N f , 
 
 in the modified model. N' is associated with some q ,. Define 
 N' 
 
 N= E q,/5- 
 i=l 1 
 
 Returning to the Coffman-Kleinrock model, let the values 
 
 that k assumes in equations (l) and (2), instead of being any integer, 
 
 be only those integers . for which 
 
 J 
 
 P . = I k. for 1 < j < m . (5) 
 
 J i=l X " " 
 
 Then can be substituted for k in those equations since I is the 
 
 m m 
 
 smallest of the I. integers such that H. s > t. 
 J J 
 
 As an example of this type of mapping, consider t = .93, 
 5 = .1, q. - 2 5 and N' = k. Clearly, q. < F> and q. = m. 5 for all 
 i and m. an integer. Further, for k = 10, k is the smallest integer 
 such that k ft = 10* .1 > t = .93 and for m = k, m is the smallest 
 integer such that _ = q + q^ + q + q } = .1 + .2 + .k + .8 > t = .93- 
 
86 
 
 Now 
 
 ^ = l*.l = 
 
 ■1= 4l 
 
 J 
 
 k l = 
 
 1, 
 
 Ag = 2*.l = 
 
 ,2--qg 
 
 J 
 
 k 2 = 
 
 2, 
 
 A3 - 1...1 = 
 
 •u=, 3 
 
 J 
 
 k 3 = 
 
 >*, 
 
 A U = 3*.l = 
 
 •3 < 1 u 
 
 > 
 
 k U = 
 
 3 
 
 and « 1= 1, i 2 =3, jg =T, 4^=10. 
 
 This mapping changes the way the system is conceptualized in 
 a greater degree than it changes the way the system actually works. 
 Figure 4.1 illustrates this change for k.=4-. Assuming the job requires 
 at least k service quantums before completion when it arrives, a job 
 passing through the Coffman-Kleinrock system receives k short bursts of 
 service, each time taking its place on the next lower priority queue and 
 waiting for jobs of higher priority to be processed first. A job passes 
 through the modified system in one service burst, after having waited for 
 all jobs queued at that priority level to use their required service 
 quantum of up to k. The restriction on the choice of k divides the first 
 type (Coffman-Kleinrock model) of system into several 'black boxes" each of 
 which represents an equivalent service quantum available in the second 
 system. 
 
87 
 
 Figure k.l. Comparison of Two Models 
 
 COFFMAN-KLEINROCK MODEL 
 
 MODIFIED MODEL 
 
 ARRIVAL 
 
 48 
 
 i th PRIORITY LEVEL 
 
 jth THROUGH (j+4)th 
 PRIORITY LEVELS 
 
 In order to see how the restriction on the choice of k effects 
 the expected waiting time results, we consider a tagged job arriving at 
 the FB N system in equilibrium, assuming that its service requirement is 
 t seconds and that k is the smallest integer such that k 5 > t, and 
 
88 
 
 rn 
 
 the smallest integer such that Q > t. The system must be divided 
 
 into two disjoint subsystems to derive the modified system equations. 
 
 We will first examine the progress of the tagged job for its first Z n 
 
 m-1 
 
 passes through the system, and then consider the tagged job's I pass 
 
 through the system separately. 
 
 We have defined A. subsets, 1 < i < m, to partition the 
 
 5-quantums required to service a job. We now consider the waiting time 
 
 in queue of the tagged job as it passes through a A. subset of quanta for 
 
 any i < m. We will define this waiting time as W., where 
 
 W. = W . (t) - W- (t) i < m • (6) 
 
 i i-1 
 
 Assuming that the units in all queues of priority higher than i have been 
 processed, in the modified system the waiting time of the tagged job 
 is effected only by those jobs which are ahead of it in the i queue. 
 These jobs will receive their q. quantum of service under a strictly 
 FIFO discipline, and then the tagged job will receive its q. quantum of 
 service, completely independent of jobs which have arrived during the 
 waiting interval of the tagged job on queue i. This is not the case in 
 the Coffman-Kleinrock system. 
 
 Still working under the assumption that the units in all higher 
 priority queues have been processed, and also k. > 1, in the Coffman- 
 Kleinrock system a tagged job's total waiting time in the j 8-quantum 
 queues, £,. ,+1 < j < I , is dependent upon new arrivals that occur 
 during the tagged unit's waiting time. This is so because these new 
 arrivals will start to receive 5 quanta of processing time before the 
 
89 
 tagged job has received its total j 5 -quantum service slices. If no new 
 arrivals occurred during the tagged job's waiting time, the tagged job 
 would experience identical waiting times in both systems. The waiting 
 
 time in the j 5-quantum queues in the Coffman-Kleinrock system is greater 
 
 th 
 than that in the i queue in the modified system by a factor that depends 
 
 on the average number of new arrivals to that set of queues. We define 
 
 E(T. ) to be this extra expected waiting time. 
 
 The average number of arrivals must be based on W. + (k. - l) 5 
 
 since new arrivals can seize 5-quantums of service until the tagged job 
 
 begins its last 5 of processing. The average arrival rate to the i 
 
 queue, \., is determined by the following consideration. A job arrives 
 
 th 
 for service at the i queue only if it requires more than £. seconds 
 
 of processing. We recall that B(t) = 1 - e is the service time 
 
 distribution, where B(t) represents the probability that the service 
 
 time t is less than or equal to some number. The inverse is formed by 
 
 solving for t: 
 
 t = - ( l/p ) In [1 - B(t)J. 
 
 The inverse form can be used to calculate the probability that t is greater 
 than some particular L , . If we call this probability p > « than 
 
 As an example of this process, we consider I . - =• 7 a nd seek to discover 
 
 the probability that 1 > 7. For t < 7 and ju = — , B(t) = .37 so that 
 
 p _ = .63 and the arrival rate to 1 is given by X. = .63 X- 
 T ^ 1 1 1 
 
90 
 
 Having determined the arrival rate, X , and the interval in 
 which these arrivals take place, W. + (k. - 1)5, the average number of 
 arrivals is calculated as the product of these two quantities. The 
 time by which the tagged job will be. delayed is the product of this 
 average number of arrivals and their average service time requirement, 
 I -1(t)» Only service times strictly less than or exactly equal to 
 
 1 
 
 (k.-l.)o are significant here since (k.-l)s is the maximum service time, 
 a job arriving to this queue will receive before the tagged job completes 
 its service requirements. The expression for the average service time, 
 therefore, is given by 
 
 V 1<T) = Jo* 1 * 1 **** X ta(1 -/( kl -i) 6 ^ Xdx)+(k i- 1)5 J(k.-i) 5 ue * Xta 
 
 (8) j 
 
 where the first term represents the average service time of jobs requiring 
 less than (k.-l) units of service time and the second term represents the 
 average service time of jobs requiring (k.-l) or more units of service 
 time. Performing the integration and simplifying, equation 7 becomes 
 
 -(k.-l)oju -, -2(k -l)Sju 
 
 l t -,(t) = - - 1 e x + [(k.-l)s +±].e x (9) 
 
 k. -1 u jLt 1 jU 
 
 Now since the waiting time interval is lengthened by the added arrivals 
 acquiring their service quantums, an infinite summation of these service 
 quanta is required so that E(T.), the added expected waiting time in the 
 Coffman-Kleinrock model, becomes 
 
91 
 
 E(T.) = [¥. + (k. -1) 5 J [ Z (X. t -1(t)) J ] (10) 
 
 3=1 1 
 
 where 
 
 and 
 
 1 
 
 Z (p.) = -5 so that 
 
 0=1 1 
 
 p. [W. + (k. -1)5] 
 
 e <v- \. P : — • (ii) 
 
 th 
 So in the modified Coffman-Kleinrock system, for each i priority queue, 
 
 i < m and k. > 1, E(T.) must be subtracted from the Coffman-Kleinrock 
 
 m-1 
 response time equations, or the term - Z E(T.) must be added. These 
 
 i=l x 
 
 terms may be considered independently for each f±. subset because even 
 though a job may wait longer to complete service in the j 5 -quantum queues 
 of the Coffman-Kleinrock system than in the corresponding i queue in 
 the modified system, the relative ordering of the jobs does not change 
 
 from one system to the other. That is, when the job arrives at either 
 
 st st 
 the /\. , ' or (i+l) ' queue it sees the same queue configuration in 
 
 either system. 
 
 We now consider the £ or m pass, where the waiting time 
 
 m B ' 
 
 is not the same for the two models if q > 5 . In the fixed quantum system, 
 a job continuously receives small bursts of service up to and including 
 its k burst, waiting only for other jobs in the system to receive their 
 same bursts up to m. But in the variable sized quantum system, a job 
 
92 
 
 that is queued for service at the m priority level must wait until the 
 jobs ahead of it receive their total quantum of service, up to the maximum 
 alloted at that level. 
 
 Since an arrival requires service at the £ priority queue 
 (which consists of £ - £ . 5 -service queues) or the m priority 
 
 queue only if it requires in excess of £ _ seconds of service, the 
 
 , th 
 
 m * ueue ' h 
 
 average arrival rate to the £ queue, \ ff , is given by 
 
 m 
 
 m m-1 
 
 £ is the time our tagged job must wait for service, then 
 
 th 
 the expected average number of arrivals to the £ queue must be based 
 
 on W„ + £ , 5 since the tagged job receives £ n 5 seconds of service 
 £ m-1 m-1 
 
 m , , 
 
 before reaching the £ queue. Therefore, the expected average number 
 
 th 
 of arrivals to the £ queue prior to the tagged job would be 
 
 m m 
 
 The average service time distribution for the queue arrivals would differ 
 depending on whether the job was serviced in system one, the Coffman- 
 Kleinrock system, or in system two, the modified Coffman-Kleinrock system. 
 
 In system one, £ - £ _ queues remain through which the tagged 
 
 ' m m-1 
 
 job must pass before completion. Each of the arrivals to the (£ + 1) 
 
 queue must have remaining quanta of service of which |* „ (t) is the 
 
 m m-1 
 average amount. The expected time to process all jobs before the tagged 
 
 job in the £ - £ n interval is therefore 
 m m-1 
 
 
93 
 
 mm m m-1 
 
 This -waiting time is already included in the Coffman-Kleinrock equations. 
 In system two, each arrival to the m queue will have remaining 
 a quanum of service of which £ (t) is the average amount. The expected 
 time to process all jobs before the tagged job is therefore 
 
 \ [W + L J L (t) • 
 m m 
 
 X. X. m-I ^ 
 
 The term that must be added to the Coffman-Kleinrock equations number (l) 
 and (2) therefore, to make the results valid for the variable quantum 
 size model is 
 
 ^ tW^ + L J [L (t) i„ „ (t)J 
 
 Vl V ; X- Vi 
 
 If q = I -i _, then the term is zero. 
 Tn m m-1 
 
 Thus, with the two modifications detailed above, the modified 
 
 Coffman-Kleinrock model becomes directly applicable to time-sharing 
 
 systems of the type represented by the general time-sharing model of 
 
 Figure 2.2. 
 
 h.2 . A Simulation Model for Time- Sharing Systems 
 
 A GPSS simulation model of a time -sharing system with a 
 scheduling discipline identical to that specified for the analytical 
 model was also developed. A flowchart of this model as it simulates 
 the MIT-MULTICS time-sharing system is presented in Figures U.2(a) 
 
94 
 
 Figure k.2(&). 
 
 Simulation of MIT-MULTICS Time -Sharing Scheduler 
 (Generation of Tagged Jobs) 
 
 2 
 
 ,K45 
 
 C ASSIGN J 
 
 _J 
 
 ' 
 
 ] 
 
 .,P2 
 
 c« 
 
 5SIGN J 
 
 < 
 
 ' 
 
 3.K1 
 
 (« 
 
 3SIGN J 
 
 1 
 
 ! 
 
 4 
 
 ,K10 
 
 (« 
 
 JSIGN J 
 
 1 
 
 ' 
 
 7,K1 
 
 (.« 
 
 5SIGN J 
 
 GENERATE JOB 
 
 ASSIGN REQUIRED 
 SERVICE TIME 
 
 ASSIGN SERVICE 
 TIME REMAINING 
 
 ASSIGN FIRST 
 SERVICE SLICE 
 
 ASSIGN SCHEDULING 
 PRIORITY 
 
 TAG THIS JOB 
 
 TRANSFER TO 
 SCHEDULER 
 
95 
 
 Figure U.2(t>) (continued). 
 
 Simulation of MIT -MULT ICS Time -Sharing Scheduler 
 (Generation of Jobstream) 
 
 GENERATE JOB 
 
 ASSIGN REQUIRED 
 SERVICE TIME 
 
 IS SERVICE TIME 
 EQUAL TO ZERO ? 
 
 IF SO, ASSIGN ONE 
 SERVICE UNIT 
 
 (SCH7) 
 
 ASSIGN SERVICE 
 TIME REMAINING 
 
 ASSIGN FIRST 
 SERVICE SLICE 
 
 ASSIGN SCHEDULING 
 PRIORITY 
 
9 6 
 
 Figure U.2(c) (continued). 
 
 Simulation of MIT-MULTICS Time -Sharing Scheduler 
 (Scheduling Discipline) 
 
 (SCH3) 
 
 BUFFER 
 
 QUEUE 
 
 3 
 
 SEIZE 
 
 k^ 
 
 DEPART 
 
 P3 
 
 3 
 
 PI (SCH1) 
 
 TEST 
 
 (SCH2) 
 
 ADVANCE 
 P3,0 
 
 ^~ 
 
 RELEASE 
 
 w 
 
 RESCAN THE CURRENT 
 EVENTS CHAIN 
 
 QUEUE THE JOB 
 FOR PROCESSING 
 
 SEIZE THE PROCESSOR 
 AND RESERVE IT 
 
 COLLECT RELEVANT 
 QUEUE STATISTICS 
 
 IS THIS THE LAST 
 SERVICE SLICE? 
 
 GIVE JOB REQUIRED 
 SERVICE SLICE 
 
 RELEASE THE PROCESSOR 
 FOR THE NEXT JOB 
 
97 
 
 Figure 4.2(d) (continued). Simulation of MIT-MULTICS Time-Sharing Scheduler 
 
 (Job Parameter Updating) 
 
 PRIORITY 
 BUFFER 
 
 4,V3 
 
 f ASSIGN J 
 
 (SCH5) 
 
 TRANS 
 
 (SCH3) 
 
 IS THE TERMINATE 
 FLAG SET ? 
 
 ASSIGN SERVICE 
 TIME REMAINING 
 
 IS THIS THE 
 LARGEST ALLOWABLE 
 ALLOTMENT ? 
 
 ASSIGN INCREASED 
 TIME ALLOTMENT 
 
 DECREASE PRIORITY 
 
 ASSIGN REDUCED 
 PRIORITY 
 
 SEND TO 
 SCHEDULER 
 
98 
 
 Figure k.2(d) (continued). Simulation of MIT-MULTICS Time-Sharing Scheduler 
 
 (Job Parameter Updating) 
 
 (SCH1) 
 
 (SCH4) 
 
 ASSIGN LAST 
 SERVICE SLICE 
 
 SET TERMINATE 
 FLAG 
 
 ^ SEND TO 
 PROCESSOR 
 
 TERMINATE JOB 
 
 Figure k.2(e) (continued). Simulation of MIT-MULTICS Time-Sharing Scheduler 
 
 (Run Time Control) 
 
 GENERATE TIMER 
 
 STOP RUN 
 
99 
 
 through k.2(e), with the chart symbols identifical to those of Schriber's 
 in his General Purpose Simulation System/360: Introductory Concepts 
 and Case Studies [SCHTl]. Figures if .2(b) through U.2(d) illustrate the 
 heart of the simulator as it generates jobs with exponentially distributed 
 interarrival rates, assigns service times exponentially distributed about 
 some mean, and services the jobs according to the scheduling discipline 
 described for MIT-MULTICS in section 2.2 .l.k. The simulator generates 
 tagged jobs for data collection purposes and this process is diagrammed 
 in Figure U.2(a). Figure U.2(e) shows the control module for desired 
 running time of the simulator. 
 
 4.3. Analysis of Model Predictions 
 
 The analytic and simulation models were developed to generalize 
 the predictive capability of the statistical response time models. The 
 conceptualization and definition of the analytical and simulation models 
 were derived from the Generalized Time -sharing Scheduling diagram shown 
 in an earlier chapter in Figure 2.2. As a result of the generalized 
 conceptualization of the models, they can be expected to most closely 
 describe those time -sharing systems which are most similar to the 
 generalization. Since the AMES-T3S system scheduler depends on core 
 usage behavior rather than processor usage, the analytical and simulation 
 models do not apply to that system. They also are not applicable to 
 the UCSD-CANDE system since the models allow a variable, but fixed, 
 service slice at each priority level and time-slices are dynamically 
 awarded in CANDE's priority queues as a function of parameters generated 
 
100 
 
 during past processor usage. The models, therefore, have been particularize! 
 to the three remaining time-sharing systems — BBN-TENEX, CCN-TSO and 
 MIT-MULTIC S . 
 
 1+.3.1. Individual System Results 
 
 BBN-TENEX Models — Both the analytic and simulation models 
 were developed for the BBN-TENEX system. The results of these model 
 predictions are plotted in Figure k.3> The analytic model is valid only 
 for values of system utilization, p, less than one, so that since the BBN 
 system saturates under relatively light loads, predictions from the 
 analytic model are possible only for load levels 1-6 . 
 
 The technique used to tune the analytic and simulation models 
 to closely represent the TENEX system was, after setting up the 
 appropriate priority queues and assigning their associated time-slices, 
 to adjust the average service time and average interarrival rate 
 parameters so that the analytical model prediction for the number crunching 
 benchmark job was as similar to the statistical model prediction as seemed 
 feasible. For TENEX, the best results were obtained for the average 
 service time equal to twenty seconds. The average interarrival rates 
 were associated with previously defined TENEX load levels as indicated 
 in Table k.l. 
 
 As can be observed from Figure ^-.3, both the analytic and 
 simulation model plots yield satisfactorily close fits to the statistical 
 model plot. They are also well within the 95 percent confidence interval 
 of the statistical model. 
 
 
101 
 
 Figure k.3. Model Comparison - BBN-TENEX 
 
 RESPONSE TIME (SECS.) 
 
 o 
 > 
 
 5 
 p 
 
102 
 
 Table k.l. Analytical Model Parameters 
 
 Load 
 Level 
 
 Associated Average Interarrival Rate 
 
 
 BBN-TENEX 
 
 CCN-TSO 
 
 MIT-MULTICS 
 
 1 
 
 29 
 
 60 
 
 
 2 
 
 25 
 
 
 60 
 
 3 
 
 2k 
 
 30 
 
 30 
 
 3-5 
 
 23 
 
 
 
 1+ 
 
 
 
 20 
 
 k.5 
 
 22 
 
 
 
 5 | 
 
 20 
 
 15 
 
 6 j 21 
 
 15 
 
 12 
 
 7 
 
 
 12 
 
 10 
 
 8 
 
 
 
 8 
 
 8.5 
 
 
 10 
 
 
 9 
 
 
 
 
 9-5 
 
 
 8 
 
 
 10 
 
 19 
 
 
 
 CCN-TSO Models --Since the run times to obtain comparable response 
 time results from the analytical and simulation models is greater by 
 a factor of approximately ten for the simulation model, and since the 
 analytical model results correspond so closely with those of the statistical 
 model for the CCN-TSO system, only the analytical model was developed in 
 this case. Model comparison results are presented in Figure k.k. The 
 average service time for jobs in this system was tuned to seven seconds 
 
 
103 
 Figure h.k. Model Comparison - CCN-TSO 
 
 RESPONSE TIME (SECSO 
 
 ro 
 
 § 
 
 ft 
 
 8 
 
 § 
 
 o 
 > 
 
 a 
 
 3 
 P 
 
 OJ 
 
 en 
 
 (Ll 
 
 
 
 
 
 
 
 
 
 
 
 1 
 \ 
 
 
 
 
 
 11 \ 
 
 
 \ * 
 
 
 * n 
 
 
 1\ \ o 
 
 
 
 f • 
 
 
 
 
 \ 
 
 
 \ \ cz 
 
 
 \ > ^ 
 
 
 \ v oo 
 
 
 \ \ n 
 
 
 \ N *J 
 
 
 \ N 
 
 
 \ \ Q 
 
 
 \ \ ^ ^ 
 
 
 \ \ c: 
 
 
 \ \ z 
 
 
 \ \ n 
 
 
 \ \ IE 
 
 
 \ \ n n 
 
 
 \ \ \ Xi 
 
 
 \ \ n 
 
 
 \ \ s 
 
 
 \ \ \ 
 
 
 
 \ \ N 
 
 \ \ N 
 
 
 
 
 \ \ ** 
 
 
 \ \ ^ 
 
 
 \ \ n 
 
 
 \ \ s 
 
 
 \ \ n 
 
 
 \ \ ^ 
 
 
 — \ \ v 
 
 
 n \ 
 
 
 
 3 9 
 
 
 
 
 o 
 
 
 H 
 
 
 
 
 
 o 
 o 
 
 1 
 
 
 p £ 
 
ioU 
 
 and the average interarrival rates are associated "with load levels as 
 shown in Table ^.1. The 95 percent confidence interval is a relatively 
 •wide one for the TSO statistical model and the analytical model results 
 fall well within this interval for all load levels. 
 
 MIT-MULTICS Models—Because the statistical response time curve 
 for the MIT-MULTICS system was most like that usually associated with 
 time-sharing systems, this system was initially used to develop and 
 validate both the analytical and simulation models. The average service 
 time was tuned to seven seconds and the average interarrival rate/load 
 level association can again be found in Table k.l. The models plotted 
 in Figure U.5 verify that indeed the analytical and simulation models 
 yield very nearly identical results for this well-behaved MULTICS system 
 and that for all but approximately one load length (6.5-7*5) the 
 analytical and simulation models fall within the 95 percent confidence 
 interval of the statistical model. This confidence interval is relatively 
 tight and it is only near system saturation that the two models tend to 
 move unacceptably far away from the statistical model results. This 
 discrepancy is easily explained by the fact that the MIT-MULTICS system 
 deviates from the generalized time-sharing scheduler model in that it 
 has two processors rather than one. The analytical and simulation models, 
 therefore, would approach saturation more quickly than the statistical 
 model which represents actual two -process or system data. 
 
 U.3.2. Success of Model Generalization 
 
 The striking success with which the analytical and simulation 
 models were able to describe system behavior for the set of time-sharing 
 
105 
 
 Figure k.5. Model Comparison - MIT-MULTICS 
 
 RESPONSE TIME (SECS.) 
 
 8 
 
 § 
 
 8 
 
 ro 
 
 O 
 > 
 
 a 
 
 p 
 
 Ln 
 
 01 
 
 <£> 
 
106 
 
 systems whose scheduling discipline can be conceptualized by the 
 generalized time-sharing model (Figure 2.2), indicates that these models 
 can be used for more extended predictions of system behavior. Having 
 been validated against the statistical models based on actual system 
 measurements, the analytic and simulation models can now be utilized 
 to predict system behavior for jobs with characteristics similar to the 
 number-crunching benchmark, but with variable processing time requirements. 
 One example of a set of such predictions is shown in Figure k.6. In 
 this case, the MTT-MULTICS simulation model was used to predict response 
 times for jobs requiring various amounts of processing time, t, as the 
 load level increases. 
 
 The relative ease with which the analytical and simulation 
 models could be tuned to reproduce the statistical model results for 
 the number crunching benchmark job indicates that this process could be 
 easily repeated for the other benchmark jobs on the appropriate systems 
 (CCN-TSO and MIT-MULTICS, since only the number crunching benchmark was 
 run on BBN-TE1JEX) . 
 
 Thus, the goal of finding a single model capable of describing 
 and predicting response times for time-sharing systems has been accomplished 
 in the case where the time-sharing scheduling discipline depends on 
 quanta fixed at each priority level, but variable across priority levels, 
 and on the past processing history of the job to be serviced. Although 
 both the analytical and simulation models successfully meet this goal, the 
 analytical model produces its results in approximately one-tenth the time 
 as the simulation model and it, therefore, may be the most practical model 
 for actual use in cases where response times for load levels beyond the 
 saturation point of the system are not required. 
 
107 
 
 Figure k.6. Generalized Simulation Model Results 
 
 RESPONSE TIME (SECSO 
 
 8 
 
 8 
 
 8 
 
 
 I - 
 O 
 > 
 
 a 
 r 
 
 P 
 
 0j 
 
 01 
 
 to 
 
108 
 
 h .k. Consideration of Network Queueing Delays 
 
 The response time measurements taken on individual systems 
 of the ARPA network for this study did not distinguish the delay due 
 to network transmission and queueing from the delay due to individual 
 system busyness. This dichotomy of delays was considered to "be 
 insignificant at the time the measurements were taken since network 
 traffic was generally light and only a short run command as opposed 
 to the total "benchmark program was transmitted. Network transmission 
 and queueing delays were estimated at their maximum to be on the order 
 of .1 second in either direction and as such did not contribute measurably 
 to the individual system response time delays. 
 
 The question now arises as to the effect of network transmission 
 and queueing delays on comparative system response times, given that in the 
 future network traffic increases by a significant amount. G. D. Cole 
 in his extensive measurement work on the ARPA network [C0L71] develops 
 expressions for the serial transmission delays and the queueing component 
 delays of ARPA network messages. The network delay time as calculated 
 using Cole's expressions can be added to the delay times generated by 
 the individual system response time models to form a composite response 
 time model . 
 
 The delay caused by physically sending a message on the ARPA 
 network from one node to any other node has two components- -the service 
 times at each IMP to store and forward the message and the actual serial 
 transmission delay. For this experiment, the run commands were either 
 one or two word messages and their expected store -and-forward service 
 
109 
 
 times were 3«^ and k.O msec, respectively [C0L71, P« 131] at each IMP. 
 The propagation delay is about 10 ju sec/mile, resulting in a cross-country 
 delay of approximately 30 msec. Assuming that the University of Illinois 
 node is the one from which all messages originate, and assuming that 
 routing occurs in an environment in which all nodes are connected as 
 shown in Figure 2.1, then transmission delays for the run command 
 message can be estimated. Table k.2 summarizes these calculations. 
 Inspection of the table reveals that even the longest transmission delay 
 of .05 seconds to UCSD is insignificant when response time measurements 
 are recorded in seconds. 
 
 Table k.2. Transmission Times from Illinois to Experimental Sites 
 
 Destina- 
 tion 
 
 ■ ■ ■-.... ■ — — — 
 
 No. of store & 
 forward trans- 
 missions 
 
 Expected Ser- 
 vice time at 
 each IMP (msec) 
 
 Total expected 
 IMP service 
 time (msec) 
 
 Propaga- 
 tion delay 
 (msec) 
 
 Total 
 trans- 
 mission 
 time 
 (msec) 
 
 AMES 
 
 1+ 
 
 k.O 
 
 16.0 
 
 20.0 
 
 36.0 
 
 BBN 
 
 3 
 
 3.k 
 
 10.2 
 
 10.0 
 
 20.2 
 
 CCN 
 
 7 
 
 3.h 
 
 22.8 
 
 20.0 
 
 U2.8 
 
 MIT 
 
 1 
 
 S.h 
 
 3-k 
 
 10.0 
 
 13.4 
 
 UCSD 
 
 8 
 
 k.O 
 
 32.0 
 
 20.0 
 
 52.0 
 
110 
 
 The queueing component of message delay may be a significant 
 addition to individual response time, however, if the ARPA network 
 becomes congested. Cole's expression for the expected message queueing 
 delay [C0L71, P- 13*+] is 
 
 A/2 [(Xj + X x + (x ) ] 
 m' v a 7 a m v nr 
 w = 
 
 [1 - ■ \ X ] [1 -\ (X + x )J 
 
 ma m v a m' 
 
 > 
 
 with variables defined in the following way: 
 
 "\ - arrival rate of messages into the network. 
 
 m 
 
 X - service time for an ACK or acknowledgment . 
 Each message is answered by a request for 
 next message, EFNM, which must in turn be 
 answered by an ACK. Therefore, a number 
 of ACKs will be in contention for the service 
 facility along with the messages themselves, 
 and in heavy traffic conditions, will effectively 
 increase each service time by the 3*0 msec that 
 is required to transmit an ACK. 
 
 x - average message service time, 
 m D ° 
 
 Using the average message service times for the various destinations 
 
 listed in Table k .2 and allowing "\ to increase, the effect of network 
 
 D m ' 
 
 congestion on comparative response times can now be investigated. 
 
 Cole defines an alternative system descriptor to "K called 
 
 m 
 
 T ) where T is the transmission attempt interval of the time between 
 a a ^ 
 
 "attempts" at transmission, since no transmission will occur on a link 
 
 which is waiting for a RFNM (Request for Next Message) return. Further, 
 
 if N is the number of active nodes or "generators" of transmissions, then 
 
 "\ = n/'T • Assuming that 3^ nodes are active simultaneously on the ARPA 
 m / a & J 
 
Ill 
 
 network, then the value of T for which the expected queueing delay 
 
 a 
 
 approaches infinity for transmission of a message of a particular 
 node becomes a meaningful basis of comparison between nodes. 
 
 For example, if messages are being transmitted from the 
 University of Illinois node to one of the five systems investigated in 
 this study, then network queueing delays to each of these systems 
 approaches infinity for the value of T listed in Table k*3> From 
 
 81 
 
 the table, it can be observed that while queueing delays to UCSD from 
 
 the University of Illinois approach infinity when the network transmission 
 
 attempt interval is slightly higher than 1 second, transmissions to MIT 
 
 are not adversely affected until T is close to .2 seconds. Not evident 
 
 a 
 
 from the table information is the fact that network transmission and 
 service speeds of the order of milliseconds cause the queueing delays 
 to be sensitive to changes in transmission attempt intervals of the order 
 of milliseconds. Queueing delays to UCSD, for instance, do not rise 
 above one second until T = 1.215 seconds. From that point, congestion 
 
 EL 
 
 quickly increases so that at T = 1-195 seconds saturation occurs. 
 
 Likewise, at MIT queueing delays rise above one second only at T = .218 
 
 seconds and saturation occurs at T = .217 seconds. 
 
 a 
 
112 
 
 Table k.3- Infinite Network Delays from U. of I. Node 
 
 
 T value at which network 
 a 
 
 queueing delays approach «> 
 
 (msec. ) 
 
 AMES-TSS 
 
 61+5 
 
 BBN-TENEX 
 
 1+U8 
 
 CCN-TSO 
 
 878 
 
 MIT -MULTI CS 
 
 217 
 
 UCSD-CANDE 
 
 1190 
 
 In cases where one system responds faster than another, 
 
 then, but where network traffic causes larger queueing delays to the 
 
 faster system for a given (low) value of T , then the network queueing 
 
 a 
 
 delay becomes a significant consideration in system comparison during 
 periods of heavy network usage and must be included as a part of the 
 predictive response time models. 
 
113 
 
 5- A DYNAMIC RESPONSE TIME MONITOR 
 
 The major purpose of this research was to investigate 
 methodologies and models which could be utilized to develop a dynamic 
 response time monitor for ARPA network users. The monitor is to supply- 
 on-line, real-time information about the level of busyness or load level 
 of each computing node of the network and also to supply comparative 
 response time data for particular computing applications for each of these 
 nodes. Research results indicate current feasible features of such a 
 monitor and also suggest additional features that should be implemented. 
 
 5-1. Currently Feasible Monitor Features 
 
 Evidence is available from the investigation of response time 
 at the five computing nodes included in this study to suggest three 
 immediately implementable monitor features. The first of these is a 
 table of load levels at each node by time of day and day of the week. 
 If ten load levels are defined across the observable load range for all 
 computing nodes, as was described in section 2.2.3., then users could 
 gain a snap-shot overview of relative busy times at any one node. This 
 type of information might influence a decision about when to do work on 
 a particular system. An example of a section of such a table has been 
 compiled for the AMES-TSS system and it is presented in Table 5.1. 
 The data in the table approximates system behavior during May and June of 
 197^- For user convenience, the time of day on these tables should be 
 translated to the time zone (EDT, EST, PST, etc.) from which an inquiry 
 
Hk 
 
 is made. Times in the AMES-TSS table correspond to the time framework 
 of a user at the University of Illinois node. 
 
 Table 5.1. Load Levels at AMES-TSS 
 
 
 Sunday 
 
 Monday - Friday 
 
 1 
 Saturday 
 
 
 1-2 
 
 
 
 -8 AM 
 
 
 
 8-9 AM 
 
 1 - 2 
 
 1 " 2 
 
 9-10AM 
 
 1 - 3 
 
 r 1-2 
 
 10AM- 2 PM 
 
 5 - 8 
 
 
 2-3 PM 
 
 5 - 6 
 
 3-7 PM 
 
 7 - 9 
 
 7-8 PM 
 
 3 - k 
 
 8-9 PM 
 
 1-2 
 
 3 - k 
 
 9PM- 
 
 
 
 
 
 
 A second feature to be included in a dynamic response time 
 monitor is a descriptive text explaining relevant local factors effecting 
 system response times at each node. In some cases, such explanations 
 are buried in "HELP" files associated with a particular time-sharing 
 system. Also, the Network Information Center of the ARPA network 
 provides a brief explanation of local conditions in its NIC publication 
 
115 
 
 No. 18666. These courses of local load information are either incomplete 
 
 (not available at every node) or out of date and are not necessarily 
 
 easily accessible to all potential users of a particular computing node. 
 
 For example, the NIC "Service Schedule" description for the AMES-TSS 
 
 system published in August of 1973 reads as follows: 
 
 AMES-67 is available 2h hours per day but severe 
 loading generally restricts access from 0800 to 
 1700 PST. The weekend schedule varies. Typical 
 Load is 30-50 users (including batch). The 
 maximum number of users is regulated dynamically 
 by loading. Network users are not regulated 
 separately. [ANR73b] 
 
 This description is accurate, but omits information that may be useful, 
 
 or at least of interest, to a network user. For instance, the AMES 
 
 system has developed a "Resource Allocation Scheme" which attempts to 
 
 guarantee a certain level of service to authorized priority users at 
 
 various times throughout the day (one group has priority from 8-10AM, 
 
 another from lOAM-noon and so on) . Because of this, the load measure, 
 
 PI, rarely goes below .250. When the guaranteed level of service for a 
 
 particular priority group is being threatened by a heavy load, then 
 
 system access is curtailed for all non-priority users, including the 
 
 non-priority network user. A further point of interest about the AMES 
 
 system is that the local user group works a fairly regular 8AM- 5PM 
 
 schedule, taking the noon hour for lunch. Thus, the machine is lightly 
 
 loaded during noon-lFM PST. 
 
 In addition to the load level tables and load descriptive text, 
 
 the dynamic response time monitor must include an inquiry feature by 
 
 which a network user can obtain actual current comparative response 
 
116 
 
 time data for a job to be processed. The inquiry feature would be made 
 up of two interactive modules — the user interface and the predictive 
 mechanism. 
 
 The user interface would require user input consisting of the 
 set of nodes at which response time is to be calculated and the CPU and 
 i/O processing characteristics of the job to be submitted. The output 
 to this user inquiry would consist of a list of expected response times 
 at each of the indicated nodes, including the current load level at each 
 node. This data for output would be generated by the predictive module 
 of the inquiry feature. Prediction, of course, is at the heart of the 
 dynamic response time monitor and the feasibility of the predictive 
 feature has been verified by this research. 
 
 For each of the five different time -sharing systems investigated 
 in this study, it was possible to develop a statistical model in all 
 cases, and an analytic and simulation model in most cases, to describe 
 and predict the response time behavior of that system as it processed 
 a limited set of benchmark jobs. The initial indication from the analytic 
 and simulation models is that they can be easily extended to predict 
 response times for more general classes of jobs than the three benchmark 
 applications. Moreover, the systems themselves represented a wide range 
 of time-sharing scheduling implementations, including the unique AMES-TSS 
 table driven, memory-usage dominated system. Successful description and 
 prediction of behavior for this wide variety of time-sharing schedulers 
 suggests equal success with other time-sharing systems whose scheduling 
 is any variation on the general time-sharing scheduling algorithm as 
 described in section 2.1. 
 
117 
 
 Results have been obtained for the network transmission and 
 queueing delays which add significantly to response time when the 
 network itself becomes congested. Should network usage become such that 
 the network approaches a saturated state, then these queueing delays 
 would have to be added to the individual system delay. Although 
 calculations made for this research were done only for very short messages, 
 the same Cole expression can be used when input or output message length 
 is expected to be greater than that able to be transmitted in one message 
 packet. Thus, an analytic model, able to be used from any network node, 
 is available for prediction of this component of the response time. 
 
 5 .2 . Additional Desirable Monitor Features 
 
 Beside the features which have already proven to be immediately 
 implementable components of a dynamic response time monitor, there exist 
 other desirable monitor features which would make utilization of network 
 resources easier for the user. Chief among these is comparative cost 
 information. Some preliminary work done by Peter Alsberg at the University 
 of Illinois Center for Advanced Computation illustrates the difficulties 
 encountered in collecting charging algorithm data for individual systems 
 on the ARPA network. Some systems have free accounts for network users 
 and some heavily subsidized systems use charging algorithms that do not 
 reflect their actual expenses. Further, information is needed on network 
 routing expenses since if charging where done on a node by node basis, 
 then some systems which offer a cost advantage as individual entities 
 may lose that advantage due to extensive job routing requirements. 
 
118 
 
 Given both comparative response time data and comparative cost 
 data, the dynamic monitor could be extended to appear to the user as a 
 dealer in network services. The monitor would be enabled to indicate the 
 fastest response time possible at the highest cost a user is willing 
 to pay. Thus, the monitor can provide complete time -vs- cost data while 
 not usurping the users' power to finally decide where to run a job. 
 
119 
 
 6. CONCLUSIONS 
 
 This research has shown that it is feasible to develop a 
 response time monitor for use in a network computing system that is 
 capable of providing comparative response time information for users 
 with various computing applications to process. System response behavior 
 was measured and modeled using statistical techniques as well as analytical 
 and simulation techniques. The effect of network traffic on response 
 times were also considered. 
 
 Analysis of measurements on individual time -sharing systems 
 revealed that it is, in fact, possible to describe and predict response 
 time for these systems using linear and/or nonlinear regression techniques. 
 The need for more uniform measures of "response time" and system "busyness" 
 was particularly evident in this phase of the investigation. While 
 response time could be satisfactorily defined in a uniform, consistent, 
 easily measurable way, a uniform measure of load level or busyness of a 
 system was more elusive. A more satisfactory solution to the busyness 
 dilemna would have been possible if all systems could have been observed 
 with busyness ranging from no users to system saturation. Although the 
 lower bound was observable on all systems, some of the nodes under 
 investigation did not approach saturation during the measurement phase 
 of the research. 
 
 Having decided on a definition of load level that was uniform 
 and consistent across all systems, but perhaps not intuitively pleasing, 
 comparison of response times of time-sharing systems as they processed 
 given benchmark jobs was possible. Systems were able to be ranked in 
 
120 
 
 order of fastest to slowest response times for a relatively long 
 (approximately 4 5 seconds of processing time) CPU bound job, for a short 
 (approximately 3 seconds processing time) CPU bound job and for an i/o 
 bound job. 
 
 This comparative capability was expanded from these three 
 specific benchmark jobs to a more general class of jobs through the 
 development of a single analytical and a single simulation model. The 
 models were developed to describe and predict the response time behavior 
 of the time-sharing systems involved in the study and were found to be 
 valid system representations in three of the five systems investigated. 
 
 The effects of increased network traffic were also studied 
 and an expression was found to predict this component of response time 
 if and when it becomes significant (adds delay on the order of magnitude 
 of seconds to the response time of any individual system) . Currently 
 on the APPA network, traffic is light and delays due to network congestion 
 were not significant in the response time measurements. 
 
 The successful results of the various areas of investigation 
 described above led to the postulation of the feasibility of a dynamic 
 response time monitor that users could query to obtain current on-line 
 comparative response time data for a particular computing application run 
 on one of a set of network time- sharing facilities. The contents and 
 structure of such a monitor were discussed. 
 
 6.1. Implications for Future Network Development 
 
 User oriented network research requires a commitment to the 
 investigation and development of tools that go beyond mere reliability 
 
121 
 goals. If, indeed, the ultimate aim of a computing network is resource 
 sharing, then the human component as well as the technical components 
 of networking must be fully investigated to achieve this goal. This 
 research, a first step toward assisting the user in participating in the 
 vast store of resources available on a network, suggests that a firm 
 commitment on the part of node managers must be made (or required) to 
 maintain and improve such assistance. 
 
 The most pressing commitment on the part of node managers, 
 needed to make more effective the implementation of the dynamic response 
 time monitor discussed in section 5., is the investigation of and 
 agreement upon uniform response time and load measures. Two of the five 
 systems studied (BBN-TENEX and UCSD-CANDE) already automatically generate 
 a consistent response time measure, as defined in section 2.2.U., when a 
 job is run. This information is easily obtainable using a system clock 
 and could be provided by other network systems with very likely only 
 a minimum of effort. An acceptable load measure may be more difficult, 
 but not impossible, to implement on all network systems. The BBN-TENEX 
 "load average" measure which is a ratio of jobs on the ready queue to jobs 
 on the run queue has proved to yield the least variation when statistical 
 analysis of response time data is performed. It is a highly dynamic 
 measure and a meaningful one in terms of system loading and the users ' 
 conception of system busyness . The "load average" measure is, therefore, 
 a prime candidate for a uniform measure of system load on all network 
 systems. 
 
122 
 
 A second commitment required of node managers is to the develop- 
 ment and maintenance of descriptive and predictive response time models 
 for their respective nodes. This research has illustrated that such models 
 are possible to generate and can be effectively used. But a considerable 
 amount of work is involved in fine tuning these models so that they are 
 accurate for various classes of input jobs (CPU bound, i/o bound, etc.) 
 and for variations within and among these classes. Even given that the 
 initial system models may be developed by an outside group, cooperation 
 from those persons most intimately involved with the system and model 
 updating, at least at times of system configuration modifications, are 
 essential to accurate response time prediction. 
 
 6 .2 . Suggested Further Research 
 
 There are, of course, many other areas of investigation 
 not directly related to dynamic response time monitors, but aimed directly 
 at assisting users of computer networks, that need to be explored. Some 
 of these areas are comparative job cost, "bidding" scheduling disciplines, 
 a basic, uniform subset of time -sharing system commands available on any 
 network system and the "black box" approach to scheduling in which the 
 user views the network as a single powerful system. If we agree that 
 "people use computers", then we have to agree to serve the needs of the 
 computing community. 
 
 Direct extensions of this research require the cooperation of 
 all HOST facilities to gather the necessary data required to make the 
 monitor universal to the entire network. Even the extensive measurements 
 
123 
 
 collected for the particular five systems investigated in detail are 
 incomplete in that they do not conclusively guarantee response time 
 predictive capability for all classes of computing applications. A 
 consistent, uniform response time measure and load level measure must 
 be adopted by all network HOSTS. Facilities should be provided for 
 forcing the systems into saturation so that system behavior can be 
 observed under all loading conditions and so that comparisons of systems 
 can be made more conceptually satisfying. Fine tuning of the basic models 
 developed in this research must be done for various kinds of computing 
 applications and models as well as tables and descriptions of system 
 loading characteristics must be continually updated so as to credibly 
 correspond to users' actual experience with a system. 
 
 A further extension of this research is the investigation of 
 comparative system costs so that users are enabled to balance their 
 response time desires with their budget constraints. 
 
 A final suggestion for future research which may be of particular 
 significance in determining the viability of the whole computer networking 
 concept is to determine the degree to which users at various sites are 
 motivated to exploit the resources at other network nodes, given that 
 the advantages of such activities are made readily apparent to them. 
 
124 
 
 LIST OF REFERENCES 
 
 [ABR74] Abrams, M. D., "A New Approach to Performance Evaluation 
 of Computer Networks, " Proc . 1974 Symposium COMPUTER 
 NETWORKS: Trends and Applications , pp. 15-20. 
 
 [ANRT3a] ARPA Network Resources Notebook , NIC 6740 Network Information 
 
 Center, Stanford Research Institute, Menlo Park, California. 
 
 [ANR73"b] ARPA Network Resources Notebook , NIC 18666, Network Information 
 
 Center, Stanford Research Institute, Menlo Park, California. 
 
 [B0B72] Bobrow, D. G-, et al, "TENEX, a Paged Time Sharing System 
 
 for the PDP-10, " Comm. ACM , Vol. 15, No. 3, March 1972, 
 pp. 135-143- 
 
 [C0F68] Coffman, E. G-, L. Kleinrock, "Feedback Queueing Models for 
 
 Time-Shared Systems," Journal of the ACM , Vol. 15, No. 4, 
 October 1968, pp. 5^9-576. 
 
 [C0L71.1 Cole, G. D., "Computer Network Measurements: Techniques and 
 Experiments, " UCLA-ENG-7165, University of California, 
 October 1971. 
 
 [DEN68] Denning, P. J., "The Working Set Model for Program Behavior," 
 Comm. ACM , Vol. 11, No. 5, May 1968, pp. 323-333- 
 
 [D0H70] Doherty, W. J., "Scheduling TSS/36O for Responsiveness," 
 
 Proc. I97O Fall Joint Computer Conf ., Vol. 37, pp. 97-111- 
 
 [FAR72] Farber, D., "Data Ring Oriented Computer Networks," Computer 
 Networks , ed. R. Rustin, Prentice-Hall, 1972, pp. 79-93- 
 
 [FUC70] Fuchs, E., P. E. Jackson, "Estimates of Random Variables for 
 Certain Computer Communications Traffic Models, " 
 Comm. ACM , Vol. 13, No. 12, December 197 0, pp. 752-757- 
 
 [GRE7U] Greenberg, B. S., personal notes, to be published as Multics 
 Program Logic Manual, Order No. AN73, Multics Multi- 
 programming and Scheduling . 
 
 [HER72] Herzog, B., "MERIT Computer Network, " Computer Networks , 
 ed. R. Rustin, Prentice-Hall, 1972, pp. 45-48. 
 
 [JAC69] Jackson, P. E., C D. Stubbs, "A Study of Multi-Access Computer 
 Communications, Proc. 1969 Spring Joint Computer Conference , 
 Vol. 3k, pp. 491-504. 
 
125 
 
 [KLE70] Kleinrock, L., "Analytic and Simulation Methods in Computer 
 Network Design, " Proc . 1970 Spring Joint Computer 
 Conference , Vol. 36, pp. 569-579- 
 
 [KLE72] Kleinrock, L., "Survey of Analytical Methods in Queueing 
 
 Networks," Computer Networks , ed. R. Rustin, Prentice -Hall, 
 1972, pp. 185-205 • 
 
 [KNI66] Knight, K. E., "Changes in Computer Performance," Datamation , 
 Vol. 12, No. 9, September 1966, pp. kO-^k. 
 
 [KNI68] Knight, K. E., "Evolving Computer Performance I963-I967, " 
 Datamation , Vol. Ik, No. 6, January 1968, pp. 31-35* 
 
 [MAMjh] Mamrak, S., "Performance Evaluation in Computer Networks: 
 A Survey," January 197^, submitted for publication. 
 
 [MAR73J Maranzano, J. G., "Proposal for a Definition of Response 
 Time, " Computer Measurement and Evaluation , selected 
 papers from the SHARE Project, Vol. II, December 1973, 
 pp. 1+81+-1+96. 
 
 [MCQ73J McQuillan, J. M-, "Throughput in the ARPA Network- -Analysis 
 and Measurement," Report No. 2^91, Bolt, Beranek and 
 Newman, Inc., January 1973 • 
 
 [MID68] Middleton, J. A., "Least Squares Estimation of Non-Linear 
 
 Parameter s-NL IN, " 360D-13-2 .003, International Business 
 Machines Corporation, 1968. 
 
 [0RG72] Organic k, E. I., The Multics System: An Examination of Its 
 Structure , MIT Press, Cambridge, Massachusetts, 1972. 
 
 [R0B70] Roberts, L. G., B. D.Wessler, "Computer Network Development 
 to Achieve Resource Sharing, " Proc . 1970 Spring Joint 
 Computer Conference , Vol. 36, pp. 543-549- 
 
 [SAL73] Salz, F., Simulation Analysis of a Network Computer , Master 
 of Science Thesis, Department of Computer Science, 
 University of Illinois at Urbana-Champaign, June 1973- 
 
 [SCH67] Scherr, A. L., An Analysis of Time-Shared Computer Systems , 
 MIT Press, Cambridge, Massachusetts, 1967. 
 
 [SCH71] Schriber, T. J., General Purpose Simulation System/ 36O : 
 Introductory Concepts and Case Studies , Ulrich's 
 Books, Inc., Ann Arbor, Michigan, c. 1971* 
 
126 
 
 [TOT65] Totschek, R. A., "An Empirical Investigation into the 
 Behavior of the SDC Time -Sharing System, " System 
 Development Corporation, Report SP-2191, AD622003, 
 Santa Monica, CA, 1965 • 
 
 [WART 3] 
 
 [WHI72] 
 
 Ware, G. 0., et al, "A Simulation Study of an Information 
 Dissemination Center Network, " The University of 
 Georgia, Technical Report UGA/OCA 73-1° 
 
 Whitney, V. K., "Comparison of Network Topology Optimization 
 Algorithms, " Proc . First International Conference on 
 
 Computer Communications , Washington, D .C 
 pp. 332-337. 
 
 October 1972, 
 
 [WOL68] Wolff, R. W., "Time Sharing with Priorities," Operations 
 
 Research Center, Univeristy of California at Berkeley, 
 ORC 68-13, June 1968 . 
 
127 
 
 APPENDIX A 
 
 Definitions and Abbreviations 
 
 AMES-TSS 
 
 Time Sharing System created by IBM and 
 run on an IBM 360/67 at the Nasa Ames 
 Research Center, Moffett Field, California, 
 This interactive system is characterized 
 by a table driven process scheduler, in 
 which the frequency and duration of 
 processor time slices awarded to processes 
 is determined by the process' paging 
 behavior. 
 
 BBN-TENEX 
 
 A time -sharing system run on a PDP-10 
 machine at Bolt, Beranek and Newman, 
 Incorporated in Cambridge, Massachusetts. 
 The scheduler is characterized by five 
 priority queues and a "balance set" control 
 module which regulates running processes 
 so as to minimize the probability of an idle 
 CPU due to too frequent page faults. 
 
 CANDE 
 CCN-TSO 
 
 See UCSD -CANDE below. 
 
 Time Sharing Option created by IBM and run 
 on an IBM 360/91 a "t the Campus Computing 
 Network on the University of California 
 campus in Los Angeles. The scheduler is 
 distinguished by its binding processes to 
 one of a fixed number of virtual machines 
 within which no multiprogramming occurs. 
 
 FIFO 
 
 MIT-MI JLTICS 
 
 A scheduling discipline in which processes 
 are served in a first-in, first-out order. 
 
 A time sharing system run on a Honeywell 6*+5 
 at the Massachusetts Institute of Technology 
 in Cambridge. This scheduler is characterized 
 by its concept of a set of "eligibles" which 
 consists of those processes having the highest 
 dispatching priority that can simultaneously 
 exist in core. 
 
 MULTICS 
 
 See MIT-MULTICS above. 
 
128 
 
 OS/MVT 
 
 0S/VS2 
 
 An IBM Operating System in which a 
 multiprogramming environment exists. 
 
 An IBM Operating System characterized by 
 its Virtual Storage memory allocation 
 scheme . 
 
 Packet Switching 
 
 RR 
 
 Store-and-forward Network 
 
 TENEX 
 Thrashing 
 
 TSO 
 
 A method for sending transmissions through 
 a communications network in which messages 
 are broken down into smaller "packets" 
 of information to be transmitted separately 
 and reassembled by the receiver. 
 
 A scheduling discipline in which processes 
 are scheduled Round Robin; that is, they 
 each receive a specified amount of service 
 and then are returned to the end of the 
 service queue if they have not completed 
 execution in the specified time. 
 
 A computer network in which messages to be 
 transmitted are stored in each node along the 
 transmission path until they are safely 
 received by the next node in their path. 
 
 See BBN-TENEX above. 
 
 A state occuring in paged memory systems in 
 which too many different working sets occupy 
 main memory and each displaces the others 
 pages in an attempt to have its own pages 
 present . 
 
 See CCN-TSO above. 
 
 TSS 
 UCLA 
 
 UGSD-CANDE 
 
 See AMES-TSS above. 
 
 University of California at Los Angeles 
 
 A time -sharing system run on a Burroughs 
 67OO machine at the University of California 
 at San Diego. The scheduler is characterized, 
 by two priority queues, with a high priority 
 queue serving burst-oriented processes and 
 a low priority queue serving compute bound 
 processes . 
 
129 
 
 APPENDIX B 
 
 Benchmark Jobs 
 
 B.l. MIT-MULTICS Number Cruncher 
 
 REAL CRL( 100, 100) , DATA ( 100, 100) , SUM(lOO) , SD (100), OBS, TSUM, TSUMS 
 INTEGER I,J,K,L,M 
 DATA L/l00/,M/l00/ 
 DO 10 1=1, M 
 DO 10 J=1,L 
 
 10 DATA(l,j)=l./(3*I-3+J) 
 
 CALL C ORREL(CRL, DATA, SUM, SD,L,M) 
 
 STOP 
 
 END 
 
 SUBROUTINE CORREL(CRL,DATA, SUM, SD,L,M) 
 INTEGER L,M, I,J,K 
 
 REAL CRL (M, M ) , DATA( L, M ) , SUM(M) , SD (M ) , OBS, TSUM, TSUMS 
 OBS=M 
 
 DO 100 1=1, L 
 TSUM=0 . 
 TSUMS=0. 
 DO 20 J=1,M 
 TSUM=T SUM+DATA ( J, I ) 
 20 TSUMS=TSUMS+DATA(J, l)**2 
 
 SUM(l)=TSUM 
 
 sd(i)=sort(tsums-tsum*tsum/obs) 
 
 100 CRL(I,I)=1. 
 
 LML=L-1 
 
 DO 150 1=1, LM1 
 
 IP1=I+1 
 
 DO 150 J=IP1,L 
 
 TSUM=0 . 
 
 DO 125 K=1,M 
 125 TSUM=TSUM+DATA(K,I)*DATA(K,J) 
 
 CRL (l,j)=(TSUM-SUM(l)*SUM(j)/OBS)/(SD(l)*SD(j)) 
 150 CRL(J, I)=CRL(I,J) 
 
 RETURN 
 
 KM) 
 
130 
 
 B.2. MIT-MULTICS Bit String Manipulator 
 
 C0NN100: EROD; 
 
 DCL(SYSIN, SYSFRINT)FILE; 
 
 DCL (FOUND, GOAL, REALITY, LAST) BIT (10201) ALIGNED ; 
 
 DCL ( I, J, ITERATIONS) FIXED BIN; 
 
 DCL SEED FIXED BIN(lT); 
 
 DCL MULTIPLIER FIXED BIN; 
 
 MULTIPLIER^ 57; 
 
 SEED =99; 
 
 DO 1=1 TO 10201 BY 17; 
 
 LF SEED=0 THEN SEED = MULTIPLIER; 
 
 SEED=MOD ( SEED*MULTIPLIER, 131072 ) ; 
 
 SUBSTR (REALITY, I, 17)=BIT(SEED) ; 
 
 END; 
 
 SUBSTR (REALITY, 1, 101)= (100)"0"B; 
 
 DO 1=102 TO 10201 BY 101; 
 
 SUBSTR (REALITY, I, 1 )= "O n B ; 
 
 end; 
 
 GOAL, FOUND, LAST= "o"B ; 
 
 SUBSTR (GOAL, 10102, 100 )=( 100 )"1"B; 
 
 SUBSTR (FOUND, 103, 100)=SUBSTR (REALITY, 103, 100) ; 
 
 ITERATI0NS=1; 
 
 DO "WHILE ( (FOUND t=IAST) &( (FOUND&GOAL)="0"B) ) ; 
 
 LAST=FOUND ; 
 
 ITERATI0NS=ITERATI0NS+1 ; 
 
 SUBSTR (FOUND, 102) = SUBSTR (REALITY, 102 )&(F0TJND | SUBSTR (FOUND, 101 )> 
 
 SUBSTR (FOUND, 102 )\|SUBSTR (FOUND, 103 K|\ I SUBSTR (FOUND, 203 ) ) \ 
 END; 
 END C0NN100; 
 
 MIT-MULTICS I/O Bound 
 
 FILFLG: PROC ; 
 
 DECLARE I FIXED BIN(3l); 
 
 DECLARE (NUMBERRECS INIT(lOOO), 
 
 RECLENGTH INIT(250)) 
 
 FIXED BIN(15), 
 
 FILEIN FILE RECORD, 
 
 FILEOT FILE RECORD, 
 
 1 RECORD ALIGNED, 
 
 2 WORTHLESSTEXT CHAR (2 50) INIT( (250) "X") ; 
 OPEN FILE (FILEOT ) TITLE ( "VFILE «- TSTJKM") OUTPUT ; 
 DO 1=1 TO NUMBERRECS; 
 WRITE FILE (FILEOT) FROM(RECORD) ; 
 END; 
 
 CLOSE FILE (FILEOT); 
 
 OPEN FILE (FILEIN) TITLE ( 'V FILE *- TSTJKM") INPUT; 
 DO 1=1 TO NUMBERRECS; 
 
 READ FILE (FILEIN) INTO (RECORD); 
 
 END; 
 
 CLOSE FILE (FILEIN); 
 
 END FILFLG; 
 
131 
 
 APPENDIX C 
 Relevant Statistical Data 
 
 Comparison of residual mean squares (RMS) for the individual 
 system data curve fits was one of the criteria used to determine a 
 "best fit" to the response time data. Table C.l contains the RMS for 
 the quadratic, cubic and exponential curve fits, for each of the bench- 
 mark jobs run. The other criteria used were possibility of fit (does 
 the regression curve indicate the response time is negative for some range 
 of the data) and probability of fit (does the regression curve indicate 
 a higher response time for a lower load level than a higher one). The 
 final choices for the best fit curve are listed in Table C.2. The 
 regression sum of squares to total sum of squares ratio given in the 
 table is a measure of how well the regression curve explains the total 
 variation in the data. A ratio of 1.0 would indicate a perfectly fit 
 curve . 
 
132 
 
 Table C.l. Residual Mean Square (RMS) Statistics 
 
 Location 
 
 Benchmark 
 
 (RMS) 
 Quadratic 
 
 (RMS) 
 Cubic 
 
 (RMS) 
 Exponential 
 
 923.5^ 
 
 AMES 
 
 Wo. Cruncher 
 
 8^8.92 
 
 788.29 
 
 Bit Manipul. 
 
 1^5-66 
 
 150.19 
 
 1^5-95 
 
 I/O Bound 
 
 985. h2 
 
 1029.69 
 
 952. Ok 
 
 BBN 
 
 No . Cruncher 
 
 lA9(io 5 ) 
 
 1.^7(10 5 ) 
 
 1.35(10 5 ) 
 
 CCN 
 
 No . Cruncher 
 
 1217.02 
 
 1882.58 
 
 1079. ^3 
 
 Bit Manipul. 
 
 839.89 
 
 1591.57 
 
 835.72 
 
 I/O Bound 
 
 6811.62 
 
 6725.53 
 
 5^1.88 
 
 MIT 
 
 i 
 
 No. Cruncher 
 
 7.^9(10^) 
 
 7A9(10 U ) 
 
 6.62(10^) 
 
 Bit Manipul. 
 
 11.72 
 
 12.35 
 
 12.19 
 
 I/O Bound 
 
 25.68 
 
 23.19 
 
 26 A3 
 
 UCSD 
 
 No. Cruncher 
 
 li+6l.07 
 
 15^2.0^ 
 
 1301.9 
 
133 
 
 Table C.2. Individual System Best Curve Fit Data 
 
 Location 
 
 Benchmark 
 
 rss/tss* 
 
 Type of Curve 
 for Best Fit 
 
 AMES 
 
 No. Cruncher 
 
 .hk 
 
 Exponential 
 
 Bit Manipul. 
 
 • 59 
 
 Quadratic 
 
 I/O Bound 
 
 •67 
 
 Cubic 
 
 BBN 
 
 No. Cruncher 
 
 .87 
 
 Exponential 
 
 CCN 
 
 No. Cruncher 
 
 •59 
 
 Exponential 
 
 Bit Manipul. 
 
 .6k 
 
 Exponential 
 
 I/O Bound 
 
 • 53 
 
 Exponential 
 
 MIT 
 
 No. Cruncher 
 
 • 37 
 
 Exponential 
 
 Bit Manipul. 
 
 .kk 
 
 Exponential 
 
 I/O Bound 
 
 • 59 
 
 Exponential 
 
 UCSD 
 
 — _ 
 
 No. Cruncher 
 
 .81 
 
 Exponential 
 
 ^Regression Sum of Squares/Total Sum of Squares 
 
13^ 
 
 VITA 
 
 Sandra Ann Mamrak was born in Cleveland, Ohio in 19^ • She 
 received the B.S. degree from Notre Dame College of Ohio in 1967 and 
 subsequently taught in the Cleveland secondary school system for three 
 years . 
 
 From 1971 "to 1975, Ms. Mamrak was employed as a research 
 assistant by the Department of Computer Science and later the Computing 
 Services Office at the University of Illinois where she was a participant 
 in groups investigating problems in performance evaluation in single 
 and network computer systems. She received the M.S. degree in 1973 and 
 her Ph.D. degree in 1975 from the University of Illinois at Urbana- 
 Champaign. 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R-75-722 
 
 3. Recipient's Accession No. 
 
 4. Title and Subtitle 
 
 Comparative Response Times of Time-Sharing Systems 
 on the ARPA Network 
 
 5. Report Date 
 
 May 1975 
 
 7. Author(s) 
 
 Sandra Ann Mamrak 
 
 8. Performing Organization Rept. 
 No " UIUCDCS-R-75-722 
 
 ). Performing Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana -Champaign 
 
 Urbana, Illinois 6l801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract/Grant No. 
 
 12. Sponsoring Organization Name and Address 
 
 Computing Services Office 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 6l801 
 
 13. Type of Report & Period 
 Covered 
 
 Ph.D. Dissertation 
 
 14. 
 
 15. Supplementary Notes 
 
 Also sponsored by the Advanced Research Projects Agency under contract DAHCO^-72-C-OOCJl 
 
 16. Abstracts 
 
 If, indeed, the ultimate aim of a computing network is resource sharing, 
 then the human component as well as the technical component of networking must be 
 fully investigated to achieve this goal. This research is a first step toward assisting 
 the user in participating in the vast store of resources available of a network. 
 Analytical, simulation and statistical performance evaluation tools are employed to 
 investigate the feasibility of a dynamic response time monitor that is capable of 
 providing comparative response time information for users wishing to process various 
 computing applications at some network computing node. 
 
 The research clearly reveals that sufficient system data is currently 
 obtainable, at least for the five diverse ARPA network systems studied in detail, to 
 iescribe and predict response time for network time-sharing systems as it depends on 
 some measure of system busyness or load level. 
 
 7. Key Words and Document Analysis. 17a. Descriptors 
 
 Response time monitor, computer networks, time-sharing systems, 
 analytic modeling, simulation, ARPA network 
 
 7b. Identif icrs/Opcn-Ended Terms 
 
 lc. ( OSATI Field/Group 
 
 J. Av.ul ability Statement 
 
 Release Unlimitied 
 
 19. Security (lass (This 
 Report ) 
 
 DNC.l.ASSH-THD 
 
 20. Security Class (This 
 
 Page 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Prici 
 
 l)R M NTIS-lfl (10-70) 
 
 USCOMM-DC 4032<J-r->7 1 
 

 a. 
 
 UJ 
 CO