ACKNOWLEDGMENT The author wants to express his sincere thanks to the ILLIAC III group (especially the hardware personnel) in offering the equipment from which this system software can be tested. Thanks also to Professor George Friedman for offering his time in proof reading and commenting on this paper.

PREFACE A multiprogramming (semi-) virtual memory operating system has been written for the mini-computer PDP 8/e located in the ILLIAC-3 room of the University of Illinois. This computer is currently used to control the various picture processing peripherals in the room as well as routing information to the general purpose computer IBM 360/75 for processing. The following paper describes the usage and design principles of the operating system in coping with the above environment. SYSTEM ROUTINES AND TABLES 25 4.1 Virtual Memory Routines 26 4.1.1 LOAD 26 4.1.2 STORE 26 4.1.3 GOTO 27 4.1.4 CALLS 27 4.1.5 CORE MAP 28 4.2 Task Management Routines and Control Blocks 30 4.2.1 TASK CONTROL BLOCK (TCB) 30 4.2.2 DEVICE TABLE ENTRY (DTE) 31 4.2.3 EVENT CONTROL BLOCK (ECB) 31 4.2.4 QUEUE ELEMENT (QE) 32 VI page 4.2.5 SERVICE ELEMENT ( SE ) 33 4.2.6 POST Routine 3 3 4.2.7 WAIT Routine 34 4.2.8 QUEUE Routine 35 4.2.9 NEXTQ Routine 35 4. 3 Communication Table 36 4.4 UTILITIES 37 4.4.1 REA DC Routine 38 4.4.2 LPRINT Routine 38 4.4.3 WRITEA Routine 38 4.4.4 WRITEC Routine 39 5. EXISTING SYSTEM AND PROBLEM TASKS 40 5.1 Keyboard Monitor Task 40 5 . 2 Scanner Task 42 5.3 ODT (Octal Debug Technique) Task 44 5 . 4 Text Editor Task 44 5 . 5 Microscope Stage Task 45 5. 6 Line- tape Task 45 5. 7 Display Task 46 6. IBM 360 INTERFACE 47 7. HOW TO GET INTO A SPECIFIC FIELD 50 8. HOW TO USE OS/8 SUPPORT 52 9. CODING NEW TASKS TO THE SYSTEM 53 10. REMARKS 62 BIBLIOGRAPHY 63 APPENDIX 64 1. DESIGN GUIDELINES Since the operating system was designed with the partic- ular picture processing environment in mind, a number of design guidelines were observed to ensure convenience in use and reasonably high system performance: (a) The system must provide easy methods for on-line interactive communication with IBM 360 (or any other computer of the same class). A large amount of computation power is needed to process pictures, and it is not likely to be done entirely on the mini-computer PDP 8/e. Therefore, information must be transmitted to more powerful computers for a major part of the processing. Also since pictures are digitized and displayed on the ILLIAC III site only, the communication must be real time interactive, so as to satisfy the needs of many such algorithms. (b) The user must be able to supervise and if necessary interrupt the processing of his program on the IBM 360. Most picture processing programs are still experimental. They are far from "bug free." Allowing the user to interrupt his program can save unnecessary computer time and make debugging easier. Also ways should be provided to communicate with the IBM 360 machine room operator for specific run time instructions, (c) Facilities should be provided to output a majority of the printouts of the IBM 360 runs at the ILLIAC III site. This is the only way a user can determine the outcome of the experiment on a particular algorithm so as to decide on the next phase of the experiment. (d) Device independent I/O should be provided for programs to use (if they want to). The I/O devices should also be assignable at run time to provide maximum flexibility. (e) The system should provide ways to do part of the picture processing on site. Most picture processing algorithms require a lot of storage. Methods should be provided to swap the contents of primary memory and secondary memory. (f) Some form of on-line debug and error (I/O or program) recovery should be provided. (g) The system should be easy to maintain. (h) The system performance should be reasonably high, i.e., at least an order of magnitude higher than a mono-pro- grammed system (e.g., 0S8 ) . 2. HARDWARE CONFIGURATION 2.1 PDP 8/e The PDP 8/e is a 12 bit word minicomputer. (I.e., all instructions and data are 12 bits long. ) It has an accumulator and several status registers (to record hardware states). It does not have any index register or base register. All arithmetic is done with this accumulator. The accumulator (which is itself 12 bits long) is accompanied by a one bit register "link" as the most significant bit to form a 13 bit entity to enable multiple precision arithmetic. There are separate commands to manipulate the "link" and the accumulator. The PDP 8 instructions are also 12 bits long. The most significant 3 bits are used as operation code. So there is a maximum of 8 instructions. Only 6 of these are memory reference instructions. For these 6 instructions, the next 2 bits are address modification bits known as "indirect- addressing bit" and "current page bit" respectively. The remaining 7 least significant bits are known as the displace- ment. So there is a maximum displacement of 128. In other words, a memory reference instruction can only address 128 words directly. The PDP 8 memory is segmented into 128 word pages: page covers locations 0-127, page 1 covers locations 128-255, and so on. The "current page bit" determines which page the "displacement" is referencing. If the "current page bit" is 0, page is referenced. If the "current page bit" is 1, the page in which the memory reference instruction resides referenced. So every instruction can reference its own page or page zero directly. Page zero can therefore be used to store commonly used variables. If a PDP 8 instruction wants to reference an address not in page zero or its own page, indirect addressing must be used (indicated by 1 for the "indirect addressing bit"). In this case, instead of using the contents of the computed address as data, it is used as a 12 bit effective address. Thus using indirect addressing, an instruction can effectively access locations 0-4095. This 4K of core is known as a "field." Most PDP 8's have more than 1 field. In this case, 2 instructions are provided to switch between fields. A "CHANGE DATA FIELD n" instruction will specify all later data references to be from a particular field. The "CHANGE INSTRUCTION FIELD n" is provided to transfer control to a program in another field. The particular PDP 8/e at ILLIAC III site has 2 fields. The PDP 8/e instructions do not facilitate re-entrant or recursive programming (due to lack of a base register) . The remaining 2 non-memory reference instructions are known as IOT and OPR. The OPR instruction performs various shifts, tests, etc. with the accumulator and the link. The IOT instruction transmits information to the peripherals. There are basically 2 kinds of transmissions depending on the nature of the device. Most devices have a word oriented transmission. An IOT is provided to test the device ready flag. A second IOT is provided to transmit information to or from the device, and a third IOT is provided to clear the device ready flag so that it can be set again when the device has finished transmitting the word. Some devices use a block-oriented transmission with a cycle-stealing data-break controller. In this case, IOT's are provided to specify the buffer address, the buffer field, the word count, etc., and the controller will perform the transmission while the CPU can be executing another program. An IOT is provided to test for the completion of the transmission. The PDP 8 has a rather primitive interrupt hardware. If an interrupt device is ready and if the interrupt status bit is enabled, the computer will be interrupted by storing the address of the next instruction in location and proceeding to execute the instruction at location 1 (which presumably is the entry point of the interrupt handler). There is no indication as to the source of the interrupt. The interrupt handler must test the ready flag of each of the devices in sequence to find out which one is interrupting. 2. 2 Peripherals A variety of peripherals are connected to the PDP 8/e (see Figure 1 ) : (a) There are 2 full-duplex teletypes. These are word-oriented devices and will generate an interrupt when either keyboard or typer is ready. (b) An interval timer with variable time base and time interval is present. The timer can interrupt the CPU when the time interval has elapsed. DEC PDP-8/e SUBSYSTEM TELETYPES (2) LINCTAPE CONTROLLER LINCTAPE DRIVES (6) LINE PRINTER HANOWHEELS INTERVAL TIMER POP-8/* 8K WORDS MEMORY INTERFACE INTERFACE HIGH SPEED PAPER TAPE READER IBM 360/75 SUBSYSTEM 2701 PARALLEL DATA ADAPTER 360/73 ILLIAC 1 CORE MACHINE I SCANNER -MONITOR -VIDEO CONTROLLER STAGE a FOCUS MOTOR CONTROLLER FLYING -SPOT MICROSCOPE SCANNER 46 MM FILM SCANNER MONITOR S-M-V SUBSYSTEM VIDEO SUBSYSTEM VIDEO SWITCHING NETWORK I STAGE 3 FOCUS MOTOR CONTROLLER AUTOMATED VIDEO MICROSCOPE VIDEO MONITOR MANUAL VIDEO MICROSCOPE VIDEO MONITOR LARGE FORMAT VIDEO CAMERA VIDEO MONITOR DATA AND/OR CONTROL CONTROL ONLY FIGURE 1. OPERABLE HARDWARE (c) Two rotating knobs known as hand-wheels are located at the picture display monitor. A program can sense the angle of rotation of these wheels and use the information to control the positioning of pictures. The hand-wheels can interrupt the PDP 8 when being rotated. (d) Motors are provided to move slides on a microscope stage by program control. The program specifies the distance to be moved in each direction and the slide will be positioned as directed. Upon completion, the stage controller can interrupt the PDP 8 when needed. (e) A line printer is provided to print out information at 80 characters a second. Due to the particular design of the printer, it must be able to get 3 characters within 25 milli- seconds after the ready flag is raised, or it will generate a carriage return and ignore the rest of the line. The line printer does not generate an interrupt when ready. (f ) A high speed paper tape reader which inputs 300 characters a second is also connected to the PDP 8. It also does not generate an interrupt when ready. (g) A Line-tape controller is connected, which controls 2 Line-tape drives. These Line- tapes are preformatted tapes which contain a large number of records known as blocks. The design of the controller is such that the PDP 8 can request the controller to start I/O operation on any specified block, and the controller will search for that block automatically. So it appears to be like a random access device to the CPU, though a lot of time is used in rewinding and winding the tape 8 to the desired block. The PDP 8 must read or write within 122 microseconds when the ready flag is raised, or a program timing error flag will be raised. The Line-tape controller does not interrupt the PDP 8 when ready for transmission. (h) Two Fabritek core memory boxes each having 16K 72 bit words are connected to an Exchange net which is connected to an Exchange net interface. This interface is a block-transfer cycle-stealing data break device. The PDP 8 can be working on another program during the transfer. The interface does not interrupt the PDP 8 upon completion. (i) A scanner-monitor is connected to the system to digitize and to display pictures. Upon receiving an activate signal, the scanner-monitor will obtain its parameters from one of the Fabritek core boxes and start the scanning process. The digitized information is transmitted to the Fabritek core boxes directly by the scanner. The scanner does not interrupt the PDP 8 upon completion. (j) An IBM 360 interface is connected to the system to provide direct communication with the general purpose computer, This is also a block-transfer cycle-stealing data break device. It does not interrupt the PDP 8 upon completion. (k) A large format scanner which makes use of informa- tion from a TV camera and converts it into digitized grey levels is being completed. There is no definite information as to the nature of this device. (1) The pattern articulating unit of the ILLIAC III system is also being completed. A very limited amount of information is available at this time. It is conceivable that the above 2 devices will be added to the system in the near future. 10 3. DESIGN DECISIONS The design guide-lines of the system requires it to be quite versatile and fully interruptible (by the user) for a large number of functions. It is conceivable that the only effective system is that which performs these functions (e.g., communica- tion with IBM 360, printing output, etc.) concurrently. With the limited capability of the hardware (e.g., no protection feature, does not support re-entrant programming), the easiest (and most storage saving) implementation is a multi-programming system. Now these concurrent functions can be implemented by independent "processes" or "TASKS." The bookkeeping of TASK switching can be left to the operating system. In order to provide enough storage for this large number of "tasks," some swapping of contents of primary memory and secondary memory (Fabritek core boxes) must be done. Due to the fact that the PDP 8/e does not have address remapping (translation) hardware, a true Virtual Memory System cannot be implemented. However, the problem "task" can explicitly specify to the system when a new page is needed. This is known as a semi-virtual memory system. 3. 1 Semi-Virtual Memory System The first thing to decide is the size of a "page" or a "segment" that is to be used as the basic unit of swapping. This "segment" should be small to reduce superfluity. However, 11 this process of segmenting the primary memory must also be such that it is transparent to the programs or data being swapped. That is for most parts of the programs being swapped, it should be insignificant as to the actual physical locations they are in. Without address remapping hardware, the only way this can be true is to use a memory "field" as a segment. It is insignificant which "field" a PDP 8 program is loaded in, provided that the program is self-contained in one field and does not contain any "CHANGE DATA FIELD n" or "CHANGE INSTRUCTION FIELD n" instructions. The change of "field" must now be done via calls to operating system routines. Since the particular . PDP 8/e has 2 "fields," this scheme leaves us with 2 large "segments" (or pages) with a maximum size of 4096 words. Not all 4096 words in a "field" can be allotted to a "segment," since some storage must be left for the system to reside in. Also it would be more efficient to leave a small amount of storage for commonly used variables to reside in core all the time. Judging on the size of the system and its tables, this resident (or non- swapping) region of core must be at least IK words in each "field." The question now arises as to whether the lower most IK or the upper most IK of a "field" should be resident. Since page zero (i.e., locations 0-127) is heavily used in most problem programs to store temporary variables, it is best to leave it in the segment (or swap area). So the system should reside in high core locations, i.e., locations 6000-7777 (octal) inclusive. Also locations 0-2 12 inclusive must be reserved for the system for interrupt handling, That leaves the segment size (or the swap area) to be from locations 3-5777 (octal) inclusive, a total of 3K - 3 words. Locations 16120-16337 (octal) (field 1) inclusive is the resident portion of core that is usable by any problem task. Just as the primary memory has to be separated into "segments," the secondary memory (Fabritek core boxes) also has to be segmented. The 3K - 3 PDP 8 words amounts to 511.5 Fabritek memory words. For convenience, the secondary memory segment is 512 Fabritek core words with 0.5 word not accessible in every segment. There are a total of 32 segments in each core box. Unfortunately, the hardware addresses of the two Fabritek core boxes may not be contiguous. So the segment numbers to access the two core boxes need not be contiguous either. But this hardware dependency can be taken care of easily in any problem task without any trouble, especially when most tasks use only a small fraction of a Fabritek core box. Due to the fact that these Fabritek core boxes need not be used solely for swapping programs (e.g., they are used for the scanner to digitize pictures), and that not both of them may be working perfectly at a time, it is convenient to be able to assign the core boxes at each system loading time. A virtual memory origin (displacement) is included in the system which allows the user to specify which octant (1/8) of the core box is to be used as segment 0. Also included in the system table is the number of the segment in field and 13 the segment in field 1 which is accessible by any problem task. Since the PDP 8 hardware includes data break devices which can be transmitting data while the CPU is executing another task, it is necessary to prevent the field which is used for trans- mission from being swapped out or disturbed in any way. Two entries in the system table provide this feature. Either of the 2 fields can be "locked" (or prevented from being disturbed) by a proper entry to the table. Only one of the 2 fields can be "locked" at a time, since the system requires at least one field to function properly. If both fields need to be "locked," or in other words no other task can be swapped in, it is only necessary for the problem task to disable interrupt. The algorithm of the swapping program is that it first checks to see if either of the fields is "locked." If so, it will swap the other field. If both fields are locked, the algorithm will ignore the lock bit and swap out field 1 (anyway). If neither of the fields is locked, the algorithm will keep the field that the last problem task resided in, and swap the other field. It is believed that this least-recently-used paging scheme can reduce the number of page faults. Though the transmission between PDP 8 and Fabritek core is done by a data break device and so the CPU can be doing anything else, this feature is not used for simplicity of implementation. The convention is that the program (system or problem) that initiates the transfer stays non-interruptible until the transfer is completed. Some experimentation on the particular PDP 8/e gives the transfer time for each segment 14 one way to be 14 milliseconds. So a complete segment swap requires 28 to 30 milliseconds. This is longer than the 25 milliseconds that the line printer must be able to receive 3 characters. Thus in between checking completion flags for core transfer, the system must also transmit characters to the line printer. This is the only function that is concurrent with the core transfer. It is also possible that an I/O error occurs during the transfer. In this case, the transfer is re-started. However, the I/O error can be permanent. In this case the user should be notified. Since no system or problem task is active during a segment swap, the only way to signal the error is via display lights. In this case, the contents of the console switch register is loaded into the MQ register which can be displayed on the console lights of the PDP 8/e. Probably the first thing a user notices when such a permanent I/O error occurs is that the system fails to respond to any input. A further look at the display lights can confirm this assumption. The system can be told to "ignore" this I/O error if all the keys in the switch register are set to zero. However, due to the unknown nature of the I/O error, unpredictable results may occur. 3. 2 Task Management In a multi-programming environment, the system runs a number of asychronous "processes" or "tasks" simultaneously. Each of these "tasks" can look after a simple function of the system, so as a whole all these functions seem to be performed 15 simultaneously. At any particular time, each of these tasks can be in one of 3 states. It can be actively executing, or in other words, the CPU can be allocated to this task. It can be in the ready state waiting for the CPU to be allocated to it. It can also be in the wait state in which it cannot be run until some "event" (e.g., completion of I/O transfer) has completed. For this latter case, the "task" is said to be non-dispatchable (in contrast to the former two cases when the task is dispatchable ) . At any time, it is likely that there is more than one dispatchable "task." To decide on which of these "tasks" should be run, priority is given to each of the "tasks." In a simple multi-programming system (such as the one being used) , the highest priority dispatchable task is run until either it blocks itself to wait for an "event," or another higher priority task is dispatchable. This approach is considerably simpler (and more storage saving) to implement than a time sharing system which also has to worry about execution time quanta. In order that the system can switch from one dispatch- able task to another effectively, some task status information must be saved. This set of information is saved in contiguous locations grouped into a "TASK CONTROL BLOCK" or "TCB." The system has a "TASK TABLE" which contains all these TCB's. The TCB has to contain the address of the next instruction to be executed when the task becomes active again. This consists of 2 PDP 8 words, namely the segment number and the address 16 within the segment. The TCB must also contain the last value of the accumulator. The TCB must also contain information on the dispatchability of the task. In this case, a wait count is used. This is a positive number which represents the minimum number of "events" that must be completed before the task is dispatchable. A dispatchable task has a wait count of zero. Besides these, a task has various other kinds of status also. For example, it might be useful to save the "link," the status register of the PDP 8, and the priority number of the task. However, for the purpose of saving storage, none of the above three is saved. It is found that programs very seldom use "link" or the status register. So it is not justifiable to allocate valuable system storage space for them, It is possible to use the "link" in a problem task, provided that the interrupt is disabled beforehand. A straight priority system is used for the same reason. In selecting a task to run, the system goes through the task table sequentially. So the task that is earliest in the task table has the highest priority. However, it is generally not possible to change task priority dynamically. Though the tasks in the system are asynchronous, it is sometimes necessary to synchronize with each other. There are generally 2 kinds of task synchronization. A task may want to make sure that an event has occurred before proceeding, For example, a picture processing task may want to make sure that the monitor-scanner has finished digitizing the picture before working on it. On the other hand, several tasks may 17 be contending to use a non-reentrant resource. For example, several tasks may want to access information on tape, but only one of them can use it at a time. Two system routines have been implemented to take care of this synchronization with events. They both use 2 PDP 8 words of storage as an "EVENT CONTROL BLOCK" or "ECB" to keep track of the event. The most significant 2 bits of the first word of the ECB are known as the "wait bit" and "post bit," respectively. If a task wants to ensure that an event has occurred before proceeding (or wait for the event), it can call the system routine WAIT passing the address of the ECB designated for that event to it. If the "post bit" for the ECB is zero, the calling task will be marked non-dispatchable after setting the "wait bit" in the ECB and placing the address of the TCB in the second word. The task will become dispatchable again when another task calls the system routine POST passing it the address of the same ECB, and a 2 word completion code. The post bit is then set and the wait bit is cleared, and the 2 word completion code is placed in this ECB for the awakened task to look at. A task can be waiting for a number of events at the same time by passing a wait count and the addresses of all the ECB's to the system routine WAIT. This provides maximum flexibility in programming. The method of handling contention is quite different from most systems. Instead of allowing the contending tasks to execute the code of the non-reentrant section (or the critical section) directly, an additional task known as the 18 critical section handler is used. This critical section handler is the only task that can touch the critical section. All tasks that want to use this critical section must request service from the handler. This request is done via a system routine QUEUE with the address of the parameter list to be passed to the handler. A 6 word QUEUE ELEMENT or QE must also be given to the system to keep track of the request. The last 2 words of the QE comprise the completion ECB which will be posted when the service is completed. (A STACK is actually used for simplicity in implementation.*) The requesting task can proceed to execute asynchronously with the handler, or WAIT for this completion ECB. Each critical section handler has a 6 word SERVICE ELEMENT or SE. The last 2 words comprise the request ECB. This ECB will be POSTed with the address of the parameter list as the completion code, whenever a request comes. The handler can WAIT for this ECB when a new request is needed. When finished servicing the request, the handler calls the system routine NEXTQ , which POSTs the completion ECB for the request and checks to find the next request element in the request queue. If another one is present, the request ECB is POSTed with the new parameter list. Otherwise, the request ECB is cleared. Although this contention approach results in a slightly larger number of tasks in the system, *A STACK implementation saves one swapping and simplifies coding. Since actual contentions rarely happen, this method is preferred to the more sophisticated QUEUE in light of its simplicity. 19 it has considerably greater flexibility. The handler can perform a variety of functions in between requests (e.g., open and close files). The critical section also need not use only subroutine returns; it can use coroutine returns too. It can even be written to make use of the physical characteristics of the devices. For example, a tape handler can process requests that require minimum tape motion first. 3.3 Peripheral Management There are basically 2 kinds of peripherals: the interrupt generating ones and the non-interrupt generating ones. They are handled differently in the system. For interrupt generating peripherals , when an interrupt occurs , it is necessary to sequentially check the ready flags of each of the devices in order to find out which one is causing the interrupt. Instead of placing this flag checking code in the system directly, a device table approach is used to provide maximum flexibility and adaptability of the system to other hardware configurations. The system consists of a DEVICE TABLE in which each interrupt generating device is represented by a DEVICE ENTRY. A DEVICE ENTRY is 4 words long: the first two consist of the IOT instructions for "SKIP IF READY FLAG" and "CLEAR READY FLAG" respectively. The last 2 words are the device ready ECB, which can be WAITed for via system routine call. Since the interrupt handler checks the devices sequentially, it is most efficient to place frequent interrupt- ing devices in the earlier entries of the table. 20 For non-interrupt generating devices, the approach is to check their ready flag once every short time interval. A system task which makes use of the interval timer performs this function. The task sets the timer to interrupt every 10 milliseconds (and thus makes itself dispatchable ) . When the timer expires, this system task checks each of these devices and POSTs their device ready ECBs accordingly. The timer task also provides some crude elapsed time service to other problem tasks. It consists of a timer SERVICE ELEMENT which can be QUEUEd by any task. At the conclusion of 1 second elapsed time, this timer QUEUE is cleared by continually calling NEXTQ. Requesting tasks can check the completion ECB of their QE to find out this elapsed time. (Thus error for elapsed time is on the order of 1 second. ) The timer task also keeps track of a 1 second counter (2 words long using multiple precision arithmetic) which contains the elapsed time in seconds since the system was started. This counter is accessible to any problem task. Of all the non-interrupt generating devices, the line printer is the only one that is handled differently. This is because it has the additional requirement that it has to receive service every 25 milliseconds. This makes it necessary to check the line printer ready flag by both the timer and the virtual memory swap routines. The printer is taken care of by a subroutine which directly receives REQUESTS for printing and services them. No device ready ECB is POSTed. Some flags 21 are used in this routine to record the status of the SERVICE (e.g., printer start and stop flag). To help system updating and maintenance (without reassembling all problem tasks), all non-interrupt generating device ready ECBs and various other items of system information (e.g., second counter, timer SE , etc.) are placed together in a COMMUNICATION TABLE of the system. Any problem task can refer to this table and get the system services needed. 3. 4 Device Independent (Assignable) I/O It is often useful to provide device independent I/O in a system either for the sake of convenience in use or to place a device off-line for service. The particular system we have has 2 teletypes (at different locations in the room). It is conceivable that the user finds it convenient to use a program at one of the teletypes some of the time and another one some other time. Therefore, it is necessary to enter keyboard input from either teletype and route output to either teletype, as desired. When input can come from either teletype the question immediately arises on what will happen when both keyboards are entered at the same time. The logical solution is to put the 2 characters in a QUEUE and give them to the program one at a time as if they come from the same source in that order. Thus the user can enter part of the command in one teletype, walk over to the other one and finish the command without affecting the program (task). Thus any task 22 that requires input from keyboard has an SE. When a key is struck, QUEUE will be called with the character entered in place of the address of the parameter list. A number of tasks may be requesting keyboard input at the same time. In order to decide which task to talk to first, 2 keyboard system tasks are used, one for each teletype. These tasks have 2 modes. In the command mode, simple on-line debug commands are accepted. One of the commands allows the user to enter the address of the SE to talk to. Then the keyboard will switch to routing mode in which all the keys are routed to the particular SE. The user may switch to command mode at any time and choose to talk to another SE. The typing of output on a teletype has a similar problem. Several tasks may want to output to the teletype at the same time. In order to prevent contention, 2 system tasks are used to handle teletype output, one for each typer. Each of these typer tasks has an SE which can be QUEUEd by tasks requesting output. It was decided to have the typers as character oriented devices. That is, a task requests service from the typer task independently for each character. So if two tasks want to type at the same time, their characters may inter-mix. The reason for this decision is that teletypes are full-duplex devices. The task that receives keyboard input must route it out to the typer again to echo back to the user. Therefore, it is necessary for the typer to type the character as soon as possible without waiting for 23 a line to finish. (The keyboard task itself cannot do the echoing since otherwise the user has no way of knowing the task he talks to has accepted the character. The typer tasks also cannot lock onto one task until a line is finished since otherwise the user may leave the typer in the middle of a line and thus lock the system for a long time. It is up to the user to prevent this inter-mixing of characters by limiting the number of dispatchable tasks. ) The line printer, on the other hand, is a line oriented device. The output task collects a line (maximum 127 characters, terminated by a "return" character) before requesting service via the line printer SE. The address of the line buffer is passed by the QUEUE routine. The line printer routine reserves a buffer in the resident portion of the system. The line is copied so as to minimize swapping. These input and output routines for each task are standardized via a system supplied assembly source file. This file is to be assembled together with the problem tasks. In this assembly several routines are provided. The READC routine reads from keyboard via an SE. WRITE1, WRITE2, and LPRINT output characters to typer 1, typer 2, and line printer respectively. WRITEA outputs to all three devices at the same time. WRITEC provides assignable I/O. A word in the assembly enables the user to route output to any of the above 4 routines. This word is modifiable (assignable) via command from either keyboard. 24 So as a whole, all outputs are done via QUEUE whereas all inputs are done via SE. This is a very flexible method as the output from one task is directly ("plug to plug") compatible with the input to another task. Thus it is possible to send commands to tasks from the IBM 360 or from a file instead of from the keyboards alone. 25 4. SYSTEM ROUTINES AND TABLES The system can be separated into 3 parts from the user's point of view. Direct task synchronization and virtual memory management functions are taken care of by a set of system routines callable by any problem task. These routines are non-interruptible , and thus interrupt must be disabled before calling. They reside permanently in core and are completely transparent to the "field" where the call originates. Peripherals and timer services on the other hand are handled by a number of system tasks. These tasks either pick up parameters from fixed core locations or accept input via SE ' s to perform desired I/O on the peripherals. Most of the parameters to use these system tasks are grouped together in the COMMUNICATION TABLE. The third part of the system consists of a set of assembly source subroutines to be assembled together with problem tasks. They contain address constants, subroutines to use different features of the system conveniently, and loader codes to bootstrap the problem task into core. The system uses a lot of tables and control blocks to provide the various services. The formats for these tables are fixed so that problem tasks are unaffected when a new 26 release of the system comes up. The following section will describe these routines and tables. 4. 1 Virtual Memory Routines 4.1.1 LOAD To load the content of a virtual address (segment + location) into the accumulator. If the segment is not already in core, it will be brought into core. LOAD can be used to access locations not in the swap area. In this case, any even segment number refers to field and any odd segment number refers to field 1. (Approx. 100 inst. or 0.3 ms. without swap) USAGE : 6002 /disable interrupt before call JMS I (LOAD /call LOAD routine SEGMN;L0C /virtual memory address /interrupt remains disabled upon return 6001 /enable interrupt again 4.1.2 STORE To store the contents of the accumulator in the virtual address. The accumulator is cleared at return. Conventions for addresses in a non-swap area are the same as for LOAD. (Approx. 100 inst. or 0.3 ms. without swap) USAGE : 6002 /disable interrupt before call JMS I (STORE /call STORE routine SEGMN;LOC /virtual memory address /interrupt remains disabled; ace. cleared 27 6001 /enable interrupt again 4.1.3 GOTO To transfer control to instruction at the virtual address. No swapping will be done if the address is in a non-swap area. (Approx. 68 instr. or 0.2 ms. without swap) USAGE : 6002 /disable interrupt before call JMS I (GOTO /call GOTO routine SEGMN;LOC /virtual memory address /no return; be sure to enable /interrupt at new location. 4.1.4 CALLS To transfer control with return to a subroutine at a virtual address. The return address (segment + location) is stored in the first 2 words of the subroutine. Control is transferred to the instruction at the third word. No swapping will be done if the address is in a non-swap area. (Approx. 72 instr. or 0.22 ms. without swap.) USAGE : 6002 /disable interrupt before call JMS I (CALLS /call CALLS routine SEGMN;LOC /virtual address of subroutine 6001 /enable interrupt again The subroutine at LOC may be as follows: RETN,6002 /disable interrupt before call JMS I (GOTO /call GOTO to return to caller 28 LOC, 0;0 6001 JMP RETN /reserve 2 words to store return address /enable interrupt at entry /body of subroutine here /prepare to return to caller 4.1.5 CORE MAP The layout of the system is as follows: Octal FIELD Loc FIELD 1 Interrupt handling Loc 2 Loc 3 Free for general use Problem Problem task task swap swap area area 1-SWAP Loc 5777 Loc 6000 AREA System Routine Entry System Routine Entry Task Table Communication Table Device Table Free for general use Virtual Memory System tasks Task management Loc 7577 Loc 7600 Loc 7777 1 OS/8 OS/8 The LOAD, STORE, GOTO, and CALLS routines provide all the services for problem tasks to transfer to another segment. (The accumulator is unaffected by GOTO and CALLS. ) Experimenta- tion shows that approximately 100 instructions are executed by each call to these routines if no swapping is required. This 29 amounts to around 0.3 milliseconds per call, which is not bad for GOTO and CALLS. However, it is rather slow for LOAD and STORE if a large chunk of storage is needed, since these routines work for 1 word at a time. The reason why this was implemented despite the obvious inefficiency is that it is a lot simpler to implement and to use this way. No buffering is needed (which is not true if a block transfer routine is implemented instead). If it is found that a substantial amount of information must be copied from another segment to the current one or vice versa to be worked on, the following more efficient algorithm can be used: STEP 1: Disable interrupt; initialize word counter. STEP 2: Call LOAD (or STORE) to transfer 1 word. STEP 3: Check to see if the desired segment is in the other field or not. It should be, unless the other field is " lock"ed. The interrupt is still disabled at this point. If desired segment is not in the other field (tough luck), goto STEP 2. STEP 4: Transfer the entire chunk of core directly using "CHANGE DATA FIELD n" instruction keeping interrupt disabled. STEP 5: Be sure to enable interrupt again when done. This algorithm makes use of LOAD to get the desired segment in core. Then it seizes control and transfers the entire chunk directly. It is found experimentally that the transfer time per page (128 words) is around 4 to 6 milliseconds, an interval that is tolerable in most cases. One thing to note is that it is necessary to ensure the interrupt has a chance to be enabled at least once every 20-25 milliseconds, or the 30 line printer will be extremely unhappy. Even more efficient transfer (for large chunks of core) can be done by coding core transfer IOT's directly. This possibility exists but it is still doubtful whether the need would ever arise. 4. 2 Task Management Routines and Control Blocks 4.2.1 TASK CONTROL BLOCK (TCB) TCB's are 4 word blocks for storing task status information. They are grouped together (continuous, one TCB after another) to form a TASK TABLE. Each system or problem task must have a TCB. The tasks with TCB's near the beginning of the table have higher priority than tasks with TCB's at the bottom of the table. A task may be created by entering its TCB via program control or via keyboard commands. A task is removed when its TCB entry is removed from the table. The first few entries of the TASK TABLE are taken up by system tasks. The first usable TCB starts at symbolic address UTCB in field 0. TASK TABLE: ADDRESS ENTRY UTCB TCB 1 UTCB+4 TCB 2 UTCB+10( octal) TCB 3 The maximum number of TCB's varies depending on the release of the system, but there are at least 16 TCB's available for problem tasks. TCB: ADDRESS ENTRY +0 SEGMENT (the segment where the task resides ) 31 +1 ACC (contents of the accumulator ) +2 LOC (address of first instruction ) +3 WAIT-COUNT (usually zero) 4.2.2 DEVICE TABLE ENTRY (DTE) DTE ' s are 4 word blocks for storing IOT information on interrupt generating devices. Since the hardware configuration for the system is usually fixed, the following information is given for system maintenance. The DEVICE TABLE resides in field starting at symbolic location DIVTBL. DEVICE TABLE ADDRESS ENTRY DIVTBL DTE of device 1 DIVTBL+4 DTE of device 2 DIVTBL+10 (octal) DTE of device 3 All of the existing interrupt generating devices have been included. There is room for 2 more entries for future expansion. DTE: ADDRESS ENTRY +0 SKIP IF DEVICE READY IOT instruction +1 CLEAR READY FLAG IOT instruction +2 to +3 Device ready ECB 4.2.3 EVENT CONTROL BLOCK (ECB) ECB's are used to synchronize tasks with events. They can be operated on by system routines WAIT and POST. Storage 32 for an ECB is allocated by the problem task and resides in the problem task area. ECB: ADDRESS ENTRY +0 bit WAIT bit bit 1 POST bit (event has occurred) bit 2 to 11 , Completion code if posted, an A(TCB)+3 if waiting +1 bit to 11 An ECB cannot start at location 0. It can reside in a non-swap area. This usually saves swapping time. However, if this is chosen, then it must reside entirely in the non-swap area; i.e. , it is illegal to have 1 word of an ECB in a swap area and another word in a non-swap area. 4.2.4 QUEUE ELEMENT (QE) A QE is used to request service from a critical section handler. It is allocated by the requesting task. This block of 6 words will be used by the system as long as the request is still in the queue. (A STACK is acually used. Two tasks can seize control of the handler!) A QE can reside in a non- swap area but the same restriction as for an ECB also holds for a QE. QE: ADDRESS ENTRY +0 to +1 A (next QE in stack) +2 to +3 A (parameter list) +4 to +5 Service complete ECB The first 4 words will be automatically maintained by the system. The user need only be concerned about the completion ECB. 33 4.2.5 SERVICE ELEMENT ( SE ) An SE is used by a critical section handler to obtain requests from other tasks. It consists of 6 words which can reside in a non-swap area. The same restriction for a QE applies to an SE. The first 4 words of an SE are maintained by the system. The user need only be concerned with the REQUEST ECB. This, when POSTed, has a completion code equal to the address of the parameter list passed from the requesting task. SE : ADDRESS ENTRY +0 to +1 A (top QE of request stack) + 2 to +3 A(QE currently under service) +4 to +5 REQUEST ECB 4.2.6 POST Routine To signal the completion of an event. If the wait bit of the ECB is 1, the waiting process wait count will be reduced by 1 (unless it is zero already). In any case the post bit is set and the completion code is placed in the ECB. Interrupt is enabled upon return from POST. The contents of the accumulator are unaffected. (Approx. 200 instr. or 0.6 ms. without page fault) USAGE: 6002 /disable interrupt before call JMS I (POST /call to POST routine SEGMN;LOC /A (ECB) to be posted CC1,CC2 /completion codes /interrupt is enabled here 34 4.2.7 WAIT Routine To wait for the completion of a number of events. WAIT accepts a variable number of ECB's as its parameter. The ECB list is terminated by an address with a zero in the location portion. (That is why an ECB cannot start at location 0.) The wait count (non-zero) is used to specify the minimum number of events that have to occur before the task can be awakened again. (Hopefully this is not larger than the number of ECB's in the list, or the task deadlocks itself.) Normally the post bit of the ECB should be cleared before the WAIT. If the post bit is 1 for any of the ECB's in the list, the event is assumed to have occurred, and the wait count is reduced by 1. That particular ECB remains unchanged. The contents of the accumulator are unaffected by this call. (Approx. 154+74N instr. without swap. 2 ECB = 0.85 ms. ) USAGE 6002 /disable interrupt before c^ll JMS WCNT SECB1;LECB1 SECB2:LECB2 I (WAIT /call wait routine /wait count /A (ECB 1) /A (ECB 2) ANYTHING ; /as many as 7 ECB's may be included /end of parameter list /interrupt is enabled before return /be sure to clear the "wait" bits of all the ECB's at this point 35 4.2.8 QUEUE Routine To request service from a critical section handler. A QE must be supplied to the system to keep track of the request. QUEUE will maintain this QE including the automatic clear of the completion ECB. The requesting task thus can proceed or wait for this ECB. The restriction to this QE is that it must not be already in use by the system. Otherwise unpredictable results may occur. The contents of the accumulator are unaffected by the call. USAGE: 6002 /disable interrupt before call JMS I (QUEUE /call queue routine SESEG;SELOC /A ( SE ) PARMSG;PARMLC /A (parameter list) QESEG;QELOC /A ( QE ) /interrupt enabled at return /process asynchronously here /wait for completion ECB here to be sure it is serviced 4.2.9 NEXTQ Routine To signal the completion of service for the present request and to obtain the next request from the request stack. An SE must be supplied to the system to keep track of requests. NEXTQ will maintain this SE including the clearing and posting of the request ECB. The accumulator is unaffected by this call. The value of the accumulator is used as the completion code for the service ECB. 36 USAGE: 6002 /disable interrupt before call JMS I (NEXTQ /call nextq routine SESEG;SELOC /A ( SE ) 4. 3 Communication Table The communication table contains system variables and parameters, which problem tasks may refer to and may change. It resides in field 1. Pointers are also used to enable tasks to access field variables. ADDRESS WTSE1 (6 words WTSE2 (6 words LPSE (6 words CLKSE (6 words EPTR (2 words ETAPE (2 words ESMV (2 words ERIBMS(2 words ERIBMD(2 words EWIBMS(2 words EWIBMD(2 words TIME1 (2 words ENTRY Teletype 1 typer SE (character oriented) Teletype 2 typer SE Line printer SE ; A (line buffer used) 1 second clock SE High speed paper tape reader ready ECB Tape error ECB; POSTed whenever Linctape error bit is on. (Program timing error is one of the causes for it. ) Scanner monitor ready ECB IBM 360 read select ECB IBM 360 read complete ECB IBM 360 write select ECB IBM 360 write complete ECB Elapsed time in seconds (multiple precision arith. ) HWLX Hand wheel X reading HWLY Hand wheel Y reading AFILDO Pointer to a word in field which contains the segment number of the segment that is currently in field 37 AFILD1 Pointer to a word in field which contains the segment number of the segment that is currently in field 1 ALCKFO Pointer to "field lock flag" in field 0. If this flag is non-zero, field segment is locked in core. ALCKF1 Pointer to "field 1 lock flag" in field ATCBPT Pointer to a word in field which contains the address of the current TCB AORGVY Pointer to the virtual memory origin word in field 0. User can choose which octant of the core box is segment by entry to this word. Directly following the communication table is a block of 220 (octal) words which are free to be used by any problem task to store variables and control blocks. 4.4 UTILITIES The UTILITIES comprise a set of assembly source sub- routines to be assembled together with problem tasks. It also contains definitions for symbolic addresses, so that the references referred to in this paper may be used in coding problem tasks. A bootstrap loader is included to load problem tasks into virtual memory locations. During assembly, the literal SEGMT must be specified. This is the segment number (> 2) in which the problem task is going to reside. All the codes for the problem task must be assembled in field 1 for the loader to function properly. If a problem task requires more than 1 segment of storage, then it is necessary to assemble it a segment at a time. The routines in UTILITIES take up 2 pages of core, the last of which is used for the 38 line printer buffer. These routines are neither reentrant nor recursive. 4.4.1 READC Routine To get one character from a keyboard type device. The character will be returned in the accumulator as well as stored in the symbolic location CHAR. The SE used in getting the character starts at the second word (RDSER) of the page into which the UTILITIES are assembled. USAGE: JMS I (READC /echo character here using write routines TAD I CHAR /pick up the character again if needed 4.4.2 LPRINT Routine To output one character to the line printer by first collecting it in the buffer. A RETURN character terminates a line and initiates the printing process. This routine obtains the character from the accumulator. At return the accumulator is cleared. USAGE: JMS I (LPRINT /print character in accumulator 4.4.3 WRITEA Routine To output the character in the accumulator into both teletype typers and the line printer simultaneously. The accumulator is cleared at return. This is used mainly for important messages that require attention. USAGE: JMS I (WRITEA /output character from ace 39 4.4.4 WRITEC Routine To output the character in the accumulator to an assignable device. The word WRITAC at the first location of the page into which UTILITIES is assembled contains the address of the output routine to be used by WRITEC. In addition to LPRINT and WRITEA , WRITE1 and WRITE2 are also available which output to teletype 1 and teletype 2 respectively. The accumulator is cleared upon return. USAGE: JMS I (WRITEC /output character to console 40 5. EXISTING SYSTEM AND PROBLEM TASKS The present release of the system has a number of tasks running under it. Most of these tasks perform system functions and are transparent to the user. They have been discussed in previous sections. This section is concerned with tasks that are directly usable by the user. It must be pointed out that most of these are crude implementations mainly for testing (and illustrating) the different features of the system. More sophisticated implementations of these tasks will be done in later releases. 5. 1 Keyboard Monitor Task The keyboard monitor task enables the user to enter commands and data through the teletype keyboards. There are 2 keyboard tasks, one for each teletype. Though the commands for these 2 tasks are almost the same, they are not identical. One of the teletypes is assigned to be the operator console. It recognizes 2 additional commands. The operator may "hold" (keep other tasks from executing) all tasks. The operator may also terminate the system by transferring control to OS/8 system. Due to the fact that it is necessary for the operator to be able to "hold" all tasks and examine them, no matter what disastrous bug has happened, the keyboard is active all the time (except when an I/O error occurs during swapping). 41 That is, if the keyboard is routing characters to problem tasks, it keeps on accepting new keystrikes , no matter whether the problem task has a chance to examine them. (The problem task may have a bug and deadlock itself. ) After receiving the keystrike , if it is found that the problem task is not ready for a new character, the character is thrown away. So some characters may be lost. Thus, it is not reliable to read paper tapes at the operator console. Paper tape reading is thus disabled. (The problem task can enable this feature if needed.) The regular (other) keyboard task does not have the "hold" command. Thus, it can afford to wait until the problem task has a chance to examine the previous character before looking at the new keystrike. Reading of paper tape is therefore permitted at the regular teletype. Another distinction is that the operator console task resides in the system area, and so it is in core all the time. This again is due to its importance in debugging. Both keyboard monitor tasks operate in 2 modes. The "\" command switches them to the command mode. In this mode, the tasks accept simple debug commands. The "nQ" command switches them to routing mode. In this mode, keystrikes are routed to problem tasks. The following is a repertoire of their commands: nnnn/ To display the content of a virtual address. Octal base is used. nnnn may be any number of digits. The last 4 digits are used as the location, the next 4 digits are used as segment number. All extra leading digits are ignored. Zeroes are packed on the left if less than 8 digits are entered. 42 vvvv= To replace the contents of the location displayed by the previous nnnn/ command with the value vvvv. Only the last 4 digits are used as value. Octal base is used. ssssQ To switch to routing mode by talking to the SE at location (segment + location) ssss (octal). If ssss is zero, the address of SE for the previous Q command is used. Q also releases all tasks. \ To switch to command mode. "\" is not routed to the problem task. A \ character will be typed on the left margin when the monitor is in command mode. H To "HOLD" all tasks. This command only works at the operator console. CONTROL-C To terminate the system by transferring tC control to 7600 (octal), i.e., OS/8 system, This command only works at the operator console, and only after the "H" command has been issued. 5. 2 Scanner Task The scanner task assembles the scanner parameters and activates it to either digitize or display a picture con- tinuously on the display monitor. This task does not check the validity of the parameters. If the scanner gets an illegal parameter, it will abandon the scan and the beeper will sound. The scanner task, after finding that the scanner completion flag has not been raised within 2 seconds will re-activate the scanner again to allow the user to have a chance to change the parameters. Other than the physical beeper, there is no way the software can detect the abandoning of the scan. This is due mainly to hardware restrictions (there is no IOT to detect this) rather than software implementation. To allow problem tasks to synchronize with 43 the scan, the scanner task calls a subroutine before assembling the parameters to activate the scanner for another scan. The entry point of this subroutine is alterable by problem tasks. Thus any problem task that wants to use the scanner can assemble a subroutine and give its entry point to the scanner task. When the system is first started, a dummy subroutine is used which takes the readings of the handwheels as the starting coordinates for the scan. The problem task may of course supply subroutines to fix the parameters in any other way. ADDRESS ENTRY CORNO Device number for Fabritek core box to be used by the scanner in bits 4 to 8 (inclusive), All other bits should be zero. The scanner starts using the core box from location 0. SCDPY If zero, then digitize the picture and put it in the core box. If non-zero, display the picture from the core box. MAG The last 3 bits specify the magnification of the picture on the display with the power equal to 2 ** MAG. ROTAT The least significant bit specifies whether to digitize the picture along the X axis or along the Y axis. An entry of 1 means to scan along the Y axis (i.e., same orientation as the film). XSIZE The least significant 3 bits specify the step size between digitize points. One out of 2 ** XSIZE points are digitized on the X axis (vertical on the film). YSIZE Y step size for digitizing films. XBEGIN X coordinate to start the digitizing process. The coordinate runs from to 7777 (octal). XBEGIN must be greater than or equal to 20 (octal) and must be a multiple of the number of points used in the X step size. It must also be such that enough room is left on the right for the scanner to complete its scan. YBEGIN NUMX NUMY 44 The corresponding Y begin point. The total number of digitized points for the picture on the X axis. This must be odd to use interlace scan. The number of digitized points on the Y axis. This also must be odd to use interlace scan. +41 (octal) Address (segment + location) of parameter (2 words) fixing subroutine. It can be used for synchronization purposes. +170(octal) If bit 4 is 1 , interlace scan is used, normal scan otherwise. The scanner task uses 3 pages of storage. 5. 3 OPT (Octal Debug Technique) Task A 1 field version of the DEC ODT program is included in the system. The modifications mainly lie in intercepting the ODT teletype I/O IOT's with the standard system READC and WRITEC. 