LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 T4G3c no.1-10 AUG 5197S The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 111 JUL 07?Fri IMP I thill L161 — O-1096 Digitized by the Internet Archive in 2012 with funding from University of Illinois Urbana-Champaign http://archive.org/details/osl2operatingsys00alsb Report No. k-60 £10. 8U l£63c no .6 ONFERENCE imgm. OSL/2 AN OPERATING SYSTEM LANGUAGE by Peter Allyn Alsberg ENGINEERING LIBRARY UNIVERSITY OF ILLINOIS URBANA, ILLINOIS CAC Document No. 6 Report No. kGO OSL/2 AN OPERATING SYSTEM LANGUAGE by PETER ALLYN ALSBERG June 10, 1971 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 This work was supported in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center under Contract No. USAF 30(602)-4l44 and submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science, June 1971. ■uiunvLWimv ABSTRACT OSL/2, an Operating System Language, was designed for the coding of supervisory systems. OSL/2 is an ALGOL-based language that may be readily extended to a general purpose language with the addition of a few data types (e.g., floating point numbers). The language replaces an interrupt concept with the P and V operators on semaphores described by Dijkstra. Concepts of multiple processes and system layers allow OSL/2 to be used as a job control language for an OSL/2 coded supervisory system. The system layering concepts provide for easy system extension and hierarchical structuring. A powerful method of specifying access techniques is used to implement queues , tables, etc. and treat them as primitive data items in the language. The language is presented in narra- tive and reference formats. Hardware implications are discussed and an OSL/2 compiler which runs on the Burroughs 5500 is briefly described. The implications of the language on system measurement, simulation, and modifi- cation are also discussed. IV TABLE OF CONTENTS Chapter Page 1. OSL/2 Concepts, Philosophies, and Goals 1 2. An Introduction to the OSL/2 Syntax and Semantics 10 2.1. The Primitive Data Types 12 2.1.1. Integer 12 2.1.2. Boolean 13 2.1.3. String ll+ 2.1.1*. Patterns 15 2.1.5. Pointers l6 2.1.6. Conditional and Case Expressions 17 2.2. Composite Data Types 18 2.3. Statements 30 2.3.1. Blocks, Compound Statements, Case Statements and Conditional Statements Are Recursively Defined 30 2.3.2. Iterative Statements .... 31 2.3-3. Assignment Statement 3^4 2.3.^-. Pattern Matching Statement 36 2.3.5 • Procedure Statements 38 2.h. Procedures and Functions 39 2.5- Control Concepts ^9 2.5-1. Escape Statements U9 2.5.2. Semaphores 50 2.5.3. Processes 52 Chapter Page 2.5.1*. Traps 3k 2.5.5. Primitives 55 2.6. Compile Time Facilities 57 3. System Measurement, Modification, and Debugging in an OSL/2 Environment 60 k. The Hardware Requirements of OSL/2 6k 5. Conclusions 78 Appendix A. A Reference to OSL/2 Syntax and Semantics 8l B. An. OSL/2 Compiler 167 LIST OF REFERENCES 2U7 VITA 2U9 VI LIST OF FIGURES Figure Page 1. Variable stack while in block 1 69 2. Variable stack after entering the procedure "addl" 71 3. Two stack layout if " append (addl(r ,m) )" is written instead of "addl(r,m) M 72 h. Final addressing scheme 75 Chapter 1 OSL/2 Concepts, Philosophies, and Goals For many years supervisory and monitor systems have "been coded in machine level languages hy a rather priviledged set of system coders. Machine level languages have been used to improve system efficiency. The systems programmer has been allowed the privilege of unprotected access to all portions of a system on the assumption that he is a capable coder not subject to the errors of the normal applications coder. Unfortunately, assembly language coded systems have shown them- selves to be grossly inefficient. For example, consider the differences in system overhead and reliability between two similar systems like ASP [1, 2, 3, k] and HASP [5] on the IBM system/ 360. HASP commonly experiences two to four times less overhead than ASP. Yet, in the opinion of this author, a factor of at least 2 if not k improvement is still available in HASP if the queuing and event structure were reorganized. Even a reasonably small system like IBSYS [6] for the IBM 709^ has significantly less throughput capability than P0RTH0S [7] when a university job mix is being processed. Analyses of the GECOS systems for the GE machines, EXEC systems for Univac machines, etc., would yield similar examples of widely varying efficiencies in different assembly language coded systems for the same machine. By comparison the Burroughs MCP [8] (Master Control Program) for the B5500 experiences relatively low system overhead even though the B5500 MCP is coded in the high level language ESPOL [9]— an ALGOL language derivative. The important point to be made here is that assembly language has not automatically brought efficiencies. Good system design is clearly the most important factor and does lead to efficiencies. An examination of existing operating systems also shows that, with the notable exception of the THE Multiprogramming System [10], errors are plentiful. The author's personal experience has "been that even in very old systems, about five to ten per cent of the errors that were dis- covered late in the life of a system were undiscovered key punch errors. Of the remaining errors, about one third to one half are addressing errors, most of which could be caught by a simple array bounds or index check in a high level language. After encountering literally hundreds of examples of trivial errors which should not have been made by systems programmers, the author has come to the conclusion that systems programmers are as prone to make mistakes as applications coders. They are in need of tools which help them avoid inadvertant errors by restricting the scope of their actions to the particular task at hand - for example, an automatic index bounds check. Because of the drastic and widely propagated effects of a system error, systems programming requires at least as much protection as that available to applications programming. The cost of building operating and support systems is now so prohibitive that new machines are required to be downward compatible with old machines on which working assembly language system software already exists. Yet new hardware architectures continue to be developed with a wide variety of addressing features that cannot be used because they con- flict with old techniques. This is especially unfortunate when one realizes that the vast majority of system software need not be machine dependent. Most system software is concerned with table and queue manipulation, i.e., with the specification of scheduling algorithms. Machine dependence has been forced as a by-product of the use of machine dependent languages. 3 The only parts of a system that are truly machine dependent are the interrupt handlers of the supervisor, the code emitter portions of the higher level language compilers, and the assembly language itself. In 1968 Wells and Alsberg designed OSL - an Operating System Language [ll]. OSL executed on one processor and was connected to any number of other processors such as arithmetic units, data channels, and peripheral storage devices. First in first out queues and pointers were provided as primitive data types in the language. Interrupts were handled by automatically invoked procedures. It was possible to dynamically change the procedures to be invoked for each interrupt. OSL was a block structured ALGOL 60 derivative which interfaced with hardware through eight primitive procedures. For example, the "route" primitive connected an interrupt with the procedure to process the interrupt. Each of these primitives was designed to be implemented on most machines with an extremely small number of instructions (usually, only about a dozen were required) . In this manner one could guarantee an easy implementation of the hardware interface on most machines . A task scheduler for a multiprogramming system was coded in OSL. Since multiprocessing and multiprogramming concepts were embedded in OSL only a few dozen lines of code were required. This code was then examined with the intention of running the same code on an IBM 360/75 and an IBM 709^+. Due to the limited time available to the authors, they were unable to build OSL compilers for both machines. However, by the time the project was abandoned, they had verified that the identical OSL code could be made to run on both machines and that the primitives involved could be imple- mented in just a few instructions on both machines. OSL was designed with machine independence as a major goal. Thus it was completely divorced from any particular addressing scheme. As a result a multiprogramming system coded in OSL would very likely run poorly on an infinite memory descriptor machine like a B5500 or a virtual memory paged machine like a 360/67 or a Sigma "J. Similarly, a failure to specially treat other machine dependent features such as bulk core, head per track disks, hypertape drives, etc. would lead sometimes to large inefficiencies. However, OSL did assure the transportability of an entire operating sys- tem among virtually all medium and large scale computers. Subsequent work by the author lead to the design of OSL/2. OSL had been designed to provide a common, machine independent systems language for the largest possible number of machines. OSL/2 used another approach. OSL/2 would be machine ignorant in its design. That is, OSL/2 would be designed to easily specify well structured operating system concepts and algorithms independent of existing machines. Once the language was designed, the hardware necessary to implement the language concepts would be examined. Finally, the appropriate compromises would be made in the language to make it practical. Several goals were set at the beginning of the design effort. 1) Since the heart of an operating system is its queues and tables, the language must allow the specification and manipulation of arbi- trarily structured queues and tables as if they were primitive data items in the language . 2) The language should provide facilities for the control and synchroni- zation of multiple processes. 3) The specification of hierarchically structured or layered systems must be straightforward. k) It must be easy to insert and delete code to measure the performance of a system coded in OSL/2 . 5) OSL/2 should be both an operating system language and an adequate job control language. The task of an operating system is to sequence or control one or more processes and to allocate resources to these processes. Similarly a job control language must specify the sequencing or control of one or more processes and specify the resources required by each process. Since these tasks are so similar, one language should handle both tasks. These five goals were met by the process control concepts and the generalized data access techni- ques. A process in OSL/2 is a program or procedure that may execute concurrently and asynchronously with any other process. For example, a separately running compiler is a process. It is possible to split the compiler into a scanner and a parser which each run simultaneously. The scanner might process the input stream to find all special words and identi- fiers and enter these in tables for the parser to use. The parser communi- cates with the scanner to get the next identifier, special word, or other symbol the scanner has processed. In this case the compiler, a process which includes the parser, would initiate the scanning procedure as a second process. The result is one program with two concurrent processes. The process management facilities of OSL/2 allow one process to create and, optionally, monitor new processes. If a process initiates a second process which the first process will monitor, then the first process becomes the operating system for the second process. The first process may define new primitives for the second process to use or may even redefine existing primitives for the second process. For example, a statistical system might create a new primitive that performs correlation and redefine the storage allocation primitive for those processes that will be monitored by it. Each time a process creates a second process which it monitors, a new level or layer is added to the system. Communication between layers is through the primitive procedures. A primitive action may be frequently redefined in new layers. It is also possible to redefine a primitive within one layer. This will be described in detail later. A process may spin off one or more processes as asynchronously running procedures. (For example, consider the previous compiler example or the ILLIAC IV operating system [12, 13] which has about a dozen of these processes — one for each operating system task.) When these processes must cooperate with each other, they synchronize their operations through the use of P and V operators on semaphores. These operators were defined and analyzed by Dijkstra [ik] and Habermann [15]. P and V operators, which will be discussed later, provide a powerful process synchronization facility, P and V operators are truly primitive in the sense that they are simple to implement in hardware or software and they provide a base from which more complex process communication techniques can be constructed. The process statements provide system layering and process synchronization facilities. Furthermore, the ability to initiate processes and declare data (e.g. files) at any level, allows OSL/2 to serve as an extremely powerful job control language. Through the use of the generalized data access facility the programmer may define new data types such as FIFO queues, priority queues, and special tables. Each time such a data type is accessed, for example in an expression or on the right hand side of an assignment statement, a fetch operation is performed. Each time an attempt is made to store a new value in a data item of this type, for example on the left hand side of an assignment statement, a store operation is performed. The semantics of fetch and store operations and special functions related to the new data type, are defined in the access definition. The coder can define fetch and store operations for queues or tables that perform very complex manipulations of these data types. There- after, a complex process such as inserting a new element into a queue or into a table may be written as a simple assignment statement. If, at a later time, changes are to be made to the queuing algorithms, these changes are made only in the definition and the rest of the code is unaltered. If desired, a coder can add instrumentation in the access definition. Since all manipulations of an access type are defined in the access definition, the definition provides a single spot at which every fetch, store and special function call can be recorded. Thus the data access facility allows a programmer to define arbitrarily complex data structures that may then be accessed as primitive data items in the language. Furthermore, since this feature is used to implement the queues and tables that define the state of the system, they provide a central place where measurement of the system can be readily performed. It should be stressed that OSL/2 is not an extensible language in the general sense that GPL [l6, IT] and ALGOL 68 [l8, 19] are. OSL/2 does allow the coder to specify a class of new data types and the semantic interpretation of fetches and stores on that data type. Unlike most extensible languages, OSL/2 does not allow binary operators to be created or redefined. This restriction makes the language easier to implement yet still provides those data extension features both necessary and desirable for systems programming. The author's continual frustration with the lack of good symbol manipulation facilities in existing algorithmic languages led him to adopt the pattern matching concepts of SNOBOL. Many other "nice" features like patterns and case expressions have been added to the language for the convenience of the coder. Most of these facilities are frills that are not essential to the thesis. They are included primarily to document the fact that such formerly diverse constructs as SNOBOL patterns, bit manipulation, and asynchronous processes can all exist harmoniously within a single, block structured, algorithmic language without the specification of scores of new primitives and exception conditions. Most of these facilities were added to basic ALGOL 60 by extending the primitive data base to include structures, strings, patterns and pointers or by making logical extensions to existing statements, such as the iteration statement. Unfortunately, systems programmers tend to be very poor docu- menters of their work. Unless they are made to write program logic manuals and flowcharts before they write the code, they will most likely never provide these flowcharts or logic manuals. Therefore , many of the features in the language are included to aid in the structuring and documentation of a piece of code. For example, the lack of a GO TO statement not only forces clean program structure but also allows a compiler to perform better global optimizations. Since labels must appear "before and after a labelled statement and must be used by the ESCAPE statement, a minimal level of documentation is auto- matically forced on the coder. The main features of the language which are essential to the operating system codes are the generalized data access facility which enhances system measurement and queue manipulation and the process facility which allows asynchronous processes to be controlled, synchronized, and layered. The other features of OSL/2 are not essential but desirable. These extra features make the description of algorithmic processes more pleasant for the coder and frequently make a piece of code more readable. Futhermore , these features not only help the systems programmer structure and compress very complex codes, they also help him clarify his thoughts and thus properly design and structure these arbitrarily complex systems. 10 Chapter 2 An Introduction to the OSL/2 Syntax and Semantics The heart of any operating system is its resource scheduling and allocation facilities. Therefore, the highest design priority is placed on the ability of the language to properly specify scheduling algorithms. Secondary to this consideration but still of great import- ance is the ease "with which the programmer can use the language. Questions concerning paging, file organization, linking, and addressing are not considered in the language. This chapter is broken into two main sections - a description of the data types and expressions allowed in the language followed by a description of the statements and process control facilities. The primitive data types allowed in the language are integers, Booleans , strings, patterns, and pointers. Integers are used for arith- metic and indexing. Booleans provide logical manipulation capabilities and are used extensively when controlling program execution. Strings of various character sizes and lengths are used with SNOBOL-like patterns to scan text. Pointers are heavily used to build and process list data or data structured as an arbitrary network. These five primitive data types are used to form the three composite data types: arrays, structures, and generalized access data. OSL/2 arrays are identical to Algol arrays. Structures allow the specification of tree structured data similar to PL/1 structures. 11 Generalized access data are built up from the primitive data types and the other composite data types. Special purpose procedures are ■written to describe how this data will be manipulated. The combination of the data specifications and the manipulation procedures allow one to declare and manipulate complex queues and tables and to perform complex I/O and data retrievals as if they were primitive constructs in the language. An OSL/2 program is composed of one or more nested or sequential statement blocks. These blocks are identical to ALGOL 60 blocks. Within the blocks, compound, iterative, conditional, and assignment statements are allowed. In addition, procedure calls and case statements are provided. The case statement allows one to index into a list of statements and execute only one of them depending upon which index value was used. OSL/2 does not have a GO TO statement. The GO TO is replaced by the escape statement. An escape statement allows one to leave the range of any statement he is in. For example, he could leave an iteration loop, or the block that contained the loop, or the block that contained that block, etc. However, one cannot enter another statement via an escape statement. The only way to enter a statement is to arrange for it to be the next statement in the program sequence. In such an environ- ment, a programmer depends heavily upon "infinite" loops, terminated by executing an escape statement, to replace many GO TO statements. In OSL/2, a process is a separately running program or asyn- chronously running procedure attached to another process. Control 12 primitives are available to initiate, stop, and restart processes. Furthermore, it is possible to declare new primitives for processes or to reroute calls on standard primitives to alternative procedures declared by a supervisory process. These asynchronously running processes may synchronize their actions using P and V operators on semaphores. These provide a very convenient lock-out capability. This brief, global introduction to the language facilities provides the perspective for the detailed description that follows. 2.1. The Primitive Data Types There are five primitive OSL/2 data types: integers, Booleans , strings, patterns, and pointers. 2.1.1. Integer The only arithmetic allowed in OSL/2 is performed on the positive and negative integers. In an actual implementation of OSL/2 to solve more than just scheduling problems, it may be desirable to include real numbers or even complex numbers as primitive data. Since these are generally machine dependent and extremely rarely used, if ever, in operating system work, they are not included in the OSL/2 description. The operations that may be performed on integers include addition, subtraction, multiplication, a variety of divisions, and exponentiation. When performing division on integers, one may produce a rounded or truncated answer. He may further truncate up or down in value. Therefore, division operators are provided to describe each of these operations. 13 2.1.2. Boolean Booleans are single valued data that take on the value true or false. They may not be considered to he multivalued or bit strings as they are in most languages . String data and string operators must be used to perform logical operations on bit strings. The logical functions and, or, not, exclusive or, equivalence, and implication are allowed on Booleans. In the evaluation of Boolean expressions two additional special purpose operations are allowed: andif and or if . The use of these operations is best illustrated with an example. Assumed that operation X is to be performed if the Boolean variables A, B, and C are all true. An OSL/2 program to do this would be written if A andif B andif C then X fi A conditional OSL/2 expression or statement begins with an if and is terminated by a fi_. Thus if. . fi is actually a bracket pair. In thi.s and all following examples the special OSL/2 words are underlined for illustrative purposes only. In the language itself, words like if , and , etc. are reserved words and may not be used for purposes other than indicating specific OSL/2 constructs. In the above example, one need not test the variables B or C if A is false. The andif caused the evaluation of B and C to be skipped if the expression is already determined to be false. The operator orif causes a similar termination of an expression or subexpression if the value is already determined to be true. These operators are provided for efficiency and to avoid unpleasant side effects. For example, it may happen, particularly in Ik operating systems codes, that the quantity B may not exist if A is false. Thus the testing of B might cause an error. The way around this situation in conventional languages has been to write nested conditional state- ments , e.g. , if A then if B then X fi f i In OSL this is written as , if A andif B then X fi 2.1.3. String String data has a character size and a length. The character size is measured in bits and the length in characters. Strings may be of a fixed length or may be variable within some maximum length. A string may be bit-wise anded ( sand ) , ored ( sor ) , exclusive- ored ( sxor ) , or complemented ( snot ) . A string may also be tested for a pattern match or for equality or inequality with some other string. For example, the following Boolean expression is true if the string S contains an "A", "B", or "C M . S { "A" | "B" | "C" } The pattern is contained within the braces. The following statement tests if the string A is not equal to the string B and, if not, conca- tenates it to the string X. if A ^= B then X .+ X & A fi String inequality is judged by the bit representations of each character. No collating sequences are assumed by OSL/2. However, one may specify standard character sets such as ASCII and EBCDIC when writing string constants. For example, the string constant E"XYZ" is 15 a three character string. Each character is eight "bits long and contains the EBCDIC representation of respectively "X" , "Y" , and "Z". Substrings may be specified by following a string expression with a field. Examples : S.[0:10j (S -*- S&B).[n:k+2] The first example specifies the first ten characters of S. The second example specifies k+2 characters beginning at character n in the new value of S. Characters are numbered left to right beginning at zero. 2.1. U. Patterns Patterns are used in pattern matching statements. Pattern matching statements are also Boolean expressions. The OSL/2 pattern is based on the SNOBOL k pattern. All patterns are enclosed within patterns to any depth. Patterns may be assigned to pattern variables and saved for later use or passed to procedures. Patterns may also specify that intermediate results in a pattern match should be assigned to some string variable. The basic form of a pattern expression is •TA I A i i A \ where each A is an alternative. Patterns are matched from left to right. The result of the match is true if one alternative matches and false if no alternative matches. The evaluation of a pattern expression stops at the first match. Each alternative has the general form S -*- P.&P_&P_&. . ,&P 12 3 n 16 where S is an optional string variable and P . . .P are pattern primaries. In order to match this alternative, a string must contain the pattern P followed immediately by the pattern P , followed immediately by the pattern P , etc. If a successful match is made, a copy of the part of the string which matches the concatenated patterns is placed in S. Since a pattern variable may have as its value an embedded assign- ment to a string variable, one must be sure that the string variable exists so that the embedded string assignment is always legal. Therefore pattern variables or string variables may not appear in a pattern expression on the right hand side of a pattern assignment if the pattern variable on the left is global to them. Furthermore, a formal parameter pattern variable may not be assigned a pattern which contains pattern or string variables. In the example Patt «■ {"A" | Str ■*■ "B" & Patt | {Str 2 | Patt }} the following must be true: 1) Patt , Patt.., Str , and Str p must be declared global to Patt or they must be declared in the same block as Patt . 2) Patt is not a formal parameter. Pattern primaries are string constants, string variables, pattern variables, pattern functions, and pattern expressions. Refer to pattern matching statements for more pattern examples . 2.1.5- Pointers Pointers specify or "point to" any OSL/2 data. If P is a pointer and X some data, then P.X. is the X pointed to by P. This is of special use when list processing. The OSL/2 coder will normally 17 have only one identifier used to denote a list element. The pointer then denotes "which list item is being specified. Pointers may point to node, own, or normal data although they are primarily intended for use with node data (node data may "be explicitly allocated and deallocated by the programmer) . Pointers may point to other pointers. For example, "P1.P2.X" is the X pointed to by the P2 which is pointed to by PI. Pointers may have a null value or may be assigned values via allocate statements and pointer functions. P <- ptr (N) creates a pointer to the variable N and assigns it to P. 2.1.6. Conditional and Case Expressions In the previous five data types, the following primaries are allowed in all expressions. if Boolean expression then expression else expression fi and case integer expression of expression^ , expression , expression esac n In the above constructs, expression, is an integer, Boolean, string, pattern, or pointer expression. In the first construct, the if primary, the Boolean expression is evaluated, if it is true, then the value of the primary is expression If it is false, the value of the primary is expression . Examples : if x=0 then a else b fi n ■*■ if n=0 then h else n+1 fi 18 The case primary has as its value expression, where i is the value of the integer expression. If the integer expression is less than zero or greater than n then the default expression is the value of the primary. 2.2. Composite Data Types There are three composite data types, arrays, structures, and generalized access data. OSL/2 allows any of the primitive or composite data types to "be subscripted. The subscript list follows the data identifier and is enclosed in "brackets. Examples : A[0] B[10, 23*2+7] C23[if X_is_even then X else 2*X fi] Data structures, similar to the structures allowed in PL/1, are used in OSL/2. An example of a list element declaration is given below. node structure list_element ( structure data ( integer age, weight, height; Boolean sex_is_male) ; pointer forward, backward) Note that structures may be nested within structures. Also note the use of an underscore to make identifiers more readable. The underscore is a null character. Thus the identifiers "a_b_c" and "abc" are the same identifier. 19 The data within a structure must always be completely addressed "by all identifiers in the hierarchy above the desired data. The list of identifiers is in hierarchial order and is separated by ".". For example, the pointer forward is addressed as list_element . forward and the integer weight is addressed as list_element . data. weight "list_element .weight" , "weight", "data. weight" , and "forward" would not be legal syntax in any construct. If desired, for example when passing data to a procedure, one may specify a whole structure or substructure by addressing down to the level desired. For example, "list_element" or "list_element .data". The generalized data access facilities allow the programmer to specify the action that is to be taken when data is fetched from (variable appears on the right hand side of an assignment statement) or stored into (variable appears on the left hand side of an assignment statement) the variable. In addition, numerous side actions may be specified. This facil- ity is of greatest use when manipulating tables, queues, and stacks which have complex data structures and access algorithms. In access declarations and definitions, the key words access , stack , table , and q ueue may all be used interchangeably. For a simple example consider a queue consisting of integers which are accessed in a first in first out manner such that storing into the queue causes a new entry to be made and fetching causes the oldest entry to be deleted. Two constructs are required: the definition of the data type and a declaration of an identifier which is this type. First the definition: 20 1. queue fifo; 2. "begin 3. integer (number_of_entries , oldest_entry, newest_entry) ■*• 0, h. entry__list[0:99]; 5 . integer f i f o procedure f et ch ; 6. begin 7. fetch «- entry_list[oldest_entry «- (*+l)mod 100]; 8. number_of_entries «- *-l 9. end fetch; 10. fifo procedure store (X); integer X; 11. begin 12. entry_list[newest_entry «- (*+l) mod 100] ■♦■ X; 13. number_of_entries •<- *+l lk. end store; 15. end A line by line description of this definition "will provide insight into many 0SL/2 constructs. 1) On this line the new access type "fifo" is declared. Within the scope of this definition, the identifier "fifo" will be used to declare "fifo" queues . 2) All the information concerning an access type is local to the access variable. This prevents tampering with access variables through any means other than the access procedures declared in the access definition. Thus the definition block is a true block enclosed in a begin and end (lines 2 and 15) that restricts the use of internal access variables. 21 The access procedures declared within the definition are pseudo local to the type fifo. That is, it is possible to write the OSL/2 code "length (X)" in one place and have it represent a call to one of many procedures called length depending on the declaration of "X" . "length(X)" might refer to a standard procedure call with one actual parameter. If "X" is an access variable and an access procedure, "length", is declared for that access type, then the appropriate access procedure will be invoked. If several access types are available each with its own proce- dure named "length", the particular procedure "length" declared to be part of the access type of the access variable will be invoked. 3) Three integers are declared and are initialized to zero. k) An integer array indexed from zero to 99 is declared. 5) This line declares the fetch procedure. The fetch procedure will automatically be invoked whenever a fetch is made on a variable declared to be fifo. The result will be an integer. 6) The fifo procedure fetch has a compound statement as a body. Thus, it is enclosed in a begin end pair. 7) The return value of the function "fetch" is set here. The value returned is in the array "entry_list". The index of "entry_list" is an imbedded integer assignment statement. The index will be the value of "oldest_entry" after the assignment is performed. Notice the use of the asterisk in the right hand side of the assignment statement. This is a shorthand for the variable used on the left hand side of the assignment statement. Only one such asterisk is allowed and it must be the first integer quantity (primary) encountered on the right hand 22 side. Thus A ■*- * + 1 is the same as A * A + 1 but A -<- 1 + * is not allowed. Note that this expression increments "oldest_ entry" and uses the mod operator to map its values cyclically over the range zero to 99 • 8) This line is unnecessary now but will be used later. It maintains a count of the number of entries currently in the fifo queue. The use of the asterisk convention is illustrated here to decrement the number of entries in the queue. 9) This end closes the fetch procedure compound statement. Note that comments consisting of letters and digits are allowed following an end . 10) The store procedure is declared on this line. A store procedure is automatically invoked when an assignment is made to an access variable. Store procedures have one parameter which is the type of the data which may be stored into the access variable. The type of data fetched from and the type of data stored into an access variable need not be the same type. The identifiers "fetch" and "store" indicate that the fifo proce- dures with those names have special restrictions as to their automatic invocation and number of parameters. In this case the data stored into the fifo access variable is integer data. 11) This begin starts the compound statement which is the body of the procedure "store". 23 12) In this case, an entry -will "be made in "entry_list". Otherwise, this is similar to line 7. Note that, as in line 7, an embedded assign- ment is used to compute the subscript of "entry_list" . This position in the array receives the value to be stored. 13) In a manner similar to line 8, the "number_of_entries" variable is incremented. ik) As in line 9, this line terminates the procedure store and also contains a comment. . 15) This line terminates the fifo definition. In order to use fifo variables, one must declare them. For example : fifo queue X, Y, Z declares three fifo variables X, Y, and Z. Some examples of use follow: X -*- puts a zero on the tail of X. Y ■*- Y removes the head of Y and puts it at the tail of Y. Y ■*- Y ■■*• Y ■*• 1 puts 3 ones into Y. Y ■*■ Z ■*■ X removes the head of X and puts it on the tail of Z and Y. I + I + Y_+ I is equivalent to Y «- (Y «- (Y «- Y) ) . Note that subscripts and all other explicitly stated operations in an assignment statement are evaluated from left to right. However, assign- ments are made right to left. Thus the invocations of access store procedures, which occur at assignment time, are right to left. 2k Now that the basic access data type has "been demonstrated, assume that the coder wishes to be able to add a facility to his fifo queues to enable him to find the length of the queue. The following access procedure declaration may then be added to the definition between lines h and 5, 9 and 10, or ik and 15. integer fifo procedure QSIZE; QSIZE «- number_of_entries ; With this addition, the coder can write QSIZE(x) and know the number of entries in X. Note that the parameter "X" ("X" is a fifo queue variable) does not appear in the declaration of "QSIZE". All access procedures are called with one more parameter than is declared. By convention, this additional parameter is the first parameter and indicates which access variable is to be operated upon. Some additional features of the access facility are demonstrated by this example: 1. stack simple (S) ; integer S -«- 0; 2. begin 3. integer number ■*■ 0, default_value ■«- S; k. pointer head «- null ; 5. node structure entry( 6. integer item; 7. pointer next ■*■ head); 8. integer simple procedure fetch; 9 . begin 10. pointer temp; 11. if_ head = null 12. then leave fetch (default_value) 13. else 25 Ik. fetch ■*• head. entry. item; 15. temp ■«- head; 16. head •*- head. entry. next ; IT. release (temp. entry) 18. fi 19. end fetch; 20. simple procedure store(X) ; integer X; 21. begin 22. head -«- allocate( entry) ; 23. head. entry. item ■*- X 2U. end store 25- end 9 26. simple stack X, Y(A*B) ; 27. X -«- l; 28. Y ■«- X «• A+B; 29. X ■«- X; 30. X ■*- X «• X; Note that in line 1 the access type "simple" has a formal para- meter which is used on lines 3 and 12 to return a default value if the simple stack is empty. Note also that on line 5 a node structure is declared to hold the entries in the stack. Node structures, like PL/1 controlled storage, may he allocated and deallocated by explicit programmer command (cf. the allocate and release statements on lines IT and 22) . On line 7 of this definition, the substructure pointer "next" will be initialized to the value of "head" each time the structure entry is allocated. 26 Lines 12 and lU illustrate the two ways to return values for a function. One line 12, the escape statement " leave fetch (default_value)" sets the function value and exits. On line lk, the function value is set but computation continues. When specifying a node variable, if a pointer to the variable is not given, the last allocated instance of that variable is assumed. Thus on lines lU , l6 and 23, the phrase "head" need not appear before the phrase "entry". On line 26, the simple stacks "X" and "Y" with respective defaults of zero and A*B are declared. In the definition of simple stacks on line 1, the parameter "S" was given a value of zero which was to be used if the parameter was not supplied at declaration time. Since the para- meter was omitted in the declaration of "X", zero was assumed. One might add the following access procedure and internal procedure between lines 19 and 20. 19.1 integer procedure glorp(n,p) ; value n,p; integer n, pointer p; 19.2 begin 19-3 if. P = nu ll then leave glorp ( de f ault_value ) 19.^ else if n=0 then leave glorp (p. entry .item) 19 • 5 else leave glorp ( glorp ( n-1 , p . entry . next ) ) fi_ 19.6 fi 19. T end glorp; 19.8 integer simple procedure scan(n); value n; 19.9 integer n «- 0; 19.10 scan ■*■ glorp(n,head) This declares the procedure "scan" which may be used to non destructively access any item in the stack. Note that procedure "scan" 27 has an optional parameter "n" "which is assumed to be zero, the top of the stack, if it is not supplied "by the coder (lines 19. 8 and 19.9). The recursive procedure "glorp" is local to the "simple" stack definition block and may only be used by the procedures within the definition block. The procedure glorp does the actual search through the stack for procedure "scan". "scan(x)" and "scan(X,0) n will both return a value equal to the top element on the stack X. "scan(X,n) u will return element "n" or the default value (zero in this case) of the stack if the stack does not contain an nth entry. Access definitions and procedures may redefine the interpre- tation of subscripts. One may follow the formal parameters of an access definition with array parameters . The parameters that appear in the array part need not be arithmetic - they are handled the same as any formal parameter, Example access formatted_file (A,B,C) [XL :X2,X3,XU :X5:X6] ; integer A,B,C,(X1,X2,X3,XU,X5,X6) <- 0; The array parameters in the access definition define the interpretation and format of the array part of an access declaration. The array para- meters may be declared and defaulted in the same manner that other formal parameters are. The array part parameters are then input in the same fashion as in any other declaration. Example formatted_file ■ access (F1,F2) ( 30,30 ,150) , F3(30 ,150 ,150) [0:10], (Fl+,F5,F6)( 30, 30, 150) [0:2, 13, 10:2:20], FA( 30,150 ,150) 28 Subscripted access variables may appear in OSL/2 code. In each case the appropriate fetch, store, or special access function must have a subscript specification. Example definition: integer A[0:N,0:N]; access transpose begin integer transpose procedure fetch [B,C]; integer B,C; fetch + A[C,B]; transpose procedure store (D)[B,C]; integer B,C,D; A[C,B] * D; end declaration: transpose X use : X[N,M] «- A[N,M] The above access example produces a variable X which, when used as an array, produces the transpose of the matrix A. The array facilities in access declarations and definitions allow one to interpret an array subscript as a particular queue or table entry or to imply virtually any other interpretation. Since the number of subscripts in the declaration and the format of the definition need not agree and can be defaulted, a maximum amount of freedom is allowed. 29 For example, one could declare a special array that took advantage of a subscript declaration with this form: "A:B:C" where an array would he created with indices from "A" to "B" in steps of "C". A fifo fetch procedure with the following heading could he written. integer fifo procedure fetch [N]; integer N •«- -1; One could use the array subscript to specify a destructive or non-destructive fetch from the queue if N were -1 (no subscript) then a destructive fetch th of the oldest element would proceed. Otherwise, the N element would be non-destructively fetched. In general the access procedure heading Pr(A ,A ,A ...A )[N ,N . . . ,N ; ] d 3 n . d m would be called as Pr( Vr[N lS N 2 , . . . ,Nj ^A^ . . . ,A n ) where defaults may arbitrarily trim the actual number of N's or A's that appear and Vr is a particular access variable name. The actual parameters, both N's and A's, may be integers, booleans, strings, etc. In particular, the subscripts (N's) need not be arithmetic and need not be interpreted as normal subscripts. Access variables can be declared to make complicated table entries, perform formatted i nput/ output , and monitor variables. Perhaps one of the greatest advantages of the access mechanism is the ease with which the coder can use it to measure a system. If one uses the access mechanism to perform all stack, queue, and input /output operations, a centralized place is provided to insert instrumentation. One merely adds to the access definition that code which. is desired. It is then easy to keep statistics on virtually every access variable. 30 Furthermore, it is easy to delete any measurement code once it has served its purpose. 2.3. Statements 2.3.1. Blocks, Compound Statements and Conditional Statements Are Recursively Defined. That is, a single statement may contain one or more statements embedded within it. Each of these embedded statements may in turn contain further embedded statements, etc. This happens in one of two ways. First, assignment statement may be embedded in expressions which are themselves parts of statement. Second, many statements may be grouped together within a block, compound statement, case statement, or conditional statement. If one places several statements end to end, separates them with semicolons and then puts a begin in front and an end at the rear he has a single compound statement. This compound statement is then treated as one statement, not the several of which it is composed. This is especially important when parsing case statements and procedure declara- tions. The compound statement has the following basic form: begin S; S; S; S; S; S;...S; S; S end If one takes the statement list previously described, precedes it with a list of declarations separated "by semicolons, and then wraps it in a begin end pair, he has a single block — which is also a single statement, The basic form of a block is begin D; D; D; S; S; S;...S; S end A case statement and a conditional statement take the same forms as case and conditional expressions (cf. 1.1.6.) except that they select from a choice of statements rather than expressions. Conditional and case statements have the following forms : 31 if Boolean expression then S ; S ; . . . ; S else S ; . . . ; S f i case integer expression of S A ;...S n ; S esac — n-1 n 2.3.2. Iterative Statements Iterative statements take many forms. They are used to control loops. One may step through a loop until some condition is satisfied, incrementing some index, or choosing items from a list. The forms of the iterative statement are generally self-explanatory. In the following forms several abbreviations are used. V is a variable, S a statement, I an integer, B a Boolean, Str a string, Ptr a pointer, Pat a pattern, and E an expression. For example, B.E. is a Boolean expression and I.V. an integer variable. Iterative Statement Forms : 1) while B.E. do S 2) do S until B.E. 3) for I.E. times do S k) for V + E Q , E 1 ,..., E g , do S In form 1, the Boolean Expression is evaluated and tested. If it is true, the statement following the do is executed and the whole process begins again with a reevaluation of the Boolean expression. If it is false, the statement is skipped and control continues after the iterative statement. Form 2 is similar to form 1 except that the statement is executed first. Then the Boolean expression is evaluated. In this case, a false result continues the loop and a true result terminates the loop. Form 3 simply specifies that a statement is to be unconditionally executed a fixed number of times. Examples : while i < j do 32 begin i +■ i+1; some__f unction (i,j) end do A[i -<- *+l] «- until i = n for n times do factorial ««- *#(i ■* *+l) Form k actually has many variations. The variable V may "be integer, Boolean, string, pointer, or pattern. The types of the expressions E_....,E are the same type as the variable V. In the basic form k, the On ' variable V is first assigned the value E . Then statement S is executed. Then V is assigned the value E . Then statement S is executed again. This continues through all of the E's. If the variable in form k is an integer, some interesting variations are allowed in the E's. In this case the form of the E's may be k.l), I.E. k.2). I.E. to I.E. k.3). I.E. X to I.E.g by_ I.E. k.k). I.E. step I.E. until B.E. k.5). I.E. while B.E. Form k.l is identical with a standard form k interaction. Form k.2 specifies that the variable will be stepped from I.E. to I.E. by the increment +1. If I.E. > I.E. no statement will be executed and the next E in the form k iteration expression list will be executed. Form U.3 is the same as k.2 except that an increment is specified. In this case I.E. may be positive or negative and the test for the end of this iteration expression is then made respectively on V > I.E.- or V < I.E.- 33 Form k.h specifies that V is to be set to I.E. . The Boolean expression is then tested. If it is true this iteration expression is finished. If it is false then the statement S is executed; V ■*- V+I.E. ; and the Boolean expression tested again. Form U.5 specifies that V ■*- I.E. If the Boolean expression is false, no statement is executed — the iteration expression is finished. If the Boolean expression is true, the statement S is executed; V «- I.E.; and a new test is made. In the evaluation of the integer expressions above, bad side effects can present themselves. For example, the loop end, I.E. , in form k.2 may be a complex expression that takes significant time to compute each time through the loop. Often times, systems coders desire these effects. However, it is necessary to carefully evaluate each expression and assign it to a separate variable outside the loop if extra computations are to be avoided. Therefore, in each of the forms k.l to U.5 one may write value (I.E.) instead of I.E. If this is written, the I.E. is evaluated only once, at the beginning of each iterative expression, rather than many times (i.e. once for each iteration) for each iteration expression. Examples : 1. for S [I] + "IF", "NOT", "MAYBE" do S2 +■ *&S[l] 2. for x +- 1, 2, 3, n step 1 until sqrt (x) < 100*n, 15, l6, i t_o value ( j*f (n) ) bv_ 11, x+1 while x[()]; [ value , . . . , , ; ] [ name , . . . , ; ] [ ; . . . ; ] Where the square brackets are metalinguistic and indicate optional syntax. The positions of the name and value parts may be interchanged. Example : procedure add_3(A,B,C,N) ; value N; integer A, B,C,N; begin A «- *+N; B -*- *+N; C ■*■ *+N end In this example, the formal parameters "a" , "B", and "C" are called by Ill reference and "N" is called "by value. The statement "Add_3(X,Y,Z ,L+M+Q)" is equivalent to begin int eger N. , n B — internal; N. , - + L+M+Q internal X +- *+N. , _ internal; Y -<- *+N. , internal; Z «- *+N. . internal end If the formal parameter "N" was called "by name the statement " add_3(X,Y,X, L+M+Q) " would be equivalent to begin X «- *+L+M+Q; Y -*- *+L+M+Q; Z -t- *+L+M+Q end When one calls an actual parameter by name inside the procedure body, the entire actual parameter, with parenthesis around it if necessary, replaces the formal parameter. When one calls an actual parameter by value, the actual parameter expression is calculated once upon entry to the procedure. This calculated value is stored in a variable local to the procedure and all references to the formal parameter are made from and to this local variable. Call by value allows one to increase efficiency by eliminating many evaluations of the identical expression and allows one to avoid undesireable side effects. For example, if the actual parameter "N" was 1*2 "L •*- L+M+Q" in the above example and if "N" was called by name, the equivalent statement to "add_3(X,Y,Z,L «- L+M+Q)" would be begin X +■ *+(L «■ L+M+Q); Y +■ *+(L +■ L+M+Q) ; Z «- *+(L «- L+M+Q) end Note that the value of "L" is continually changed. This may be a bad side effect if unexpected. Call by value may be used to keep an external variable from being changed by a procedure call. Example : procedure Add_N(A,N) ; value N; integer N,A; while N «- N-l >= do A ■<- A+l While this may be the hard way to add M N M to "A", it does illustrate how one would protect the value of "X" in the statement "Add_W(X,X) ■". If "N" is called by value, this will double "N" (assuming "N" is positive). If "N" is called by name, this statement would be equivalent to " while X «- X-l = doX + X+l" which is an infinite loop. The procedure "Add_N" also demonstrates the illegal statement side effect. "Add_N( X,23+l6 )" is legal if "N" is called by value. If "N" is called by name, this is equivalent to " while (23+l6) <- (23+l6)-l >= do X <- X+l". This clearly has an illegal assignment to "(23+16)" which is not a variable. ^3 If a parameter is not specified to be called by name or called by value, then it is called by reference. Parameters called by reference are similar to FORTRAN subroutine parameters. If possible a "reference" is a pointer to or an address of a single variable, a structure, an array, a substructure, or a single element of a structure or an array. This "reference" (pointer or address) is calculated once, just before procedure entry. If it is not possible to generate such a "reference" then the actual parameter will be called by value. This is generally the case when a complex expression or procedure call is used as a parameter. Example : Assume the following global declarations structure S ( integer X,Y); integer I,L,M,N; integer A[0:10] ; structure Big ( integer X,Y [0:10]); procedure P (l,A,S); integer I,A[0:10]; structure S ( integer A,B) ; begin end P Let us call the procedure "P" with the following actual parameters for "I". kk s.x ^ A[I] A[I 4- 1+1] ^ Big.S .Y L N > These parameters will have references calculated. Note that each subscript will he computed only once. J These parameters will be reduced to call by value parameters. L+N+M I «t- 1+1 A[I]+A[I+1] S.X*N Let us call "P" with the following actual parameters for the formal parameter "A A References will be calculated Let us call formal parameter S S Big.S Big.Y for these arrays. 'P" with the following actual parameters for the References will be calculated for these structures . if I0) then no further action is taken. If the value of s is non-positive (s_<0) then one or more processes were blocked and are booked on a waiting list. In this case one process is removed from the waiting list and may now proceed. P and V operators can be used to implement complex waiting queues and priorities. Therefore, the primitive queuing algorithm is not specified. Eegardless of the original queuing algorithm on a single semaphore, a new one can be built through the use of one or more semaphores. P and V provide access to critical code or some system resource. To implement any scheduling scheme on x, one could introduce a process X which schedules x. A process which desires x would then use one sema- phore to pass this request to X and another to block its continuation until x was available. Example : P(msg) ; seize message path to X ! send message to X requesting x V(msg) ; release message path to X P(x i ) seize x use x P(msg) ; seize message path to X send message to X releasing x V(msg) ; release message path to X 52 Note in the above example that each process would use the "msg" semaphore to communicate with X but each process would use a different semaphore (indicated by x. ) to block its access to x. The process X would perform any necessary V(x. ) to permit a particular process (the i process to which semaphore x. refers) to have access to x. Clearly, X can implement an arbitrary booking and dispatching algorithm for x. 2. 5.3. Processes OSL/2 processes are separately running, asynchronous procedures or programs. Six primitives control these processes: spawn , append , initiate , suspend , restart , and terminate . The spawn primitive allows one to spawn, from inside a process, another process which will run at the same level as the spawning program. The spawning program and spawned program have no special relationship. If the already executing process A of operating system OS spawned process B the relationships would be diagrammed as follows. A B OS Any parameters passed to the spawned process are called by value if they are defined within the spawning process. Examples : spawn (program) spawn (routine (x,y,z)) A process may append a procedure. An appended procedure has access to all normally addressable constructs. A process may not leave a block until all procedures appended in that block have returned. Appended procedures run asynchronously with the appending process. 53 Example : "begin integer (A,B,C) [0 :9] , X,Y,Z,N «- 9; procedure innerjproduct (A,S) \ integer AlOjN] ,S \ begin integer sum ■*- ,i ; for i to N do sum *■ sum + A[i]*A[iJ; S •*• sum end ; append (inner_product(A,X) ) ; append (inner product(B,Y) ) ; inner-product ( C , Z ) ; end In this example three processes would be running simultaneously. Two new processes would be created to perform the procedure "inner product" on arrays "A" and "B" , and the original process would calculate "inner_product" of "C". All three processes have access to the same global variable "N" but each inner product procedure has its own local variables "i" and "sum" which are not addressable by the other processes. Initiate creates a new process that is to be monitored by the initiating process. The new process may not directly address constructs within the initiating process. All communication to the initiating process must take place through the primitives defined by the initiating process and any parameters passed to the process. A process variable is attached to each such process. The process variable is used to "suspend", "restart", and "terminate" the initiated process . 5h Example : process pl,p2,p3; initiate (program, pi); initiate (program, p2); initiate (user job (l0,file_23) ,p3) ; suspend (pi) ; suspend (p2) ; restart (p2) ; terminate (p3) ; If a terminate statement appears in a process without a process variable for a parameter, then the process in which the statement is executed is terminated. Example : terminate 2.5- 1 +. Traps Traps are special conditions that arise in the course of executing an operator. For example, divide by zero or index out of range. If desired a procedure may be executed to handle the special condition. Examples : on index terminate on overflow leave overflow (99999) on divide_by_zero (Numerator) leave divide_by_zero (Numerator' Traps vary from machine to machine. Therefore, each OSL/2 implementation must define reserved or key words for each such trap. The trap declaration 55 consists of the reserved word on followed by the trap condition followed by an optional set of parameters followed by a statement. The parameters contain pertinent information about the environment. For example, an implementation might return the value of an index causing an index fault and also the array bounds involved. While index out of range, integer overflow, and divide by zero are clearly required in OSL/2, the definition of the trap environment (i.e. trap parameters) is left to each implementation, The trap facility allows a way of specifying special operators for a microcoded machine or perhaps a way of changing the semantics of existing operators. For example, one implementation might allow a trap on all integer divides so that the coder could define the semantics of divide. 2.5-5- Primitives The reinterpretation of the primitives between system layers is frequently implementation dependent. While OSL/2 specifies the manner in which reinterpretation will be performed the actual specification of a primitive may vary from installation to installation. In general, the scope of primitive declarations is the same as for normal procedure declarations with the following extensions. Within the declarations at the head of a block, all access to a primitive made lexicographically prior to the new declaration of a primitive are valid and will access the old primitive. Furthermore, the types of primitive functions must be maintained. Therefore, it is not permitted to change ptr (x) from a pointer primitive to a string primitive. % Example : begin integer X[0:10000]; pointer primitive allocate ( ) ; integer A[0:100] end In this example, an implicit call on the old primitive " allocate" is performed to allocate the array "X". Note that the reserved word primitive is used in place of the reserved word procedure when redefining a primitive. Once "allocate" is redefined, all allocation will be performed by the new primitive. Thus the array "A" will be allocated by the new primitive. Since the only space addressable by the primitive "allocate" is the array "X", "A" must be allocated inside of "X" . Furthermore, any processes which are appended or initiated (but not spawned) within the scope of the new allocate must use the new allocate. The number and type of the parameters into which a compiler will map an allocate statement is not defined. OSL/2 originally had a data type "core" into which allocations could be made. This facility was deleted because it restricted the class of addressing schemes allowed in the hardware. In order to permit the maximum flexibility for future technological developments, such as cheap content addressed memories, the manner of allocation and the format of the parameters are left undefined. Most of the arithmetic function primitives will not be changed. The allocate, release, and terminate primitives might be in a normal system. The primary use of the primitive construct will be to define 57 additional primitives for use at higher levels in a multi-layer operating system. 2.6. Compile Time Facilities Operating system codes have a strong need for a parameterized macro facility and some conditional compilation. These facilities allow a significant isolation of changeable parameters and they also allow a programmer to write and document a single system that has several, possibly conflicting, portions that are conditionally inserted on an installation by installation basis. The define facility used in Burroughs B5500 [23] and B65OO [2k] ALGOL is used in a slightly modified form in OSL/2. One may define a string value for an identifier. Examples : define ABC = begin A + 0; B •*- N; C «- M end #; define Xsize = 100#,Ysize = 37#; The "=" and "#" bracket the string. The appearance of the defined identifier later in the block in which it is defined will cause the identifier to be replaced by the characters enclosed in the "=" "#" brackets. In the above example, A -<- Xsize is equivalent to A* 100. Defined identifiers may appear in defines nested to any level. Furthermore, defines may appear within defines, " define identifiers . .#" is considered a bracket construct that may appear in a define. 58 Defines may have parameters. Examples : define add (A,B) = ($A)+($B)#; define concat (A,B) = "$A$B"#; In the above cases, add (lO*L,13) is equivalent to (lO*L)+(l3) and concat (ABCjDEF) is equivalent to "ABCDEF". The dollar sign ($) denotes compile time facilities. One may declare compile time Booleans , integers, and strings. Example : $ integer A,B,C,D +■ 10 ,E; $string (S1,S2) <- MT; These variables must be preceeded by the dollar sign when they are used. An assignment to a compile time variable is made at the time it is scanned. The expression assigned must therefore be calculable at compile time. A compile time variable which appears in an expression is replaced by a constant whose value is the same as the current value of the variable. Examples : $A «- $D+1; X + 10 + $A*$D; is equivalent to X <- 10 + 11*10; Most 0SL/2 statements may be compile time statements. Examples : $if $A<10 then . . . else . . . fi $for $A «- 1,13,2 do X[P(QRZ),$A] «- L*13 59 In both of the above examples , the $ if and $for did not require the "$". In the if case, the expression being tested "was a constant, therefore, a good compiler would only have compiled one branch of the conditional anyway. In the case of the for statement, the use of a compile time variable as the iteration control variable indicated that the statement was a compile time statement. The for statement is equivalent to: X[P(QHZ),1] «• L*13; X[P(QRZ),13] «■ L*13; X[P(QRZ),2] *■ L*13 while . . .do. . . , do . . . until . . . , and for . . . times do . . . always must be preceded by a dollar sign if they are compile time statements. 60 Chapter 3 System Measurement, Modification, and Debugging in an OSL/2 Environment Operating systems are not static. They are constantly being changed. These changes either improve the efficiency of the system, add features to it, or correct errors. , When new systems are coded, the most significnat question facing the implementor is: "How well does it run?" This question is answered predictively by a closed form analysis of the system or by simulation. Once implemented, the question is answered by measuring the system. Due to the complex interaction of system modules, and the stochastic nature of the inputs a closed form prediction of the perform- ance of an entire operating system is usually not possible. However, many individual modules in a system may be amenable to some greater or lesser degree of analysis. Simulation is more commonly used when predictive estimates of an entire system's performance are desired. Unfortunately, it is usually very difficult to accurately estimate the loads on a system. If one assumes, however, that an accurate model of the system is available, then it is certainly reasonable to run this model in a variety of modes and under a variety of loads. In this manner, the areas of competence and incompetence of the system can be discovered a priori. This information can then be used by the managers of an installation to make the loads placed on a system fall in the area of the operating system's greatest competence. The strict isolation of processes and the nature of the synch- ronizing primitives P and V in OSL/2 make OSL/2 a reasonable simulation 6l language. Therefore, it is possible to avoid a separate simulation step by using OSL/2 for both the simulation and the final system code. OSL/2 has no concept of an interrupt. Systems coded in OSL/2 are command driven. That is, they consist of separate processes for each task to be performed. Each task uses the P and V operators to get and dispatch •work. Therefore, each process is either waiting for a command to begin work (P operation), issuing a command to another task to begin work (V operation), or performing its function. Since OSL/2 systems have well defined interfaces to both the outside world and internal processes, it is a relatively straight- forward (although frequently tedious) procedure to simulate an OSL/2 coded system in total or in part by appropriately setting global variables and issuing P and V operations to the various semaphores of the system. The generalized data access facilities of OSL/2 allow access to every transaction performed on any data declared with this facility. Therefore, one may specify virtually any information he wants recorded on access data, the manner of recording, and the manner of read out. Since this is done only once, in the access definition, a single place is pro- vided where all measurements of queues, stacks, tables, etc. may be inserted and deleted. This facility allows the system designer, subject only to the constraint of the Heisenberg Uncertainty Principle, to readily measure an OSL/2 coded system a priori during simulation or a posteriori after implementation. Changes to an OSL/2 coded operating system may take place in several ways. The generalized data access facilities provide an easy way to change queuing algorithms and add special features to complex data 62 structures. One may add new facilities "by adding a new primitive to an existing system or "by introducing a new layer to the system and adding the additional feature in the new layer. This latter technique is especially desirable if one has a well functioning piece of code which he does not want disturbed. Adding the new layer to the system, leaves the original system undisturbed. Furthermore, the new layer is trans- parent to layers above which are using the primitives in the older, lower system. Since OSL/2 code can express only nested loops and procedures and cannot express interlocking loops with deliberate transfers to various portions of code and returns via indexed and non indexed go to ' s , the code tends to be more clearly structured and therefore easier to debug. However, it is still possible to write bad code and design bad logic. OSL/2 does not attempt to solve the coding or logic error problem. The generalized data access facilities provide a sophisticated data moni- toring tool of considerable power. However, no special debugging facili- ties have been added to the language. Balzar has done some excellent work with EXDAMS [20] on a 360 PL/l system. EXDAMS provides a method of keeping a history of a program's execution. After a run is completed, this history is coupled to a graphic display which is used to selectively examine variables and run the program backwards as well as forwards to discover where errors lie. This history has links between the execution variable and time space and the source program space. These links are used to immediately see which program statements caused the actions indicated on the graphic display. OSL/2 is amenable to this treatment as are most higher level languages. 63 In general OSL/2 appears to be a powerful language for coding, expanding, and measuring an operating system, While OSL/2 tends to be cleaner and thus easier to debug, special debugging features have not been provided. 6k Chapter U The Hardware Requirements of OSL/2 It was the original intention of the author to design and simulate an OSL/2 machine. That is, a machine especially suited to executing OSL/2 code. This project was not completed for three reasons. First, the amount of machine time required to build and run a simulator was not available. In fact, the author was requested to sharply reduce the machine time spent building an OSL/2 compiler. Second, the human time required to design the OSL/2 machine in detail and to code a simu- lator was significantly underestimated by the author. Third, since the author had gone through the exercise of designing the language with an actual implementation in mind, it is doubtful that the completion of an implementation would significantly alter the concepts in OSL/2. The author actually wrote OSL/2 code, wrote an OSL/2 compiler, and designed the language with a particular hardware implementation in mind. The coding of a compiler and the resulting object machine consider- ations had their primary impact on the syntactic description of the lan- guage. Similarly, the production of OSL/2 code changed the syntax again. The only concept significantly altered was the ability to pass whole declarations into access definitions as parameters to be supplied at access variable declaration time. The intent was to allow the programmer to describe a data independent access technique and introduce the actual types of data after the definition of the access type. For example, this would enable a programmer to define a first in first out queue and at declaration time describe the data being entered and removed from the queue, 65 This proved to be quite difficult to implement. Later codes showed that this feature was still desirable but the language was not badly hurt by- its absence. The author had had experience with a large number of machines of first through fourth generation architecture. The ease with which one could compile code for Burroughs machines and the similarity between higher level language constructs and the machine operations in the B5500 and B65OO originally prejudiced the author toward those designs. The OSL/2 compiler assumed that the object machine, like a B65OO, was a zero address stack machine with each word in memory tagged to indicate the type of its contents. Furthermore, a block level relative addressing scheme with arrays indexed via B6500-like descriptors was assumed [21]. This made the object code relatively easy to produce. However, the author soon discovered that the isolation of types (i.e. they could not be mixed in expressions) and the explicit coding of type transfer functions made the use of tagged words to distinguish types unnecessary. Furthermore, since OSL/2 procedure declarations require the complete specification of all parameters, the compiler can check at compile time for the erroneous passing of parameters and procedures to procedures. Thus the compiler knows at all times exactly which types it is working with. In this case, most of the type tags on words were necessary only to protect the system from compiler errors and not to aid the compiler. Virtually all of the OSL/2 constructs can be implemented on any machine upon which FORTRAN may be implemented. The exceptions are call by name procedure parameters and the process control primitives. 66 With the exception of the Burroughs compilers , call by name has been a problem since it was first introduced in ALGOL 60. The normal solution has been the "thunk". Since call by name parameters are evaluated each time they are referenced in a procedure, a procedure call, or "thunk", is substituted for the reference. Then each time a store or fetch is to be made, this procedure is executed to produce an address or value. On the B5500 and B65OO a hardware operand call is made. The tag bits of the operand tell the computer whether a simple fetch or procedure entry and complex expression analysis must be performed to produce a value. The operand call instruction is much more complex than the most intricate instructions on conventional machines. It does, however, replace a "thunk" with a single hardware operator of much greater efficiency. Call by name is a concept absent from most languages. It is included in OSL/2 for two strong reasons. It allows the use of side effect processes in a simple way. The concept of a continuously evaluating expression for use in calculating process priorities is an example of how this feature could be well used in operating system codes. Also, it is the nature of access data that they produce complex side effects when they appear in expressions. A call by name concept is required if expres- sions containing access variables are to be passed to procedures. In order to produce an efficiently running OSL/2 machine it is considered essential that the hardware provide a call by name facility. Such a facility seems to be easier to implement on a stack machine than a normal multiregister machine. This is because of the ease with which evaluation procedures may be automatically entered and the process environ- ment saved in the stack. To enter a procedure and save the process 61 environment is a more difficult matter in a register machine where the process environment may be determined by an arbitrary number of the registers. The problem of inter layer communication is severe. No hard- ware known to the author is capable of both allowing access to variables in the system at all levels, yet rigorously assuring that variables are not inadvertently altered by an erroneous compiler or operating system. Ordinary relocation and bounds protect or virtual memory addressing will not work. If some layer of the operating system makes a mistake while setting the bounds register or relocation register — all is lost. In a similar manner, a virtual memory system might have an error made while its segment table was set up. However, a few extraordinary techniques help solve these problems. For example, the register loading or segment address loading instruction could be made a special operator that worked only with an already set relocation and bounds register or an existing segment table. The operator could be designed to not create any segment address or register setting that would violate the existing settings. Thus one could guarantee that higher layers of the system would not inadvertently bomb lower layers if the lower layers properly initiated the upper layers. Techniques like these can be used to remove the concept of priviledged instructions from a machine. Each operator is constructed so it has a built in check that prevents it from violating its own environment. Thus any operator (instruction) in the machine can be allowed at any level. Where this scheme falls on hard times is in the passing of para- meters between system layers. How does system layer three pass a file 68 and an integer to system layer ten for its use without allowing access to all of layer three's variables? The setting of protection bits clearly gets immediately out of hand if layer 3 frequently gets control of layer 10. Then the setting and resetting of protection bits becomes a very significant overhead. The author feels that the solution to these problems lies in a suitably defined addressing scheme. An addressing scheme where it is simply not possible to generate an address outside of one's allowed space. The addressing scheme will be described in several steps. First in terms of a single program, then in terms of a program with several asynchronous processes, and finally in terms of a multilayered operating system. A stack machine is assumed with several level registers. Each level register points to the base address of a block level in an OSL/2 program. Example : 1 — begin integer a,b,c; Boolean X; o o o H H M o o H — enc 1 — begin string s,m,y — end; L 69 BASE OF STACK RELATIVE ADDRESS 6 5 4 3 2 I LEVEL REGISTER, DISPLACEMENT COUPLE (1,2) (1,1) (1,0) (0,3) (0,2) (0,1) (0,0) ( STACK LEVEL REGISTER BASE OF STACK POINTER y m s X c b LEVEL 1 = BASE + 4 a LEVEL = BASE+O Figure 1. Variable stack "while in block 1, In order to address the variable "a" in block the address couple (0,0) "would be used - level 0, displacement 0. To address the variable "c", the address couple (0,2) is used - level 0, displacement 2. Similarly, the variable "m" in block 1 is (l,l) - level 1, displace- ment 1. One could even put bounds on each level register to prevent inadvertent addressing into the next higher level. Then one would have TO a set of miniature relocate and bounds registers. Actually, this is not necessary but desirable. In fact, the entire program stack is an array with bounds protect and a base of stack pointer. Furthermore, the distinction between levels is made more clear by inserting special data into the stack at each break in a level to indicate the old value of a level register. This is necessary when doing recursion. This additional data is used to pass parameters to procedures. Example : — begin integer a,b ,c; procedure addl (x,y) ; integer x,y; — begin integer Z; Z ■*- 1; x ■«- y+Z •— end addl: — begin integer m — begin integer n i— begin m integer r,s addl (r,m) ; en< 3- •— end •— end 1 — end 71 or u i- co C9 UJ hi > hi -J CO M JO -J hi > UJ to N CVJ -1 UJ > UJ UJ > UJ -J o II o -J UJ > UJ 5C o CO UJ _l 0. OC => UJ O r- O ro cj c\j_ — ro ro cvt ro o" CVJ — o cT UJ UJ > UJ 0. to CO vr CO CO 9 u. o UJ > UJ o o < r- co UJ CO < CO ro CVJ — CD CD h- m UJ CO CO UJ o Oct % UJ o §2 UJ CO 0. 0_ < CO CO — . Ul O o 3 0T ^ a. 5 ?? < to CO CO CO or Ul M t- o O CO o C£ UJ O. or 2 _i < UJ 2 > UJ E t- UJ 2 CO UJ uj o to < CO Ul or o o < 0. CO E c c t- = Ul -• 2 uj e> i- < o or r— ' t— » O o _ o L_J ' -» II n — O -I -1 UJ Ul > > UJ Ul -J -1 CO CO CO Ul I- O Z O Ul £ 2 0. o Ul u. o Z X Ul o 0. ft o 1- \- z z Ul Ul 2 o 5 < < or or u. u. \ \ \ \ i r— i r— i r— i m o> * O o o l—J 1 1 ii <& & ^~ hi >s X -J Ul > -J -J UJ < < -J 2 or 2 or Q _l O o u> £ C- o— o- — — — C-- II II it 11 rO CSI O _J or or -J Ul Ul Ul Ul to w > UJ _l Q -I O C > Ul _j Q _J O E > UJ _i Q -I O u .Q o > Ul _J o -1 o "-% % V kl k* *» r~~i r— i r— I r— i CO CO * O o O O O « — • L— J i— J ii II M ii IO CM — o _l -J -J _l Ul UJ Ul Ul > > > > Ul Ul Ul UJ -J -J _l -J n CO 1- co z CO u O PROCE FRA6M l- z Ul 2 o < or »f u. CO «H o n3 0) -p W CI •H (U -p •p •H > W •H 73 In this example the "block levels have "been numbered. The procedure "addl" creates variables at level 1, hut it is passed variables from level 3 and level 1. It is not possible for a procedure running at level n to use level n+1 or higher level registers. Furthermore, it is not possible for the procedure "addl" to maintain two level 1 registers - one to address its own variable "Z" and one to address the actual parameter "m" . To bypass these problems, base of stack relative address- ing is used to pass the parameters. By using base of stack relative addressing it is possible to create indirect reference words that can be used through any number of recursions. To enter the procedure in the above example an operator was executed to leave space to save level register 1. Then another operator was executed to create an indirect reference word to "r". This was repeated for "m". Finally the procedure entry operator was executed to relink the level register and enter a new code segment. So far this is very similar to the B6500 manner of addressing and procedure entry. Things start to get sticky when procedures are appended as sep- arate asynchronous processes rather than executed in line. In this case there must be several stacks (one for each appended process) each with the same base stack. If the procedure "addl" had been appended in the last example there would have been two processes running--each with its own set of level registers to define its environment. Note in Figure 3 that the level register in the appended process points to level in the main process . Therefore the main program and the appended procedure both reference the same level zero variables "a", "b" , "c". However the level 1 registers in each process are different and reference different variables. Ik In addition to the array of level registers, an array of stack fragment registers is introduced. Now the operator that translates level displacement addresses into stack displacement indirect reference words not only produces a stack displacement but also produces a stack frag- ment index in the indirect reference word. If yet another "addl" were appended at this point, it would also receive separate arrays of level registers and stack fragment registers. The level and stack registers would point to the same place in both processes but each of the appended processes would have different level 1 and stack 1 pointers. If a process now develops a level displacement address, it must point to a legitimate address for the process. There is no way a variable in another appended process or separate program can be addressed with a level displacement address. Similarly stack displacement addresses used for formal para- meters, cannot be created that point to illegal variables since they were all created at some time from a legitimate level displacement address Clearly stack fragment 1 of the appended procedure "addl" provides no way of referencing a stack fragment of another appended "addl" or another program. The level registers and stack registers can be initialized using the environment checking techniques discussed earlier. Therefore the basic integrity of the values in the registers can be assured. This is all fine for assuring a proper addressing space for a single multiprocess program or several running together. The problem now is to insure the integrity of the addressing space between operating system layers . 75 LAYER POINTERS STACK FRAGMENT POINTERS h STACK FRAGMENTS INDIRECT REFERENCE WORDS USE THE ADDRESS TRIPLE (LAYER, STACK, DISPLACEMENT) LEVEL POINTERS STACK FRAGMENTS HE STANDARD ADDRESSING TECHNIQUE USES LEVEL DISPLACEMENT COUPLES. THE LEVEL POINTER CONTAINS ALL REQUIRED STACK FRAGMENT REFERENCES. Figure k. Final addressing scheme. 76 To do this another array is introduced, the layer array. Futhermore , the processor is given the knowledge of the layer at "which it is operating (This is just part of the initiate or restart operator.). When an indirect reference word is created, the operator also inserts the current system layer as well as the stack fragment and stack displacement into the word. A complete set of layer, stack fragment, and level pointers defines the process environment (cf. Figure k) . Since higher layers of the operating system cannot create indirect reference words that point into lower layers, the higher layers have access only to the data in the lower layers that were passed to the higher layers as actual parameters. But to those variables they have direct access as if they were in their own layer. If one makes the final requirement that indirect reference words are tagged so that only special operators may access them, the integrity of the entire system is assured. Furthermore, foreign programs, appended processes, and layers are so arranged that no possible address can be created to reference them. Thus they are removed from the address space. It should be noted that this scheme is not unreasonable. The idea of the stack fragment array, stack fragments, and level registers are used on the B6500. However the stack fragment array is a single array 102U long which contains every process in the machine. Thus the addressing spaces of programs are not disjoint. In order to address a second program a stuffed indirect reference word specifying its stack number would suffice. The OSL/2 concept makes this array part of the environment of each process thus separating the address spaces. The further addition of the layer array is straight forward. 77 The passing of information between layers is demonstrated to be feasible. The communication of semaphores as formal parameters is an obvious use of this interlayer communication. The problem of the addition and changing of primitives does not require esoteric hardware. Once interlayer addressability is defined, this problem falls under standard linkage editor techniques. In conclusion, OSL/2 algorithmic techniques do not make severe requirements on the hardware architecture. However, the access variables imply a call by name facility which does require specialize hardware for efficient processing. The layering facilities do place severe restrictions on the hardware and require careful attention to the address- ing problem. 78 Chapter 5 Conclusions The data access techniques and process control concepts provide OSL/2 with the ability to create and manipulate queues and tables and to control multiple processes in a multilayered, hierarchial operating system. The procedure parameter specification facilities (keywords and defaults) are an integral part of these facilities and considerably enhance the readability of the language. A system measurement capability is provided as a side effect of the data access facilities. The strong concepts of environment plus the measurement capability makes OSL/2 a language suitable for predictive system simulation as will as implemented system measurement. OSL/2 implies a command driven operating system. That is, a system with multiple processes each either executing or else waiting for a command (P operation) to continue. There is no interrupt concept in OSL/2. In general, command driven systems (e.g., the "THE" system [10] and the ILLIAC IV system [11,12]) must have well defined structures in order to work in the command driven environment. Such systems are there- fore more amenable to analysis, modification, and debugging. The power of the general data access facilities is enormous. They allow the specification and manipulation of complex concepts as if they were primitive in the language. They further isolate the manipulation of special purpose constructs like queues and tables to a single declaration where access algorithms can be readily changed without touching any other portions of code. This facility specifies an access technique. The inter- 79 pretation of parameters and array information in the declaration and subscripts in statements and expressions are subject to programmer control. This allows a programmer to write his own semantics for variable accessing and control. Simple queues, complex tables, switch files, and even formatted I/O have been generated using this facility. It will take a significant amount of actual use before the range and power of this technique is well established. The data access concepts can be added to existing languages like ALGOL, ESPOL, and PL/l. Furthermore, the concepts can be restricted and implemented in such a way that call by name hardware is not essential for their use. The structure definition in BLISS [22] is an example of a language with such a facility. However, adding such a capability to an assembly language is not reasonable. Assembly language does not have the richness of structure and concepts of scope that permit a clean implemen- tation. The layered system concept provided by OSL/2 is the most diffi- cult feature to implement. Special hardware is certainly required. It is not recommended for implementation on current machines. There may be one desirable concept left out of OSL/2. There is no primitive concept of a list in the syntax. Most lists are easy to build and maintain via the access techniques. However the concept of a written list is not available. This was important when writing formatted I/O. The desirable form of a read statement appeared to be read ( , ,) where could be a list of identifiers, or expressions separated by commas. Instead of doing work this way, separate "readO", "readl" , . . . ,"readn" procedures would be required to 80 read 0, l,...,n variables with "read" reserved for handling a list manipulated via the access facility. The author had used such a scheme in ALGOL 60 on the GE 635- In general, it was messy and one would continually change a "readk" to a "readn" to reflect a change in his list. Eventually, he gave up and went to the list form and declared many, many lists. While the list is not incompatible with OSL/2, it was left out in an effort to keep an already large language from getting com- pletely unmanageable. Lists can be added later if really needed. The author feels that data types like pattern also fall into this category. They were included only to document the compatibility of the SNOBOL pattern matching statement with an ALGOL-like block structured language 81 Appendix A A Reference to OSL/2 Syntax and Semantics I . Introduction OSL/2 is a block structured, Algol-based language designed for the specification of operating systems. OSL/2 is a follow-on language to OSL (Operating System Language) -which was designed and documented by Alsberg and Wells [ll]. OSL/2 is not designed for an existing machine, nor is it machine independent. Rather, OSL/2 is designed to make coding easy for the system's programmer, and it is also designed to be its own job control language. That is, a user in an OSL/2 system may use OSL/2 to control his job flow. To this extent OSL/2 is recursive in the system sense. One may describe new operating systems that exist on top of many layers of other operating systems . It is expected that OSL/2 will be best implemented on a machine specifically designed to mirror the language itself. This machine will be called the OSL machine. In the syntactic description of the language, the ALGOL 60 report is followed as closely as possible. Furthermore, every attempt is made not to arbitrarily introduce new terminology for concepts and entities available in, or similar to, other languages like ALGOL 60 and PL/1. OSL differs sharply from ALGOL 60 in the following points. OSL includes list processing and generalized data access techniques. The programmer may specify special action to be taken when a fetch or store is performed on data declared with the general access facilities. These 82 facilities allow the ready specification of simple and complex queues, tables, and stacks. Semaphores replace the notion of interrupts. Semaphores are used to synchronize processes. Strings of various character sizes and varying lengths can be easily manipulated with facilities similar to those offered by SNOBOL. In OSL/2 a process is any program or procedure that is capable of running asynchronously with any other process. Processes are created in one of three ways. Any currently running OSL process can "initiate" a new process. Initiated processes expect the initiating process to act as their operating system. The initiating process must explicitly indicate those procedure (primitives) within itself which the initiated process has access to. All other procedures and all data in the initiating process are not addressable by the initiated process. The initiating process may elect to default some or all existing operating system facilities to a lower level and choose only to add new facilities for the initiated process. New processes may be "spawned" by an initiated process. This facility allows a running program to notify the operating system that another program (process) should be initiated by the operating system. Finally, asynchronously running procedures (processes) may be "appended" to currently running processes. These appended procedures have access to all data and all procedures that they would have if they were executed sequentially. After appending a procedure, the appending process continues execution at the next executable statement and the appended process begins and continues execution at the same time. 83 II. OSL/2 Syntax and Semantics Table of Contents 1. The Purpose and Extent of the Language 2. Basic Symbols, Identifiers, and Constants 2.1. Letters 2.2. Digits 2.3. Constant Values 2.k. Delimiters 2.5. Special Symbols 2.6. Identifiers 2.7- Constants 3. Expressions 3.1. Variables 3.2. Function Designators 3.3. Integer Expressions 3.^. Boolean Expressions 3«5« String Expressions 3.6. Pointer Expressions 3.7. Pattern Expressions 3-8. Structure Expressions k. Statements U.l. Compound Statements and Blocks k.2. Assignment Statements k.3* Dummy Statements 8U k.h. Case Statements 4.5. Procedure Statements k.6. Process Statements ^.7^ Iterative Statements k.Q. Escape Statements 4.9. Pattern Matching Statements U.10. Synchronization Statements U.ll. Conditional Statements 5. Declarations 5.1. Integer Declarations 5.2. Boolean Declarations 5.3. String Declarations 5.^. Semaphore Declarations 5.5. Process Declarations 5.6. Pointer Declarations 5.7- Structure Declarations 5.8. Trap Declarations 5.9- Pattern Declarations 5.10. Access Definitions 5-11. Access Declarations 5.12. Procedure Declarations 5-13. File Declarations 6. Compile Time Facilities 6.1 Compile Time Declarations 85 1. The Purpose and Extent of the Language OSL/2 is designed to describe the flow of control and manipu- lation of data required in an operating system. The term operating system refers primarily to supervisory and monitor systems. However, the string manipulation facilities make OSL/2 a powerful compiler writing language. OSL/2 is designed with the deliberate intent to make the generation of well structured and readable codes the natural coding mode. To this end the "GO TO" statement has been eliminated and the use of labels had been restricted to encourage self -documenting code. A simple macro facility is specified. The data base includes integers, Booleans , strings, semaphores, patterns, and pointers. Furthermore, structures, in the PL/1 sense, are available and extendible data access facilities have been provided. 86 2. Basic Symbols, Identifiers, and Constants Basic Concepts. The reference langauge is "built up from the following basic symbols : : := : I i 87 2.1. Letters : := A|B| C| D|E|F| G|H| l| J|k|l| m|n| 0|P| Q| R| S|t|u| V|w| X| Y| Z| a|"b| c| d| e f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z This alphabet may he arbitrarily restricted, or extended with any other . distinctive character (i.e., character not coinciding with any digit, logical value, or delimiter). Letters do not have individual meaning. They are used for forming constants and identifiers. 88 2.2. Digits ::= 0|l|2|3|U|5|6|7|8|9 : := ::= 0|1 : := 1 1 1 2 j 3 : := 0| l|2| 3| ^| 5|6| 7 : := 0| l| 2 | 3| U| 5| 6\ l\ 8|9| A|b| c| d|e| F Digits are used for forming integers, constants, and identifiers. 89 2.3. Constant Values : := TRUE| FALSE : := MT : := NULL Logical values are the values of Boolean quantities. The string value MT denotes the empty string of length zero. The pointer value NULL denotes the null pointer. 90 2.U. Delimiters ::= | : := relational operator> | : := < | <= | = | >= | > | %= | LSS | LEQ | EQL | GEQ | GTR | NEQ : := : := & | SAND | S0R | SN0T | SX0R : := , | . | : | i | <■ | == | C0MMENT | TIMES | $ " { I % i ! | <"blank> : := ; : := ( | ) | [ | ] | BEGIN | END | ' | " | { | } | # : := B00LEAN | INTEGER | STRING | P0INTER | ACCESS | 0N | STRUCTURE | PR0CEDURE | N0DE | PR0CESS | FILE | PATTERN I QUEUE | STACK | 0WN | DEFINE | PRIMITIVE | SEMAPH0RE : := VALUE | NAME | FIXED | F0RMAL | F0RWARD : := | Delimiters separate the various entities of the language. In order to accept input from common restricted alphabets, the following symbols and combinations of symbols are equivalent: 91 Standard Symbols Alternatives < LSS <= LEQ EQL >= GEQ > GTR *u= NEQ -«- : = % N0T In what follows, only the left hand standard symbols are used. Delimiters have fixed meanings which will be explained as they occur in various constructs. Delimiters and logical values are considered basic symbols of the language and have no relation to the individual letters of which they are composed. Therefore, the words which constitute basic symbols are reserved for specific use in the language. Z.h.l. Spacing No space may appear between the letters of a reserved word. At least one space must separate a multicharacter delimiter from any adjacent letter or digit. That is, any two basic components of the following form must be separated by . 1. Multicharacter delimiter 2. Identifier 3. Logical value h. String value 5. Pointer value 6. Unsigned integer 92 2.U.2. Comments Comments may "be inserted in the program without effect on the program structure. The following conventions hold: 1) {terminal symbol }C0MMENT {any sequence of symbols except;} ; is equivalent to {terminal symbol} 2) % {any sequence of symbols except %} % is equivalent to 3) ! {any sequence of symbols} {end of line} is equivalent to k) END {any sequence of letters digits and blanks} is equiva- lent to END The % convention has precedence over all other comment conventions No comment conventions are recognized inside strings. The metalinguistic notation {...} is used to denote the terminal symbol or group of terminal symbols indicated by the english phrase enclosed in the metalinguistic braces. 93 2.5. Special Symbols : := _ The underscore is completely ignored in the language except when it appears within a string. The underscore is used in identifiers and constants to improve readability. 9b 2.6. Identifiers 2.6.1. Syntax : := i 2.6.2. Examples Al PR0CESS0R_NUMBER TIME_LIMIT J0B23 X 2.6.3. Semantics No spaces may appear within identifiers. The underscore "_" is used to improve readability, but is not part of the identifier. Thus "A_B", "AB", and "A B" are all the same identifier. . Identifiers are used to identify pointers, events, procedures, variables, etc. Reserved words may not be used as identifiers. Otherwise, identifiers may be chosen freely. Identifiers may be of any length. However, finite implementations are envisioned. In no case will the maximum identifier size be less than fifteen characters. When reduction of an oversized identifier is required, the identifier shall be reduced by the following algorithm: 1) n is the maximum identifier length 2) if n is even, choose n/2 characters from the front and n/2 characters from the rear of the identifier 3) if n is odd, choose (n+l)/2 characters from the front and (n-l)/2 characters from the rear of the identifier. 95 2. J. Constants 2.7.1. Integers 2.7.1.1. Syntax : := \ : := j + j - 2.7.1.2. Examples 1 123j*56_789_012 +17 -35_702 2.7-1.3. Semantics The underscore is used to improve readability and does not alter the value of the integer. The value of the integer is its decimal based value. 2.7-2. Strings 2.7-2.1. Syntax : := 2' ' I 2"" : := k ' ' j h"< quaternary digit string>" : := 8' ' 1 8"" : := 10' ' j 10"" : := l6 ' ' I l6"" 96 : := : := : := : := : := | A E | B : := "{any string of symbols}" '{any string of symbols}' : := i MT Each quote, identical to the opening quote, which is placed within the string, must be presented by two juxtaposed quotes of the same type as the opening quote. 2.7-2.2. Examples 2*011001101' l6'FAEC0628F' 8 '73107777' "N0WS THE TIME" 'WHAT ' ' S Y0UE PR0BLEM? ' »T0 QU0TE: "G0 H0ME!"' E" ( $ # § ""?" 2.7.2.3. Semantics The number strings described are used as masks and can take the positive value of their binary representation when used in arithmetic 97 expressions. A decimal string is semantic ally equivalent to a hexadecimal string. MT is the universal null string — it is empty. Character strings are, by default, eight bit' ASCII code. The prefix A denotes seven bit ASCII, E denotes eight bit EBCDIC, and B denotes 6 bit IBM BCD. Either double or single quotes may open a string. A string is closed by the same type of quote which opened it. 98 3. Expressions In the language , the primary constituents of the program which describe algorithmic processes are integer, Boolean, string, pointer, pattern, and structure expressions. : := 99 3.1. Variables 3.1.1. Syntax < subscript list> : := I , : := : := [ : : : ] < first position> : := : := : := : := j . J [] | . [] : := | . : := \ [] : := ::= . | . ::= ! : := j . : := | 3.1.2. Examples A A[23] USER NAME. [1:20] 1Q0 STUDENT_MTRY[N*2+K] .ID_N0 P.LIST_ELEMENT X.[2T:3] 3.1.3. Semantics Variables name data quantities. A variable may specify single quantities, multidimensional arrays, or whole data structures. When a variable specifies an array, each subscription expression is evaluated and used to determine which array element is specified. The value of a subscripted variable, one or more of whose subscripts are out of bounds, is undefined. If an out of range subscript is calculated, an "index" trap occurs. Access variables may have actual parameters passed through the subscript list. In this case, the interpretation of the subscripts is given by the associated access definition. A variable which specifies a string quantity, may be followed by a substring specification. The substring specification indicates what is the first character desired in the string and how many contiguous characters to the right of the first character (including the first character) are desired. The character positions in a string are numbered from left to right beginning at zero. If a size parameter is specified, the new character size is used instead of what was declared for the string. The symbol "." is used in variables to exactly define which quantity is desired when a pointer is specified. If P is a pointer, then "P" denotes the pointer P and "P." denotes the quantity pointed to by P. This use of the "." is consistent with structure addressing. 101 Each higher level identifier used in a structure, that is each identifier that does not identify a primitive data item, may he viewed as a pointer, e.g. STRUCTURE J0B (INTEGER ID, TIME0N, TIME0FF; ■ STRING NAME ( 30 ) ; STRUCTURE IDCARD_PARMS( INTEGER TIME, LINES) ) ; "J0B" and "IDCARD_PARMS" are higher level identifiers in this structure. Thus one would address "LINES" as follows: J0B . IDCARD_PARMS . LINES If "J0B" were a structure which was repeated many times in a list and "P" were a pointer which pointed into this list then the "J0B" structure pointed to by "P" would be expressed as "P.J0B". 102 3.2. Function Designators 3.2.1. Syntax < actual parameters : := : := : : := | * | < empty > : := , : := () The variables which appear as actual parameters may be semaphores, files, access variables, or process variables. 3.2.2. Examples GETSPACE (1OO*J0B_SIZE) ELAPSEDJTIME (A,B,C,D) GL0RPSOT0TT ( MEEKLE_JAMMER ) 3.2.3. Semantics Function designators may define any data type value available in OSL/2. This data is returned by the function after it executes a given set of rules defined by a procedure declaration. 3.2.4. Standard Functions The following functions (primitives) are predefined and implemented for the user. 103 Functions Asses) SIGN(E) decimal(e) binary(e) cofvert(a,b,c) or CONVERT (A,B,C,N) Arguments integer expression integer expression integer expression decimal string strings A,B,C and expression N LENGTH(E) CHARSIZE(E) string string Action absolute value of E is returned. if E>0 return +1, if E=0- return 0, if E<0 return -1. returns a decimal string ■which represents the decimal value of ABS(E). returns the binary integer value of the string E. convert N characters in string A into a new string using truncated values of the characters in string C. Each character in A. is used to index into string C. The first character in C is selected if the character in A has value 0, the second if it has value 1, etc. The Character in C is truncated on the left or padded with zeroes on the left to match the character size of B. The selected character is then placed in B and the process continues at the next position in A and B until N characters have been processed. If the character position beyond the end of the string C, or if N exceeds the size of A or B then a convert failure trap is caused. If N<0 then no charac- ters are converted. returns the number of characters in string E. returns the number of bits per character in string E. ioU Functions PTR(E) SPAN(E) or SPAN(E,N) except(e) or EXCEPT (E,N) Arguments any single variable or array element or character E=string N=integer expression E=string N=integer expression Action A pointer to the variable, array element, or character is returned. A pattern is returned which represents a string of length N consisting of any of the characters in E. If N is absent, a string of arbitrary length is matched. Same as SPAN except that a string which consists of any characters except those in E is matched. ANY or ANY(N) RELEASE^) UPPER(E) L0WER(e) STRING(E) or STRING(E,N) ALLOCATE (E) N=integer expression variable array row specification array row specification integer expression identifier ANY is a pattern which matches any string. If N is specified then ANY matches any string of length N. The specified variable is deallocated. The upper bound of the speci- fied array row is returned. The lower bound of the speci- fied array row is returned. E is converted to a bit string of length N. Two's complement, arithmetic is assumed. If N i: not provided, the current default length of an integer ■ is assumed. A pointer is returned with the location of E in it. This primitive is used at block entry and block exit. If it is redefined by the user, the user may do his own allocation. However, the user may only allocate space in arrays declared lexico- graphically prior to the redefinition of primitive ALL0CATE. This function may also be used as a procedure statement . 105 Functions B00L(E) INTGR(E) Arguments integer expression decimal string Action Allows E to be used as a Boolean with Boolean opera- tors. B00L(E) is true if E is odd or has a right -most bit equal to 1. Otherwise it is false. Allows E to be used as an integer. 106 3.3. Integer Expressions 3.3.1. Syntax : := + ' - : := + j - multiplying operator> : := * / | RDIV CDIV | M0D : := j CASE 0F ESAC ( ) | IF THEN ELSE FI : := j + t : := \ : := : := ! : := , 3.3.2. Examples Integer expressions : A+B+C X+-3 -h * (A+INTGR (16'OAFFF 1 ) M0D 3) 18+ IF J0BTIME MAXTIME>THEN A/B ELSE IF A * B M0D C = X THEN R0UNDER (Q) ELSE A/B FI FI 107 Terms -(Q «- B+C) A + -G * 30 Q+INTGR (X. [17:5]) X -*■ A+B * (C «- 10*L) Z [17, A * B] * TEST_N0 A * B * X [11, IF A = B THEN 3 ELSE k Fl] * Q Factors -A B C t D t -E t 10 +(30 * A) -(A+ (10+J)) Primaries : 73 IF Q = R ANDIF B00L (X + l) IMP B THEN 17 ELSE Q FI LENGTH (P) BASE (PTRl) (X + IF P THEN 2 ELSE C * Q Fl) 3.3.3. Semantics An integer expression is a rule for computing a numerical value. This value is obtained by executing the indicated arithmetic operations on the actual numerical value of a primary. The value is obvious in the case of constants. For variables, it is the current value (assigned last in the dynamic sense), and for function designators it is the value returned "when the computing rules defining the proce- 108 dure are applied to the current values of the procedure parameters given in the expression. For integer expressions enclosed in paren- theses, the value must, through a recursive analysis, be expressed in terms of values of primaries of the other three kinds. The value of a primary of the form IF THEN ELSE FI is computed as follows. The Boolean expression is evaluated. If it is true, integer expression 1 is evaluated and is the value of the primary. If the value of the Boolean expression is false, then integer expression 2 is evaluated and is the value of the primary. The value of a primary of the form CASE OF , ,.. . ESAC is computed as follows. The integer expression following the case is evaluated. Assume it has value i. If < i : := < | <= | = | >= | > I %= : := = | ^= : := : := | | | ( ESAC : := | : := | AND | ANDIF : := | X0R | 0R | 0RIF : := | IMP : := | 1 | pattern matching statement I : := , <=, = , >=, third: % fourth : AND, ANDIF fifth: X0R, 0R, 0RIF sixth: IMP seventh: EQV The operators ANDIF and 0RIF have the same meaning as AND and 0R except that they may alter the evaluation rules of the Boolean expression and may thereby drastically change any side effects. If the logical value computed to the left of an ANDIF in the current Boolean factor is false, then no further calculation is made to the right in the current Boolean factor containing the ANDIF. The value of the Boolean factor is false. If the logical value computed to the left of an 0RIF in the current Boolean term is true, then no further calculation is made to the right in the current Boolean term containing the 0RIF. The value of the Boolean term is true. If the above conditions do not hold, the action is identical to AND and 0R. 113 Strings and integers may be used as Booleans when indicated by the B00L function. In this case the string is treated as true if the right-most bit is 1 and false if it is 0. An integer becomes true if it is odd and false if it is even. Ill* 3.5« String Expressions 3 . 5 • 1 • Syntax : := |() IF THEN ELSE FI CASE 0F ESAC : := SN0T : := SAND : := S0R I SX0R ::= I Simple string expression 5 * & ::= : := I , Variables and functions are used as string primaries must be declared to be type STRING. Only strings of the same character size may be concatenated. 3.5.2. Examples "THIS" & "IS A C0N CATENATED STRING" DECIMAL (SALARY) & (IF CENTS THEN 10"00" ELSE MT Fl) A SAND 2 '110000111100' 3.5-3. Semantics A string expression is a rule for computing a string value. The rules for evaluating a string expression are analogous to the rules for evaluating an integer expression. 115 The string operators SAND, S0R, SX0R, and SN0T correspond respectively to bit wise and, or, exclusive or, and not operations. These operators work left to right on any strings. The operators SAND, S0R, and SX0R may operate on strings of inequal length. The result is the same as if the operations were performed on the full length of the shorter operand and the first part of the longer operand (which is the same length as the shorter operand) concatenated with the rest of the longer operand. The string constant MT is the empty string. The value of a string expression is the concatenated string whose length is equal to the sum of its concatenated components. 116 3.6. Pointer Expressions 3.6.1. Syntax : := < function designator> INTJLL IF THEN ELSE FI CASE OF ESAC : := , The pointer variables and function designator in pointer expressions must be of type POINTER. 3.6.2. Examples P.A.PTR1 P. A P «- PTR(B) IF Rl = P2 THEN PI ELSE Pl.P FI P[23] P[A*B+C] 3.6.3- Semantics A pointer expression is a rule for computing a pointer value. A pointer value points to a particular piece of data whether it be an individual character, integer, array, event, structure, etc. The pointer operator "." may be read right to left as "pointed to by the pointer". Thus "P. A" is A pointed to by the pointer P. 117 3.7. Pattern Expressions 3.7-1. Syntax : := < function designator> ! I IF THEN ELSE FI I CASE 0F ESAC : := I & : := ! J : := J : := {} : := i , 3.7.2. Examples "A" {B +■ {"A" I "B" & SPAN ("012376")} | B -e- X & C} {"LMQ" I "CAT" I "D0G"} 3.7.3. Semantics Pattern expressions define a rule for scanning strings. Patterns are used in string matching statements and their semantic interpretation is discussed there. 118 3.8. Structure Expressions 3.8.1. Syntax : := IF THEN ELSE FI i CASE #F ESAC : := , Variables and functions used as structure expressions must be declared to be structures. All structures in one expression must have the same hierarchy and types in each position. 3.8.2. Examples A A.B IF A=B THEN S ELSE B.S FI 3.8.3. Semantics The rules for evaluating a structure expression are analogous to the rules for evaluating an integer expression. 119 h. Statements The "basic unit of calculation is the statement. Statements are normally executed sequentially in the order in which they are written. This sequence may he shortened by conditional or escape statements which may cause some statements to he skipped, or process statements may create new processes that will run in parallel with the current code. Statements may be grouped into compound statements and "blocks which are themselves statements. Therefore the definition of statement is necessarily recursive. k.l. Compound Statements and Blocks ll-.l.l. Syntax | | j i \ ; < synchronization statement> : := | : := : := ; : := ; : := BEGIN END i : : : := BEGIN declaration list> : := ; END | : : : := j 120 All labels are used as brackets. If a label is used before a statement, the same label must be used after the statement. 1*.1.2. Examples Let S, D, and L denote arbitrary statements, declarations and labels . Basic statements: form: S A i-B+C P.D «■ Q LEAVE OUIERBL0CK Compound statements: form: L:L. . .L:BEGIN S;S;S;...;S END:L ... L:L CD: BEGIN P(23,17,A*B); X «- P+D ; REWIND(TAPE9); TERMINATE END : CD Blocks : BLOCK : form: L:L. . .L:BEGIN D;D; . . . ;D;S ; . . . ;S END : L:L ...L BEGIN INTEGER X,Y,Z; STRING (A,B,C) (30); A ■«- "N0¥ IS THE TIME" ; B ■<- A & "23 SKID00"; 121 X + 2*(Y +■ 30); Z + X+Y; IF X = INTGR(l0"37") THEN LEAVE BL0CK ELSE TERMINATE FI END: BL0CK ^.1.3. Semantics Every block introduces a new level of nomenclature. Any identifier occuring within the block is local to the block if it is defined in the blockhead. Identifiers defined outside the present block but within a block containing the present block are global to the present block. Only entities represented by local and global identifiers have any existence within a block, When an identifier is declared within a block, any entity represented by this identifier outside the block is completely inaccessible inside the block, 122 k.2. Assignment Statements U.2.1. Syntax ::= : := «- | -<- ::= -<- : := -<- : := -<- ::= ::= All variables on the left of the assignment arrow must be the same type as the expression on the right. An asterisk may be used as the first primary in the expression on the right of the assignment arrow. Any pointer or pattern variable appearing on the left of an assignment arrow must be at the same or lower nested block level of any variables appearing to the right of the arrow. If the variable on the left of an assignment is a function identifier, then the expression value is saved to be returned at the end of the procedure. k . 2 . 2 . Examples A +- B+C X + *+l B «- X = Y S + "TEST" P <- PTE(A[0]) SI «- S2 <-tf.fl I *»-,,!• I II -. „ , 123 U.2.3. Semantics Assignment statements assign the value of an expression to one or more variables in a left part list. The types of the entities on the left and right of the assignment must agree. No type transfers are automatically invoked. An asterisk in place of the first primary after an assignment arrow means that the variable to the immediate left of the assignment arrow- is to be used in place of the asterisk without further evaluation of sub- scripts, etc. that may cause side effects. An access variable may not be this variable if the asterisk convention is used. The presence of an access identifier on the left or right of an assignment automatically invokes the appropriate fetch or store procedure specified in the access declaration to generate or save an expression value. Assignment statements are executed in the order specified by the following algorithm: first: Any subscript or substring specifications occuring in the left part list are evaluated in sequence from left to right, second: The expression of the statement is evaluated, third: The value of the expression is assigned to all the left part variables, and all access variable stores are performed from right to left. 12U U.3. Dummy Statements U.3.1. Syntax < dummy statement> : := U.3.2. Examples BEGIN . . . ; END U.3.3. Semantics A dummy statement executes no rules and performs no calculations It is completely null. Dummy statements may serve as markers in unused indices of a case statement. 125 k.h. Case Statements U.U.I. Syntax : := CASE 0F ESAC |