LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 IJKoY HC.Q43-QA2 cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or he fore the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN Digitized by the Internet Archive in 2013 http://archive.org/details/cleopatracompreh646schr 5/0. X r 12o. a) UIUCDCS-K-74-6U6 '77 ULstt CLEOPATRA Comprehensive Language foe Elegant Operating System and Translator Design by Axel T. Schreiner May, 1974 Errata Page 25, line 4: change "od" to "of". Pages 114 and 115: the AT operator as described there is not sufficient. A usable alternative will be discussed in the author's forthcoming Ph.D. thesis. Page 165: change line 9 to read "INTEGER search_insert(INT BY ADR): INT 1 EXT BY ADR" Page 166, line 15: change "." to ":" . Page 167, line 15: change "1" to "2" . line -7: change "==" to ">=" . Beport No, [JIUCDCS-B-74-646 CLEOPATRA Comprehensive Language for Elegant Operating System and Translator Design by Axel T. Schreiner 1974 Departments of Mathematics and Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 The design of CLEOPATRA, a major part of ly Ph.D. thesis, took approximately two years. Throughout this time I enjoyed the continuous guidance and encouragement of my "Doktorvater" [Ph.D. father]. His advice helped shape the definition of the language; his experiences with countless programmers at all levels of proficiency helped decide numerous questions of human engineering. I therefore would like to dedicate this report on CLEOPATRA to my advisor and friend Professor H. George Friedman, Jr. A.T.S. Table of Contents. 1 Introduction 7 1. 1 Purpose of this report 8 1.2 Syntax specifications in this report 8 1.3 The character set 9 1.4 Token delimiters 10 1.5 Comments 11 2 Identifiers, constants 13 2.1 E Identifiers 13 2.2 Constants 14 2.2.1 E Basic constants 15 2.2.2 E Integers, bits, and bytes 16 2.2.3 E Real numbers 17 2.2.4 E Decimal numbers 18 2.2.5 E Literal values 19 2. 3 E System supplied values 21 3 Configurations 24 3. 1 Building blocks of a configuration 25 3.1.1 Routine blocks 26 3.1.2 Structure blocks 26 3.1.3 Data blocks 26 3.1.4 Recompiling rules 27 3.2 Available configurations 28 3.3 Scope rules 29 3.3.1 Identifier classes 29 3.3.2 The nesting of configurations 31 3.3.3 Recursion 32 3.3.4 Referencing data 32 3.4 E The compilation environment 34 4 Types 37 4.1 Basic types in CLEOPATRA/360 37 4.2 Pointers 39 4.2.1 E Objectives for deferred creation 40 4.2.2 Using a pointer 43 4.3 User-defined types 44 4.3.1 Constructing a type_pack 46 4.3.2 References to a type 48 5 Structure, routine, and data blocks 50 5. 1 Structure blocks 50 5.1.1 Local structure blocks 50 5.1.2 Global structure blocks 53 5.2 Routine blocks 53 5.2.1 Procedure definitions 54 5.2.2 £ Procedure calls 55 5.2.3 Operator definitions 57 - 4 - Table of Contents. 5.2.4 E Operator applications 58 5.2.5 E Conversions 61 5.3 Data blocks 63 5.3.1 Local data blocks 64 5.3.2 Global data blocks 65 5.3.3 E Type data blocks 66 5.3.4 E Data groups 68 5.3.5 Array descriptions 71 5.3.6 Initialization 73 6 Expressions 75 6. 1 E Forming an expression 75 6.2 Storable references 78 6.3 Type recognition 80 6.3.1 Primitive data types 80 6.3.2 Alignment 81 6.3.3 Aggregation 82 6.3.4 Routine calls 83 6.4 The COPJf option for parameters 83 6.5 The array facilities 85 6.5.1 E Extraction 86 6.5.2 Creating an array 90 6.5.3 Accessing array parameters 91 7 Statements 93 7.1 Simple statements 93 7.1.1 The arithmetic statement 93 7.1.2 The RETURN and EXIT statements 93 7.1.3 The simple conditional statement (IF) 94 7.1.4 The ALLOCATE and RELEASE statements 95 7.2 Compound statements 96 7.2.1 Bracketing statements 97 7.2.2 E Repetitive execution 97 7.2.3 Selective execution 100 7.3 Statement lists 103 8 Interrupts 104 8.1 Linkage considerations 104 8.2 Creation and activation 106 8.3 Interrupt blocks 107 8.4 The ASSERT condition 109 9 Privileged operations 110 9.1 Basics about System/360 111 9.2 Privileged data types 113 9.2.1 Addresses 113 9.2.2 Keys 115 9.2.3 Program Status Words 117 - 5 - Table of Contents, 9.2.4 Tiae 117 9.3 Privileged statements 118 9.3.1 The current Program Status Word 118 9.3.2 Input-Output 119 9.4 Protection 120 A Keywords and their aliases 121 B Operators for the built-in types B. 1 Summary B. 1.1 Exceptional conditions B. 1.2 Grouping of operators B. 1.3 The operator table B. 1.4 Returned types B. 2 CONVERSION operations B.2.1 Converting to DECIMAL'size B.2.2 Converting real numbers B.2.3 Converting from CHARACTER B.2.4 E Converting to CHARACTER B. 3 E Assignaent operations B. 4 E Comparison operations B. 5 E Binary arithmetic operations B.5. 1 Complete integer arithmetic B. 5. 2 Complete real arithaetic B.5.3 Fast mixed exponentiation B. 6 E Unary arithmetic operations B.7 BIT operations B. 8 CHARACTER operations B.8. 1 E Code conversions and assignaent B.8.2 E Concatenation and splitting B.8.3 E Selection B.8.4 E Scanning B. 8. 5 E Unary operators B.9 POINTER operations C Summary of productions D Coding examples E The beginnings of a type_pack F Conditions F. 1 Hardware interrupts F.2 Initial Program Loading F.3 Software conditions References 182 127 127 127 129 131 134 134 135 135 135 135 136 138 140 141 142 142 143 145 146 147 149 150 151 152 153 155 165 170 176 176 179 180 - 6 - Section 1: Introduction 1. Introduction, CLEOPATRA will be used to code operating systems, i.e., production systems. It is expected that "normal" programs written in CLEOPATRA will be lengthy, and that normally compilation, linking, and loading will be distinct events in time. Therefore, emphasis in CLEOPATRA design lies primarily on ease of maintenance, execution efficiency, and clarity of systems written in the language, and to a much lesser extent on ease of compilation. Maintaining control is a major consideration; we wish to protect the ignorant coder from himself. However, we will provide the necessary means to eliminate those control options, thus allowing the construction of seemingly efficient "dirty" programs - a practice which we would strongly like to discourage. Operating systems require the manipulation of a variety of data items, representing system components like •task 1 , •job 1 , 'unit record device', •reply*, etc. It is not possible to anticipate all such data items, and their aggregates, when a system implementation language such as CLEOPATRA is designed; it may well be the case that such data items and their aggregates receive their final layout well through the actual system design period. Consequently, CLEOPATRA provides user-defined data items and aggregates; the usage of such data items through their operators and the definition of each operator can be kept completely independent so that the redefinition of the algorithm performed by an operator need not affect any code calling the operator. CLEOPATRA is to be transportable. Most of the run-time support routines for the system are therefore expected to be written in CLEOPATRA itself. In particular, we expect to only supply those operators with the system proper (i.e., inside the coder) which reflect the target machine's hardware. All other operators will be synthesized in the language, not in the coder. Only those "basic" data types and operators are built into the language which correspond to special hardware instructions on the target machine, or which present a major convenience for the programmer. All other operators and - 7 - Section 1: Introduction. data types should be built using CLEOPATRA code. It is expected that a library of commonly used types will be built; provisions should be made in the control statements of the compiler to incorporate such library routines. Main features of CLEOPATRA are extensible (i.e., user definable) data types and operators, the ability to create synchronizing primitives, interrupt mechanisms, the ability to create parallel processes, and the ability to create stand-alone programs (i.e., operating systems.) 1. 1 Purpose of this report. This report attempts to define an optimal CLEOPATRA language. All desirable features, anticipating a member of the IBM System/360 family of computers as a host machine, have been included; a subsequent implementation attempt may decide to omit certain features in the interest of simplification. Some considerations in the design of CLEOPATRA have been left to the implement or. Brief reference to these considerations is made where appropriate; the implications are discussed elsewhere. 1.2 Syntax specifications in this report. He will use a slightly extended form of BNF to specify the syntax of CLEOPATRA. The following symbols will be used: : := separates left and right hand sides of a production. lower_case designates a non-terminal symbol; an attempt was made to make the names meaningful. UPPERCASE and special characters (except as noted below) designate terminal symbols; we assume a concrete representation where these terminal symbols represent themselves. - 8 - Section 1: Introduction. | indicates a choice. { } are used to combine a number of choices. [ J the enclosed entity nay appear or may be omitted. [ ]• the enclosed entity may appear or more times. { }• the enclosed entity must appear one or more times. The brackets [ ] are also considered a part of the CLEOPATRA character set; they do not appear in productions, however, since they are considered to be pairwise equivalent to parentheses ( ) . The choice symbol | is also considered a part of the CLEOPATRA character set. If it is to denote itself as a terminal symbol, it will be underlined. The right hand side of each production has a sequence number. Appendix C contains a list of all productions by sequence numbers. 1.3 The character set. Envisioned for the compiler design are four passes: lexical analysis, syntactic analysis, semantic analysis, and encoding. It is anticipated that these passes operate synchronously, with communication through synchronizing primitives and token messages. This should enable an extremely modular construction, easy to change. In particular, various versions of the lexical analysis module then will support varying concrete representations of the language. He anticipate usage of the language through terminals supporting upper and lower case letters, rather than through keypunches. For the purpose of this report, we will therefore adopt the following character set: The character _ (underscore) will be completely transparent to the envisioned lexical analyzer, except in literal values (see section 2.2.5). This allows for a better formatting of identifiers and constants. However, we thusly remove the underscore character completely from the - 9 - Section 1: Introduction. CLEOPATRA character set as through (1.5). (1.1) letter I ::= A K | L V | W g l h r I s outlined in productions (1.1) I B J C | D | 1 M | N | | Y J u I I I t | i I I z I v J E | F | G P J Q I R a | b | c 1 | m | n w | x | y (1.2) delimiting__character ::=(!) I • | blank_character I H S d o z I I J t j a e I f p I q Brackets [ ] are also considered part of the character set; they serve as delimiting characters and are pairwise eguivalent to parentheses ( ) . (1.3) special_character ::=d|#l$|X|5|*|-| «~l = I " 1 ? I / I * 1 £ I < 1 > I 1 I * 1 (1.4) (1.5) digit ::= 0| 1|2l3|4|5|6j7j8|9 control_character ::= backspace J ! end of source record The concrete representation of end__of_source_record is implementation and terminal dependent. It may be a state of the lexical analyzer automaton rather than an actual character code. It only serves as a delimiting boundary and to delimit an in-line comment; see sections 1.4 and 1.5. In terms of the IBM System/360, the anticipated host family of machines, this character set is a compromise between a 029 keypunch together with the regular high speed printer PN train character set, and a 2741 terminal with the COURIER 72 typeball together with the TN train character set. 1.4 Token delimiters. This report, with the exception of sections 2.1 and 2.2, discusses CLEOPATRA programs as a sequence of tokens. Tokens are keywords with a predefined meaning in the language (summarized in appendices A, B, and D) , identifiers and operators whose meaning is determined by their usage in - 10 - Section 1: Introduction. the language, and constants, special keywords which can be evaluated and typed at compilation tine. The lexical analyzer in the CLEOPATBA compiler constructs a CLEOPATRA program as a sequence of tokens from an incoming string of characters. To be recognizable, tokens must be surrounded by delimiting boundaries. Between tokens ■ay appear an arbitrary number of blank characters and ends of source records. Delimiting boundaries are represented as states of the lexical analysis automaton, and they are recognized as follows: Characters are ordered into the five classes described by productions (1-1) to (1.5). A delimiting boundary occurs between any two characters excep_t that there is no delimiting boundary between two letters a letter and a digit a digit and a letter two digits two special characters Additionally, no delimiting boundary occurs within a constant as described in section 2.2. The section describes the formatting and processing of constants in detail. Section 2. 1 discusses identifiers and operators. 1.5 Comments. A comment is equivalent to a blank character, and may occur wherever a blank character is legal. It is replaced by a blank character before any further processing occurs. Comments take the following two forms: (1.6) comment ::= COMMENT any sequence of characters with the exception of a semicolon ; (1.7) ::= ! any sequence of characters with the exception of an end of the source record end_of_source_record COMMENT is a system reserved word; note that it must be surrounded (like any other keyword) by delimiting boundaries. Production (1-7) denotes the only explicit use - 11 - Section 1: Introduction. which the character ! has within CLBOPATHA; observe, however, that ! can be embedded in a literal value (section 2.2.5) and does not initiate a comment in that context. It should be noted that ! does not appear on a printout with the PN printer train of the host machine. - 12 - Section 2: Identifiers, constants. 2. Identifiers, constants. 2. 1 Identifiers. An identifier in CLEOPATRA consists of an arbitrary number of characters as outlined belDw; however, for reasons of compilation economy, an implementation would probably limit all identifiers to 31 or fewer characters in length. We have two classas of identifiers, those naming only operators, and those naming all other guantities. They are formed as follows: (2.1) identifier ::= letter [ letter | digit ]• (2.2) operator ::= { special_character }• | identifier We would liKe to recall that identifiers may be formatted for better readability by use of the transparent _ (underscore) character, see section 1.3. The use of this character does n ot extend the actual length of an identifier. Identifiers and operators may not contain any characters other than those specified above; in particular, they may not be broken across source record boundaries, and they may not contain embedded blank characters. E.2.1 Examples. Valid identifiers are a_1 a1 a_1_ The preceding line indicates three different ways to write the same identifier. Valid operators are ♦ - <= ♦!the operator ♦ immediately followed by a comment Invalid as identifiers or operators are - 13 - Section 2: Identifiers, constants. 1a ! a digit cannot start an identifier ♦real ! must not mix special characters and letters a 1 ! no intervening blank characters 2.2 Constants. A constant is a special kind of keyword representing an instance and a value of a certain data type. Section 4. 1 describes the built-in data types (which are to some extent specific to the IBM/360 family of computers) . Corresponding to each built-in data type there is a way to write a constant of this data type; the name of each of these constants immediately indicates its data type and its value. Further constants can be created at execution time through the use of CONVERSION operators, CONSTANT declarations, etc. We cannot provide a facility to write constants for user-defined data types to be evaluated at compile time. Such a facility would have to assume existence and call on services for these user-defined data types. This would amount to a redefinition of the language, i.e., to a syntactic and semantic extension of the recognized language, for which CLEOPATRA does not provide. (2,3) constant ::= integer | long_integer j byte | real | long_real | decimal i bit J literal_value Observe that there is a SIZE condition: e.g., one may in general write "integers" of arbitrary value; however, their machine representation may not fit the allocated amount of storage for "integer". For constants described in this section the SIZE condition is detected at compilation time. What magnitude for a particular data item will cause the SIZE condition to be raised, is implementation (i.e., target machine) dependent; in section 4. 1 we will outline the relevant figures for an implementation on the IBM/360. Constants, as described in this section, are recognized and evaluated by the lexical analysis part of the CLEOPATRA compiler. When leaving the lexical analyzer, there will only be a token "constant of value v and of type t". This token is constructed in the lexical analyzer from a number of (sub) tokens identified by delimiting boundaries as described in section 1.4. As a result of this function of the lexical analyzer there will be a number of delimiting boundaries in - 14 - Section 2: Identifiers, constants. the context of constants which do not give rise to individual tokens used in the further recognition of the language. The most outstanding of these constructions concerns the recognition of a signed constant. The rule is that all constants may not contain any blank characters. If a minus_sy mbol is attached to the left of a string of hexadecimal digits with no intervening blank, it will be taken for a sign symbol for the constant. If there is an intervening character, the symbol is passed on as a separate token (an operator) even if this subsequently would result in syntactic errors. Like identifiers and operators, constants may not contain blank characters and (with the exception of literal values) may not be broken accross source record boundaries. They may, of course, contain underscores; see section 1.3. 2.2.1 Basic constants. Constants, as described below, are constructed from certain digit strings. These digit strings are written with respect to different bases as follows: deciaal_digit : := digit hexadecimal digit ::= digit J A | B | C I D | E | F (2. .<*) (2. 5) (2. 6) (2. -7) (2. 8) (2. 9) (2. ,10) (2. .11) (2. .12) octal_digit : binary_digit minus_symbol = 0|1|2|3|4|5|6|7 := | 1 decimal_string ::= [ decimal_digit } • hexadecimal^string ::= X. hexadecimal_digit }• octal_string ::= 0. octal_digit }• binary_string ::= B. binary_digit } • C minus_syabol ] { [ minus_symbol ] { minus__symbol ] { [ minus_symbol ] { - 15 - Section 2: Identifiers, constants. (2.13) basic_constant ::= decimal_string | ~hexadeciaal_string | octal_string | binary_string The strings above have a numerical value as follows: the sequence of digits is evaluated in the respective base, and if a minus_symbol is present the resulting value is multiplied by -1. £.2. 2. 1 Examples. The following are basic constants of decimal value 10: 10 X.A 0.12 B.1010 The following are basic constants of decimal value -10: -10 X.-A 0.-12 B.-1010 The following are not basic constants of decimal value -10: - 10 X. -A B. 1110110 The last of these is a basic constant of decimal value 261; CLEOPATHA does not recognize a negative base-2 constant in complement notation. 2.2.2 Integers, bits and bytes. (2.14) integer ::= basic_constant (2.15) long_integer ::= F. integer (2.16) byte ::= Q. integer (2.17) bit ::= S. integer An integer is of type INTEGEE, a long_integer is of type LONG_INTEGER, a byte is of type BYTE, and a bit is of type BIT. See section 4. 1 for the SIZE limitations on these types. The size qualifiers are F (full-word) , Q (quarter word), and S (single bit). - 16 - Section 2: Identifiers, constants. E.2.2.2 Examples. The following are constants of maximum and minimum value for their respective types (assuming an IBM/360 host machine) : 32767 X.7FFF -32768 X.-8000 F.2 1<*7_483 647 F.0.1 77777 77777 F.-2_147_483_6U8 F. O.-2_00 000_00000 Q.255 Q.B.T111 1111 Q.O Q.X.O S.1 s.o 2.2.3 Real numbers. (2.18) decimal_real_string ::= decimal_string . { decimal_digit } • (2.19) hexadecimal_raal_string ::= hexadecimal_string . { hexadecimal_digit }• (2.20) octal_real_string ::= octal_string . { octal_digit } • (2.21) binary_real_string ::= binary_string . { binary_digit } • (2.22) basic_real ::= decimal_real_string | hexadecimal_real_string | octal_real_string } binary_real_string (2.23) exponential_part ::= P basic_constant (2.2U) real ::= basic_real [ expanential_part ] (2.25) ::= basic_constant exponential_part (2.26) long_real ::= L. real Seals have a numerical value as follows: the radix point (.) (which must be surrounded by digits) is - 17 - Section 2: Identifiers, constants. interpreted with respect to the base of the digit string into which it is embedded. The exponential_part is interpreted as exponent for the sane base in which this exponential_part was written. A real is typed REAL, a long_real is typed LONG_REAL; see section 4.1 for SIZE limitations. In addition to the SIZE condition, the SIGNIFICANCE_LOSS condition could be raised at compile time if a real or long__real were specified with more (low-order) significant digits than the implementation can accommodate. To summarize the preceding productions: a real is an integer with either a radix point {.), or an exponential_part, or both. E.2.2.3 Examples. The following describe the largest and smallest possible values for the type REAL and L0N3_REAL on an IBM System/360 machine: X.FFF_FFF_PX.39 X.-FFF_FFF_PX.39 L.X.FFF_FFF.FFFF FFFF_PX.39 L.X.-FFF_FFF. FFFF_FFFF_PX. 39 Values of types REAL and LONG_REAL which are normalized values closest to zero on an IBM System/360 machine are: X.1_PX.-41 X.-T_PX.-41 L.X.1 PX.-41 L.X.-T PX.-41 2.2.4 Decimal numbers. (2.27) decimal ::= D. basic_constant Decimals are typed DECIMALS where n is just sufficient to hold the value of the constant. DECIMALS is a built-in data type realized in the coder phase of the CLEOPATRA compiler, which is very specific to - 18 - Section 2: Identifiers, constants. the IbM System/360 hardware. See section 4. 1 for a aore detailed discussion. E.2.2.4 Examples. The following are some examples for decimals, together with their resulting size n: D. 10 size 2 D.x.FO size 2 D.-2U0 size 2 2.2.5 Litecal values. (2.28) literal_value ::= C. any sequence of characters up to and not including the first following blank_character A literal_value is interpreted as follows: the first character following the gualifier C. begins the literal_value. the first blank_character following the qualifier C. ends the literal_value. a literal_value is of type CHARACTER; its maximum anticipated length and its current length both equal the number of characters (including backspaces and excluding end_of _source_record indications) between and not including the qualifier C. and the first blank_character following it, adjusted for the representation of underscores as outlined below. end_of_source_record is transparent to a literal value and does not become part of it. the underscore character _ is not transparent to the interpretation of a literal_value: if not followed or preceded by a backspace, it represents a blank_character in the literal_value; each sequence underscore backspace underscore recursively represents a single underscore and is counted as a single character. - 19 - Section 2: Identifiers, constants. backspacing and overtyping in general is a feature of literal values; however, each backspace oust be followed by a character other than a backspace or a blank_character ; backspaces may neither start nor end a literal_value. backspace thus serves as a control character for overtyping in literal values; it can not appear for itself in a literal value. overtyping is not a commutative operation, and an underscore cannot be overtyped immediately with another underscore. a ! within a literal value does not initiate a comment; it is considered to be part of the literal value. A literal_value with too many participating characters may raise the SIZE condition at compile time. For limitations, see section 4.1. It should be notad that the representation of a lite ral_value upon entry to the lexical analysis part of the CLEOPATRA compiler depends extremely on the type of communication medium used. It is expected that lexical analyzers corresponding to other types of communication than a typewriter-style terminal will choose a different representation. It is also expected, however, that overtyping is a feature of literal values. Overtyping must be done character by character, and will have to be anticipated in output operations on line oriented peripherals. E.2.2.5 Examples. In these examples for literal values we use the symbol ■ to represent a blank character, and the symbol • to represent a backspace where appropriate. value length ABC 3 A.C 3 A_C 3 AB 4 AB 4 AB 6 - 20 - constant 1 C.ABC 2 C.A C 3 4 C.A • C C.A« B 5 6 C. A»_«_B C. «A« B Section 2: Identifiers, constants. 7 C._»AB AB 4 8 C.A_* D A_«D 4 Note that while constants 4 and 5 above are equal, they are both considered not equal to constants 6 or 7, in spite of constants 6 or 7 appearinq to be the same in print as constants 4 and 5. Constant 6 is an example of an overtyped underscore. 2.3 System supplied values. (2.29) system_supplied_value ::= type { LARGE | NIL | SMALL } (2.30) system_supplied_constant ::= E | FALSE | FIRST | LAST J PI | TRUE A system_supplied_value is an instance of a data type with an implementation dependent value, as defined belov. For user-defined data types it is created at execution time by a call to the creation time routines for the specified type with suitable parameters. (Types are discussed in section 4.) A systeiu_supplied_constant described in this section must be used accordinq to the returned type, or conversion operators must be applied. The system supplied quantities are as follows: E is the mathematical number e with as many siqnificant diqits as the implementation allows. It is returned as LONG_REAL. FALSE is a bit of value S.O. It is returned as BIT. FIRST is the first character in the implementation defined collatinq sequence. It is returned as CHARACTER of current and maximum anticipated lenqth 1. LARGE is the larqest positive numerical value representable on the machine. As CHARACTER (expression) , LARGE is a literal value of current and maximum anticipated lenqth as specified in the expression (which must be of type INTEGER and is defaulted to LARGE if absent) , consistinq of copies of LAST. As - 21 - Section 2: Identifiers, constants. POINTER, LARGE is identical to NIL. LARGE is returned according to requested type. LARGE is expanded recursively for user-defined types, by calling the generation time routines for the type, and initializing all fields which are not explicitly initialized to LARGE rather than NIL. Consequently, LARGE may produce significant side- effects and error conditions. LAST is the last character in the implementation defined collating sequence. It is returned as CHARACTER of current and maximum anticipated leugth 1. NIL is a generic nil value. NIL is defined to be numerically zero for the types INTEGER, LONG_INTEGER, BYTE, BIT, DECIMALS, REAL, and LONG~REAL. NIL is a universal POINTER to no object. As a CHARACTER (expression) , NIL is a literal value of current and maximum anticipated length as specified in the expression (which must be of type INTEGER and is defaulted to NIL if absent) , consisting of copies of FIRST. NIL is returned according to the requested type. NIL is expanded recursively for user-defined types, by calling the generation time routines for the type, and initializing all fields which are not explicitly initialized to NIL. This is also the default initialization for any user defined type. NIL thusly may produce side effects. PI is the mathematical number pi with as many significant digits as the implementation allows. It is returned as LONG_REAL. SMALL is the smallest positive numerical value representable on the machine. As a CHARACTER (expression) , SMALL is a literal value of current and maximum anticipated length as specified in the expression (which must be of type INTEGER and is defaulted to SMALL if absent) , consisting of copies of FIRST- As POINTER, SMALL is identical to NIL. SMALL is returned according to the requested type. SHALL is expanded recursively for user-defined types, by calling the generation time routines for the type, and initializing all fields which are not explicitly initialized to SMALL rather than NIL. - 22 - Section 2: Identifiers, constants. Consequently, SMALL may produce significant side- effects and error conditions. TRUE is a bit of value S.1. It is returned as BIT. The request for a systeui_supplied_value must observe the usual restrictions on parameters for the respective types. Error conditions would generally be raised at execution time, not at compile time as for constants. E.2.3 Examples. irfe have the following constants available as system supplied values: E = L. 2. 71828 18284 59045 23536 PI- L.3. 14159~26535~89793~23846 (INTEGER NIL) = (LONG_INTEGER NIL) = F.O (CHARACTER NIL) = C. (CHARACTER (3) NIL) = FIRST « FIRST » FIRST (INTEGER LARGE) = 32767 (BIT LARGE) = S. 1 (CHARACTER (1) LARGE) = LAST (CHARACTER SMALL) = FIRST = CHARACTER (1) NIL (BIT SMALL) = S. 1 (INTEGER SMALL) = 1 " is the concatenation operator for CHARACTER data items; it is discussed in appendix B. - 23 - Section 3: Configurations, 3. Configurations. A CLEOPATRA program consists of one oc more configurations. Configurations in turn consist of one or more blocks. They provide linking information in a structure block, data creation information in a data block, and executable code describing an algorithm in a routine block. Each of these blocks is separately compilable; together they form configurations: a type pack or an algorithm, which then can be used by other configurations. With a block for compilation the user must also imply or specify the environment into which the block is to be compiled; CLEOPATRA will then choose from its given library of configurations the appropriate linking information and global data information needed to compile properly. Blocks, not configurations, may be compiled and recompiled separately. Such a recompilation will subseguently reguire other recompilations only if calling conventions or globally accessible data are changed. The CLEOPATRA compiler itself can determine the extent of recompilation necessary to update its library of compiled blocks, and could even initiate the recompilation. Configurations form the outermost structure of CLEOPATRA programs. They are discussed at this point primarily to provide the background for the later sections of this report. These future sections then will discuss just exactly how the contents of a configuration are to be created, used, and interpreted. Configurations are statically nested for compilation in a tree fashion. When discussing the scope of names, we shall refer to the predecessor, the current, and the successor configuration, and to the predecessor, current, and successor trees of configurations. The nesting of configurations into these trees is discussed in section 3.3.2. - 24 - Section 3: Configurations. J.I Building blocks of a configuration. A configuration consists of one or more blocks, each describing a particular aspect od tne functioning of the configuration. Section 5 will discuss the construction of each of these blocks in detail. In the current section we will merely sketch their functional aspects. The different bloc conf igur ation_name, which on the services o f the section 5, a configuratio used in different s thro ughout the current co The scope may be enla tree ; see section 3.1. conf iguration as a whole through a structured ref e ks are connect in some cases a configuration. n__narae is simp emantic context nfiguration tree rged to extend o 2. Explicit r is made, for co rence, see secti ed by th eir common lso serves to call As is described in iy an i dentif ier , s. Its scope is , un iless redefined. ver the p redecessor ef erence to the mpil on 3 ation 1.4. purposes. There are three aspects of an algorithm that a program has to express. The algorithm operates on certain data items, it has certain execution rules, and it may dispatch work and call on the services of otner algorithms. Corresponding to these tasks in describing an algorithm, CLEOPATRA provides tnree kinds of blocks: data blocks, describing the data items on which an algorithm will operate; routine blocks, describing the execution rules about to be carried out; and structure blocks, detailing the conventions according to which subalgorithms are to be invoked. There is a further level of differentiation according to function lor these blocks. This further differentiation deals mostly with scope rules for names delineating access and, in some instances, life expectancy for the items thus described. This is not merely a semantic limitation, however. Scope rules determine levels of abstraction; they are provided to enable a conceptually modular design of algorithms, as well as a physically modular packaging of the resulting machine code. - 25 - Section 3: Configurations. 3.1.1 Routine blocks. There are several kinds of routine blocks, differentiated according to function and calling conventions: procedures are supplied with a list of parameters (or possibly no parameters at all) . They return values. operators have one or two principal arguments, and possibly one or two lists of parameters. They return values. conversions are a special notation for unary operators, intended to make the text of a routine more meaningful. As operators, they return values. 3.1.2 Structure blocks. Structure blocks provide a forward referencing capability and calling conventions for configurations yet to be defined. There are two kinds of structure blocks, differentiated according to the scope of their entries: (local) structure blocks list all configurations which (unless redefined) may be employed throughout the current configuration tree. global structure blocks list all configurations which (unless redefined) may be employed throughout the predecessor configuration tree. Structure blocks do not result in the generation of any executable code. They merely determine the nesting of configurations and thus the scope of names, and, reversing the nesting of configurations, the levels of abstraction within an algorithm. Global structure blocks are supplied to allow one particular abstraction, the packaging of algorithms pertaining to a user-conceived data type. 3. 1. 3 Data blocks. There are three kinds of data blocks, differentiated according to function, and according to the scope (and life- expectancy) of their entries: - 26 - Section 3: Configurations. (local) data blocks describe data items available only to the current configuration. ylobal data blocks describe data items available to the current configuration tree, unless the names are later redefined. type data blocks describe the creation of the underlying representation of a user-defined data type. Their entries are explicitly accessible in the current configuration tree, and can be made explicitly readable in the predecessor tree. 3.1.4 Recompiling rules. With the information packaged as described, we can now state the maximal amount of recompilation necessary due to a change to a single block. An implementation can actually do with less recompilation if more descriptive information is collected and retained with each compiled block. In this section, the "current" configuration is about to nave one of its constituent blocks changed and recompiled. Recompilation of the current routine block will cause no further recompilations (the calling conventions are described in the predecessor structure block) . Recompilation of the current (local) structure block may require recompilation of the current configuration tree, since the calling conventions can change. Recompilation of the current global structure block may require recompilation of the predecessor configuration tree, since the calling conventions can change. Recompilation of the current (local) data block may require recompilation of the current configuration. It should, however, not require recompilation of the successor tree. Recompilation of the current global data block may require recompilation of the current configuration tree. We can eliminate some structure blocks from this recompilation. - 27 - Section 3: Configurations. Recompilation of the current type data block may require recompilation of the current configuration tree, or, if entries in it were shared (i.e., were made externally readable) , of the predecessor tree. It should be noted that as a whole the recompiling rules coincide nicely with the levels of abstraction achieved within the program. 3.2 Available configurations. A configuration consists of a number of the building blocks described in section 3.1. The semantic meaning of a configuration is determined by the combination of these blocks. The blocks are combined by having identical conf iguration_names. (3.1) configuration : := type_pack | algorithm (3.2) type_pack ::= global_structure_block [ structure_block ] [ data_block ] type_data_block (3.3) algorithm ::= [ structure_block ] [ global_data_block J [ data_block ] routine_block A type_pack describes the underlying representation of a user-defined data type and lists the algorithms manipulating instances of the type. Outside the type_pack there should exist only the abstracted idea of the data type, instances of which are to be manipulated by algorithms encoded within the type pack. Only these routines, by virtue of their inclusion in the type pack, may access the underlying representation explicitly; outside the type pack the representation should be transparent. An algorithm is the usual "procedure" consisting of data items, execution rules, and references to subalgorithms for manipulation of the data items, etc. Algorithms generally will employ the services of type_packs for their encoding. Type_packs require algorithms to carry out the actual data manipulations. - 28 - Section 3: Configurations, Blocks may be presented tor compilation in any order. With each block there must be an indication of the environment into which the block is to be compiled; see section 3.4. The only crucial rule about the order of compilation is that items must be defined before they can be used in declarations, and that items must be declared before they can be used. Definitions can be implicit (e.g., data types or procedures known to the compiler) or explicit (e.g., structure block entries define a type or an algorithm); declarations are explicit (e.g., data item declarations in data blocks or switches and labels preceding their usage in a routine block) . The exception to the define/declare before use rule is provided by calls on configurations. A routine block may, tot example, issue a call on an algorithm after the algorithm was defined by its structure block entry, but before it was actually elaborated by its configuration. This exception is necessary to allow recursive and top-down programming. For reasons of documentation and compilation economy, we reguire that the structure block entries completely specify the calling conventions for each configuration. 3.3 Scope rules. In this section we will describe the rules governing the scope of names and accessability. Most names have a scope delineated by configurations or configuration trees. We therefore also describe how canf igurations are nested. 3.3.1 Identifier classes. Identifiers must be unique. The same identifier may in general not be used to simultaneously describe more than one entity. In order to avoid having to generate a multitude of names, this uniqueness requirement is modified and takes into account the accessibility and usage of the items described by the identifiers. Identifiers are grouped into classes, and identical identifiers may simultaneously appear in more than one class. Identifier classes group identifiers on the basis of their usage. - 29 - Section 3: Configurations. Within a class, the accessibility of an object depends on and determines the scope of the identifier naming the object. The scope of an identifier is delineated by the syntactic unit which delineates the accessibility of an object. The identifier classes and the syntactic units delimiting the scopes of their members are as follows: configuration names derive their scope within configuration trees. Their scope begins with the structure block entry for the configuration. data item names have as a scope a configuration, or they derive their scope within a configuration tree. Their scope begins with the data block entry describing the data item. subrield names in a data group have as their scope a data group. switches have as their scope a compound statement. (See section 7. 2.3). labels derive their scope within a statement. (See section 7. 3) . Identical identifiers may appear more than once in the same usage class, provided their scopes do not begin at the exact same level of scope nesting. If the scopes of two identifiers in the same class intersect at all, one scope will be included in the other, due to the inclusion ordering (nesting) of the syntactic units delimiting the scope of identifiers. If the scopes of two identical identifiers intersect but do not begin at the same level of nesting, the smaller scope preempts the larger scope; the object described by the identifier with larger scope is thus rendered inaccessible throughout the smaller scope. A partial exception to the inaccessibility rule is provided by the B0ILT_IN declaration for operators, and by the possibility to declare ALIAS identifiers. Certain names additionally may not be formed due to conflicts with the list of system reserved words. - 30 - Section 3: Conf igurations. 3.3.2 The nesting of configurations. Structure blocks describe the nesting of configurations. All configurations which the current configuration will use, explicitly or implicitly, must be referenced in a structure block belonging either to the current configuration or to some preceding configuration. All configurations referenced in the structure block of the current configuration become internal to this configuration. Their configuration names are known throughout the current configuration tree in the sense of section 3.3.1, i.e., unless they are redefined in some successor configuration. The preceding paragraph applies to global structure blocks as well. However, configurations referenced in the global structure block of the current configuration, while internal to the current configuration, are usable and known by their names throughout tha predecessor configuration tree. It should be emphasized that the name of a configuration may involve different identifiers. For the purpose of connecting the constituting blocks of an operator algorithm, an identifier is used as a configuration name. For the purpose of calling on the services of the algorithm, an operator is used, and the recognized name of the operator also involves the types of its principal operands. Details are discussed in section 5.2. 4. Both naming possibilities are subject to the same scope considerations as described above. These nesting and referencing rules for a global structure block are designed to accomodate user-defined data types through type packs as follows: A type pack in its global structure block will reference all algorithms that apply to the underlying representation of the type. Subseguently , when data items of the type are introduced, the type pack must have been referenced in some structure block and conseguently all the algorithms in its global structure block may be referenced within the scope of the type data item declaration. The preceding discussion means that the current configuration can reference any of the following configurations: those that are described in the global or local structure block of the current configuration, or in the global or local structure block of a preceding configuration and whose recognized names have not been - 31 - Section 3: Configurations. redefined, or configurations which are described in the global structure block of a configuration which has been introduced in the structure block of the current configuration, or of a preceding configuration, and whose recognized names have not been redefined, or, finally, any system configurations whose names have not been redefined or which have been reintroduced as BUILT IN; see section 5.1.1. 3.3.3 Recursion. The discussion in the preceding section implies that a configuration (algorithm or type_pack) may call on itself, directly or through various intermediate levels of further calls. In general, this feature is intended, and at the same time may cause somawhat unintended results. For example, a type_pack, calling on itself indiscriminantly, guickly may cause the exhaustion of available computer memory; an algorithm trapped into calling itself also will exhaust available memory (but relatively slowly) with the allocation of descriptions for its parameters. CLEOPATRA does support recursive programming and leaves the judicious use of the feature up to the user. Services are provided to monitor or raise the RECURSION condition in critical sections of the program. 3.3.4 Referencing data. Data blocks describe the data items on which a particular routine operates. It has been pointed out that well structured programs should avoid indiscriminant use of global variables. Conseguently, CLEOPATRA provides more than just the scope rules of ALGOL 60. A (local) data_block declares just those data items which are local to the configuration possessing the local data block. Those data items exist as long as the configuration is active (i.e., as long as control has not been returned from it) , but their names are local to the configuration, and they cannot be referenced in any successor configuration. We thusly provide a means to avoid global variables and to construct variables purely local to an algorithm. A global_data_block declares all data items which have the ALGOL 60 scope, i.e., which can be referenced in the - 32 - Section 3: Configurations. current configuration tree unless their names later are redefined. These data items also exist as long as the current configuration is active. A type_data_block describes all data items which constitute the underlying representation of a user-defined data type. It's entries are created as the creation of each instance of the user-defined data type and exist as long as the instance exists. The data items are referenced throughout the current configuration directly, and throughout all successor configurations with an indication as to which instance of the data type they belong to. Data items residing in a type_data_block may be given the SHARE attribute. In this case they can be referenced with an indication as to which instance of the data type they belong to everywhere where the instance owning the type_data_block is ref erencable. Hawever, such a reference outside the regular scope is read-only; the CLEOPATRA compiler will not permit any such reference to appear as a storable reference; see section 6.2. The preceding discussion applies to data items residing in automatic storage, i.a., data items which are automatically created on configuration entry and which are automatically destroyed when control returns from the configuration. CLEOPATRA also supports data items which are subject to deferred allocation (section 5.3). They are described in data blocks, and the descriptions thus receive a referencability scope, but it is the user's responsibility to cause creation and daletion of the data item. Access to the data items is obtained by using pointer values constructed during creation of an instance of the data item. Formal parameters for a configuration must be declared in one of the configuration's data blocks, and thus receive a scope for their referencability. [ A type_pack may own a local data_block precisely to allow some parameters to be local to the creation of one instance of the data type and to prevent those parameters from being accessible by any successor configuration. ] Formal parameters for a configuration must be unigue; tha same formal parameters may appear but once among all formal parameters for the configuration. Also, tha corresponding actual parameters should be unigue. They are always passed by address and supplying the same actual for two distinct formals may cause confusion. - 33 - Section 3: Configurations. To summarize: a configuration can reference any of the following data items: those residing in the configuration's ovn local, global, or type data block, those residing in the global data block of a preceding configuration, unless the names were redefined, those residing in the type data block of a preceding configuration if an instance of the type is available, and finally those data items with the SHARE attribute residing in a type_data_block referencable by the current configuration. These latter data items must be referenced read-only, and with an indication of ownership of the data item. 3.4 The compilation environment. A compilation presented to the CLEOPATRA compiler consists of one or more blocks in seguence. Such a compilation may optionally be preceded by a request selecting the configuration into which the upcoming blocks are to be compiled. An implementation will specify a default environment for all compilations which may be some given operating system or the initial program loading (IPL) situation of the host machine. ' (3.4) compilation ::= [ environment_request ] { structure_block | global_structure_block | data_block | global_data_block | type_data_block | routine_block } • (3.5) environment_request ::= COMPILE INTO conf igur a tion__ref erence (3.6) conf iguration_ref erence ::= conf iguration_name [ . configuration_name ]• The configuration names must uniquely specify a path through the tree of already (partially) compiled configurations accessible to the CLEOPATRA compiler in this run; the configuration_ref erence may omit intermediate nodes in the tree, if this is possible in a unique fashion. The environment_request specifies the structure blocks of the configuration at the end of the path through the tree of compiled configurations. The blocks about to be compiled, by their configuration names, then are added and form successor configurations. Thera can be no ambiguity as to - 34 - Section 3: Configurations. which configurations are being resolved by the blocks being added, since the configuration names of these blocks oust have been defined previously by structure block entries, and since these names are unique within their scope. A new environment_request must be written to return to the environment of a redefined name. Each environment_request serves to condition the CLEOPATRA compiler tables into a state in which all declarations preceding the presented set of blocks have been scanned. E.3.4 Examples. We use a pseudo language for these examples. They are merely intended to illustrate the concept of compilations. In all names, capital letters only are assumed to have been supplied by the user, while the digits serve to distinguish like names for the explanations. Using the Operating System/360, a main program may be entered into the system under an assumed structure block entry PROCEDURE MAIN RETURNS LONG_INTEGER where the LONG_INTEGER would supply the usual return code. (This example assumes that an implementation of CLEOPATRA/360 would conform to the standard calling conventions for OS/360.) If such an operating system is assumed by the implementation of the compiler, a compilation may be COMPILE INTO OS DATA MAIN • • * END STRUCTURE MAIN • * * PROCEDURE A_1 • • * END - 35 - Section 3: Configurations. PROCEDURE MAIN • • • call on A_l • • • END He could now add the algorithm A_1, or we could supply it at a later date for compilation as COMPILE INTO OS.HAIN STRUCTURE A_1 • • • PROCEDURE A_2 • • • END DATA A • • • END Note that the data_block A would be interpreted as responding to the structure block entry for the algorithm A_2. If it were to belong to algorithm A_1, it would have to be preceded by an environment request of the form COMPILE INTO MAIN The final example addresses itself to the problem of recursive calling. Consider, continuing the above, COMPILE INTO MAIN.A_1 PROCEDURE A • • • call on A • • • END The routine block is compiled corresponding to the structure block entry as procedure A_2. The call on a procedure A, assuming that no procedure by the name A was defined in the structure block of A_2, would then be recursive on procedure A_2. - 36 - Section 4: Types. 4. Types. Data items used by CLEOPATRA programs can be of built- in "basic" type, or they can be of user-defined type. Basic types in CLEOPATRA reflect the hardware of the host-machine, or a major convenience for the programmer. User-defined types obtain their semantic meaning through a user-defined type_pack for the type. Section 4.1 of this report will discuss the basic types for an implementation on the IBM System/360. Most of these basic types would be found in almost any implementation of CLEOPATRA. Section 4.2 discusses special considerations concerning the data type POINTER, i.e., considerations applying to all data items whose allocation is completely managed by the user. Section 4.3 discusses user-defined data types. We have to distinguish two kinds of references to the entity "type": One reference merely discusses the recognition (and calling) conventions for a type. The other reference is concerned with creating a data item of a certain data type. In the latter case parameters must be supplied detailing the desired creation. E.g., in the case of a character string, its maximum anticipated length is of no concern for reference purposes (nor is the fact that this maximum length must be specified as an INTEGER) , but a value must be assigned or implied for creation purposes. Further details of this distinction will be discussed in sections 5 and 6 of this report. 4.1 Basic types in CLEOPATRA/360. (4.1) basic_ref_typa ::= INTEGER J LONG_INTEGER | BYTE 1 REAL | LONG REAL | DECIMAL [ • size J | BIT | CHARACTER | POINTER ( ref_type ) (4.2) basic_type ::= INTEGER | LONG INTEGER | BYTE | REAL | LONG_REAL | DECIMAL~[ • size ] J BIT | CHARACTER [ ( expression ) ] | POINTER ( ref_type ) - 37 - Section 4: Types. The expression details the maximum anticipated length tor the data item of type CHARACTER. This expression, a creation time parameter, must be of type INTEGER. If it is omitted, NIL (i.e., 0) is assumed. C*.3) size ::= integer The type DECIMAL is specific to the IBM System/360. The size must specify the length of the data item in bytes, and it must be in the range from 1 to 16. For reasons of efficiency, it must always be supplied as a constant, and 8 is assumed if it is omitted. Generally, size is not a creation time parameter; instead, it gives rise to 16 data types. For convenience, there are a certain number of implied conversions between these data types. The following table gives basic types for CLEOPATRA/360, together with their capacity and representation in the host machine. Expressions in the table are written using CLEOPATRA 1 s expression syntax (see section 6). type representation capacity INTEGER -+- half word, binary -f- - 2 ** 15 to (2 ** 15)- 1 LONG INTEGER + full word, binary r - 2 ** 31 to (2 ** 31) - 1 BYTE + byte, binary T to (2 ** 8) - 1 REAL + full word, floating point (six significant hexa- decimal digits) T approximately 10 ** -78 to 10 ** 75 in absolute value LONG_ REAL + double word, floating point (14 sig. hex. digits) r same as REAL above DECIMAL 1 + packed decimal. T to (10 **(2 * size) size size bytes - 1)- 1 in absolute value L J L I - 38 - SectioD 4: Types. The above are tha basic types which are included because of the host machine 1 s hardware. Below we define the basic types included as a convenience. It should be noted that no scaling is built in for integer or decimal numbers. If such internal scaling is desired, the necessary operators and types would have to be user defined. r 1 r " t | type | representation | capacity | L + + j 1 BIT | bit jS.O (false) S. 1 (true) | L + f j J CHARACTER) two INTEGER, | up to (2 ** 15) - 1 | | | followed by zero or | character codes | | | more BYTE . | | L + f J | POINTER | see section 4.2 I see section 4.2 | i j i j If a CHARACTER data item is created, the maximum anticipated length must be specified. It should be noted that BIT operations are extremely inefficient, due to the host machine's hardware. For alignment considerations, see sections 5.3-4, 6.3, and 6.4. 4.2 Pointers. CLEOPATRA provides both automatically allocated variables (in the sense of PL/I) , and data items with deferred creation, to be explicitly allocated in the heap (in the sense of ALGOL 68). However, dynamic storage management is one of tha main tasks of the operating system underlying a given program. Since this operating system can be written in CLEOPATRA, for privileged users we must provide an interface between the dynamic and explicit storage manipulation provided by CLEOPATRA and the user- written operating system. At a conceptual level below this memory management interface, only •memory 1 exists, unformatted into the higher order types that CLEOPATRA commonly provides. Memory can be manipulated by its size alone; requests to obtain or free memory deal only with its size; higher order types mapped into this memory cannot be recognized below the interface. - 39 - Section 4: Types. Above the interface, memory should cease to be unformatted. For the sake of maintaining control, ve cannot allow arbitrary manipulations on essentially formatted storage areas. At this point, therefore, we only deal with user-defined data items; if those exist in the heap, they will have to be manipulated using pointers, and these pointers must "know" what kind of object they are pointing to. There is yet another distinction, which we might terra the "loader problem". To a program charged with loading and linking another program, the other program is "data". We very definitely would like to avoid passing execution control into a data area. But this is precisely what we will have to do once the loader program completes its operation. In order to maintain as much control as possible while giving complete freedom to the loader program (written in CLEOPATRA) in choosing its representation for the program to be loaded, the conversion from "data" to "code" must be made at the level of the memory management interface: for privileged users we must allow a conversion of a pair (address, size) into the entitiy "entry point". The user-supplied memory management interface will be discussed elsewhere, since it deals with aspects of the interaction between the implementation of CLEOPATRA and the design of an operating system in CLEOPATRA. In the present section we will only discuss the explicit creation, deletion, and manipulation of data items through POINTER variables as the common user would see them. 4.2.1 Objectives for deferred creation. For data items whose creation is deferred while their hosting data block is elaborated, we would like to have the following rules of manipulation: (1) The user is to perform explicit creation by means of an ALLOCATE statement. During creation, the user is given a pointer value which makes the data item accessible. The user may create multiple copies of this pointer value. (2) There is no a priori limit on the life expectancy of an object. The user can perform implicit deletion by losing the last copy of the pointer value for his object, or explicit deletion by - 40 - Section 4: Types. means of a RELEASE statement. In the latter case, all remaining copies of the pointer value for the object about to be deleted must be destroyed automatically by the routines managing pointer values. (3) Pointer values are maintained in data items of type POINTER and there is no restriction on the life of the POINTER items with regard to the life of the pointer values. (4) A reference to an object is to be made using the pointer value for the object; such a reference is to be fully equivalent to a reference to an automatically allocated data item, in particular, (4a) the reference must be fully typed at compile time, and, (4b) the reference may be passed as an actual parameter merely of the type of its object, with no indication of the pointer value nature of the reference. (5) A reference to an object is to be controlled in the sense that (5a) the pointer value participating in the reference must not have been destroyed (a condition monitored by the CLEOPATRA system) , and, (5b) the reference must be to a data item of the same type as anticipated at compile time. (6) We do not wish to introduce uninvolved overhead (i.e., overhead introduced by the deferred creation feature at a point where this feature is not being used) and we wish to be able to reclaim all memory at the time it becomes available. Unfortunately, not all the objectives can be accomplished simultaneously. An implementation of CLEOPATRA will have to choose which objectives are most desirable and it will have to adjust the language definition accordingly. A discussion of two possible implementations and their shortcomings is provided elsewhere. - 41 - Section 4: Types. Explicit allocation seems desirable in an operating system environment. Garbage collection, i.e., memory reclamation only once no more space for allocation is available, seems extremely undesirable, as does leakage of allocated but inaccessible memory. One might consider elimination of explicit release in favour of tying the life- expectancy of deferred data items at creation time to the dynamic configuration nesting structure. Objectives (2) and (3) together imply that we have a list of all copies of each pointer value and that we monitor these lists during all (automatic, implied, or explicit) release operations. Objectives (4a) and (5b) suggest that a data item of type POINTER must only accept pointer values for objects of a known type, or that the data type to contain a pointer value should be POINTER (ref_type) . Objective (5a) , extremely desirable for maintaining system integrity, implies that any reference through a pointer value must be dynamically validated for existence of the pointer value. This does not seem to cause an undue overhead . Parameter passing effectively creates a reference to the parameter; for a deferred data item we are creating another copy of the pointer value. Objective (4b) then implies that any. passed parameter (deferred or not) must be able and must be expected to take the form of a reference with a pointer value involved. This we would consider an undue uninvolved overhead. We summarize: We consider the points (1) through (6) all equally desirable and do not wish to exclude them individually for fear of preempting a decision to be shared by a CLEOPATRA implementor and an operating system designer. The report will therefore assume that all the objectives can be met; this philosophy is reflected in section 4.2.2. E.4. 2. 1 Example. We offer the parameter passing problem as follows (the coding is not completely admissible CLEOPATRA syntax) : - 42 - Section 4: Types. STRUCTURE main PROCEDURE sub(type BY ADDRESS) END GLOBAL DATA main DEFER type item POINTER (type) ptr END DATA sub type x END PROCEDURE main ALLOCATE item FOR ptr sub (ptr. item) END PROCEDURE SUb (X) RELEASE ptr COMMENT main has created a copy of item and has passed it by address to sub into the formal parameter x. sub then can release this copy of item because here it globally knows the pointer £tr to it. Subsequently sub could still reference it under x, with no explicit indication of the fact that a pointer value is involved. 4.2.2 Using a pointer. Data items of type POINTER (ref_type) , pointers for short, can be used like all other data items; see appendix B for available operations on pointers. Primarily, however, pointers will be used to reference (parts of) some object. The ALLOCATE statement is the sole source of new pointer values, When an object, an instance of a deferred data item, is created using the ALLOCATE statement (section 7.1.4), a pointer value for the object is created and assigned to a pointer named in the statement. The ref_type for this pointer must match the type of the deferred data item. The same type match is also enforced for all manipulations of pointers with one exception: POINTER NIL is a universal pointer value for no object, regardless of type. - 43 - Section 4: Types. A reference made with a pointer value takes the form of a structure reference (section 5,3): the first item in the reference indicates a pointer, further items then establish the reference to the object or the subparts of the object by referencing the respective parts of the deferred data item. It should be noted that a reference made through a pointer is a storable reference if and only if the pointer itself is a storable reference (section 6.2). The type of such a reference is the type of the referenced part of the deferred data item. The referencing algorithm will verify that an existing pointer value is supplied. If an attempt is made to access (parts of) an object which has vanished (i.e., whose pointer value was destroyed) , the REFERENCE condition will be raised. An object can vanish because the user has issued an explicit call to CLEOPATRA^ release mechanism; see section 7.1.4 on the RELEASE statement. The release mechanism then is said to have been explicitly invoked. An object can also vanish because the last copy of its pointer value has been lost. This can occur in one of two ways. First, an assignment is made changing the value of the pointer containing the last copy of the pointer value of the object. Such an assignment can also be the result of allocating a new object for this pointer. Alternatively, the last copy of the pointer value is lost through deallocation - the pointer may reside in vanishing automatic storage, or it may be part of an object under release. In this case, CLEOPATRA will imply a RELEASE operation; we implicitly invoke the release mechanism. 4.3 User-defined data types. A programming language cannot anticipate all conceptual data types that a user might wish to employ in his implementation of an algorithm. Some data types that the programming language provides for have in turn been provided by the hosting machine's hardware; other built-in data types are recognized at language design time as having a high probability of being of general use; additionally the implementor may be able to provide these additional data - 44 - Section 4: Types. types very efficiently by using techniques that are not available from within the programing language. The set of built-in data types in most languages is usually extensible: the user generally is afforded the means to aggregate a number of more primitive data items to form a composite object of higher complexity. There are usually two kinds of aggregates available, arrays and structures. Arrays consist of many copies of members of the same data type, together with an accessing mechanism, the subscripts, to denote an individual member of the collection. Structures consist of many data items of different data type, referred to by a common highest level name, together with an accessing mechanism, the structure reference, to denote an individual member or a group of members of the structure. Programming languages usually provide a certain amount of distributive laws, rules for the application of an operator defined on scalar data items to an aggregate. Examples are APL's distribution of addition and multiplication over an array, or PL/I's distribution of operators over a structure, or over elements of a structure BY NAME. These extension mechanisms still do not provide for most conceptual data types. The user might wish to manipulate an aggregate as a stack, a deque, or a queue; he may wish to represent the properties of a turtle, a task, or a token in the input stream of a coipiler. Conceptual data items consist of an actual ("underlying") representation which represents the new data type in terms of data types that have been previously constructed, together with manipulation algorithms, operators that realize conceptually more abstract operations by simulating them in the underlying representation. CLEOPATRA provides a packaging possibility for the ingredients of a user-defined data type. A type_pack in its data blocks will specify the underlying representation of a new user-conceived data type, and in its structure blocks will list the operations available for the new data type. The nesting and scope rules of CLEOPATRA have been designed so as to allow user-controlled access to selected parts of the new data type. Some, but not necessarily all, operations nested into the type_pack are usable wherever elements of the new data type are introduced, and parts of the underlying representation can be made read-only accessible to the user of an instance of the new data type. _ US - Section 4: Types. The feature has some shortcomings which are introduced in the interest of compilation and execution economy. Namely, we cannot provide for execution time access to the entity "data type". The CLEOPATRA compiler will accomplish a complete binding of all names to types. At execution time we only deal with correct calls on algorithms, independent of their semantic context as being implementations of an operation on some new data type or implementations of an algorithm using some new data types. We therefore cannot provide for an extension of aggregation separate from an extension of primitive data types. I.e., we cannot let the user define and implement a "stack" as an accessing mechanism which can be made ignorant of the element types which it is to access. Our data type definition facilities reguire that a stack of task control blocks be defined (where we are able to defer the definition of the implementation of a task control block), or a stack of integers, etc. The same type_pack will not be prepared to handle both element types without recompilation. 4.3.1 Constructing a type_pack. A type_pack is a configuration. It has to describe the underlying representation of a user-defined data type, and operations on this data type. Additionally it can describe algorithms which all the implementations of the operations may share. Only within the type__pack configuration tree is unrestricted access to the underlying representation of a user-defined data type possible. As a minimum, a type_pack consists of a type_data_block and a global_structure_block (see section 5 for details concerning the construction of these blocks) . The type_data_block defines the underlying representation and initialization of a user-defined data type. It is invoked every time an instance of the new data type is created (i.e. , allocated and initialized) . For each instance of a new data type there will therefore exist one copy of the type_data_block (or at least one copy of all the automatically allocated data items in the block). The creation process can optionally be supplied with parameters, which may influence, e.g., the size of arrays which are employed in the description of the underlying representation. Creation therefore takes the form of a - 46 - Section 4: Types. procedure call on the type_data_block. This procedure call nay, of course, only appear in very special contexts. The CLEOPATRA compiler will create various routines during the elaboration of a type_data_bloc)c. For allocation purposes, the space requirements of an instance of the new data type (dependent on creatiou time actual parameters, of course) must be available to the operating system's "obtain memory" interface. Normal initialization of the data type must be provided through another routine. There must be routines to create a NIL value for the new data type, as well as routines to create LARGE and SHALL values (section 2-3). Some of these routines may coincide. Finally, like all other data blocks, the type_data_block will result in the creation of a list of access mechanisms for all its constituents. Some of these access mechanisms, for elements with the SHARE attribute (section 5.3.3), will be available not only throughout the type_pack configuration tree, but also in its predecessor tree. The global_structure_block defines the operations that are implemented in the type_pack. All algorithms listed in the globa^structure^biock are nested into the type_pack, but their names will also be available throughout the predecessor tree. (Compilations within the predecessor tree but outside the type_pack tree therefore will need access to the global_structure_block of the type_pack so as to determine what algorithms are availanle there.) In addition to a type_data_block and a global_structure_block , a type_pack may use a local data_block and a local structure_block. The local data_block also is created once for every instance of the data type. Its only function can be to make some of the formal parameters tor the type_data_block local to the type_pack and not available outside the creation process. The local structure_block can be used to define algorithms (and further type_packs) which are available throughout the type_pack tree, but not outside of it. For reasons of compilation efficiency we do not allow a nesting of ref erencability . The global_structure_block of a type_pack may not list another type_pack; we can therefore not have a chain of algorithms implementing a new data type and parts of its own underlying representation available outside the type_pack. The rule also prevents a chain of SHARE attributes to provide access to a data item deeply nested into underlying representations. - 47 - Section 4: Types. Since a type_pack is nested into some predecessor configuration, it has access to ail facilities available from the usual nesting of configurations. In this fashion a chain of reterencability is possible. However, it will in general not be very good coding practice to provide it. 4.3.2 References to a type. At the beginning of section 4 we discussed the different contexts in which reference to a type is made. Such a reference occurrs when we describe the calling conventions for the creation process, when we actually create an instance of the data type, and when we describe a pointer to an instance of the data type. The description of the calling conventions for a type_pack appears in section 5.1.1; a type_pack is first listed in the nesting local structure_bloclc, and this entry describes its calling conventions. Corresponding to productions (4.1) and (4.2) for built-in data types, we have the following productions for arbitrary data types: (4.4) (4.5) ref_type ::= ( basic_ref_type J type_name | [ ALIGNED | COMPACT } { ref_type { , ref_type }• ) } [ ( ALIGNED | COMPACT } integer EXTENTS ] [ ALIAS identifier ] type ::= { basic_type | type name [ parameters ] 1 { ALIGNED | COMPACT } ( type { , type }• ) } [ array ] [ ALIAS identifier ] The integer in p dimensional extents (i possesses. It must be all factors which cont subject is discussed (4.5) describes all info an instance of a dat section 5.2.2, array is meaning of ALIGNED an 5.3.4. roduction f any) wh positive. P ribute to in detail i rmation per a type. Pa discussed i d COMPACT (4.4) is ich the roduction type re n section tinent to rameters a section is disc the number of described type (4.4) describes cognition. This 6.3. Production the creation of are discussed in 5.3.5, and the ussed in section The ALIAS phrases are intended to simplify the referencing process tor types. Both ALIAS phrases define the listed identifier to be fully equivalent to the matching ref_type (not the actual creation type) . The identifier - 48 - Section 4: Types, assumes a data item scope, i.e., dependent on the context where it appears its scope will be the current configuration, the predecessor configuration tree, or the current configuration trae. These scopes result from an appearance of the ALIAS phrase in a local data_block, in a global_structure_block or in a shared object in a type_data_block, and in all other blocks, respectively. The ALIAS identifier may be used after it is defined, only. - 49 - Section 5: Structure, routine, and data blocks, 5. Structure, routine, and data blocks. 5. 1 Structure blocks. Structure blocks describe the calling conventions for configurations. They also determine the nesting of configurations, and thus the scope of most names. There are two kinds of structure blocks, global structure blocks used to form type packs, and local structure blocks. We will first describe local structure blocks. Examples can be found in section 5.2. 5-1.1 Local structure blocks. A local structure block lists all algorithms that are internal to the configuration to which the local structure block belongs. These algorithms are then referencable throughout the current configuration tree. Additionally, a local structure block lists all type packs that are directly nested into the current configuration. Such a reference makes all algorithms referenced in the global structure block of the type pack callable throughout the current configuration tree. Normally, in CLEOPATRA everything must be defined before it can be used in a declaration, and before it can be referenced. Structure blocks provide the only exception to this rule. References made in a structure block are always forward, to configurations yet to be defined. During compilation, blocks belonging to these configurations will make an explicit or implied reference to the structure block in which they are referenced, in order to enable the CLEOPATRA compiler to generate the proper compilation environment; see section 3.4. (5.1) structure_block ::= STRUCTURE conf iguration_name ( ; link_item }• [ ; ] END conf iguration_name [ ; ] (5.2) conf iguration_name ::= procedure_name | type_name I operator_link - 50 - Section 5: Structure, routine, and data blocks. The conf iguration_name will connect structure blocks, routine block, and data blocks ot a configuration. It is usually the name of the algorithm or type which the configuration defines. If the configuration defines an operator or a conversion, a separate identifier, the operator_link, is used to establish the connection between parts of the configuration. (5.3) link_itea ::= TYPE type_name [ ALIAS identifier ] [ ref_type_list ] | global_link_item (5.4) global_link_item ::= PROCEDURE procedure_name [ ALIAS identifier J [ ref_type_list . ] RETURNS ref_type (5.5) ::= operator_link : OPERATOR [ lef t_ref_types ] operator [ ALIAS identifier J rrght_ref _types RETURNS ref_type (5.6) ::= operator_link : CONVERSION TO ref_type EROM right_ref_types The ALIAS phrases define identifiers which are fully eguivalent to the operator or procedure_name to which they apply. They assume the same scope which their originals obtain. If the original name is redefined, the ALIAS is not automatically redefined. The main purpose for a link_item is to define complete calling conventions for the configuration. (CLEOPATRA is strongly typed, see section 6.3.) The ref_type_list for a TYPE link_item specifies the calling conventions for the generation time routines of a type. All other ref_type_lists specify routine_block calling conventions. In addition to defining new algorithms, a structure block allows the reintroduction of algorithms which are system-supplied and whose names were redefined by the user in some preceding configuration. This is accomplished with the BUILT IN attribute: (5.7) global_link_item ::= PROCEDURE procedure_name [ ALIAS~identifier ] BUILT IN (5.8) ::= OPERATOR [ ref_type [ BY ADDRESS : | . ] J operator [ ALIAS identifier ] { [ . ] ref_type | : ref_type BY ADDRESS } BUILT IN - 51 - Section 5: Structure, routine, and data blocks. (5.9) ::= CONVERSION TO ref_type FROH { [ . ] ref_type | : ref_type BY ADDRESS } BUILT IN The decision which procedures, etc., are considered BUILT IN for each particular application is left to the implementor. This decision is additionally influenced by a decision as to whether it should be possible to redefine basic types. Whether or not the redefinition of basic types is possible depends for example on the implementor 1 s decision to reserve their names. (5.10) ret type_list ::= ( type formal [ , type_formal ]• ) (5.11) type_formal ::= [ identifier * ] ref_type [ BY ADDRESS ] The identifier in the preceding production acts as a keyword. See sections 5.2.2 and 5.2.4 on parameter passing conventions. Recall that any ref_type may be replaced by an ALIAS name (see production (4.4)). (5.12) left_ref_types ::= ref_type [ BY ADDRESS : [ ref_type_list ] j . ref_type_list ] (5.13) right_ref_types ::= ref_type \ : ref_type BY ADDRESS { ref_type_list { . ref_type | : ref_type BY ADDRESS } Essentially the preceding two productions state how an operator must be separated from its principal operands. There are three cases: 1) If there is no parameter list possible, operator and operand are separate tokens, nothing else. 2) If there is a parameter list possible, it is separated from the principal operand by a period. This period is needed for the compiler to decide how to associate parameter lists with routine names. 3) The preceding rules apply if the routine has read-only access to the operands. If the operator modifies its operands, the operands have to be passed BY ADDRESS. In this case between operator and operand in 1) or replacing the period in 2) - 52 - Section 5: Structure, routine, and data blocks. above must be a colon (:). This corresponds to ALGOL 60*s assignment operator (:=). The CLEOPATRA compiler will verify that the type lists as stated in the link_item are identical to the declarations made in the actual routine declaration. Additionally, the types of the arguments at a point of call must also match the type lists. There is no implied conversion as through the PL/I ENTRY-attribute. 5.1.2 Global structure blocks. Recall from section 3.2 that a global structure block always belongs only to a type pack. It lists configurations which are callable throughout the predecessor configuration tree. In order to prevent the creation of a chain of thusly callable configurations, we do not allow a global structure block to declare a type pack to be nested into it. (5. 14) global_structure_block ::= GLOBAL STRUCTURE type_name { ; global_link_item } • [ ; J END type_name [ ; ] (5.15) type_name ::= identifier 5.2 Routine blocks. Routine blocks describe the algorithms we wish to perform. Data items on which they operate have to be declared earlier in data blocks; routines which they call have to be declared earlier in some structure block. (These routines need not have been defined earlier, however.) There are essentially two types of explicitly callable routines, procedures and operators. We will discuss each in turn. The examples in this section are completely encoded and assume the existence of some operators which are defined in appendix B. Details of the code of the examples should become clear, once sections 5.3 on data blocks, 6 on expressions, and 7 on statements are consulted. - 53 - Section 5: Structure, routine, and data blocks. 5.2-1 Procedure definitions. (5.16) routine_block : := PROCEDURE procedure_naae [ name_list . ] ( ; statement J • [ ; ] END procedure_naae [ ; ] (5.17) procedure_naae ::= identifier Statements are described in section 7. Among the statements must be at least one RETURN statement to create a value for the routine. The type of each return value must match the ref_type specified in the RETURNS phrase of the routine declaration in the structure block. Control may never reach the END phrase of the routine block. (5.18) name_list ::= ( formal [ , formal ]• ) (5.19) formal ::= identifier [ • ] If followed by a quote, the identifier is used as a keyword. Unless the phrase BY ADDRESS was coded in the link_item, the routine parameter may not be modified by the routine, i.e., it will not be considered a storable reference; see section 6.2. All parameters to the routine must be declared in one of the data blocks of the configuration. These declarations are matched by the identifiers used in the name_lists. No identifier may be used for more than one formal. The scope of the parameter is determined by the declaration as usual. The CLEOPATRA compiler will verify that these declarations specify types that match the ref_types in the structure block entry for the routine. The correspondence of the name_list to the ref _type_list is positional for all but keyword parameters. Keyword parameters are matched by the keyword, and aay appear anywhere in either list. The scope of keywords begins with the structure block entry for the routine possessing the formal parameter, and is identical to the scope during which the routine can be called (this includes the scope where the routine can be called by its ALIAS name if any) . The scope of the keyworded parameter is determined by its data block entry as usual. - 54 - Section 5: Structure, routine, and data blocks. 5.2-2 Procedure calls. A procedure call is used as an operand in expressions; see sections 6 and 7. The ref_type of this operand is the ref_type specified in the RETURNS phrase in the structure block entry for the procedure. The CLEOPATRA compiler will verify that this ref_type also Batches the type in any RETURN statement in the procedure. (5.20) procedure call ::= procedure name [ parameters . ] (5.21) parameters ::= [ ( actual [ , actual ]• ) ] (5.22) actual ::= [ [ identifier • ] expression ] Expressions are discussed in section 6. Actual parameters are connected to formal parameters by position in the parameter list (it is possible to omit parameters in a call) or by matching keyword. However, only storable references are permitted as actual parameters for formal parameters specifying BY ADDRESS. For a discussion of storable references, see section 6.2. All actual parameters in a routine call (with the exception of the principal operands of an operator, see section 5.2.4) may be omitted and default parameter creation takes place while the data blocks of the routine are expanded (see section 5.3). Positional parameters can be omitted from the right end of a parameter list by shortening the list. Positional parameters elsewhere can be omitted, but their separating commas cannot be omitted. Keyword parameters can be omitted together with their separating commas. If no parameters are specified, the entire parameter list is omitted. Observe that delimiting periods may not be omitted, even if the parameter list is omitted as a whole. When a procedure call is processed, all actual parameters are evaluated starting with the rightmost actual. For any omitted parameter, a default initialization is used. This initialization is determined during initialization of the data blocks of the procedure. Please observe that only those INIT phrases in these data blocks are executed which belong to formal parameters for which actual parameters were not passed in the call. Initialization follows the evaluation of all specified actuals, in the usual order of expanding the data blocks. - 55 - Section 5: Structure, routine, and data blocks. E. 5.2.2 Examples. He illustrate the various possibilities of oaitted and keyworded parameters. Be present a structure block entry for a procedure, relevant data definitions, the beginning of the routine block, and various calls. structure block entry: PROCEDURE example (REAL, integer* INTEGER, POINTER (CHARACTER) ) . RETURNS ALIGNED (BYTE, INTEGER) ALIGNED 2 EXTENTS ; This defines a procedure called 'example 1 with a REAL and a POINTER (to CHARACTER) as positional parameters, and with an INTEGER parameter responding to the keyword •integer 1 , •example' will return a two-dimensional array of structures consisting of a BYTE and an INTEGER each. data definitions: DATA example ; REAL real ; INTEGER integer ; POINTER (CHARACTER) ptr ; ALIGNED (BYTE b, INTEGER i) ALIGNED LEFT(10, 3:4) return routine block: PROCEDURE example (real, integer 1 , ptr). ; - 56 - Section 5: Structure, routine, and data blocks. examples of calls: example . [ 1 ] example (integer 1 10, 2.0) . [2] example (, POINTEH NIL, integer' 2) . [3] example (1.0) . (6, 3) [4] Example 1 shows a call with no actual parameters. Example 2 shows the omission of the last actual positional parameter; observe that additionally the keyword parameter was moved to the extreme left. Example 3 shows the omission of an actual positional parameter from a position other than the last. Example 4 shows the omission of a keyword parameter. It also illustrates how an element of the returned array value would be accessed. Comparing example 4 and example 1, it should become clear why a period is necessary to delimit the actual parameter list for a procedure. 5.2.3 Operator definitions. (5.23) routine_block ::= operator_link : OPERATOR [ left_names ] operator right_names { ; statement } • [ ; ] END operator_link [ ; ] (5.24) operator_link ::= identifier (5.25) left_names ::= ref_type identifier [ { : | . } name_list | : ] (5.26) right_names ::= £ : ] ref_type identifier | name_list { : I . } ref__type identifier Compare section 5.2.2 foe the remarks concerning the matching of name and ref_type lists. With respect to the delimeters period and colon, productions (5.25) and (5.26) follow the same rules as did productions (5.12) and (5.13); compare the remarks in section 5.1.1. The ref_types of the principal operands must be specified in the operator definition since they serve as part of the operator name. Only if the structure block entry - 57 - Section 5: Structure, routine, and data blocks. specified BY ADDRESS for an operand can the operator routine modify this operand, i.e., will it be considered a storable reference; see section 6.2. 5.2.4 Operator applications. Operators are used in expressions. They may be unary (if left_names and lef t_ref _types were oaitted in the definition of the operator) or binary. It is illegal to omit a principal argument of an operator, unlike omitting positional or keyworded actual parameters. The ref_type of the result of the operator application is determined from the RETURNS phrase in the structure block entry for the operator. The CLEOPATRA compiler will verify that this ref_type matches the type of any RETURN statement in the definition of the operator. (5.27) operator_call ::= [ expression [ [ : | . } [ parameters j ] ] operator [ [ parameters ] ( : | . } ] expression Operators within an expression are recognized by the triple [ref_type] operator ref_type (where the ref_type on the left is not considered for unary operators) . This is the reason why it is not possible to use binary operators as unary operators. The same delimiters (period, colon, or none at all) must be specified in the call which were specified in the definition heading, irrespective of whether parameters are then specified or not. Expressions are discussed in section 6. Actual parameters are connected to formal parameters by position or by keyword. However, only storable references may be substituted for formal parameters with the BY ADDRESS phrase. For a discussion of storable references, see section 6.2. When an operator application is processed, all actual parameters are evaluated starting from the right. Consequently , the right principal argument of an operator is the first actual to be evaluated. Default initialization applies for omitted (non-principal) parameters, as described in section 5.2.2. - 58 - Section 5: Structure, routine, and data blocks. E.5.2.4 Examples. Appendix B contains a list of OPERATORS which we would like to supply with an implementation of CLEOPATRA/360. Here we present soae examples of the structure block entry, the routine_block, and the operator_call for unary and binary operators. STRUCTURE main ; distance: OPERATOR -> (systea' CHARACTER, CHARACTER). ALIGNED (REAL, REAL) ALIAS point RETURNS REAL ; slope: OPERATOR point / point RETURNS REAL END main The first entry defines a unary operator -> with two optional CHARACTER parameters, with a •point 1 type argument, returning a REAL. It also defines 'point' to be a structure consisting of two REALs. The second entry defines a binary operator / on two 'point' type arguments, also returning a REAL. DATA distance ; CHARACTER (10) system INIT C. cartesian , units ; ALIGNED (REAL X,y) P ; REAL d END distance - 59 - Section 5: Structure, routine, and data blocks. distance: OPERATOR -> (system 1 , units) . point P ; DECISION cartesian: system . ==. C. cartesian ; polar: system .==. C. polar ; meters: units .==. C. meters ; inches: units .= = . Cinches ACTION cartesian: d polar: d meters: d inches: d cartesian -»J d END = R00T(2). (P.x ** 2) ♦ (P.y ** 2) = P.x ; = d / 39.37008 ; = d * 39.37008 ; polar: = REAL LARGE ! illegal parameter RETURN d ! the distance form the origin to the point, ! directed if polar coordinates [r,t], and ! converted to meters or inches if desired. END distance DATA slope ; ALIGNED(REAL X,y) P,Q END slope slope: OPERATOR point P / point Q ; DECISION equal_x: P.x == y.x ; equal_y: P.y == Q.y ACTION equal x & equal y: RETURN - BEAL LARGE;! no line egual~x: RETURN REAL LARGE ; ! vertical equal_y: RETURN 0.0 ; ! horizontal ELSE RETURN (P.y - Q.y) / P.X - Q.X END END slope - 60 - Section 5: Structure, routine, and data blocks. DATA main ; ALIGNED(REAL x,y) origin ; ALIGNED(REAL X INIT 3.0, y INIT 4.0) P_3_4 ; KEAL real END main PROCEDURE main ; real := -> . origin ; ! result 0.0, cartesian coordinates and no conversion real := -> (system* C. polar ). P_3_4 ; ! result 3.0, polar coordinates and no conversion real := -> (Cinches ). P_3_4 ; ! result 39.37008 * 5.0 (converted to inches) real := origin / P_3_4 ; ! result 4.0/3.0 END main 5.2.5 Conversions. Conversions are a shorthand notation for a certain kind ot unary operator. The only difference between a conversion definition and the same definition written as a unary operator lies in the heading. (5.28) routine_block ::= operator_link : CONVERSION TO ref_typa FROM right_names { ; statement } • [ ; ] END operator_link [ ; ] In this production, ref_type must be an identifier; it must have an ALIAS name if it is compounded. As is discussed in section 4.3.2, any ref_type may have an ALIAS name. This alternate name becomes (in proper context) also the ALIAS name for the CONVERSION, and it will usually be the name under which the CONVERSION is invoked. The production (5.28) is exactly equivalent to (5.23), where ref_type takes the place of operator. The calling - 61 - Section 5: Structure, routine, and data blocks. sequence for a conversion then corresponds to the calling sequence (5,27) for a unary operator (5.23). £.5.2.5 Examples. Appendix B contains a list of CONVERSIONS which we would like to supply with an implementation of CLEOPATRA/360. Here we present some examples of the structure block. entry, the routine_block, and the operator_call for a CONVERSION. STRUCTURE main ; real_part: CONVERSION TO REAL FROM ALIGNED (REAL, REAL) ALIAS complex ; extend: CONVERSION TO complex FROM REAL ; END main The first conversion operates on a •complex 1 typed argument and returns a REAL, the second conversion performs the opposite operation. » complex' is a structure consisting of two REALs; it is defined in the first entry. DATA real_part ; ALIGNED (REAL X,y) Z END real_part real_part: CONVERSION TO REAL FROM complex z ; RETURN Z.x END real_part DATA extend ; REAL x ; ALIGNED (REAL X,y) z END extend - 62 - Section 5: Structure, routine, and data blocks. extend: CONVERSION TO complex PROM REAL x ; z.x := x ; ! z. y is by default initialized to REAL NIL RETURN Z END extend DATA main ; REAL x INIT 1.0, y ; ALIGNED(REAL X,y) z END main PROCEDURE main ; y := REAL complex x END main 5.3 Data blocks. Data ^blocks describe the data items on which a configuration may operate. All data items undergo initialization; usually the data item description will assign an initial value to a data item. A data item can be declared CONSTANT, so that it cannot change its value once it is initialized. CLEOPATRA provides the usual scope rules of ALGOL 60, and in addition facilities to create names for data items which have a larger or a smaller scope than the usual one. Data items can be aggregated into structures similar to PL/I structures, and into arrays. The user may control alignment of data items and array elements, and he may also control the allocation of arrays into contiguous memory in row- or column-major order. Our discussion of data blocks will proceed top-down. The reader may initially assume that the following production for a data_group is valid: data_group ::= INTEGER identifier INIT expression - 63 - Section 5: Structure, routine, and data blocks. This example would construct a data item of type INTEGER (a basic type) with the stated identifier as a name; the data item would be initialized to the value of the indicated expression, where the type of the expression must be allowed as right argument type and INTEGER must be allowed as left argument type for some built-in or user defined INIT operator. 5.3.1 Local data blocks. (5.29) data_block ::= DATA conf iguration_name { ; [ CONSTANT | DEFER ] data_group }• [ ; ] END conf iguration_name [ ; ] Data items declared in a local data_block can only be referenced in the configuration to which the data_block belongs, not in any successor configuration. A data block is elanorated left to right through all its data groups. Since all data items must be declared before they can be used, this rule prohibits forward references in the initializing expressions for each data item. The initializing expressions for each data item are, of course, evaluated right to left within each such expression (see section 6 for a discussion of expressions) . If the CONSTANT option is selected for a data_group, its members may be initialized, but they cannot be used as storable references; effectively this means that their initial values cannot change ducing the time that their hosting configuration is active. (See section 6.2 on storable references.) If the DEFER option is not specified for a data_group, reference to the data items specified is made by using their names. For the sample definition of a data_group above this means that the specified identifier is to be used; for the general case consult the sections on data groups 5. 3.4, and on arrays 5.3.5. If the DEFER option is not specified, the data_group will be allocated in automatic storage in the sense of PL/I, i.e., it will be created and initialized immediately before control reaches the configuration, and it will be destroyed as soon as control is returned from the configuration. If the DEFER option is specified for a data_group, it is the user's responsibility to allocate (i.e., create and - 64 - Section 5: Structure, routine, and data blocks. initialize) the data_group by executing an ALLOCATE statement; see section 7.1.4 for a discussion of the ALLOCATE statement. Such data items do not reside in automatic storage, and need suitable pointer values for their manipulation. If the DEFER option is specified for a data_group, reference to the data items specified is made by using a pointer and the name of the data item. For the sample definition of a data_group above this means that the reference will take the form • pointer. identifier • , with the •identifier* as it appeared in the data_group, and with the name of a suitable •pointer' substituted. For the general case consult the sections on data groups 5.3.4, or on arrays 5.3.5; also see section 4.2 on pointers. Formal parameters for a configuration must also be declared in a data block. The scope of their names is thus determined. Formal parameters may not be subject to the CONSTANT or DEFER options. Creation Df formal parameters is performed if and only if no corresponding actual parameters were supplied in the calling seguence for the configuration. This means that potential side effects in the evaluation of an initializing expression or of the creation-time routines for a formal parameter only take place if no actual parameter was supplied. 5.3.2 Global data blocks. (5.30) global_data_block ::= GLOBAL DATA conf iguration_name ( ; [ CONSTANT | DEPER ] data_gr3up }• [ ; ] END conf iguration_name £ ; 3 Data items declared in a global_data__block can be referenced in the configuration to which the global_data_block belongs, and in all successor configurations. A global_data_block is elaborated left to right through all its data_groups. Whether a data_block or a global_data_block for the same configuration is elaborated first depends on the order in which they are presented to the CLEOPATRA compiler. All data items must be declared before they are used in expressions; no forward references are permitted in initializing expressions. These initializing expressions for each data item are, of course, evaluated right to left within each such expression (see section 6 for a discussion of expressions) . See section - 65 - Section 5: Structure, routine, and data blocks. 5.3.1 for a discussion of the CONSTANT and DEFER options, of the referencing conventions for data items, and of formal parameters. 5.3.3 Type data blocks. The underlying representation of a user defined type is declared in the data blocks which belong to the type pack. These data blocks are created once for the creation of every instance of a user-defined data type, and they exist as long as the instance of the data type exists. The local data_block of a type pack can be used to declare data items that are used only in the data blocks of the type pack; according to the definition of a local data__block, these data items cannot be referenced in any successor configuration. It is therefore necessary that the local data_block of a type pack be elaborated (i.e., presented to the CLEOPATRA compiler) before the global data block for this type pack. A type pack employs a special version of a global data block: the type_data_block. This type_data_block allows for parameters to be passed to the creation process (i.e., the elaboration of the data block), and it allows extending the scope of declared data items in a read only fashion. (5.31) type_data_block ::= GLOBAL DATA typa_name [ name list] £ ; [ CONSTANT | DEFER |"~SHARE [ CONSTANT ] ] data_group }• [ ; ] END type_name [ ; ] Data items declared in a type_data_block constitute the underlying representation of an instance of the user-defined data type. They can be referenced in the type_pack, and in all successor configurations, wherever an instance of the data type is available. Additionally, data items for which the SHARE option was selected may be referenced in a read only fashion (see section 6.2) in the predecessor configuration tree, wherever an instance of the data type is available. A type_data_block is elaborated left to right. No forward references among data items are permitted in the initializing expressions. These expressions are each evaluated right to left (see section 6 on expressions). - 66 - Section 5: Structure, routine, and data blocks. In the type pack, a data item declared in one of the data blocks of the type pack is referenced by its name (possibly employing a pointer) as described in section 5.3. 1. In configurations nested into the type pack, a data item declared in the type_data_block is referenced not only by its name (possibly employing a pointer), but through the name of a data item of the type vhich the type pack defines. A reference takes the form 'name. item 1 , where 'item 1 references the data item in the type_data_block, and 'name 1 references a data item for which this type_data_block has been allocated. The CONSTANT option was discussed in section 5.3.1. Data items in the underlying representation of a data type will only be allocated in automatic storage, if the DEFER option has not been selected for either the data item or the instance of the data type about to be allocated. In all other case, the DEFER allocation mechanism applies, and responsibility for allocation rests with that configuration which specified the DEFER option. Data items for which the SHARE option had been selected, may be referenced in the applicable manner as outlined above throughout their entire scope. Such a reference is read only unless the reference would also be meaningful had the SHARE option not been selected. In the latter case, the reference is storable. See section 6.2 on storable references. Formal parameters for the creation process must also be declared in a data block. The scope of their names is thus determined. Formal parameters may not be subject to the CONSTANT, DEFER, or SHARE options. Creation of formal parameters is performed if and only if no corresponding actual parameters were supplied in the calling sequence for the configuration. E.5.3.3 Example. GLOBAL DATA stack ; INTEGER a ; DEFER INTEGER b ; POINTER (INTEGER) c END stack - 67 - Section 5: Structure, routine, and data blocks. GLOBAL STRUCTURE stack ; add: OPERATOR stack ♦ stack RETURMS INTEGER END stack add: OPERATOR stack x ♦ stack y ; In this framework, references would take the following form: x.a and y.a x.c.b and y.c.b Within each group, each reference is for the same field in the underlying representation of the two distinct items x and y. 5.3,4 Data groups. Data groups provide tor the creation of PL/I-like structures, for control of the alignment of the data items contained in the group, and for creating arrays of like data items. The simple form of a data group merely describes one instance of a data type. {5. 32) data_group ::= { basic_type J type_name [ parameters ] } [ array ] item [ , item ]• The list of basic types can be found in section 4.1. If no generation time parameters are given for a type expecting them, the usual action for omitted actual parameters will be taken by the creation process, namely the initialization specified in the underlying representation is applied for each omitted actual. Such a generation time parameter when specified for a formal parameter of the present configuration is only used and evaluated for its side affects if the corresponding actual parameter was not provided. (5.33) data_group ::= { ALIGNED | COMPACT } ( data_group [ , data_group ]• ) [ [ array ] item [ , item ]• ] This version of a data_group provides for data item alignment control, and for the creation of structures, i.e.. - 68 - Section 5: Structure, routine, and data blocks. of aggregates consisting of a number of data items of various types. Ail data items will reguire a certain amount of memory for their allocation. In the case of the IbM/360 family of computers, this amount of memory is measured in bytes of 8 bits each. Unless specified otherwise, a data group will be allocated starting on a dividing boundary. Dividing boundaries are defined as follows: A byte address in central memory is a dividing boundary for a data item provided it is divisible by the largest number of the set { 1, 2, 4, 8 } which alsa divides the size (in bytes) of the data item. For the purpose of this definition, the sizes of data items extending into a byte are to include that byte. The allocation on dividing boundaries can be overruled by the user. The members of any data_group selecting the COMPACT option are packed as tightly as possible, with respect to bits, accross any boundaries. The data_group as a whole, however, is allocated according to the nesting group (if any). Similarly, the effect of a nesting COMPACT option can be overruled by selecting the ALIGNED option for the nested group. The nested group as a whole and all members of the group will then be allocated on dividing boundaries. It should be fairly clear to any user of the COMPACT option that it's use will result in a significant decrease of performance. The option is included primarily for the purpose of creating certain hardware-reguired data structures. Consult section 5.3.5 for a discussion of the rules governing the alignment of array elements. Note that as a consequence of the syntactic positioning of the DEFER and ALIGN options, the ALLOCATE statement will always allocate its targets on a dividing boundary. Also note that because of this definition data items which are fiETURNed by a routine must always be ALIGNED as a whole. Only the members of the returned group (if any) may be unaligned. In addition to providing for control over the alignment of a group of data items, data__groups provide for the construction of PL/I-like data structures, the aggregation of a number of different data items. This is accomplished by specifying one or more items following the description of the members of the data_group. Reference to the data_group as a whole is then made by using the name of the item - 69 - Section 5: Structure, routine, and data blocks. ('identifier 1 , or ' pointer. identifier • , or 'typed item name. group name'). Reference to a data item from within the group is made by appending the name of the data item in question to the right of the reference to the group. Unlike PL/I, CLEOPATRA does not allow reference to a field of a structure to be made by specifying just a unique path to the field. This restriction is imposed in the interest of clarity and of simplification of the lexical analysis module of the compiler. Also, array subscripts must be adjoined to the appropriate identifier; they may not be merged towards the extreme left or right of the reference. This restriction is imposed in the interest of a clean specification of CLEOPATRA'S extremely flexible array manipulation facilities. (See section 6.5 on arrays.) Formal parameters for a configuration may be groups, but they may not be members of a group. This restriction acknowledges that the caller of a configuration controls the allocation of actual parameters which he passes into formal parameters. Secondly, this restriction enables CLEOPATRA to assume that formal parameters reside on dividing boundaries, i.e., that the ALIGNED option was selected or implied for them. This will result in the compilation of code which can be expected to be the most efficient; however, this code would not necessarily always be correct. Consequently the second restriction is imposed: actual parameters must reside on dividing boundaries, or the COPY pseudo operator must be implied. See section t>.4 for a discussion of COPY. E.5.3.4 Example. As an example, we provide the following data structure for the program status word (PSW) of the IBM/360 family of machines : ALIGNED ( COMPACT ( BIT VECTOR (0:7) System_Mask, BIT VECT0R(8: 11) Key, COMPACT ( BIT USACII_mode, BIT Machine_check_mask, BIT Wait_state, BIT Problem_state ) AMWP, SHORT_INTEGER In terrup tion_Code, - 70 - Section 5: Structure, routine, and data blocks. BIT VECTOH(32:33) Inst ruction_Length_Code, BIT VECTOR(3<*:35) Condition_Code, COMPACT ( BIT Fixed_overf low, BIT Decimal_overf low, BIT Exponent_underf low, BIT Significance, ) Program Mask, BYTE VECTOR (2:4) Instr uction_Address ) PS«l ) ; ! to ensure doubleword alignment. We can now use tha following names to access relevant fields and aggregates: PSW to access the entire data structure. P S W. I nstruction_ Address to access the three byte array. PSW.AMPW to access the structure of four bits. PSW.System_Mask (b) to access the selector channel 6 mask. PSW.Program_Mask. Exponent_underf lo w to access the exponent underflow mask. 5.3.5 Array descriptions. (5. 34) (5.35) array ::= [ ALIGNED | COMPACT ] { LEFT | RIGHT | VECTOR } ( bound [ , bound ]• ) bound ::= [ { * I expression } expression } ] { An array description can be selected in order to create more than one instance of a new data item, or in order to change or limit the subscripts for a formal parameter. He distinguish six ways to specify a bound: 1) expression In this case a default of • 1 1 is supplied as first part of the bound, and execution will proceed as described under 3) . 2) * - 71 - Section 5: Structure, routine, and data blocks. In this casa a default of • 1* is supplied as first part of the bound, and execution will proceed as described under 4). 3) expression : expression The smaller value becomes the lower bound, the larger value becomes the upper bound. The dimensional extent is the absolute value of the difference of the two expressions plus 1. Should the description apply to a formal parameter, the corresponding actual parameter is checked for an exceeding of the dimensional extent. If the extent has not been exceeded, the lower bound and extent of the formal description become valid throughout tne new configuration. Should the extent have been exceeded, a SUBSCRIPT_RANGE exception results. 4) expression : * This form is only acceptable for formal parameters. The value of the expression becomes the lower bound, and the extent of the actual parameter determines the upper bound. Should no actual parameter be available, the extent is set to • 1*. 5) * : expression This form is only acceptable for formal parameters. The value of the expression becomes the upper bound, and the extent of the actual parameter (or • 1 ' , if there is no actual parameter) determines the lower bound. 6) * : * This form is only acceptable for formal parameters. The bounds of the corresponding actual parameter will be used. If there is no actual parameter, 1:1 will be supplied. In general, all the expressions describing the bounds on subscripts must be of type INTEGER. The evaluation of the expressions proceeds right to left through the sequence of expressions, and right to left in the usual ordar for each expression (see section 6 on expressions). Expressions for arrays with the DEFER option will be evaluated at allocation time. It should also be noted that the application of the built-in functions for array bounds (see section 6.5.3) returns the new bounds for a formal - 72 - Section 5: Structure, routine, and data blocks. parameter. These built-in functions cannot be used to inquire as to the previous bounds of some parameter. CLEOPATRA affords the user control over the order in which elements of a multidimensional array are stored in contiguous memory. Arrays with one dimension are stored in order of increasing subscripts into contiguous memory of increasing address. The VECTOR keyword can only be selected for an array with one dimension. Arrays with several dimensions and with the LEFT storage scheme are allocated into contiguous memory in order af increasing addresses so that the leftmost subscript values cycle increasingly the most rapidly. Arrays under the RIGHT storage scheme are allocated so that the rightmost subscript cycles most rapidly as the memory addresses increase. For two-dimensional arrays, the LEFT storage scheme will leave each column of the new array in contiguous memory (column- ma jor order), and the RIGHT (row-major) storage scheme will preserve row contiguity. Formal parameters under the LEFT storage scheme allow actual parameters under the RIGHT storage scheme, and vice versa. The user may also exercise control over the alignment of array elements. Normally, array elements are aligned to dividing boundaries, or as compact as possible, depending on the nesting data_group. If the ALIGNED option is selected for the array," all elements of the array are aligned to a dividing boundary, irrespective of the nesting data_group. If the COMPACT option is selected for the array, the second and all further elements of the array are allocated as tightly as possible with respect to bits. The first element and thus the entire array is allocated as the nesting data_group dictates. It is not legitimate to pass non-aligned actual parameters into aligned formal parameters, and vice versa. The COPY pseudo operator must be employed (see section 6.4). 5.3.6 Initialization. (5.36) item ::= identifier [ INIT expression ] The scope of the identifier thus declared, and its referencing, has been described earlier in this section. - 73 - Section 5: Structure, routine, and data blocks. If there is a list of items specified, only the first (leftmost) item is created; all further items are copies of the first. Copying for the basic types described in section 4.1 needs no explanation except that when a POINTER is copied, only the pointer value is transferred; we do not create a new object. Any creation of a data_group under production (5.32) will result in the data item being given the value NIL. Following creation, the user may execute through the INIT phrase the eguivalent of an arithmetic statement essentially of the form of the right hand side of production (5.36). This is to say that there must exist an operator INIT which allows the indicated expression as its right hand argument, and which allows the type of the identifier for its left hand argument. The left hand argument is considered storable and may be received BY ADDRESS. If the right hand argument is to be received By ADDRESS, it must be a storable reference (see section 6.2), and a delimiting colon must be written. The INIT phrase may not specify parameters for the INIT operator. INIT phrases in a data block are executed from the left; the initializing expressions are each executed right to left, as usual. Exceptions to this rule are, first, INIT phrases for formal parameters for which actual parameters have oeen supplied: those INIT phrases are not executed. Second, INIT phrases for data items with the DEFER option are only executed at allocation time following the execution of the FOR phrase in the ALLOCATE statement, and only if the ALLOCATE statement itself does not supply an INIT phrase. See section 7.1.4 for a discussion of the ALLOCATE statemen t. - 74 - Section 6: Expressions. 6. Expressions. In this section we discuss expressions, the basic construct within a CLBOPATRA program. As has been indicated previously, expressions in CLEOPATRA are executed strictly from right to left; no precedence rules are observed. There is a fairly powerful array mechanism available in CLEOPATRA: tne user may extract sections of an array which will subseguently be treated as an array without being in contiguous memory. CLEOPATRA also provides a means within an expression to construct an array from a set of scalar elements. Finally, we will describe the COPY option which allows a user to substitute a copy of a data item for an actual parameter to be given to a subroutine. This copy can be differently aligned than the original. 6.1 Forming an expression An expression is a seguence of data references, the operands, interspersed with operator names and possibly parameter lists for the operators. Alternatively, procedure calls may serve as operands. (6-1) expression ::= constant | system_supplied_value | system_supplied_constant | reference j procedure_call | operator_call | COPY expression | ( expression ) Constants and system supplied values were discussed in section 2 of this report. References to scalar data items were discussed in section 5.3 with the discussion of the initial description of the data items. Details concerning the reference of elements within an array will be supplied in section 6.5. Section 6.3 will discuss how the data type of a reference, procedure, or operator call can be found. Procedure and operator calls were discussed in section 5.2. He would like to recall the special convention concerning the recognition of constants with a minus sign: -10 would be recognized as an INTEGER constant with value minus ten, whereas - 10 would be recognized as an attempt to call a unary operator with tha name •-• and with argument INTEGER of value ten. The blank character between the minus operator - 75 - Section 6: Expressions. and the decimal string creates the two tokens in the second case. Since the user may assign and reassign rather arbitrary algorithms to be executed as the result of an operator_call, CLEOPATRA does not impose any precedence-scheme on expressions. We follow APL and execute expressions basically right to left (to allow for the mathematical habit of writing unary operators to the left of their operands) , unless directed differently by parentheses. Brackets or parentheses may be used to control the flow of execution within an expression. Both sets must be balanced, and a left bracket will not match a right parenthesis, etc. Throughout this report we consistently use pairs of parentheses, although pairs of brackets might have been substituted. Operators associate to the right, i.e., as a replacement for the sequence * a ♦ b + c' we might have written 'a ♦ (b + c) • , but not * (a ♦ b) ♦ c'. This convention seems the only reasonable one to apply together with a strict right to left execution; it is also the most suitable one for conventional assignment operators and multiple assignment. Actual parameters supplied to operators and procedures must strictly adhere to the expected data types. The recognized data type takes into consideration essentially the contents and alignment of the constituents of the data item and the number of dimensions and the element alignment of an array data item; for details see section 6.3. Alignment characteristics of a data item can be changed to some extent; see the discussion of the COPY option in section 6.4. Arrays can be reshaped, dimensional extents can be added or deleted with a crossection mechanism; see section 6.5 for a discussion of CLEOPATRA arrays. In general, CLEOPATHA will always pass actual parameters to subroutines by passing their addresses. Unless the COPY option is explicitly selected, there will never be two copies of the same data item merely for the purpose of making one copy availaole to a called procedure. However, not all parameters are modifyable. The CLEOPATRA compiler certainly should prevent any attempts at the dynamic modification of constants, or SHARE data items. This is accomplished by having the compiler recognize references as •storaDle 1 , and by enforcing that only storable references - 76 - Section 6: Expressions. be given to routines which receive them BY ADDRESS (and thus potentially modify them). The general definition of storable references, in section 6.2, takes into account how a configuration obtained its data items; e.g., a data item that was received BT ADDRESS is considered storable, while any other parameter for the configuration is not. CLEOPATRA has one distributive law for operators. The compiler will interpret a unary operator call with a data_group as principal argument, or a binary operator call with two data groups of the same recognized type as principal arguments, as a reguest to apply the operator to all members of the group. The compiler will proceed right to left accross the description of the group and will apply the first operator that accepts each subset of the group. As a result, the operator call on the group gets resolved as a sequence of operator calls on the subgroups of highest complexity for which operators have actually been defined. An error at compile time would result, if not all members of the group could be passed to some operator. E. 6. 1 Examples. The following examples use operators for basic types as defined for CLEOPATRA/360 in appendix B. 1U == 2 * 4 ♦ 3 11 == (2 * 4) +3 Note that in the following code sequence a := 2 ; b := (a := 3) OR TRUE ; we obtain a == 2 b == TRUE Let us assume that we have the following declarations: OP ALGD (REAL, REAL) ALIAS complex * ALIAS c_mult complex RETURNS complex ; OP complex BY ADDRESS := complex RETURNS complex ; ALIGNED ( - 77 - Section 6: Expressions. ALIGNED(REAL X,y) z, ALIGNED (INTEGER x, REAL y) w ) a,b,c ; As a result of the distributive law for operators and structures we then would obtain the following code as an expansion of 'c := a*b*: c.w.y := a.w.y * b.w.y ; c.w.x := a.w.x * b.w.x ; c.z := a.z c mult b. z : 6.2 Storable references. This section discusses the conditions under which a parameter can ue passed to a routine BY ADDRESS. CLEOPATRA assumes in general that a routine intends to modify all actual parameters which it receives BY ADDRESS. Consequently, CLEOPATRA only permits the passing of actual parameters BY ADDRESS which are actually sensibly modif yable. The name of a data item as it is implied in a declaration is a storable reference. This rule has the following exceptions: a) A reference to an INTEGER used as a loop index (section 7.2.2) is not storable in the scope of the loop. b) A reference based on a declaration for a formal parameter without the BY ADDRESS phrase is not storable. (It is possible to use formal parameters as storable references in the data block, prior to their appearance in the name_list of the routine. The illegal use of the formal parameter must then be reported at that time.) c) A reference based on a declaration with the CONSTANT option is not storable. d) A reference based on a declaration with the DEFER option is not storable; in order to be storable, such a reference must involve a pointer to the object described by the declaration (see below) . - 78 - Section 6: Expressions. e) A reference based on a declaration with the SHARE option is storable only if the reference is made within the scope which the data item would have had if the SHAKE option had been omitted, and if the CONSTANT option was not selected. In the preceeding discussion, the name of a data item as it is implied in a declaration is understood in the sense of section 5.3. The name can take the following forms: identifier for most data items, entire data groups, arrays, etc. identifier. identifier. identifier... for members of a structure. item. name for a reference to the underlying representation of a user defined data item; see section 5.3.3. There are four constructions which create storable references trom storable references. These constructs are all recognized by the compiler; they are not to be confused with procedure or operator calls. a) A part of a storable array, extracted by the methods described in section 6.5, is a storable reference. Extraction uses a generalized subscript to specify one or more elements of the array as a new array; basically the dimensional structure of the original array is left intact. b) A storable array, reshaped by the methods described in section 6.5, is a storable reference. Reshaping uses a generalized subscript to overwrite the bounds of an array, and to insert or add new dimensional extents of length one. c) A reference made through a pointer to a data item with the DEFER option is storable unless the pointer has the value NIL. This condition can only be checked dynamically at execution time; as is described elsewhere, it will be decided by an implementation in what form such references can be passed to a routine; see section 4.2. - 79 - Section 6: Expressions. d) The COPY option described in section 6.4 will create a storable reference from a storable reference. However, the actual parameter modification is only performed once the called routine terminates, not at each time that the called routine modifies the copy of the original parameter passed down. 6.3 Type recognition. In order to afford the user the possibility of writing •generic' (i.e., argument type sensitive) operators, CLEOPATRA determines the target of each operator_call not alone by the name of the operator, but by this name together with the recognized types of its principal arguments. In order to simplify the task for the human reader, CLEOPATRA forbids any implied conversions. When compiling expressions, at each point the resulting type of the phrase is Known and used as is. In the present section we will summarize just what exactly contributes to the recognized type of a data item or of a phrase constructed during the elaboration of an expression. The recognized type, or ref_type, of an expression is essentially defined by productions (4.1) and (4.4), and it has three components. First, it depends on the primitive data type assigned to the reference in question during the declaration. Second, the recognized type depends on the alignment assigned during the declaration. Third, it depends on the aggregation if any. 6.3.1 Primitive data types. Primitive data types essentially determine how a data item snould be manipulated. Some primitive data types, the basic types, are sold tD the CLEOPATRA user courtesy of the manufacturer of the host machine. Other basic types are included by the implementation of CLEOPATRA for execution efficiency. Finally, the user himself may design his own primitive data types through the type pack, mechanism. The primitive data type component of type recognition depends on the simple scalar data element, irrespective of its - 80 - Section 6: Expressions. allocation in memory, i.e., irrespective of its relation to larger aggregates. The creation of a data item may be influenced by creation time parameters. E.g., when creating a CHARACTER, its maximum anticipated length must be specified. These creation time parameters can take considerable influence on the final looks of a user defined data type, but they do not influence the recognized type of the data item. (We do wish to completely analyze the resulting type of a data item at compile time; in particular, we do wish to be able to compile only meaningful references through pointers. This is the reason why we have, reluctantly, eliminated PASCAL*s variable field structures.) DECIMAL data items for the IBM/360 family of computers do not quite agree with the management of creation time parameters as suggested above. While CLEOPATRA does recognize DECIMAL 1 1 through DECIMAL'16 as separate primitive data types, an implementation should make some amendments to the recognition to allow for the creation of efficient code from reasonably •human' source text. We propose that CLEOPATRA should recognize a data type DECIMAL- If parameters of this type are passed BY ADDRESS and if the actual and the formal parameter size designations should differ, the user should have to explicitly specify the COPY option described in section 6.4. If the size designations differ, but if the parameter is not passed BY ADDRESS, CLEOPATRA should automatically create a properly sized copy and should pass it in place of the actual parameter. 6.3.2 Alignment. Built-in routines realized by the coder can be adapted at compile time to accomodate COMPACT or ALIGNED actual parameters. Code for user-written routines is compiled to be as efficient as possible, irrespective of the actual parameters. Because of the syntactic constraints of the declaration of formal parameters, it will always be assumed that formal parameters are aligned on divisible boundaries. Actual parameters for user-written routines not aligned on divisible boundaries must be aligned properly using the COPY option. The problem of alignment, however, is still more complicated. Formal parameters may be an entire data group, and we nave a similar problem of alignment, this time for - 81 - Section 6: Expressions. potentially wrongly aligned members within the group. Formal parameters may also be arrays, and their elements, too, may be aligned wrongly. In general, compiled code capaole of manipulating COMPACT data items will always be able to (inefficiently) cope with ALIGNED data items, but not conversely. The COPY option has sufficient information available to create properly aligned copies of data items, even if such data items should be entire arrays or structures. It additionally serves as an indication to the human reader that the passing of the data items in question will require (potentially) a large overhead. Alignment contributes to the recognized type in the form of an attribute. While the recognized type is determined by the primitive data type and aggregation alone, it is only considered a matched recognized type, if the ALIGNED attribute is present wherever it is expected. If no ALIGNED attribute is expected, but one is present, we still have a matched recognized type. 6.3.3 Aggregation. CLEOPATRA provides two storage schemes for arrays: LEFT (column- ma jor) and RIGHT (row-major) . These storage schemes do not contribute to the recognized type. Additionally the dimensional extent along each dimension of an array does not contribute to the recognized type. (The first rule is imposed as a convenience for a team of users who wish to share their data bases. The second rule is imposed, since its converse could only be enforced dynamically, which we would like to avoid.) The number of dimensional extents of any data item does contribute to its recognized type. Reshaping mechanisms are provided to more or less artificially increase or decrease the number of dimensional extents. The most conventional case of reshaping is, of course, the extraction of an element of an array, or the reshaping of an array to have zero dimensional extents. The number of dimensional extents does contribute to the recognized type, since we wish to provide the user with the ability to write specific operators to handle vectors, two-dimensional matrices, etc. This feature does counteract what one would call the closure of a type pack. It is, theoretically, never possible to write a type pack containing all possible operators for a given type; some - 82 - Section 6: Expressions. user can always add another dimension to an aggregate of this type and thus reguire another operator sensitive to this type. It was felt, however, that this possibility did not severely jeopardize the construction of reasonably complete type packs. 6.3.4 Routine calls. Procedure and operator definitions specify what types they return. Returned data items as a whole must be aligned. The CLEOPATRA compiler verifies while compiling a routine that appropriate arguments are specified in the RETURN statements. The recognized type for an operator or procedure call then is simply the type specified in the RETURNS phrase for the routine. For their subsequent use these types can, again, be modified by the COPY option of the next procedure or operator call where they serve as operands. 6. U The COPY option for parameters. Section 6.3.2 described some cases where efficiency in execution dictates that copies of parameters be made to be used by routines. These cases in general are concerned with the subroutine having been codad to expect slightly different access mechanisms than the calling routine provides. The problem arises in the passing of DECIMAL quantities, in the passing of improperly aligned array elements, and of improperly aligned structures. There are two more cases where passing a copy of a data item may be useful. First, when passing an allocated data item (instead of merely passing the pointer to it), the receiving routine may inadvertently RELEASE the object and still posess a reference to it in the form of the formal parameter. This problem is discussed in more detail in section 4.2; an implementation may choose to impose some restrictions in this case. One possible solution could be to employ the COPY option to ensure that the data object exists for the duration of the subroutine. The second case where passing only a copy of a data item may be useful is more theoretical. It concerns the proving of program correctness at the expense of efficiency. - 83 - Section 6: Expressions. Generally, proving program correctness involves the establishing of certain invariance criteria among data items. For a modular approach, it is usually assumed that during operations on data items the values of these data items change instantly. The COPY option can be employed to make this assumption valid: the original of the data items will only change (and then fairly instantly) when control returns from the subroutine manipulating the data. The value of the original data will not change during the execution of the subroutine. We propose that the COPY option be implicitly invoked for DECIMAL data items, unless they are to be received BY ADDRESS; this was explained in section 6.3.1. The COPY option, however, is not implicitly invoked for DECIMAL data items with a missing ALIGNED attribute. When the COPY option is used for a data item about to be passed to a routine, a copy of the data item is made and passed to the routine. This may raise the REFERENCE condition if an attempt is made to copy an object referenced by a pointer with value NIL. The copy of the data item is adjusted to have the ALIGNED attribute wherever necessary; the compiler determines the necessity from the calling conventions available to it. In the case of DECIMAL data the copy is also adjusted to have appropriate size. This may raise the SIZE condition if the receiving data item cannot hold the incoming value. If the COPY option is written for a data item to be passed BY ADDRESS, the address of the copy is passed to the subroutine, and for the duratiou of the subroutine any modifications made to the passed data item are only made to the copy, not to the original. At the time of termination of the subroutine the value of the copy is transferred to the original. This may then raise the SIZE or the REFERENCE cond ition. COPY is defined recursively; it is a service that is provided as a consequence of the definition of each user- defined type. For the basic types as described in section 4.1, COPY will create an identical data item and will initialize it by applying one of the built-in assignment operators. In the case of POINTER data, this means that COPY creates another copy of the pointer value in the POINTER, not of the pointed-to object. For non-basic types, COPY is defined to be the result of applying COPY to the underlying - 84 - Section 6: Expressions. representation (which will eventually tasic types) . result into copying 6.5 The array facilities. An array is a number of data items of the sane type, referred to by the same name and by a subscript. The data items reside in (essentially) contiguous computer memory, and they are individually accessed by a mechanism based on the array name, a storage rule determining in which order the subscript is to be interpreted, and the information in the subscript. CLEOPATRA provides both customary storage rules, the so-called row-major scheme and the so-called column-major scheme. Under the first of these, called the BIGHT storage scheme in CLEOPATRA, elements of the array are allocated in contiguous memory so that the rightmost subscript varies most rapidly as we proceed through memory in increasing order of addresses. Under the LEFT storage scheme, the leftmost subscript varies most rapidly when describing physically neighboring array elements. An array can have one or more dimensional exten ts; this distinguishes it ] from a seal ar with no dimensional ext ent. Each extent is an INTEGER, g reater than or egual to one. The total number of e. Lements in the array is the product of all its dimensional extents. While it is possible to extract parts of an array in the form of arrays of fewer ( or even more) dimensions. f it is never possible to change the structuring of an array into the original dimensions . It is not possible to consider n * m elements first as an array with two dimensions of n and m as extents, respectively. and later to considi ar the same n * m elements as an array with one d imension and an extent of n * ra. The only extraction of elements or suDarrays allowed is one which does not delete the association of each eleinentg roup with a constant position along each dimension. Such extractions are the crossections of PL/I. Dimensional extents are measured relative to user specified lower bounds, and they determine upper bounds along each dimension. The upper bounds are always - 85 - Section 6: Expressions. numerically greater than or egual to the lower bounds; array indexing may not run in reverse along a dimension. CLEOPATRA generally will enforce that any subscript reference is made only within the active bounds; a failure to do so will cause the SUBiiCBIPT_RANGE condition to be raised. When extracting a crossection the user is afforded the possibility to start the crossection at a point other than the lower bound of a particular dimension, and to terminate the crossection at a point other than the upper bound of the dimeusion. The result of extracting a crossection is still termed an array (unless the crossection leaves only a single element; for details see section 6.5.1), and it is thus possible to create arrays which are no longer in contiguous memory. We describe elsewhere how such an array referencing scheme may be implemented efficiently. The following sections describe array referencing, i.e., the extraction and reshaping of a crossection, which may specifically De a single element. We next discuss the creation of an array from a number of expressions, e.g., for the purpose of initializing an array. Finally we describe the operators that afford access to the underlying representation of an array, e.g., operators to determine the upper and lower bounds, etc. Appendix B describes a number of operators which are available to perform some arithmetic on arrays of all the basic built-in numerical types. 6. 5. 1 Extraction. (6.2) array_ref erence ::= [ shape_list ] array_expression [ extract_list ] | [ shape_list ] ( array_ref erence ) [ extract_list ] (6.3) array_expression ::= identifier J array_temporary | procedure_call | operator_call An array expression can be simply the name of an array, a number of scalar expressions formatted into an array as described in section 6.5.2, or the result of a procedure or operator call returning an array. In this section we describe how one would extract sections of such an array, and how such sections can be reshaped into new arrays of higher or lower dimensionality. (6.4) extract_list ::= ( extract [ , extract ]• ) - 86 - Section 6: Expressions. (6.5) extract ::= expression I ( * I expression } : { * | expression } An extract can take one of five forms; it will describe what kind of a section of the array is to be extracted along the dimensional extent for which (pasitionally) the extract was specified. If there is any extract_list at all for an array, it must specify as many extracts as there are dimensional extents for the array. 1) * : * this extract specifies that the entire dimensional extent is to be extracted. The result of this extraction is another dimensional extent. 2) * : expression this extract specifies that the dimensional extent from the lower bound to the value of the specified expression is to be extracted. The result is another dimensiDnal extent. 3) expression : * this extract specifies that the dimensional extent from the value of the specified expression to the upper bound is to be extracted. The result is another dimensional extent. 4) expression : expression this extract specifies that the dimensional extent between the specified values is to be extracted. The result is another dimensional extent. 5) expression this extract specifies that just the single position is to be extracted. The result has no dimensional extent in this position (and as such is different from an extent of value one) . All expressions in question must be of type INTEGER and will be checked to be within bounds. If the active bounds are exceeded, the SUBSCRIPT_RANGE condition is raised. Wherever a range is specified, the endpoints are included. (6.6) shape_list ::= ( shape [ , shape ]• ) (6.7) shape ::= [ { * 1 expression } : ] { * I expression } - 87 - Section 6: Expressions. A shape can take one of six forms. It describes hov the array is to be referenced. Shapes referring to dimensional extents correspond by position to the existing extents; shapes creating net* dimensional extents are inserted and ignored during the positional matchup. If a shape_list is specified, it must have as many references to existing dimensional extents as there are existing dimensional extents. 1) * this shape specifies the entire existing extent with 1 to be supplied as a new lower bound, and with the new upper bound determined by the existing extent. 2) * : * this shape specifies the entire existing extent as it is. 3) * : expression this shape specifies the entire existing extent, with the valua of the specified expression to be used as a new upper bound, and the new lower bound determined by the existing extent. 4) expression : * this shape specifies the entire existing extent, with the valua of the specified expression to be used as a new lower bound, and the new upper bound determined by the existing extent. 5) expression : expression this shape specifies part of an existing extent as a new extent. The new lower bound is the smaller of the two values specified, and the new upper bound is the larger of the two values specified. If the new extent exceeds the old extent, a SUBSCRIPT_EANGE condition results. 6) expression this shape creates a new extent of one, with a new lower bound taken to be the specified value. All expressions in question must be 3f type INTEGER and will be checked to be within bounds. If the active bounds are exceeded, the SUBSCRIPT_8ANGE condition is raised. Wherever a range is specified, the endpoints will be included. - 88 - Section 6: Expressions. E. 6. 5. 1 Examples. We present some examples for ccossectioning an array. Let us first allocate and initialize the array: INTEGER array LEFT(1:4, 5:7, 8:9) ; This data_qroup defines an INTEGER array with three extents of 4, 3, and 2. The lower bounds are 1, 5, and 8, respectively. Let us now reshape the array so that it has lower bounds of -1, -2, and -3, respectively: (-1:*, -2:*, -3:*) array Assuming the original bounds, and interpreting the extents to denote (from the left) row, column, and plane of the three-dimensional array, let us next extract the second plane as a two-dimensional array: array (*:*, *:*, 9) Note that with this two-dimensional array we can now reshape to obtain another three-dimensional array. The following two examples construct exactly the same thing: (*:*, *:*, 9) array (*:*, *:*, 9) array (*:*, *:*, 9:9) Let us extract an array with two dimensional extents of length two each, taken from the second row in each plane, and let us assign 'default* bounds to this array: (*, *) array (2, 6:*, ♦:*) An equivalent construction would be: (*:2, 1:2) array (2, 6:7, 8:*) Finally, we can assert that after executing the following section of code - 89 - Section 6: Expressions. FOR i FROH 1 UPTO U DO FOR j FROM 5 UPTO 7 DO FOR k FROM 8 UPTO 9 DO array (i, j, k) := k ♦ 10 * j ♦ 10 * i END END END the following is true: array (4, 7, 9) == 479. 6.5.2 Creating an array. CLEOPATRA permits the creation of a temporary array during the elaboration of an expression. This feature is included primarily to enable a simple initialization of array variables. (6.8) array_temporary ::= ARRAY ( new_shape [ , new_shape ]• ) ( section [ , section ]• ) {6.9) new_shape ::= [ expression : ] expression (6.10) section ::= extract_list expression Evaluation of the ARRAY phrase proceeds left to right. First, the new_shapes determine the bounds for the new array in the usual fashion; compare section 5.3.5 on array declarations. The new_shape list is evaluated right to left. Next, the first (rightmost) section is evaluated right to left, i.e., the expression (which must be a scalar) is evaluated, and then copies are made and allocated in the temporary array, as the extract_list dictates. All further sections (from right to left through the list of sections) are evaluated and allocated in the same fashion. The first (rightmost) scalar expression determines the type of an array element; all further sections must conform to that type. - 90 - Section 6: Expressions. Before any section is allocated for the array, the entire array is set to NIL. This means that all elements of the array that are not contained within a section will remain NIL. It is possible that more than one section references the same array element. The element will have as its final value the value created by the last (leftmost) section. 6.5.3 Accessing array parameters. There are three kinds of parameters for an array which can change dynamically: upper bounds, lower bounds, and extents. Depending on the implementation, two of these will usually be used to compute the third; hence the user should not attempt to compute these parameters himself bat should rely on the built-in facilities. Built-in facilities to access array parameters are as follows: INTEGER extent_numb3r EXTENT array RETURNS INTEGER this 'operator 1 returns the number of elements along the specified extent for any array. INTEGER extent_number U_BUUND array RETURNS INTEGER this •operator 1 returns the currently active upper bound for the specified extent for any array. INTEGER extent_number L_BOUND array RETURNS INTEGER this 'operator* returns the currently active lower bound for the specified extent for any array. There is a slight problem when these facilities are used during an operation which changes the currently active bounds for an array. Such operations are passing a parameter, or reshaping an array. The difficulty is resolved as follows: Changes of the active bounds of an array are accomplished from first to last (i.e., rightmost to leftmost specified) bound. - 91 - Section 6: Expressions. The EXTENT, L_BOUND, and U_BO0ND •operators 1 always refer to the current status of these bounds; they can never imply access to some older bounds, e.g., to the bounds of an array before passing. Prior to any reshaping operation, the currently active bounds are transferred to the new set of bounds; any changes, and any reference with EXTENT, L_BOUND, and UNBOUND, are then made against this new set of bounds. Careful coding thus allows a certain amount of access to older bounds by specifying a bound facility during the reshaping of the same bound for which the facility is invoked. The SUBSCRIPT_RANGE condition is raised if reference is made in the bound facilities to some non-existent bound. In that case, the facility would always return a value of NIL. - 92 - Section 7: Statements. 7. Statements. In this chapter we describe the executable statements of CLEOPATRA. These statements are used to write routine blocks. For compilation efficiency CLEOPATRA statements should be relatively easy to recognize; we therefore expect all keywords (in capital letters in the productions in this chapter) to be system reserved words. A few larger examples of CLEOPATRA/360 code appear as appendices D and E. 7.1 Simple statements. There are six simple statements in CLEOPATRA. These simple statements may not be labeled. These simple statements can be classified into three groups: the arithmetic statement to execute some arithmetic expression, statements which conditionally or unconditionally change the flow of execution, and statements which dynamically allocate and release data items in the heap. 7.1.1 The arithmetic statement. (7.1) stmt ::= expression | NIL This is the general purpose statement in CLEOPATRA. Expressions are discussed in chapter 6. Observe that a multitude of algorithms can be expressed in an arithmetic statement, since, e.g., a procedure_call is an expression. NIL alone indicates a no-operation statement; it is an overhead-free special version of the general arithmetic statement. 7.1.2 The RETURN and EXIT statements. RETURN and EXIT effect unconditional forward transfers of control to the limits of an enclosing syntactic bracket. - 93 - Section 7: Statements, The RETUfiN statement is used to return control from an operator or procedure call, and it therefore must supply a value, the return value of the operator or procedure call, (7.2) stmt ::= RETURN expression The specified expression becomes the value of the operator or procedure call from which control is being returned. (There can never be confusion as to which procedure body is being exited, since procedures cannot be physically nested.) The type of the expression must match the ref_type in the RETURNS phrase in the procedure or operator header. (See section 6.3 on type recognition. Note that all returned data items as a whole must be aligned.) Every routine__block must contain at least one RETURN statement. The EXIT statement is used to exit from a compound statement, and it must indicate or imply which compound is to be left. (7.3) stmt ::= EXIT [ label ] An EXIT statement is only legal inside a compound statement. An EXIT statement with no label terminates execution of the innermost compound statement containing it. An EXIT statement with a label terminates execution of the compound statement bearing the same label (within which it must appear). (For a discussion of labels, see section 7.2.) 7.1.3 The simple conditional statement (IF). (7.4) stmt ::= IF expression THEN statement [ [ ; ] ELSE statement ] The simple conditional statement provides for conditional execution of one or two statements. (Statements are discussed in section 7.3.) If the value of the specified expression is not NIL, the statement in the THEN phrase will be executed. If the value of the expression is NIL, the statement in the ELSE phrase will be executed (if present) . The expression must be of a basic type as discussed in section 4.1. The interpretation of NIL as FALSE corresponds to the logical operators in CLEOPATRA/360 (appendix B, section 7) . - 94 - Section 7: Statements. There is the usual 'dangling ELSE 1 problem, to be resolved as follows: Any ELSE phrase applies to the nearest IF THEN phrase or ACTION phrase (see 7.2.3) on its left which does not already have an ELSE phrase. (It is expected that printouts of CLEOPATRA programs will be constructed by a paragrapher. This component of the compiler will indicate by indentation, to which IF THEN phrase or ACTION phrase the ELSE phrase in question has been attached.) 7.1.4 The ALLOCATE and RELEASE statements. These statements manage the heap, i.e., the non- automatic storage area. ALLOCATE creates an object, an instance of a deferred data item, and returns a pointer to the instance; RELEASE removes an instance of an object, and destroys all existing copies of its pointer value. Compare the discussion in section 4.2. (7.5) stmt ::= ALLOCATE identifier [ parameters J FOR pointer [ INIT expression ] The identifier must have appeared at the outermost level of a data_group in a declaration with the DEFER phrase included. This serves to attach the necessary type information tor the data item to be created to the ALLOCATE statement. The actual parameters, if specified, must be generation time parameters for this type. Other sources for these generation time parameters are the DEFER phrase, and the defaults for the type, in that order. The FOR phrase specifies the pointer (as a storable reference of type POINTER (ref_type) ) which is to reference the data item to be allocated. This pointer must be able to point to the ref_type of the data item. The ALLOCATE statement can be interpreted as being an assignment of a pointer value to this pointer. Should as a result of this assignment the last reference to some other data item, previously being pointed to by this pointer, be lost, a call to release that data item is implied during the execution of the FOR phrase. The INIT phrase, if specified, constitutes a call on an operator INIT which has as its left argument (BY ADDRESS) the data item, and as its right argument the expression in the INIT phrase. The types must match as usual. Other - 95 - Section 7: Statements. sources for initialization are the declaration for the data item, and the defaults for the type, in that order. The FOR and INIT phrases may be written in either order. The ALLOCATE statement is executed left to right for its phrases; within each phrase, execution is the usual right to left. I.e., first a call to the generation time routines for the type is issued with the properly evaluated actual parameters. Next, either the FOR or the INIT phrase is executed. It is thus possible to use the old value of the pointer for initialization, before the new value of the pointer is assigned, e.g., in list management. (7.6) stmt ::= RELEASE expression The expression must evaluate to a pointer value. The data item pointed to (if it exists) is removed from the heap, and all pointers to it (including but not limited to the pointer specified in the RELEASE statement) are set to NIL, i.e., to point to no object. It is thus impossible to reference a data item which no longer exists. 7.2 Compound statements. There are three compound statements in CLEOPATRA. All of these may optionally be labeled. They serve three functions: a group of statements is connected so that it can be used like a simple statement, a group of statements is marked for repetitive execution, and a selection is made, which members of a group of statements are to be executed. (7.7) lstmt ::= cstmt | label : cstmt label (7.8) label ::= identifier The scope of labels is determined as follows: The name of a label may conflict with other identifiers. (Labels are only used surrounding a cstmt and after the reserved word EXIT; they are hence easily distinguished.) The scope of a label is only throughout the lstmt to which it applies, and labels may be redefined within the lstmt. It is thus possible to remove the possibility of a complete EXIT from a nest of lstmt. - 96 - Section 7: Statements. 7.2.1 Bracketing statements. (7.9) cstmt ::= BE3IN statement [ ; statement ]• [ ; ] END This construction serves to collect any number of statements into a group. It also designates the scope of an EXIT statement within the group (if any). CLEOPATRA uses BEGIN/END only in order to group statements. BEGIN/END in CLEOPATRA do not result in any actions concerning automatic storage as in PL/I. 7.2.2 Repetitive execution. (7.10) cstmt ::= [ for_phrase ] ITERATE [ while_phrase ] statement [ ; statement ]• [ ; ] [ when_phrase ] END The group of statements is executed repeatedly (from left to right), until some exit is initiated. The EXIT statement can be used to exit the group of statements to be repeated, the RETURN statement can be used to exit the routine_block containing the group, and an interrupt may result in execution of the group being suspended. The FOR, WHILE, and WHEN phrases provide, alone or in any combination, further control over the repetitive execution, as follows: (7.11) for_phrase ::= FOR identifier [ FROM expression ] I ; J [ { DOWNTO | UPTO } expression ] [ ; ] [ STEP expression ] [ ; ] The identifier must refer to a non-deferred, ungrouped ALIGNED INTEGER with no extents residing in the local data_block of the current configuration. For the duration of the cstmt this identifier is no longer a storable reference (section 6.2); in particular, it cannot be passed BY ADDRESS to any operator or procedure (including assignment operators) , and it cannot be used as an index for any nested ITERATE statement. The expressions must be of type INTEGER and are executed once prior to the initiation of the loop in the order in which the phrases are written. This order may be selected at will. - 97 - Section 7: Statements. The qualifiers DOWN and OP indicate the expected direction of the integer sequence controlling the loop. Checks on termination are based solely on these qualifiers, got on the execution-time value of the STEP epxression (if any) . Consult the examples for a description of the control function by means of equivalent code. If the STEP phrase is omitted, 1 is assumed if the UPTO phrase is used, and -1 is assumed if the DOWNTO phrase is used. It is not legal to omit all three, UPTO, DOWNTO, and STEP; at least one of these phrases must be specified if a fo r_ phrase is used. (7.12) while_phrase ::= WHILE expression ; The expression must be of a basic type as described in section 4.1, and only if not NIL requests that the iteration is to be performed. This corresponds to the interpretation of NIL of a basic type as FALSE for the logical operators in CLEOPATRA/360 (section B.7) . The vhile_phrase is evaluated and checked once directly prior to each iteration. (7.13) when_phrase ::= WHEN expression [ ; ] The expression must be of a basic type as discussed in section 4.1. Only if the value of the expression is NIL is the iteration continued. The when_phrase is evaluated and checked once directly after each complete iteration. If all three explicit loop control options are used, they perform their checks in syntactic order: The for_phrase is checked first and may exit the iteration; next the while_phrase is evaluated and may exit the iteration; finally, after the iteration is performed, the when_phrase is evaluated and may exit the iteration. E.7.2.2 Examples. We describe CLEOPATRA/360 code equivalent to the following ITERATE statement: FOR index FROM init_expression OPTO end_expression STEP step_expression ITERATE WHILE while_expression ; - 98 - Section 7: Statements. WHEN when_expression END is equivalent to: index := init_expression ; [1] end_t := end_expression ; [2] inc_t := step_expression [3] loop: ITERATE [4] IF index > end_t [ 5] THEN EXIT loop ; IF -» while_expression [6] THEN EXIT loop ; [7] IF when_expression [8] THEN EXIT loop ; index := index ♦ inc_t [9] END loop [ 10] Comments: [ 1 ] is present if the FROM phrase is present. [2] is present if one of the UPTO or DOWNTO phrases is present. [3] is replaced by 'inc_t := 1* or *inc_t := -1 1 if the STEP phrase is absent and if the UPTO or DOWNTO phrases were selected, respectively. The order in which [1] througn [3] appear is the same as the order in which their corresponding phrases were written. [4. .10] form the iterative loop bracket. [5] is omitted, if no UPTO/DOWNTO phrase was specified. If the DOWNTO phrase is used instead of the UPTO phrase, •<• replaces •>•. - 99 - Section 7: Statements. [6] is omitted if no WHILE phrase is specified. [7] indicates the original loop. [8] is omitted if no WHEN phrase is specified. [9] is omitted if no FOB phrase is specified. Note that the for_phrase imposed control [5] precedes the while_phrase control [6], which is followed by the loop [7], and then by the when_phrase control £8] and by the for_phrase incrementation [9]. As a further example, assume that DISPLAY is a ficticious output command, and that we employ the following loop body: ITERATE DISPLAY i END DISPLAY C./ i We show some controlling phrases and their results: FOR i FROM 1 UPTO 5 12 3 4 5/6 i := 1; FOR i DOWNTO 10 / -1 FOR i FROM 1 UPTO 5 STEP 3 14 / 7 FOR i FROM 1 DOWNTO STEP 3 14 7 same, and WHEN i == 1 1 4 7 10 / 10 same, and WHILE i < 10 14 7 / 10 7.2.3 Selective execution. CLEOPATRA provides as its main control structure a generalized free form decision table as follows: (7.14) cstmt ::= DECISION decision [ ; decision ]• [ ; ] ACTION action [ ; action ]• [ ; ] [ ELSE statement [ ; ] ] END (7.15) decision ::= switch : expression - 100 - Section 7: Statements. (7.16) : := switch { , switch }• [ list_layout ] : list_index (7.17) switch ::= identifier (7.18) list_layout ::= { LEFT | RIGHT | VECTOR } ( [ integer : ] integer [ , [ integer : ] integer ]• ) (7.19) list_index ::= expression [ , expression ]• Each switch labeling a decision or appearing in a list oust be a unique identifier, and its scope is local to the decision table. These switches are set to be true or false as follows: switches in production (7.15) are true if the expression attached to then is not NIL, and false otherwise. This expression must be of a basic type as discussed in section 4.1. The interpretation of NIL as FALSE corresponds to the logical operators in CLEOPATRA/360 (appendix B, section 7). The list of switches in production (7.16) is interpreted as an array according to the list_layout (if none is present, the switches are numbered from left to right, starting at 1). The list_index, which must consist of as many INTEGER expressions as the array has extents defined, will select one switch out of this array, and will set it to be true. All other switches on the list are set to false. It is possible that no switch on the list is selected; i.e., there is no SUBSCRIPT_RANGE condition for a switch selection. Consult section 5.3.5 for more details on storage schemes for arrays. (7.20) action ::= switch_expression : statement (7.21) switch_expression ::= [ -» ] switch [ switch_operator [ -» ] switch ]• (7.22) switch_operator : : = & J 1 I li I -*& I -«i 1 -«ii I = = | -•= | AND | OR J NAND J NOR All decision phrases are executed, left to right. The status of the switches is then frozen, i.e., they cannot be changed or accessed from within the ACTION phrase. Next the action phrases are selected for execution as follows: each phrase is preceded by an expression connecting switches with the usual logical operators -» & (AND) | (OR) || ~»& (NAND) -*\ (NOR) -»| | . We additionally provide the operators == (identical to -»|j) and -»= (identical to ||). Each switch expression may only contain switches from its own cstmt as - 101 - Section 7: Statements. operands, and is evaluated from right to left {all operators have equal precedence!). If the switch expression is true, the statement in the action phrase will be executed. The switch operators correspond to the logical operators of CLEOPATBA/360 defined in appendix B, section 7. However, whereas the English names AND, Ofi , etc. for logical operators indicate that certain parts of expressions may not be executed, there is no such distinct interpretation for switch operators. Switches are evaluated in the DECISION phrase, and then frozen. Switch expressions per se then are side-effect free. Duplicate names for certain switch operators are provided in the interest of symmetry only. All selected action phrases are executed, left to right. It no phrase has a true switch expression, the ELSE phrase, if specified, will be executed. Consult the remarks in section 7.1.3 concerning the • dangling ELSE' problem. The decision table, like all the other compound statements, establishes the scope of an enclosed EXIT statement. If one of the action phrases is an unlabeled EXIT statement, it will terminate execution of the decision table. If one of the action phrases is a labeled EXIT statement, it will terminate execution of the compound with the same label, which may be the decision table or some enclosing compound. If one of the action phrases is a compound containing an unlabeled EXIT statement, this EXIT statement will at most terminate the action phrase, but not the decision table. If such an enclosed EXIT statement is labeled, it will terminate the compound with the same label; this compound may be the decision table. The decision table was selected as main control structure in CLEOPATRA since it provides the greatest possible flexibility while still maintaining the virtues of goto-less programming. Production (7. 16) is primarily intended to facilitate the writing of a case statement (case statements in general are a special case of decision tables) . It is, of course, expected that eventually this control structure will be translated into optimized code. To this extent it should be advantageous that once all decision phrases have been executed, all action phrases can be marked for execution together; the evaluation of all switch expressions intentionally is left side-effect free. - 102 - Section 7: Statements. 7.3 Statement lists. Section 7. 1 discussed all simple statements, referred to as stmt. Section 7.2 discussed all (labeled) groupings of statements, referred to as lstmt. (7.23) statement : := stmt | lstmt We would like to somewhat relax the rules governing the placement of semicolons: "statement" appears in productions (5.16) [PROCEDURE routine_block], (5.23) [OPERATOR routine_block ], (5.28) [CONVERSION routine_block], (7.4) [IF stmt], (7.9) [BEGIN lstmt], (7.10) [ITERATE lstmt], (7.14) [DECISION lstmt], and (7.20) [action]. In these productions we have thus far the rule that all statements must be separated from each other by semicolons, but that keywords following them need not be preceded by a semicolon. This rule is safe as long as the keywords in guestion are reserved. It can be further relaxed. The reserved keywords BEGIN, END, WHILE, ... all initiate, punctuate, and terminate statements. He need not introduce another level of punctuation unless omitting it creates ambiguity or near-ambiguity. We only require that two statements be separated from each other by one or more statement boundaries. A semicolon is a statement boundary; additionally each lstmt is (as a definition) surrounded by statement boundaries. We thus can juxtappose two lstmt, or a stmt and a lstmt, without an intervening semicolon. As is already stated in the productions, statement boundaries exist before the keyword END and after the phrase END, or END label. There is one exception to this rule. In production (1.6) we define a comment to be delimited by the keyword COMMENT and by the first following semicolon. Such a comment is not a statement. It takes the place of a blank character and its trailing semicolon may not be omitted, even if the comment directly precedes a lstmt. - 103 - Section 8: Interrupts, 8. Interrupts. Information contained in this section is not intended to be binding on an i mpieinentation of CLEOPATRA. We recognize that the handling of exceptional conditions provides the interface between the language implementor and the designer of the underlying operating system; we do not wish to preempt certain system designs at language design time. This section discusses interrupts, i.e., the creation, reporting, and handling of exceptional conditions, as an extension of certain concepts of the language discussed earlier. It is possible, and attempts are encouraged, to design different interrupt mechanisms along the same Lines before an actual implementation of CLEOPATRA is undertaken. The discussion in this chapter is not oriented towards any particular target machine, propose the structure of an interrupt mecnanism. F we will discuss what interrupts would most inherent in an implementation of CLEOPATRA System J60. necessarily He simply In appendix likely be for the IBM 8.1 Linkage considerations. During the execution of a CLEOPATRA program, there will De certain exceptional "conditions" raised. Lt the user so desires, he can establish an "interrupt", i.e., a configuration prepared to be activated in the event of a certain exceptional condition. We allow the user to selectively connect interrupts (the handling mechanisms) with conditions. Details of the connection are discussed in section 8.2. (8.1) (8.2) conr iguration : := interrupt conf igurat ion_name ::= interr upt_link After algorithm and type_pack, we propose interrupt as a third conLiguration; compare chapter 3. Constituent blocks of an interrupt are connected by the common interrupt_link , - 104 - Section 8: Interrupts. an identifier, in a manner similar to the connection of the blocks of an OPERATOR algorithm. (8.3) interrupt ::= [ struct ure_bloc)c ] [ global_data_block ] [ data_block ] interrupt_block Like any other configuration, an interrupt may employ other configurations, which it must define in its structure_block, and it may possess data items local to itself, or global to its configuration tree. In the interrupt_block the actual algorithm for the recovery from some exceptional condition is described. Certain exceptional conditions (e.g., SUBSCRIPT_RANGE) are predefined in the language; see appendix F. Other conditions can be defined in structure blocks. Structure blocks must also detail the interrupts that they provide. We want to enable a type_pack to provide interrupt handlers, as well as to report exceptional conditions to its users. Consequently we propose conditions and interrupts as global_link_items; compare section 5.1. (8.4) global_link_item ::= CONDITION condition_name [ ALIAS identifier ] [ ir_ref _types ] (8.5) ::= interrupt_link : INTERRUPT interrupt_name [ ALIAS identifier ] [ ir_ref_types ] (8.6) condition_name ::= identifier (8.7) interrupt_link ::= identifier We interpret the creation of an exceptional condition as a call on an interrupt (if such an interrupt is activated; see section 8.2). As in any other configuration call, the interrupt may be provided with certain parameters. In most cases, the interrupt will take corrective action (if possible) . To simplify access for correction purposes, we permit that parameters be passed to the interrupt BY ADDRESS. (8.8) ir_ref_types : := [ ref_type [ BY ADDRESS ] ] [ ref_type_list ] We would like to keep the mechanics of naming interrupts simple. Therefore we propose that an interrupt - 105 - Section 8: Interrupts. aay have a principal argument (similar to a unary operator; see section 5.2.3), and that the interrupt be recognized not only by its own naae, but also by the ref_type of its principal argument if any. The definition of a condition must indicate which interrupts aay be activated for it. It therefore must state the parameters for an interrupt which can be connected to it. For the same reasons as above, conditions, too, are recognized not only by their name, but also by the ref_type of the principal argument for their interrupt if any. In order to simplify the naming conventions, we propose that interrupt_naae and condition_name each belong to its own identifier class; see section 3.3.1. This will not complicate recognition, since the context for each name is extremely limited. Ve thus can use the same name for a condition and for the interrupt to be connected to it. 8.2 Creation and activation. Interrupts should be connected to conditions at execution time. It must be possible to selectively activate and deactivate the connection between a certain condition and a certain interrupt. This is the reason why we have provided separate structure block, entries for conditions and interrupts. In the present section we will discuss how the connection is made, and how conditions may be raised. It is expected that there will be a certain number of conditions which may never be raised, i.e., which are only caused by the external hardware; see appendix F for details concerning CLEOPATRA/360. (8.9) stmt ::= ON condition_name { in terrupt_name | NIL | BUILT IN } [ ref_type | ALL ] This statement connects (activates) an interrupt for an exceptional condition. The ref_type contributes to the recognition of the full condition and interrupt name as discussed in section 8.1. If ALL is specified instead, all conditions and interrupts with that same identifier in their name as specified in the statement, and which are known at this point in the configuration tree, will be connected. The attempt at connecting ALL is bounded by the availability of interrupts, not of conditions. - 106 - Section 8: Interrupts. It is possible to deactivate interrupts in one of two ways. System action is requested if BUILT IN is specified instead of an interrupt. NIL can be used to request that no action be taken on a particular condition. An implementation will define what actions are BUILT IN, and which conditions, if any, may not be connected to NIL. (8.10) stmt ::= SIGNAL condition_name [ expression ] [ parameters ] This statement raises an exceptional condition. It may be executed irrespective of whether or not an interrupt was connected to the condition. The parameters are supplied to the interrupt in the usual fashion. The list of parameters is evaluated right to left, prior to the evaluation of the principal argument. It is not legal to omit a principal argument, if the structure block entry specified one. Other parameters may be omitted as usual. In both statements discussed in this section the usual calling conventions must be observed. Interrupts can only be connected to conditions with matching ref_types, and conditions may only be signaled with parameters which match the ref_types indicated in the structure block entry. It is possible to employ the COPY option (section 6.4) when raising a condition. 8.3 Interrupt blocks. An interrupt__block describes the action to be taken when an exceptional condition is raised. The routine has access to its parameters, and in general to the information in its own local and global data blocks, and in the global data blocks of all predecessor configurations, and to SHAREd data as usual. (8.11) interrupt_block ::= interrupt_link : INTERRUPT interrupt_name [ ref_type identifier J [ name_list ] { ; statement } • [ ; ] END interrupt_link [ ; J (8.12) interrupt_name ::= identifier - 107 - Section 8: Interrupts. Formal parameter names are matched to their declarations as usual. The formal parameter declarations must match the structure block entry as usual. Interrupts certainly cannot return a value. Instead, special statements are provided to terminate the processing of an interrupt. One such statement must appear within an interrupt_block ; control may not reach the END phrase of the block. (8.13) stmt ;:= RESUME This statement corresponds to the RETURN statement; see section 7.1.2. It directs execution to continue just beyond the point where the exceptional cDndition was raised. It may not be legal for all conditions to RESUME. Additionally, certain corrective action may be mandatory for certain conditions. As alternate means to terminate an interrupt, we propose that there exist conditions which by the implementation are guaranteed not to RESUME and which thus effect a termination when signaled. Such statements might be SIGNAL HALT indicating an unconditional termination of the entire program tree (or well-defined parts thereof), and SIGNAL RETURN expression (system_supplied_value) indicating a return from a configuration nested a certain number of levels deep through the calling seguence, with the given value (which then, probably, would not be typed) . Especially the latter of these statements will be extremely implementation dependent. - 108 - Section 8: Interrupts, 8.4 The ASSERT condition. Following Floyd we propose a condition aimed at dynamically checking program correctness. Basically, the user can specify certain relations which, possibly depending on instructions to the compiler, will be compiled into code to signal an exceptional condition, the ASSERT condition, should the relations be found to be false. Introducing the ASSERT condition is the only explicit aid in debugging which we give through the design of CLEOPATRA. Other conditi3ns could and should be incorporated in an implementation, such as the CHECK condition of PL/I and the FLOW condition of PL/C. (8.14) stmt ::= ASSERT expression We propose that this statement be equivalent to IF -* expression THEN SIGNAL ASSERT i.e., that the ASSERT condition be raised if the expression is NIL. This expression must be of a basic type as discussed in section 4.1. The interpretation of NIL as FALSE corresponds to the logical operators in CLEOPATRA/360 in appendix B, section 7. BUILT IN action for ASSERT probably should be a report to the user. This must be decided by an implementation, and it depends on the availability of a medium of communication. We additionally propose that the CLEOPATRA compiler may be instructed to compile a NIL statement (section 7.1.1) instead of an ASSERT statement. Such an instruction could be given with the environment request (section 3.4). - 109 - Section 9: Privileged operations. 9. Privileged operations. Privileged users, usually system programmers concerned with the implementation of functions of the nucleus of an operating system such as memory allocation and protection, and basic input-output access methods, must be allowed a very controlled access to certain features of the host machine's hardware which the average user should not be aware of. Such access necessarily will not be completely controlled, e.g., in the sense of section 4.2.1, since it may foe example deal with controlling functions assumed to underly CLEOPATRA such as the maintenance of POINTERS. In this section we intend to make our second major proposal for extensions to the general purpose part of CLEOPATRA as described in sections 1 through 7 of this report. This final implementation-dependent extension concerns itself with access to highly protected parts of the hardware. We must be specific as to what hardware we intend to address in this discussion. Therefore, this section will be specific to an implementation of CLEOPATRA/360. Above the system nucleus, whose implementation tools are described in this section, we assume that well designed operating systems of a similar nature would assume similar functions to be provided by this nucleus. In this sense this current section serves as an introduction to design tools whose effects eventually should be available on most current machines for which CLEOPATRA could be intended. For those readers unfamiliar with the intricacies of the System 360, section 9.1 will provide a guicJt review of the fine points considered in this section. Section 9.2 introduces a few more "privileged" data types shaped after these hardware features, together with a proposal for operations to manipulate these data types. Section 9.3 similarly discusses a few more privileged statements, most notably statements to control the data transmission channels of the System/360 directly and most primitively. Section 9.4 finally suggests the mechanism to protect CLEOPATRA and the system from the not-so-privileged user. - 110 - Section 9: Privileged operations. 9.1 Basics about System/360. Operations of the central processor are controlled and recorded in a 64 bit Program Status Word (PSW) . The structure of a PSW was described in section £.5.3.4. Most important entries of the PSW are the System and Program Mask, and a mask controlling Machine Check, interrupt reporting. The PSW provides fetch and/or store protection through the Storage Key, and indicates the next instruction address. While modifying access to the System and Program Masks is provided, all other status changes must be accomplished by the introduction of a new PSW. Protection is provided for storage blocks of 2048 bytes each (beginning at boundaries divisible by 2048) by associating a 4 bit protection key with each storage block. The key may be chosen at random for each individual storage block. In general there will be more storage blocks in a given installation than distinct keys. There is store protection, merely by providing a non-zero key, and store/fetch protection. The latter is enforced by a fifth bit in the key set to one. Instructions are provided to interrogate and change the key of any given storage block. There are five interrupts recognized and reported by the hardware: External (recognizing so-called external signals), Input-Output (signaled by a channel), Program- Check (recognized by the hardware and further differentiated by the interrupt code recorded in the PSW) , Machine-Check (accompanied by a model-dependent in-core readout) , and Supervisor-Call. An Input-Output interrupt normally will store information into the Channel Status Word in a fixed location in low core. The Supervisor-Call interrupt is caused by an instruction, which also will record a user- supplied interrupt code in the PSW. When an interrupt occurs, the PSW currently in effect is stored into a fixed location in low core (one such location for each of the five interrupts) and is thus available for inspection, and a new PSW is introduced from a fixed, interr upt-specif ic location in low core. Most instructions discussed in this section may only be issued in supervisor state. Supervisor state is indicated by the state of a certain bit in the PSW. Introducing a new PSW is the only way to switch to and from supervisor state. Interrupts therefore may cause a switch to supervisor state (depending on their new PSW) ; additionally, a privileged - 111 - Section 9: Privileged operations. (supervisor state) instruction allows the introduction of a nev PSW. There is a sixth interrupt, employed during the Initial Program Loading (IPL) of the machine. It is caused by operator intervention, and it does not store an old PSW. Its new PSW is introduced during the IPL operation from some external device. The address of this device is recorded in the new PSW just before it is introduced and causes execution of the first user program to commence. Input-output is performed by channels. The seven channels are controlled by specific instructions issued to the central processing unit (CPU) such as START INPUT OUTPUT, TEST CHANNEL, etc. The channels report diagnostic information by means of the Condition Code (CC) , which consists of two bits in the PSW, and, more elaborately, through the Channel Status Word (CSW) in a fixed location in low core. Channels and devices connected to them can be individually addressed by the controlling instructions. START INPUT OUTPUT, the most important of these, directs the channel and device to commence execution of a channel program. Such a channel program consists of one or more Channel Command Words (CCW) which can be located anywhere in memory. Channel programs need not be presented in contiguous memory; a TRANSFER IN CHANNEL instruction is provided among the possible CCW. When START INPUT OUTPUT is issued, the beginning of the channel program is indicated by the address of the first CCW located in the Channel Address Word (CAW) in a fixed location in low memory, which must have been initialized prior to the START instruction. The channel may be instructed by a CCW to cause an Input-Output interruption concurrent with the execution of the CCW by the channel. Additionally, Input-Output interruptions are caused for various reasons, such as Device End, the execution of a HALT INPUT OUTPUT instruction, etc. One more fixed storage location in low core deserves special mention. There is a continuously decremented timer available, which, when turning to negative units, may cause an External interrupt. The timer can be set initially under program control, and it can be interrogated without undue loss of accuracy. It is usually not directly related to the time of day. - 112 - Section 9: Privileged operations. The reader is certainly invited to read the comprehensive discussion in the IBM publication "IBM System/360 Principles of Operation" (Order No. GA22-6821). 9.2 Privileged data types. From the discussion in section 9.1 we can identify at least four different data types, for which we can and must provide special instructions within the CLEOPATRA coder. We need convenient ways to analyze and synthesize Program Status Words (PSW) , we need a way to analyze the contents of the timer cell, we need a way to conveniently create and manipulate storage keys, and, finally, we must be able to access locations in memory by their explicit addresses. Whether or not we should proviia special data types to participate in the manipulation of the channels is guestionable. Channels essentially are controlled by a seguence of flags; guita conceivably we could introduce channel programs as arrays of BITs. This, however, contradicts our spirit of the levels of abstraction (section 3) . We shall theretora pursue the guestion of channel control somewhat further in section 9.3. 9.2. 1 Addresses. CLEOPATRA users familiar with the System/360 realize that probably the most efficient choice for the underlying representation of addresses is a LONG_INTEGER. We propose, howevar, not to introduce a new data type ADDRESS, but instead to merely provide a few operators on LONG_INTEGERs which interpret these as addresses. This decision avoids having to provide for and recognize countless arithmetic operations on ADDRESSes and on combinations of LONG_INTEGERs used as offsets and ADDRESSes used as bases. We invite, however, a complete confusion of the concepts of LONG_INTEGER and ADDRESS. Experience and education of CLEOPATRA privileged programmers will show whether this leads to difficulties. What operations characterize ADDRESSes? Basically we anticipate that a few system_supplied_values will be introduced, such as MEMORY_SIZE (of the hosting machine) , - 113 - Section 9: Privileged operations. NEIT_STAT£MENT allowing the design of the address part of a PSW about to be introduced, etc. Furthermore we shall have to be able to inquire at what location in core a given value is stored. This could be provided by an operator as follows: OPERATOR a) type value RETURNS LONG_INTEGER ; This operator returns the address at which 'value' (which can be of any 'type') is located in memory. The operator couli also be provided as a binary operator with assignment of the resulting address; it most probably need only be made available for the basic types as described in section 4.1, and in the current section. One more very delicate operation is required. He need to be able to specify a memory location by its address, and to format it into some (basic) type of value NIL. We should make the result available as a storable reference (section 6.2) • As a safety precaution, a check, should be performed whereby the required amount of memory determined from the type about to be created is compared to the anticipated amount of memory supplied as an additional parameter. It is not immediately clear which is the most convenient notation. A CONVERSION of ADDRESS to the required type seems to be most useful, if we additionally instruct CLEOPATRA to regard such a CONVERSION as storable. Using the type of the new data item as a variable (essentially) is not very much in the spirit of the strongly typed CLEOPATRA language, although unavoidable. Other alternatives therefore are a PROCEDURE based on two LONG_INTEGER parameters, which accepts anything as being stored into it, or a binary operator between a type and a LONG_INTEGER, or a statement. He can hardly make a choice of implementation which fits into the existing framework of the language. Probably the most flexible extension of the framework is provided by the following pseudo-system_supplied_value (compare section 2.3) : (9.1) system_supplied_value ::= type AT ( expression , expression ) - 114 - Section 9: Privileged operations. The result is considered storable; both expressions must be of type LONG_INTEGER. We would use this systera_supplied_value as follows: type AT (address, size) and the interpretation is that at the •address 1 with an anticipated •size' a new data item of the requested •type 1 (but this, again, may be restricted to the basic types in section 4.1, and in the current section) is created, initialized to a suitable NIL value, and made available as a storable reference. This feature will be needed, e.g., for the implementation of a POINTER management mechanism, etc. 9.2.2 Keys. He are faced with a similar problem as in section 9.2.1. Keys should be represented as a new data type, with new operators, and with an underlying representation of (approximately) a HALF_BYTE and a BIT. Again, this is neither efficient nor terribly compact. Instead, we propose to treat keys as an extension of BYTEs. For hardware reasons, the key is located in the left (high-order) half of the BYTE, and the fetch-protection bit directly follows the K.ey. We thus have the usual arithmetic operations for keys available. To these we add another set of comparisons which can be used to compare but the key part of the BYTE: OP BYTE a .(BIT m_a) compare (BIT m_b) . BYTE b RETURNS BIT ; •a 1 , whose fetch-protection bit is ANDed with 'm_a', is numerically compared to • b' with its protection bit masked by *m_b l . 'compare' can be chosen from the set EQ NE GT NG GE LT NL LE with the usual meanings. We provide a special key to key assignment which separately manipulates the protection bit: - 115 - Section 9: Privileged operations. OP BYTE a BY ADDRESS : (BIT fetch) KEY BYTE b RETURNS BYTE ; •a* is made a proper key, i.e., the low-order three bits are set to zero, the protection bit is taken from 'fetch 1 , and the high-order four bits are copied from •b*. The implementor may decide that it is convenient to treat an omitted •fetch 1 bit not as zero (as we propose) but in this case to copy the fetch bit from 'b^, also. More operators can certainly be provided. In particular, we would like to see unary operators STORE and FETCH to cause the respective setting of the protection bit (and to zero the low-order bits) . Key exchange assignment could be designed. Masking operations on keys may be useful. Keys are intended for protection purposes. We provide these as pseudo assignment operators between the data types LONG_INTEGER (for ADDRESS) and BYTE (for KEY). The mnemonic name for these assignments suggests the KEY context: OP L_INT address SETKEY ALIAS SSK BYTE key RET BYTE ; OP BYTE key BY ADR : READKEY ALIAS ISK L_INT address RETURNS BYTE ; SSK associates the indicated key with the storage block to which the given address belongs. The key (with low-order zero bits) is returned. ISK inspects the key of the storage block to which the given address belongs, assigns it to •key 1 , and returns it. Like many other operations described in section 9, ISK and SSK may raise error conditions. E.g., the available memory on the hosting machine may be exceeded by the designated address. It is felt that this is not the point to speculate what new conditions (if any) this proposed feature may introduce. Such error conditions depend highly on the implementation upon which an operating system designer and a CLEOPATRA implementor may decide. - 116 - Section 9: Privileged operations. 9.2.3 Program Status Words. A Program Status Word (PSW) is a new data type. As underlying representation we essentially propose the layout in section E.5.3.4, possibly with a few changes. Me assume that such a PSW is represented by the t ype_data_block of a built-in type_pack, and that all the data items in the type_data_blocic are SHAREd. Additionally for the convenient synthesis of a new PSW we propose that pseudo assignment operators be provided, such as: PSW ; : SSM BYTE PSW : : KEY BYTE PSW : A BIT PSW : : M BIT PSW . : W BIT PSW ; : P BIT PSW : : IC INTEGER PSW 1 : SPM BYTE psw : : ADR L_INT PSW : ; (INT) = BIT PSW : : (INT) = BIT set the System Mask set the Protection Key set the ASCII mode set the Machine Check. Mask set the Wait State set the Problem State set the Interrupt Code set the Program Mask set the Instruction Address set the numbered bit 1 EXT set the numbered bit sequence The converse assignment operations should probably also be provided. Additionally, we need assignment and exchange assignment among PSWs, equal/unequal comparison of PSWs, and a certain amount of masking operations on an array of BITs. It should be noted that this section only discusses operations on a data item which has the format of a PSW. In section 9.3.1 we shall discuss how such a data item might be introduced as a new current PSW, and how we can take actual influence on certain fields within the current PSW. 9.2.4 Time. It is one of the less convenient features of the System/360 that there is no real-time clock available within the system. Access to the timer cell (which is merely a LONG_INTEGER in a fixed location in low core, which is continuously being decremented, and which can cause an External interrupt) can and should be provided in a fashion very similar to an exchange assignment: OP TIME: LONG INTEGER BY ADDRESS RETURNS LONG INTEGER : - 117 - Section 9: Privileged operations. The current value of the argument becomes the new value of the timer cell, and the current value of the timer cell becomes the new value of the argument. No cycle of a real- time counter can be lost. (Details of the implementation are suggested in the IBM "Principles of Operations" manual.) 9.3 Privileged statements. Privileged statements allow the privileged user control over the current Program Status Word (PSW) . The most uncontrolled operation possible, namely the introduction of a new PSU, is allowed. It should be fairly obvious that the casual user is somewhat discouraged from the use of this feature. Other privileged statements are provided for the control of the input-output channels. These statements, too, allow indiscriminate access to memory. 9.3.1 The current Program Status Word. We propose three statements of increasing probability to destroy the system. They are: (9.2) stmt ::= SPM expression The expression must be of type BYTE (an implementation may certainly decide differently) , and the high-orier four bits are made the current Program Mask. Consult section E-5.3.4 for a discussion of the bits in the Program Mask. It should go without saying that a change of the Program Mask may interfere with the recognition of the built-in exceptional conditions on basic arithmetic. (9.3) stmt ::= SSM expression The expression must be of type BYTE and is made the current System Mask. Consult section E.5.3.4 for a discussion of the bits in the System Mask. A change in the System Mask may interfere with the proper execution of input-output, or of timing. (9.4) stmt ::= RESUME expression - 118 - Section 9: Privileged operations. The expression lust be of type PSW and is made the new current PSW. This is a significant extension of the RESUME statement discussed in section 8.3. The RESUME statement with no argument merely reintroduces the old PSW which was stored (and then preserved by CLEOPATRA) when an exceptional condition was raised. If an explicit new PSW is supplied for a RESUME statement, a complete change of status, and a completely undisciplined transfer of execution control, is effected. 9.3.2 Input-Output. Basically we propose a new type of compound statement to create efficient code for the interaction with a channel and the analysis of the interaction. The cstat taxes the form of a special decision table: (9.5) cstmt :;= i_o_control CASE switch, switch, switch, switch OF { switch : statement ; }• END The i_o_control causes one of the four switches to be set to TRUE. Then all statements whose switches are TRUE will be executed in order. This statement should be taken to be very similar to the decision table described in section 7.2.3. Possible i_o_controls correspond to the START INPUT OUTPUT, etc. statements in the System/360 machine language. They will provide an indication of the type of CPU instruction, a device or channel address (probably as an INTEGER) , and a storable reference to a suitable recipient for the STATUS information contained in the CSW (probably a built-in data__group) which may be stored as a result of the operation. Each i_o_control will cause the switches to be set, and further analysis to be initiated. This a«*lysis must take into account (and the status of the switches will so indicate) whether an actual STATUS was stored, or whether the STATUS merely was set to NIL by the i_o_control. It seems quite plausible that STATUS could be made local and non^storable within the cstmt in a fashion similar to the index of an ITERATE loop (section 7.2.2). We give the most important example, namely the START INPUT OUTPUT instruction. We deliberately do not detail - 119 - Section 9: Privileged operations. implementation aspects of the form in which device address and STATUS are to be recorded, (9.6) i_o_control ::= START device [ KEY key ] STATUS status PROGRAM channel_program The key is a protection key which participates in the fetching or storing of the data about to be transmitted. Operating system designer and implementor will have to choose a suitable default (NIL, or the key from the old PSW of the supervisor-Call interruption which presumably is executing the program). Me propose that further investigations detail the precise format of a channel_program. Essentially we would anticipate an in-line assembler format, with free fields but possibly source record oriented, with labels, branches, mnemonic instructions, etc. Expressions involving CLEOPATRA data items (which may have been passed to the algorithm executing the program) should be allowed for the various flags, etc. Additionally, such a language should allow transfers in channel into an array of channel command words, and one might even consider that the user should provide a suitable array into which the channel_progcam would be assembled. 9.4 Protection. Operations described in section 9 ace not available to the general public. This can be enforced in two ways. First, it should take special control instructions to invoke the compiler so as to be able to have these operations recognized. Second, most of the operations in this section are privileged, i.e., they can ouly be issued in the supervisor state. A casual user should not be permitted to compile code tor the basic five interrupts, i.e., he should not be enabled to switch to supervisor state. This latter consideration is further discussed in appendix F. - 120 - Appendix A: Keywords and their aliases, A. Keywords and their aliases. In this appendix we list in alphabetical order all Keywords of the language, their aliases, and the section of the report where the keyword first appears in a production. For ease of recognition we would like to assume that most of these keywords would be reserved for an actual implementation of the language. An implementation may choose to use a different concrete representation, especially if only an upper-case character set is supported. Certain other words may attain reserved word status, namely the names of all operators defined on the basic data types (appendix B) and exceptional condition names (appendix F) . Compare the remarks concerning the BUILT IN attribute in section 5.1.1. Words appearing in section 9 only have not been included, due to their privileged status. keyword alias section comment ACTION ADDRESS ALIAS ALIGNED ALL ALLOCATE AND ARRAY ASSERT ACT ADR ALGD ALL 7.2.3 statement keyword 5.1.1 parameter passing mode 4.3.2 4.3.2 8.2 condition activation 7.1.4 statement keyword 7.2.3 switch_operator 6.5.2 temporary array keyword 8.4 statement keyword - 121 - Appendix A: Keywords and their aliases. keyword alias section comment B. BEGIN BIT BOILT BY BYTE BEG 2.2.1 denotes binacy_string 7.2.1 statement keyword 4. 1 basic_type 5.1. 1 5.1.1 context: BY ADDRESS 4. 1 basic_type c. 2.2.5 denotes literal_value CHARACTER CHAR 4. 1 basic_type COMMENT COM 1.5 COMPACT CMPT 4.3.2 COMPILE 3.4 CONDITION COND 8.1 CONSTANT 5.3. 1 option CONVERSION CONV 5.1. 1 COPY 6.1 parameter option D. DATA DECIMAL DECISION DEFER DEC DECN 2.2.4 denotes decimal 5.3. 1 4.1 basic_type 7.2.3 statement keyword 5. 3. 1 option - 122 - Appendix A: Keywords and their aliases, keyword alias section comment DOWNTO DOHN 7.2.2 statement keyword E 2,3 system_supplied_constant ELSE 7.1.3 statement keyword END 5.1.1 EXIT 7.1.2 statement keyword EXTENTS EXT 4.3.2 EXTENT F. 2.2.2 denotes long_integer FALSE 2.3 system_supplied_constant FIRST 2.3 system_supplied_constant FOE 7.1.4 statement keyword FROM 5.1.1 context: CONVERSION FROM GLOBAL GBL 5.1.2 HALT 8.3 condition IF 7.1.3 statement keyword - 123 - Appendix A: Keywords and their aliases. keyword alias section coaaent IN INIT INTEGER INT INTERRUPT IR INTO ITERATE ITER LOOP DO 5.1.1 context: BUILT IN 5.3.6 4. 1 basic_type 8.1 3.4 context: COMPILE INTO 7.2.2 stateaent Keyword L. LARGE LAST LEFT LONG_INTEGER LONG REAL LE L_INT L REAL 2.2.3 denotes long_real 2.3 systea_supplied_value 2.3 systea_supplied_constant 5.3.5 storage scheae 4. 1 basic_type 4.1 basic_type NAND NIL NOR 7.2.3 switch_operator 2.3 systea_supplied_value 7.2.3 switch_operator 0. ON 2.2.1 denotes octal_string 8.2 stateaent keyword - 124 - Appendix A: Keywords and their aliases. keyword alias section comment OPERATOR OR OP 5. 1. 1 7.2.3 switch_operator p PI POINTER PTR PROCEDURE PROC 2.2.3 denotes exponential_part 2.3 system_supplied_constant 4.1 basic_type 5.1. 1 Q. 2.2.2 denotes byte REAL RELEASE RESUME RETURN RETURNS RIGHT REL RES RET RET RI 4.1 basic_type 7.1.4 statement keyword 8.3 statement keyword 7.1.2 statement keyword 5.1.1 context: structure block 5.3.5 storage scheme S. SHARE SIGNAL SHALL SIG 2.2.2 denotes bit 5.3.3 option 8.2 statement keyword 2.3 system_supplied_value - 125 - Appendix A: Keywords and their aliases. keyword alias section comment STEP 7.2.2 statement keyword STRUCTURE STRUC 5.1.1 THEN 7.1.3 statement keyword TO 5.1.1 context: CONVERSION TO TRUE 2.3 system_supplied_constan t TYPE 5.1.1 UPTO UP 7.2.2 statement keyword VECTOR VEC 5.3.5 storage scheme WHEN 7.2.2 statement keyword WHILE 7.2.2 statement keyword X. 2.2.1 denotes hexadecimal_string - 126 - Appendix B: Operators for the built-in types, B. Operators for the built-in types. We present here a list of all operators which are built into the coder of the CLEOPATRA/360 compiler, or which are provided as a convenience. The order in which the operators are presented in this appendix is largely functional; operands essentially follow the order given in section 4.1. We use a pseudo-CLEOPATRA notation, employing a combination of structure block entries and routine headings to convey all the pertinent semantic information. Note that all operators in this appendix are prepared to handle ALIGNED and COMPACT arguments and parameters, and that they will always return ALIGNED values. B. 1 Summary. We precede the detailed discussion of the functioning of the operators by a table presenting an overview and a summary of all the available operators, together with their returned types and possible exceptional conditions. Secondary parameters for the operators are not discussed in this section. B. 1.1 Exceptional conditions. The list of exceptional conditions resulting from an operator invocation presented in this section is provided as an example. An implemsntor may decide to provide more or fewer distinguishable execution exceptions. The execution- time exceptional conditions resulting from operators, their potential invocations, and suggested system attempts to remedy the exception are as follows: [a] COMPLEX the result of a computation is not within the real number field. remedy: employ the absolute value function in a suitable fashion. - 127 - Appendix B: Operators for the built-in types. [b] CONVERSION an attempt is made to convert a literal value into a constant, Mhere the literal value does not conform to the syntax of a numerical constant as described in section 2.2. remedy: return a suitable instance of NIL instead. [c] OVERFLOW the result of a numerical computation exceeds the allowable (absolute) value for its target type (see section 4.1). remedy: return an appropriately signed suitable instance of LARGE instead. [d] SIGNlFICANCE_LOSS as a result of real number arithmetic, all significant digits are lost. remedy: return a suitable instance of NIL instead. [e] SIZE an attempt is made, during assignment or conversion, to place a value into a typed variable which cannot be accommodated there (see section 4.1). remedy: return an appropriately signed suitable instance of LARGE instead. i f ] STRING_RANGE an attempt is made to access a CHARACTER value outside of its current length, or before its first character. remedy: return the entire value or CHARACTER NIL, respectively. £g] UNDERFLOW the result of a real number computation is not equal to, but numerically indistinguishable from zero. remedy: return an appropriately signed suitable instance of SMALL instead. [h] ZERO_DIVIDE an attempt is made in a numerical computation to divide by zero. - 128 - Appendix B: Operators for the built-in types. remedy: return an appropriately signed suitable instance of LARGE instead. B. 1 . 2 Grouping of operators. The operator table lists mostly groups of operators. Be now present these groups and operators, together with the exceptional conditions that they may raise (the conditions are indicated by the bracketed lower-case letters preceding each condition as defined in section B.1.1): C CONVERSIONS [b,e] U unary arithmetic operators. The group consists of: ABS [c] CEIL [c,d,g] FLOOR [c,d,g ] MAX MIN NORM [c] PROD £c,g ] ROOT [a,g,h] ROUND [C,d,g ] SIGN SUM [c,d,g ] TRUNC [c,d,g] CU unary CHARACTER operators. The group consists of: LENGTH MAX REPEAT [e ] REVERSE A simple assignment, denoted by = [e,f]. This operator throughout this appendix has INIT as an ALIAS name. EA exchange assignment, denoted by <=> £e,f] - 129 - Appendix B: Operators for the built-in types. CM comparison operators. The group consists of: < > <= CP comparison of pointer values, denoted by == and -*— IA arithmetic on integer-like values. The group consists of: ♦ [c] - [c] ♦ tc] ** [c] ♦ ♦ [c] / [c,h] // [c,h] MOD [c,h] BA arithmetic with bits. The group consists of: ♦ £c,g] " [c,g] * SS string selection, denated by <- and -> [f] IE integer exponents for real-like values. The group consists of: ** [c,g] ♦ ♦ [c,g] RA arithmetic on real-like values. The group consists of: ♦ [c,d,g] - [c,d,g] ♦ [c,g] ♦ ♦ £c,g] / Lc, g,h] ** [a,c,g ] - 130 - Appendix B: Operators for the built-in types. CI character interrogation. The group consists of: "-> INDEX <-" ?-> VERIFY <-? --> <-- CC character concatenation, denoted by " [e] SEQ SUB unary operators converting CHARACTER into BYTE arrays, and vice versa. (See section B.8.2.) Additionally there exist logical operators between any two basic types. This group consists of: & AND | OR I I -.& NAND -.| NOR -II The unary negation -• also is defined on all basic types. These last two groups of operators are not shown in the operator table. B. 1.3 The operator table. A table entry has the form op # cond where "op" denotes an operator or a group of operators as described in section B.1.2, M #" denotes the returned type of a call on the operator by number, and "cond" denotes possible exceptional conditions from section B.1.1. Optional secondary parameters for the operators are not shown. A few unary operators (MAX, MIN, NORM, PROD, SUM, SUBSTITUTE) take array arguments and return scalar values, and the operator SEQUENCE returns an array value. For details, consult the operator definitions. The returned type for conversions depends on the conversion invoked. Entries for conversions in the table therefore have a list of returned types. The - 131 - Appendix B: Operators for the built-in types. unary operator ROOT always returns a real number; unary arithmetic operator entries in the table therefore are followed by a list of returned types. - 132 - Appendix B: Operators for the built-in types. r- i- ■J 0) A) <€ it 00 oj -C —.IX 01 01 cr. IP 1 r (9 a M — H a V * a joj at ■ ■P 9 9 M -4 IB rg on ( I 0) 00 ~ J 0) » « (M 0) w W 10 a a «- i/i m oo « a> r» 0> U in 9 oo ai a> 00 J3 0) « ai r» » oi u >< a u u iB 0) a a. 9 o e < < U M J9 u 0) E « « O >-l •- o * r~ «- H e 4 ■* L> M r r F ■#- 4. «.' *j r h C r C t> t: *■ T u lj n- >■ ■^ ■ w t: PT l .~ jC CL *. *- A . 1 PI y ^ If .*_. n rj 11 11 «J u %— 3 l/l • «* O. K 3 ■*■ » ^ in 1-. c o m U w a -H c £ D (j fl n a. re -n a *~ n 11 4J t~M e ^_ in K 0) ■«ri a: fr H •H w c .-i li t- rr I- u 1^ U CT U Cr t: .,_ u u r > ff fl' n « ■ rr B (Tt u; M m «c * l-T. n- (U c ^ » c cr: Q. a. r\. c A' r p^ a <~H m 1- ^ •i ■< ■ m — 1 I) J_ ^ m rv U in *« n 3: n- O n 3 M r) r- Hi u AJ *• en u t i l ) 11 n I) in lH ■ ^1 i- in rv in AT r > *-i »- a, n — i -c OJ «: 0- in Ml in o < ) 1 > t ) t i o te t— M a. tn i^ »: o ( t.i i-- u n rj r- o o m t- K u c- in D N Hi iu rr i-i •■ (/) x to o in u K>in j ut J r. t> o-. I rzui" n O > >- lilipin IB .C t) Tl U U M u o •< u E •« B < O M E << < U M PI H w> t7« T) ■O -8 in ib o> r» iPi < r» m \n u E « E < 4 U0S at - 133 - Appendix B: Operators for the built-in types. B.1.4 Returned types. Returned types may quickly be determined for most built-in operators by the following rules: Unary arithmetic operators always return the type of their argument. One exception to this rule is the operator ROOT which will return the stronger type (see below) of REAL and the argument type. Comparison and logical operators always return BIT. Arithmetic operators always return the stronger type (see below) of their two arguments. Assignment operators always return the original value of their right-hand argument typed as is. All returned types are aligned. The basic types are ordered for strength so as to minimize the chance of OVERFLOW and SIZE exceptions resulting from arithmetic involving mixed type operands. The strength of the basic types decreases from top to bottom in the following list: LONG_REAL REAL DECIMAL' n n decreasing from 16 to 1 LONG_INTEGER INTEGER BYTE BIT This concept does not eliminate OVERFLOW and SIZE exceptions completely in the context of DECIMAL, of course. B.2 Conversion operators. CONVERSION as a special kind of unary operator was introduced in section 5.2.5. A list of CONVERSION operators for the built-in data types is contained in the operator table in section B.1.3. This list also indicates the exceptional conditions which CONVERSION operators may raise. In the present section we offer a list of special - 134 - Appendix B: Operators for the built-in types. considerations to be employed when using these built-in CONVERSION operators. B. 2. 1 Converting to DECIMAL 1 size. When converting to DECIMAL, a size must be supplied as an INTEGER between 1 and 16, appended to the operator name, DECIMAL, by a quote. A default of 8 applies if the size is omitted. In general, conversions between DECIMALS of different sizes are usually implied. B.2-2 Converting real numbers. A conversion from a real number to another number is performed by rounding, not by truncation. Truncation can be accomplished with the TRUNC operator described in section B.6. For an example on truncation consult section E.B.3. B.2.3 Converting from CHARACTER. The character string must contain a literal value which is a recognizable constant as described in section 2.2. This constant is evaluated and converted to the appropriate result. Note that the type of the constant (under the rules of section 2.2) may differ from the type requested by the CONVERSION. Potentially two conversions will have to be applied. A CONVERSION condition may be raised if no recognizable constant is found. A SIZE condition may be raised if the specified constant exceeds its own type or the type of the target. Compare section B.1.2 for suggested remedies. B.2.4 Converting to CHARACTER. When converting a number into a CHARACTER, a literal value is constructed according to the syntax of a constant as described in section 2.2. There will be no embedded, heading, or trailing blanks in the literal value. CONVERSION TO CHAR FROM (base) . non_real ; CONVERSION TO CHAR FROM (base, exp_base) . real ; - 135 - Appendix B: Operators for the built-in types. The basic_constant underlying the constant is expressed in a base as supplied by the positional INTEGER parameter •base' which must equal 2, 8, 10, or 16, and which is set to 10 if omitted or incorrectly specified. The basic constant will contain as many digits as are available. The length of the resulting literal value (which includes qualifiers where appropriate) becomes the current and maximum anticipated length of the resulting CHARACTER. If the conversion is applied to real arguments, different bases may be specified as •base 1 (an INTEGER) for the fractional part, and 'exp_base' (an INTEGER) for the exponential part. An explicitly written INTEGER NIL for •exp^base' requests that no exponential part be produced. Otherwise the conversion would construct a basic constant and an exponential part. If •exp_part l is omitted, the same basis is used to express fractional part and exponential part. E.B.2.4 Examples. C.1 .==. CHAR. 1 C.0.1 .==. CHAR(8). 1 C.D.X.-FF .==. CHAR(X.IO). D.-255 C.1P1 .==. CHAR. 10.0 C.10.0 .==. CHAR(,0). 10.0 B.J Assignment operations. There are two types of assignment, simple assignment and exchange assignment: OP type_a receive := type_b value RETURNS type_b ; OPERATOR type a :<=>: type b RETURNS type ; The first assignment operator is simple right to left assignment, designed to be embedded into an expression. It returns the unchanged value on its right, regardless of any exceptions caused while this value is possibly converted and stored into the storable reference on the left. This operator has INIT as an ALIAS name, so that initializations of the basic types can be perforned. - 136 - Appendix B: Operators for the built-in types. The second assignment operator exchanges the values on its left and right, and returns the original unchanged value on its right, regardless of any exceptions which may result from a conversion. (Exchange assignment can be performed between any. DEC'n and DEC'm arguments). The operator table in section B.1.3 discusses which assignments are supplied for the basic types, and what exceptions they may cause. In order to simplify expressions involving the built-in types, many of the built-in simple assignment operators operate on mixed type arguments, thus simulating a conversion. As a rule, however, only those mixed mode assignments are built-in which require little executional overhead for the 'implied 1 conversions. E.B.3 Examples. The rule that assignments return the original value of their right hand argument unchanged is designed to encourage the use of assignment operators at the highest level of an arithmetic statement. There are some uses of a sequence of mixed mode assignments which are inefficient and may produce •surprising 1 results for the novice. Consider the following example: L_REAL A,X ; REAL I, J ; A := I := J := X ; This example actually requires two conversions from L_REAL to REAL, and n_o conversion from REAL to L_REAL. When the statement is concluded, the values of A and X are equal. One can obtain a rounded result as: A := I := J := REAL X ; This example requires one explicit L_REAL to REAL conversion (written with the REAL operator) , and one implicit REAL to L_REAL conversion (at the third (leftmost) assignment). The LONG_REAL to REAL conversion is rounded; truncation could be performed as A := I := J := REAL TRUNC(6). X ; - 137 - Appendix B: Operators for the built-in types. Observe that reversing the order of the unary operators would render the TRUNC operator essentially ineffective. E. 4 Comparison operations. OPERATOR type_a a compare_with type_b b RETURNS BIT ; where * type_a* and , type - b* may be chosen from one of the following three sets: a. INTEGER, LONG_INTEGER, BYTE, DECIHAL, BIT b. REAL, LONG_REAL C. CHARACTER and where the comparison operator »compare_with * may be chosen from the following set: <>==<= >= -.= -*< -•> Again, we provide only those •implied 1 conversions which can be realized with a reasonable amount of computational overhead; further comparison operators could be created with the aid of some of the conversion operators described in section B.2. Comparison operators for the type CHARACTER can have an argument list with keyworded parameters on either side of the operator. These parameters serve to establish a collating sequence to be used in the comparison, and to determine the padding character. The default collating sequence (on either side of the operator) is implementation dependent, and in the case of CLEOPATRA/360 is the EBCDIC character sequence. The keyword parameter for the collating sequence and its interpretation is as follows: SEQUENCE • CHARACTER seq The argument ('a 1 or *b') to which this keyword parameter applies is translated according to the contents of •seq' as follows: a character C in the argument *a* or •b* is mapped into a collating sequence number CSN which is the position (from 1 on up) of C in •seq*. If C should not - 138 - Appendix B: operators for the built-in types. appear in *seq*, CSN is set to 0. In terms of the operator INDEX (see section B.8.4), CSN is determined as: CSN := C INDEX seq ; Both arguments, 'a* and 'b*, are translated according to the detined collating sequence for each, and then the comparison is performed on the collating sequence number strings. The default padding character (with which the shorter of the two arguments is extended on the right before the comparison is made) in CLEOPATRA/360 is defined to be C._ (a blank character) . The keyword parameter for the padding character and its interpretation is as follows: PAD • CHARACTER pad Assuming that the argument •a' has a shorter length than *b*, and that PAD was specified on the left hand side of the comparison operator, then 'a 1 will be extended for the comparison with the first character of •pad*. If pad .=«. CHARACTER NIL, the default padding applies. Padding is temporary and does not raise any exceptional conditions. The order of evaluation for the keyword arguments is right to left in the order in which they are written. (See section 5.2.2 on the order of parameter evaluation.) The order of application of the keyword arguments is padding before sequence - at most one of the two possible padding operations is applied in each case. E.B.U Examples. 1 = = D. 1 is TRUE 1.0 == L.1.0 is TRUE 5.1 is TRUE C.a .<. C.A is TRUE C.a .<=. CHAR NIL is FALSE C.a .<= (PAD* CO ). CHAR NIL is TRUE Can .(SEQUENCE* Cab ) == (SEQUENCE* C 12 ). C 12 is TRUE C.a . (SEQUENCE* FIRST) <=. CHAR NIL is TRUE - 139 - Appendix B: Operators for the built-in types. B. 5 Binary arithmetic operations. This section does not discuss binary arithmetic operations between BIT and numerical type arguments; see section B.7 for those. We have essentially three groups of operations; we employ considerations concerning computational overhead and SIZE protected return types as outlined earlier. E. B. 5 Examples. In this section we will give examples for the definitions of the various binary arithmetic operators. Addition, denoted by ♦ , subtraction, denoted by -, and multiplication, denoted by *, are defined in the usual sense. We have 3 == 1 ♦ 2 4.0 == 5.5 - 1.5 D.6 == D.-2 * D.-3 We would like to recall the discussion in section 2.2 concerning the recognition of the minus_symbol versus the use of - as a unary or binary operator. For integer-like values we have three types of division. Rounded division is denoted by /, truncated division is denoted by //, and remainder computation is denoted by MOD. For real-like values we have the usual division, denoted by /. 3 == 5/2 2 == 5//2 1 == 5 MOD 2 2.5 == 5.0 / 2.0 We have two kinds of exponentiation. The usual exponentiation (which fails for negative values raised to an irrational power with the COMPLEX condition) is denoted by **. Additionally we have "sign-preserving" exponentiation, denoted by ♦♦: (a ♦♦ b) == (SIGN a) * (ABS a) ** b so that - 140 - Appendix B: Operators for the built-in types. 8.0 == 4.0 ** 3.0/2.0 -4.0 ** 3.0/2.0 will raise the COMPLEX condition -8.0 == -4.0 ♦♦ 3.0/2.0 We adopt the following rules concerning attempts to raise NIL to a power: DECISION a_zero: a == NIL ; b_zero: b == NIL ACTION a_zero: RETURN NIL ; b_zero: RETURN 1 ; ! to be typed correctly ELSE RETURN a power b END B.5.1 Complete integer arithmetic. A complete set of arithmetic operations is provided between all integer-lilce basic types. The operators are in the following set: ♦ addition subtraction * multiplication / rounded division // truncated division MOD remainder computation ** exponentiation ♦♦ sign-preserving exponentiation (see E.B.5) This set of arithmetic operators is applied as follows: OPERATOR type_a a op type_b b RETURNS return_type ; where •return_type l can be obtained from the operator table in section B.1.3. The type of each argument may be chosen from the following set: INTEGER, LONG_INTEGER, BYTE, and DECIMAL. Consult section B.1.2 for a discussion as to what exceptional conditions may result form these operators. Section B.1.1 discusses the remedies that will be employed for each condition. - 141 - Appendix B: Operators for the built-in types. B. 5.2 Complete real arithmetic. A complete set of arithmetic operations is provided for real numbers as follows. Operators may be chosen from the following set: ♦ addition subtraction * multiplication / division ** exponentiation ♦♦ sign-preserving exponentiation Exponentiation is performed correctly, i.e., negative numbers may be raised to integral powers. The preceding arithmetic operators may be applied as follows: OPERATOR type_a a op type_b b RETURNS return_type ; where • return_type ' can be obtained from the operator table in section B.1.3. The type of each argument may be chosen from the following set: REAL, and LONG_REAL Exceptional conditions are discussed in section B.1.2; remedies are supplied as outlined in section B.1.1. B.5.3 Fast mixed exponentiation. Depending on context, raising a real number to an integral power by repeated multiplication may require less computation than a conversion and real exponentiation. Consequently, the following mixed mode operators are provided : OP real_type a op integer_type b RETURNS real_type ; where •real_type' is REAL or LONG_REAL, and •integer^type 1 is chosen from the set: INTEGER, LONG_INTEGER, BYTE, or DECIMAL'n and where 'op* is either ** (for exponentation) or ♦♦ (for sign-preserving exponentiation) . For exceptional - 142 - Appendix B: Operators for the built-in types. conditions and their remedies see sections B.1.2 and B.1.1 The operator is implemented through repeated multiplication B.6 Unary arithmetic operations, CLEOPATRA is not exactly intended to be a mathematical programming language. We therefore do not include many mathematical functions among the unary arithmetic operations. Mathematical functions as operators applying to REAL type arguments and returning REAL results might be added as demand dictates. On the basic numerical types, INTEGER, LONG_INTEGER, BYTE, REAL, LONG_REAL, and DECIMAL'n, Me provide the follwing unary operators: name kind action of argument ABS CEIL FLOOR HAX BIN NORM PROD ROOT ROUND SIGN SUM TRUNC scalar scalar scalar scalar array array array array scalar scalar scalar array scalar unary min absolute smallest argument largest argument largest e smallest largest in array product o (square) rounding sign of t sum of al largest absolute the argum us value integer not preceding the integer not exceeding the lement in array element in array element (in absolute value) f all elements root of the argument to the nearest integer he argument [-1, 0, or 1 ] 1 elements integer not exceeding in value the absolute value of ent All these operators, unless noted otherwise below, return the type of their arguments. Operators on arrays return a scalar of the type of an array element. The ROOT operator will return REAL (or LONG_REAL if applied to a L0NG_REAL argument). Exceptional conditions for all these operators are noted in section B.1.2. - 143 - Appendix B: Operators for the built-in types. The following operators may have one positional INTEGER parameter: CEIL, FLOOR, ROOT, ROUND, and TKUNC In the case of ROOT, the integer indicates which root is to be taken; the default is 2 (i.e., the square root). The sign is treated correctly for all roots. A negative positional parameter will return appropriate reciprocals of the roots. In the case of the other operators allowed to have a positional INTEGER parameter, this parameter indicates that the operation is to be performed on a decimally shifted argument. E.B.6 Examples. We present first a recursive definition of the CEIL, FLOOR, ROUND, and TRUNC operators with a positional INTEGER argument : (CEIL(n). x) == (CEIL, x * 10 ** n) * 10 ** - n Or for numerical examples: (CEIL. 12.5) == 13.0 (CEIL(3). 1.2345) == 1.2350 (CEIL(-1) . 1.23) == 10.0 As a second example, we present the essential part of the definition of the ROOT operator: OPERATOR ROOT (n) . X ; DECISION zero_root: n == ; zero_arg: x == NIL ; odd_root: n MOD 2 ; pos_arg: x >= NIL ACTION zero_arg: RETURN NIL ; odd_root: RETURN x +♦ 1.0 / REAL n ; zero_root: ! raise the ZERODIVIDE condition RETURN LARGE * SIGN X ; - 144 - Appendix B: Operators for the built-in types, pos_arg: RETURN x ** 1.0 / REAL n ; ELSE ! raise the COMPLEX condition RETURN (ABS X) ** 1.0 / REAL n END END operator B-7 BIT operations. CLEOPATRA/360 provides the aost useful arithmetic operations involving mixed modes between BIT type arguments, and numerical type arguments. The arithmetic operators are ♦ addition subtraction * multiplication All these operators are defined between the type BIT and a numerical •type' from the following set INTEGER, LONG_INTEGER, BYTE, REAL, L0N3__REAL, and DECIMALS, and are defined as OPERATOR BIT a op type b RETURNS type ; OPERATOR type a op BIT b RETURNS type ; Furthermore, 6 of the possible 16 logical operations involving two arguments are provided in the following operators (see below for the special usage of the alternate English names) : & AND logical and | OR logical or | | logical exclusive or -*& NAND negation of the logical and -»| NOR negation of the logical or -»|| negation of the logical exclusive or (i.e., logical equivalence) Me also provide the unary logical operator -» logical complement (not) . - 1U5 - Appendix B: Operators for the built-in types. These logical operators are defined between an£ two basic types. Non-BIT basic types are preprocessed, before a logical operation is performed: an instance of NIL of the basic type is compared to the presented argument for the logical operator* If the two are equal, S.O is used as argument for the logical operator (i.e., NIL corresponds to FALSE) , and S. 1 is used otherwise. Like all operators, logical operators are processed right to left, i.e., their arguments are evaluated, right argument first, and then the operator is applied. The result of a logical operation is returned as BIT. In order to write more efficient conditionals, or in order to avoid undesired evaluations, a special option is provided by the CLEOPATRA/360 coder: if the alternate English names are used tor the logical operators on basic types, processing (still right to left) is preempted, as soon as it becomes apparent what the result of a logical operation will be. For example, (a OR b) AND c will preprocess c. Should c correspond to S.O, no further processing is done, and S.O is returned for the entire AND phrase. Should c correspond to S.1, b will be preprocessed. If it corresponds to S.1, further processing is abandoned, and S.1 is returned for the OR (and thus for the AND) phrase. This option is not available for the exclusive or operators - those require evaluation of both arguments in any case. B.8 CHARACTER operations. Previous sections have dealt with conversions between the basic types and the type CHARACTER (B.2), with comparison operations (B.4), and with BIT related operations (B.7). In the present section we will discuss operations which deal with CHARACTER values for their 'verbal* content. CLEOPATRA/360 as defined here intends to provide a basic set of manipulations; it is hoped that a SNOBOL-like package of operations (hopefully involving BLOCKS [Gimpel CACM June 1972, Vol. 15, No. 6, P. 438-447.]) will be created for future applications. - 146 - Appendix B: Operators for the built-in types, B. 8. 1 Code conversions and assignaent. Comparison of CHARACTER values lets a user provide his own collating seguence (see section B.4). Similarly, in the present section we discuss operators which effectively allow the changing of the encoding of a character string. For convenience, CLEOPATRA/360 provides this facility in two forms, as an assignment, and as a conversion: OPERATOR CHAR a : (pos) = (SEQUENCE* seg, SUBSTITUTE' SUb). CHAR b RETURNS CHAR ; CONVERSION TO CHAR FROM (SEQUENCE 1 seg, SUBSTITUTE 1 sub) . CHAR b ; As a default in CLEOPATRA/360, «seg» and 'sub* both specify the value NIL. The returned value for the CONVERSION has current and maximum anticipated length equal to the current length of •b 1 , and is obtained from *b* as follows: For each character C in •b*, a collating seguence number CSN is obtained which indicates where C is found in 'seg 1 (which must be of type CHARACTER). Should C not occur in 'seg 1 , CSN is set to NIL. If CSN is NIL, or if it exceeds the length of 'sub 1 (which must be of type CHARACTER), the character C in •b 1 is returned unchanged. If CSN differs from NIL, the CSN'th character in *sub* (counting from 1 on the left on up) is returned in place of C. For assignment, the returned value has maximal and current length egual to the current length of 'b 1 , and contains the original value of 'b 1 unchanged. (The latter condition is imposed to encourage the use of but one assignment operator at the top level of an arithmetic statement.) If the INTEGER argument 'pos* is omitted, or egual to NIL, the result of the conversion (if any) becomes the new value of •a*. •a's maximum anticipated length does not change, of course. A SIZE exception results if the current length of 'b« exceeds the maximum anticipated length of •a'. If •pos 1 is specified, the converted value is placed into 'a 1 beginning at the •pos'th character, counting from 1 on the left on up. 'pos 1 must point into the current length of •a 1 if no STRING_RANGE condition is to be raised. The resulting current length of 'a 1 is egual to the original - 147 - Appendix B: Operators for the built-in types. length of 'a', LENGTH a, if pos ♦ LENGTH b does not exceed LENGTH a, and equal to pos - 1 ♦ LENGTH b otherwise. A SIZE exception results if the current length of 'a 1 then exceeds its maximum anticipated length. If a STRING_RANGE condition is raised, 'a* remains unchanged. If a SIZE condition is raised, 'a' is changed and right- truncated. If no optional parameters are supplied, the assignment operator above performs simple right to left assignment. Let us compare code conversions at assignment and at comparison time: a : = (SEQUENCE* seg, SUBSTITUTE' sub) . b ; a . (SEQUENCE' sub) == (SEQUENCE* seg) . b In spite of a compatible use of the parameters , seq l and 'sub*, the comparison need not produce a TRUE result for the following reasons: 1) if, in a comparison, two distinct characters receive collating sequence numbers of NIL (0) (since they do not appear in the sequencing strings) they are judged to be egual. In an assignment, those characters ramain unchanged, hence are different from each other once the assignment is completed. 2) the substitution string 'sub* may contain duplicate characters, causing the assignment-substitution of equal characters in the result for unegual characters in the initial value. In a comparison seguencing string, only the first occurrence of a duplicate character produces a collating sequence number. We conclude from these observations, that in the preceding code seguence the comparison would be TRUE, provided 'sub' contains no duplicate characters, and all characters in •b 1 also appear in 'seg'. CLEOPATRA also provides an exchange assignment with double code conversion feature and replacement as follows: OPERATOR CHAR a : (pos a, SEQUENCE* seg_a, SUBSTITUTE' sub a) <=> (SEQUENCE' seg b, SUBSTITUTE' ~sub_b, pos_b) : CHAR b RETURNS CHAR : - 148 - Appendix B: Operators for the built-in types. The values of •a 1 and *b* are converted (if necessary), using (e.g., for 'a') •sub_a l and 'seq_a', and are then assigned to 'b 1 and •a 1 respectively. A SIZE condition may result, if one of the resulting values exceeds its receiving aaximal length. A STRING_RANGE condition may result if the positions for replacement violate the respective current lengths. The returned value is the original value of ' b'. E.B. 8. 1 Examples. Assume a .==. C.abcdef Issue a :(3)=. C.xyz Obtain a .==. C.abxyzf Issue a := (SEQUENCE' C. 1234 , SUBSTITUTE* C.abac ) C.xy1234z Obtain a .==. C.xyabacz B.8.2 Concatenation and splitting. Two CHARACTER values may be combined into one as follows: OPERATOR CHAR a " CHAR b RETURNS CHAR ; The resulting CHARACTER value consists of the value of •a 1 followed (on the right) by the value of 'b 1 . Current and maximal length are equal to the sum of tha current lengths of •a' and *b'. A SIZE exception may result, if the maximally allowable length for a CHARACTER value is exceeded, the result then is righttruncated on the right. CLEOPATRA/360 does not provide a direct way to convert a CHARACTER value into an array of CHARACTER values consisting of the single character constitutents. Instead, we allow the equivalent of a conversion of a CHARACTER value into an array of BYTE values, and vice versa. This facility also provides for a code conversion. It was felt that such a conversion facility would prove to be more applicable than a CHARACTER representation change. OP SEQUENCE (seq) . CHAR a RETURNS BYTE ALIGNED 1 EXT ; OP SUBSTITUTE (sub) . BYTE ALIGNED_COMPACT 1 EXT b RETURNS CHAR ; - 149 - Appendix B: Operators for the built-in types. The CHARACTER arguments •seq* and •sub' take the place of the arguments of like name in section B.8.1; as a default they both specify NIL, thus providing access to the internal character encoding of the hardware. E.B.8.2 Examples. C.abcde . = = . Cab " C.c " C. " C. de Q.2 == (SEQUENCE (C.abc ). C.ba )(1) Q.1 == (SEQUENCE (C.abc ). C.ba ) (2) Assuming that 'byte 1 is an array containing the values Q.1 through Q. 5 in order, we obtain C.aabca .==. SUBSTITUTE (C.aabca ). byte B.8.3 Selection. CLEOPATRA/360 provides two binary operators which allow the retrieval of smaller parts of a CHARACTER value. These operators can then be combined to achieve a PL/I SUBSTR-like facility. OPERATOR INTEGER a -> CHARACTER b RETURNS CHARACTER ; The returned value consists of characters 1 through 'a* of •b*. If •a 1 exceeds LENGTH b, the entire string is returned. If •a* is nonpositive, NIL is returned. In the latter two cases, a STRING_RANGE condition is raised. OPERATOR CHARACTER a <- INTEGER b RETURNS CHARACTER ; The returned value consists of characters , b l through LENGTH a. If •b 1 is non- positive, the entire string is returned. If 'b 1 exceeds LENGTH a, NIL is returned. In the latter two cases, a STH1NG_RANG£ condition is raised. E. B.8.3 Examples. As an example for the selection of substrings, consider: - 150 - Appendix B: Operators for the built-in types. C. 23456 .= = . 5 -> C. 1234567 <- 2 C.2345 .==. (5 -> C. 1234567 ) <- 2 In general a -> string <- b selects a substring of length 'a 1 starting at character ' b* in * string* , and (a -> string) <- b selects a substring starting at 'b 1 and extending to character 'a 1 in 'string 1 . B. 8.4 Scanning. CLEOPATRA provides the means for detecting the presence or absence of certain characters in a CHARACTER value: OPERATOR CHARACTER a "-> CHARACTER b RETURNS INTEGER ; OPERATOR CHARACTER a <-" CHARACTER b RETURNS INTEGER ; The first of these two operators has INDEX as an ALIAS name. It returns NIL or the position (from 1 on up) where the string 'a* first occurs in 'b'. The second operator returns 1 ♦ LENGTH a or the position (from LENGTH a on down) where • b» last occurs in •a 1 . The string to be guoted is next to the double guote symbol {") ; the arrow reminds the user of the direction of the search. If the string to be guoted is NIL, NIL or 1 ♦ LENGTH a will be returned, respectively. CHARACTER values can be checked whether their constituents belong to a certain class: OPERATOR CHARACTER a ?-> CHARACTER b RETURNS INTEGER ; OPERATOR CHARACTER a <- ? CHARACTER b RETURNS INTEGER ; The first of these operators has VERIFY as an ALIAS name. It returns 1 ♦ LENGTH b or the position (from 1 on up) of the first character in * b* which does not occur in *a*. The second operator returns NIL or the last position - 151 - Appendix B: Operators for the built-in types. (from LENGTH a on down) in which a character not in *b* occurs in •a 1 . OPERATOR CHARACTER a -•-> CHARACTER b RETURNS INTEGER ; OPERATOR CHARACTER a <~ • CHARACTER b RETURNS INTEGER ; These operators function exactly like their ?- counterparts, except that the classification strings contain all characters which may not occur in the strings to be tested. E.B.8.4 Examples. 3 == Cab '•-> C.a-ab...ab 8 == Ca-ab...ab <-" Cab == c "-> C.abcde 6 == Cabcie <-•• C 2 -= Cab ?-> C.a-b 2 == Cab <-? C 1 == Cab -»-> C.a-b == Cab <--. C B.8.5 Unary operators. For ease in manipulating CHARACTER values, CLEOPATRA/360 provides the following four unary operators: OPERATOR LENGTH CHARACTER a RETURNS INTEGER ; This operator returns the number of characters which presently are contained in 'a', i.e., the current length of •a*. OPERATOR MAX CHARACTER a RETURNS INTEGER ; - 152 - Appendix B: Operators for the built-in types, This operator returns the naxiiun number of characters which the string 'a 1 may ever contain, i.e., the maxinua anticipated length of 'a*. OPERATOR REVERSE CHARACTER a RETURNS CHARACTER ; This operator returns the characters of •a 1 in reversed order. For ease in constructing a CHARACTER value, we provide the operator OPERATOR REPEAT (n) . CHARACTER a RETURNS CHARACTER ; This operator returns a string of 'n 1 juxtapposed copies of •a*, 'n 1 must be specified as an INTEGER and has a default of INTEGER NIL. If the resulting literal value exceeds the maximally allowable length for the type CHARACTER (see section 4.1), the SIZE condition is raised at execution time, and the result is truncated on the right. E.B.8.5 Examples. 5 == MAX CHAR(5) NIL 3 == LENGTH C.abc == LENGTH CHAR NIL C.abcde .==. REVERSE C.edcba C. .==. REVERSE C. CHAR(4) LARGE .==. REPEAT^). LAST C. .==. REPEAT. C.a fl. 9 POINTER operations. References based on a pointer are described in section 5.3. We allow the following operations for pointer manipulation: - 153 - Appendix B: Operators for the built-in types. OPERATOR PTR (ref_type) a := PTR (ref _type) b RETURNS PTR (ref_type) ; OPERATOR PTR (ref _type) a :<=>: PTR (ref _type) b RETURNS PTR (ref_type) ; The first operator above has INIT as an alias name; it is thus possible to initialize pointers. The second operator above performs an exchange of its arguments. The returned pointer value is equal to the original value of ' b' in both cases. Note that it is intentional that the identifier • ref_type' appears three times in each operator heading - POINTERS can never change the type of the objects to which they may point. OP PTR (ref_type) a op PTR (ref_type) b RETURNS BIT ; where •op 1 is a comparison operator from the following set: These comparison operators detect whether both pointers are pointing to the same object, or whether a pointer is pointing to no object (i.e., is equal to POINTER NIL). Both pointers must point to objects of the same ref_type. - 154 - Appendix C: Summary of productions, C. Summary of productions. He will use a slightly extended form of BNF to specify the syntax of CLEOPATRA. The following symbols will be used: ::= separates left and right hand sides of a production. lower_case designates a non-terminal symbol; an attempt was made to make the names meaningful. UPPERCASE and special characters (except as noted below) designate terminal symbols; we assume a concrete representation where these terminal symbols represent themselves. | indicates a choice. { } are used to combine a number of choices. [ ] the enclosed entity may appear or may be omitted. [ ]• the enclosed antity may appear or more times. ( }• the enclosed entity must appear one or more times. The brackets [ ] are also considered a part of the CLEOPATRA character set; they do not appear in productions, however, since they are considered to be pairwise eguivalent to parentheses ( ) . The choice symbol | is also considered a part of the CLEOPATRA character set. If it is to denote itself as a terminal symbol, it will be underlined. - 155 - Appendix C: Summary of productions. (1.1) (1-3) letter ::=A|B|C|D|E |K|L|HJN|0|.P |?|H'|X|T|Z|a I 9 I ft i i I j I k I 1 ■|r|s|t|u|T|« F J G | H | I | J Q I R | S | T | b | c | d J e | f m | n | o | p | q x | y | z delimiting_character ::= ( | ) | . 1 | blank_character 1 ! I special_character ::=a>|#|$|%|6|*|-| ♦1 = 1" l?l/lllll*l (1. .*) (1. .5) (1. .6) (1. .7) digit ::= 0|1|2|3|4|5|6|7|8|9 control_character ::= backspace | ! | end_of_source_recor3 comment ::= COMMENT any sequence of characters with the exception of a semicolon ; ::= ! any sequence of characters with the exception of an end of the source record end of source record (2. 1) (2. 2) (2. 3) (2. .*) (2. 5) (2. ,6) (2. -7) (2. 8) (2. 9) identifier ::= letter [ letter | digit ]• operator ::= { special_character }• J identifier constant :;= integer | long_integer J byte | real J long_real | decimal | bit | literal_value decimal_digit : := digit hexadecimal_digit ::= digit | A | B | C J D | E | octal_digit : binary_digit minus_symbol = 0|1!2|314|5J6|7 := | 1 decimal_string ::= [ minus_symbol ] { decimal_digit } • - 156 - Appendix C: Summary of productions. (2.10) hexadeciaal_string ::= X, [ minus_symbol ] { hexadecimal_digit }• (2.11) octai_string ::= 0. [ minus_symbol ] { octal_digit } • (2.12) binary_string ::= B. [ ainus_symbol ] { binary_digit } • (2.13) basic_constant ::= decimal_string | hexadecimal_string | octal_string | binary_string (2.14) integer ::= basic_constant (2.15) long_integer ::= F. integer (2.16) byte ::= Q. integer (2.17) bit ::= S. integer (2.18) decimal_real_string ::= decimal_string . { decimal_digit }• (2.19) hexadecimal_real_string ::= hexadecimal_string . ( hexadecimal_digit } • (2.20) octal_real_string ::= octal_string . { octal_digit } • (2.21) binary_real_string ::= binary_string . { binary_digit } • (2.22) basic_real ::= deciaal_real_string | hexadeciaal_real_string \ octal_real_string | binary_real_string (2.23) exponential_part ::= P basic_constant (2.24) real ::= basic_real [ exponential_part ] (2.25) iz~ basic_constant exponentiai_part (2.26) long_real ::= L. real (2.27) decimal ::= D. basic_constant - 157 - Appendix C: Summary of productions. (2.28) literal_value ::= C. any sequence of characters up to and not including the first following blank_character (2.29) system_supplied_value ::= type ( LARGE | NIL | SHALL } (2.30) system supplied constant ::= £ | FALSE | FIRST | LAST | PI~| TRUE (3.1) configuration ::= type_pack | algorithm (3.2) type_pack : := global_structure_block [ structure_block ] [ data_block ] type_data_block (3.3) algorithm ::= [ structure_block ] [ global_data_block ] [ data_block ] routine_block (3.4) compilation ::= [ environment_request ] { structure_block | global_structure_block | data_blDck | global_data_block | type_data_block | routine_block } • (3.5) environment_request ::= COMPILE INTO conf iguration_reference (3.6) conf iguration_ref erence ::= conf iguration_name [ . conf iguration_name ]• (4.1) basic_ref_typa ::= INTEGER | LONG_INTEGER | BYTE I REAL | L0NG_REAL \ DECIMAL~[ • size ] | BIT | CHARACTER | POINTER ( ref_type ) (4.2) basic_type ::= INTEGER | LONG_INTEGER | BYTE | REAL | LONG_REAL | DECIMAL [ • size ] | BIT J CHARACTER [ ( expression ) ] J POINTER ( ref_type ) (4.3) size ::= integer - 158 - Appendix C: Summary of productions. (4.4) ref_type ::= ( basic_ref_type | type_name | { ALIGNED | COMPACT"} ( ref type ( , ref type }• ) } [ { ALIGNED | "COMPACT } integer EXTENTS ] [ ALIAS identifier ] (4.5) type ::= { basic_type | type_name [ parameters ] | { ALIGNED | COMPACT }"( type { , type }• ) } [ array ] [ ALIAS identifier ] (5.1) structure_block ::= STRUCTURE conf iguration_name { ~; link_item }• [ ; ] END conf iguration_name [ ; ] (5.2) configurat ion_name ::= procedure_name | type_name | operator_link (5.3) link_item ::= TYPE type_name [ ALIAS identifier ] [ ref_ > type_list ] \ global_link_item (5.4) global_link_item ::= PROCEDURE procedure_name [ ALIAS identifier ] [ ref _type_list . ] RETURNS ref_type (5.5) ::= operator_linlc : OPERATOR [ lef t_ref _types ] operator [ ALIAS identifier ] right_ref_types RETURNS ref_type (5.6) ::= operator_link : CONVERSION TO ref_type FROM right_ref_types (5.7) global_link_item ::= PROCEDURE procedure name [ ALIAS identifier ] BUILT IN (5.8) ::= OPERATOR [ ref_type [ BY ADDRESS : | . ] ] operator [ ALIAS identifier ] { [ . ] ref_type | : ref_type BY ADDRESS } BUILT IN (5.9) ::= CONVERSION TO ref_type FROM { [ . ] ref_type 1 : ref_type BY ADDRESS } BUILT IN (5.10) ref_type list ::= ( type formal [ , type formal - 159 - Appendix C: Summary of productions. (5.1 1) type_formal ::= [ identifier • ] ref_type [ BY ADDRESS ] (5.12) . left_ref_types ::= ref_type £ BY ADDRESS : [ ref _type_list ] \ • ref_type_list ] (5.13) right_ref_types ::= ref_type f : ref_type BY ADDRESS | ref_type_Iist { . ref_type | : ref_type BY ADDRESS } (5.14) global_sti:ucture_block ::= GLOBAL STRUCTURE type_nane ( ; global_link_item } • [ ; ] END type_name [ ; ] (5.15) type_narae ::= identifier (5.16) routine_block ::= PROCEDURE procedure_name [ name_list . ] [ ; statement }• [ ; ] END procedure__name [ ; ] procedure_name ::= identifier name_list ::= ( formal [ , formal ]• ) formal ::= identifier [ • ] procedure call ::= procedure_name [ parameters . ] parameters ::= [ ( actual [ , actual ]• ) ] actual ::= [ [ identifier • ] expression ] routine_block ::= operator_link : OPERATOR [ left_names ] operator right_names { ; statement } • [ ; ] END operator_link [ ; ] operator_link : := identifier left_names ::= ref_type identifier [ { : | . } name_list | : ] right_names ::= [ : ] ref_type identifier | name_list ( : I . } ref_type identifier opera to recall ::= [ expression [ {•!•}[ parameters ] ] ] operator [ [ parameters ] { : I . } ] expression - 160 - <5. ,17) (5. .18) (5. ,19) (5. ,2 0) (5. ,21) (5. ,22) (5. ,23) (5- ,24) (5. ,25) (5, .26) (5. ,27) Appendix C: Summary of productions. (5-28) routine_block ::= operator_link : CONVERSION TO ref_type FROM right_names ( ; statement } • [ ; J END operator_link [ ; ] (5-29) data_bloc)c ::= DATA conf iguration_name ( ; [ CONSTANT | DEFER ] data_group }• [ ; ] END conf iguration_name [ ; ] (5.30) global_data_block ::= GLOBAL DATA conf iguration_name { ; [ CONSTANT | DEFER ] data group }• [ ; ] END configuration name c ; 3 (5.31) type_data_block ::= GLOBAL DATA type_name [ name_list] ( ; [ CONSTANT | DEFER | SHARE [ CONSTANT ] ] data_group }• [ ; ] END type_name [ ; J (5.32) data_group ::= { basic_type J type_name [ parameters ] } [ array ] item £ , item ]• (5.33) data_group ::= { ALIGNED | COMPACT } ( data_group [ , data_group ]• ) [ [ array ] item [ , item ]• ] (5.34) array ::= [ ALIGNED J COMPACT ] { LEFT | RIGHT | VECTOR } ( bound [ , bound ]• ) (5.35) bound ::= [ { * I expression } : ] { * 1 expression } (5.36) item ::= identifier [ INIT expression ] (6.1) expression ::= constant | system_supplied_value | system_supplied_constant | reference | procedure_call | operator_call | COPY expression | expression (6.2) array_reference ::= [ shape_list ] array_expression [ extract_list ] | [ shape_iist ] ( array_ref erence ) [ extract_list ] (6.3) array_expression ::= identifier | array_temporary I procedure_call | operator_call - 161 - Appendix C: Summary of productions. (6.4) extract_list ::= ( extract [ , extract ]• ) (6.5) extract ::= expression I { * | expression } : ( * | expression } (6.6) shape_list ::= ( shape [ , shape ]• ) (6.7) shape ::= [ { * I expression } : ] { * 1 expression } (6.8) array_temporary ::= ARRAY ( new_shape [ , new_shape ]• ) ( section [ , section ]• ) (6.9) new_shape ::= [ expression : ] expression (6.10) section ::= extract_list expression (7.1) stmt ' (7.2) stmt • • (7.3) stmt • (7.4) stmt (7.5) (7.6) (7.7) (7.8) (7.9) (7.10) stmt stmt lstmt label = expression | NIL = RETURN expression = EXIT [ label ] := IF expression THEN statement [ [ ; ] ELSE statement ] ::= ALLOCATE identifier [ parameters ] FOR pointer [ INIT expression ] = RELEASE expression := cstmt | label : cstmt label := identifier cstmt ::= BEGIN statement [ ; statement ]• [ ; ] END cstmt ::= [ fDr_phrase ] ITERATE [ while_phrase ] statement [ ; statement ]• [ ; ] [ when_phrase ] END - 162 - Appendix C: Summary of productions. (7.11) for_phrase ::= FOR identifier [ FROM expression ] [ ; ] [ { DOWNTO | UPTO j expression ] [ ; ] [ STEP expression ] [ ; ] (7.12) while_phrase ::= UHILE expression ; (7.13) when_phrase ::= HHEN expression [ ; ] (7. 14) cstmt ::= DECISION decision [ ; decision )• [ ; ] ACTION action [ ; action ]• [ ; ] [ ELSE statement [ ; ] ] END (7.15) decision ::= switch : expression (7.16) ::= switch { , switch }• [ list_layout ] : list_index (7.17) switch ::= identifier (7.18) list_layout ::= { LEFT | RIGHT 1 VECTOR } ( [ integer : ] integer [ # [ integer : ] integer ]• ) (7.19) list_index ::= expression [ , expression ]• (7.20) action ::= switch_expression : statement (7.21) switch_expression ::= [ -» ] switch [ switch_operator [ -• J switch ]• (7.22) switch_operator ::■ S | J. 1 JJ. I -»S I -»1 I ~*H I ==|-=| AND J OR | NAND J NOR (7.23) statement ::= stmt | lstmt (8.1) configuration : := interrupt (8.2) conf iguration_narae ::= interrupt^ link (8.3) interrupt ::= [ structure_block ] [ global_data_bloc:k J [ data_block ] interrupt_block (8.4) global_link_item ::= CONDITION condition_naae [ ALIAS identifier ] [ ir_ref_types ] - 163 - Appendix C: Summary of productions. (8.5) (8.6) (8.7) (8.8) (8.9) (8.10) (8.11) (8.12) (8.13) (8.14) ::= interrupt_link : INTERRUPT interrupt_name [ ALIAS identifier ] [ ir_ref_types ] condition_name interrupt__link ir_ref _types := identifier := identifier = [ ref_type [ BY ADDRESS ] ] [ ref_type_list ] stmt ::= ON condition_name ( interrupt_name | NIL | BUILT IN } [ ref_type | ALL ] stmt ::= SIGNAL conditiDn^nane [ expression ] [ paraneters ] interrupt_block ::= interrupt_link : INTERRUPT interrupt_name [ ref_type identifier ] [ name_list ] { ; statement } • [ ; ] END interrupt_link [ ; ] interrupt_name ::= identifier stmt ::= RESUME stmt ::= ASSERT expression (9.1) (9-2) stmt : (9.3) stmt : (9- 1 *) stmt : (9.5) cstmt (9.6) system_supplied_value ::= type AT ( expression , expression ) = SPM expression = SSM expression = RESUME expression ::= i_o_control CASE switch, switch, switch, switch OF [ switch : statement ; } • END i_o_control ::= START device [ KEY key ] STATUS status PROGRAM channel_program - 16U - Appendix D: Coding examples. D. Coding examples. STRUCTURB appendix__D ; COMMENT In this appendix we present a few more or less complete examples of CLEOPATRA/360 code. ie employ the operators for the basic types as described in appendix B ; D_1: OPERATOR INTEGER search_insert(INT BY ADR). INTEGER 1 EXT RETURNS INTEGER ; D_2: OPERATOR bubble_sort: INTEGER 1 EXTENT BY ADDRESS RETURNS INTEGER 1 EXTENT ; D_3: OPERATOR sort: INTEGER 1 EXTENT BY ADDRESS RETURNS INTEGER 1 EXTENT ; D_U: OPERATOR INTEGER binary_search INTEGER 1 EXTENT RETURNS INTEGER END appendix_D - 165 - Appendix D: Coding examples. DATA D_1 ; INTEGER itea ; ! the search argument INTEGER VECTOR {*) table ; ! the table to be searched ! we assume it cannot overflow INTEGER end ; ! used area is table (1 ;end) ! end may be zero, it is extended if necessary INTEGER X ! the return value: a subscript END D 1 D_1: OPERATOR INTEGER item search_insert (end) . INTEGER 1 EXT table ; COMMENT This is a problem proposed by Knuth and Floyd. It intends to display a two-exit loop. The code is therefore written as if it were in-line in some larger context. table_search: BEGIN FOR x FROM 1 ! the new lower bound of table UPTO end I the limit of the used area ITERATE IF table (x) == item THEN EXIT table_search END table (end:=x) := item ! insert and extend the active area END table_search RETURN x END D 1 - 166 - Appendix D: Coding examples. DATA D_2 ; INTEGER VECTOR (*) table ; ! area to be sorted, 2 or more entries assumed INTEGER end, ! index of the outer loop x, ! index of the inner loop flag ! if not NIL: no switch was aade END D 2 D_2: OPERATOR bubble_sort: INTEGER 1 EXTENT table ; COMMENT This is a problem proposed by Sulf in the Project Rosetta Stone for comparisons of implementation languages ; FOR end FROM 1 UNBOUND table DOMNTO 1~ ! end delimits the scope of the sort ITERATE x := flag := 1 ! no switch thus far ITERATE ! pass over sort scope and switch any ! two entries which are not in order IF table(x := x+1) < table(x) THEN BEGIN table(x) :<=>: table(x - 1) ; flag := END HHEN x == end END WHEN flag END RETURN table END D 2 - 167 - Appendix D: Coding examples. DATA D 3 _-* » INTEGER VECTOR (*) table ; ! area to be sorted, say have one entry INTEGER end INIT 1 UNBOUND table, x INIT 1 END D 3 D_3: OPERATOR sort: INTEGER 1 EXTENT table ; COMMENT This is a version of a bubble-sort proposed by Wooliey as an exanple of the use of "LINUS: A structured language for instructional use" (SIGCSE Bulletin, Feb. 1974, P. 127) ; COMMENT This solution takes in the order of n**3 comparisons, while example D_2 takes in the order of n**2, in the worst case ; ITERATE WHILE x < end ; IF table(x := x*1) < table(x) THEN BEGIN table (x) :<=>: table (x - 1) ; x := 1 END END RETURN table END D 3 - 168 - Appendix D: Coding examples. DATA D_4 ; INTEGER item ; ! the search argument INTEGER VECTOR (*:*) table ; ! the area to be searched ! we assume it to be ordered INTEGER a INIT 1 L_BOUND table, b INIT 1 UNBOUND table, ! throughout the search item must be in J table between a and b. x ! used as a pointer into tab^e END D_4 D 4: OPERATOR INTEGER item binary_search INTEGER 1 EXTENT table ; COMMENT This is a problem proposed by Gries in "What should we teach in an introductory programming course" (SIGCSE Bulletin, Feb. 1974, P. 85) ; ITERATE WHILE a <= b ; x := FLOOR, (a ♦ b)/2 DECISION less, egual, greater VECT0R(-1;1) : SIGN item - table (x) ACTION less: b := x - 1 ; equal: RETURN x ; ! item == table (x) greater: a := x ♦ 1 END END ! if we reach this point, item is not in table END D 4 - 169 - Appendix E: The beginnings of a type_pack. E. The beginnings of a type_pack. GLOBAL DATA complex ; SHARE HEAL R, ! the real part I I the imaginary part END complex ; GLOBAL STRUCTURE complex ; COMMENT This example is provided to illustrate the concept of a type_pack. We would like to attach a mathematical disclaimer: "Use at your own risk". The type_pack certainly is very incomplete; add: OPERATOR complex ♦ complex RETURNS complex ; mul: OPERATOR complex * complex RETURNS complex ; pow : OPERATOR complex ** REAL RETURNS complex ; rts: OPERATOR ROOT (INTEGER) . complex RETURNS complex ; abs: OPERATOR ABS complex RETURNS REAL ; cnj: 0? CNJ ALIAS CONJUGATE complex RETURNS complex ; con: OP complex BY ADR := REAL RETURNS REAL END complex ; STRUCTURE complex ; COMMENT We need the conversion to and from angle- measure for easy exponentiation ; ang: CONV TO REAL 1 EXT ALIAS a_complex FROM complex ; car: CONV TO compleic FROM a_complex END complex - 170 - Appendix E: The beginnings of a type_pack. DATA ang ; complex x END ang ang: CONVERSION TO a_complex FROM complex x ; RETURN ARRAY(2) ( (1) inverse__tangent I.x / R.x, ! a suitable function is assumed ! the angle in the complex plane (2) ROOT. (R.X ** 2) ♦ I.x ** 2) ! the radius vector in the complex plane END ang DATA car ; REAL (2) polar ; complex x END car car: CONVERSION TO complex FROM a_complex polar ; R.x := polar (2) * cosine polar (1) ; I.x := polar (2) * sine polar (1) ; RETDRN x END car - 171 - Appendix E: The beginnings of a type_pacx, DATA add ; complex x # y, z END add add: OPERATOR complex x + complex y ; R.z := R.x ♦ R.y ; I.z := I. x ♦ I. y ; RETURN Z END add DATA mul ; complex x, y, z END mul mul: OPERATOR complex x * complex y ; R.z := (R.x * R.y) - I.x * I.y ; I.z := (R.x * I.y) ♦ I.x * R.y ; RETORN Z END mul mul: OPERATOR complex x * complex y ; COMMENT This is certainly less efficient, vie merely offer this second solution as another example ; RETURN complex ARRAY (2) ( (1) ((a_complax x)(1)) ♦ (a_complex y) (1) , (2) ((a~complex x) (2) ) * (a_complex y) (2) ) END mul - 172 - Appendix E: The beginnings of a type_pack. DATA pov ; complex x ; REAL y END pow pow: OPERATOR complex x ** REAL y ; RETURN complex ARRAY (2) ( ( 1 ) 7 * (a_coiplex x) (1), (2) (a_complex x) (2) ** y) END potf DATA rts ; INTEGER n INIT 2 ; ! as a default, compute square roots complex x END rts rts: OPERATOR ROOT (n) . complex x ; RETURN X ** 1.0 / REAL n END rts - 173 - Appendix E: The beginnings of a type_pack. DATA abs ; complex x END abs abs: OPERATOR ABS complex x ; RETURN ABS (a_complex x) (2) END abs DATA cnj ; complex x r y END cnj cnj: OPERATOR CONJUGATE complex X ; COMMENT This example also indicates essentially, how a •complex := complex' assignment operator would be implemented ; R.y := R.x ; I.y := - I.x ; RETURN y END cnj - 174 - Appendix E: The beginnings of a type_pack, DATA con ; complex x ; HEAL y END con con: OPERATOR complex x := REAL y ; R.x := y ; I.x := REAL NIL ; RETUBN y END con - 175 - Appendix F: Conditions. F. Conditions. We discussed the structure of a possible interrupt mechanism in section 8. Section 9 further proposed certain privileged operations available for critical segments of the code of an operating system for the IBM System/360 family of computers. Built-in basic data types were discussed in section 4.1, and in appendix B we suggested some operators for them. Appendix B also listed the exceptional conditions which these operators may raise. In the present appendix, we would like to further discuss an interrupt scheme for CLEOPATRA/360. The System/360 architecture was briefly discussed in section 9. 1. It should be clear that appendix B listed many more distinguishable conditions than the hardware will actually differentiate. In this appendix we will therefore make the further distinction between hardware interrupts and software conditions. (Hardware interrupts should not be confused with CLEOPATRA interrupt configurations.) We will suggest calling conventions for both types of conditions. F. 1 Hardware interrupts. As we described in section 9.1, there are six different exceptional conditions reported as interrupts by the hardware: EXTERNAL (including the interrupt caused by the timer cell), INPUT_OUTPUT, MACHINE_CHECK, PROGRAM_CHECK, SUPERVISOR_CALL, and IPL (Initial "Program Loading). The first three of these can be disabeled, i.e., by setting proper bits in the Program Status Word (PSW) we can instruct the hardware to disregard or stack these conditions. If we code correctly, PROGRAM_CHECK cannot occur, and if we do not issue the SUPERVISOR CALL (SVC) instruction, the SUPERVISOR_CALL condition cannot be raised. An operating system cannot overrule the IPL condition, but a judicious operator will not raise it during normal operations. Further considerations in this section do not completely apply to the IPL condition. We will discuss this condition separately in section P. 2. - 176 - Appendix P: Conditions. Hardware conditions should not be resolved by the average user. We propose that only the privileged CLEOPATRA programmer should be permitted to issue ON statements connecting interrupt handlers to the six hardware conditions described here. He consider the nucleus of an operating system as consisting of (resident) handlers for the hardware interrupts; additionally we suggest that the privileged operations sketched in section 9 can be used only in the innermost (primitive) parts of an operating system. Hardware conditions, with the exception of SUPERVISOR_CALL, cannot be signaled by the user. The SUPERVISOR_CALL condition and instruction will, of course, play a major role in the implementation of the SIGNAL statement, and they attain a special status. Certain additional interrupts which the operating system needs, such as the obtain/free mechanism suggested in section 4.2, or the HALT condition suggested in section 8.3, or a general ERROR condition, e.g., when execution control erroneously reaches an END phrase, etc. would certainly be predefined SUPERVISOR_CALLs. In order to keep the operating system nucleus design simple, we should not permit hardware interrupts to be raised recursively, quite contrary to software conditions. Within the limits as implied by the first paragraph of this section, this is feasible: we simply need to define the new PSWs for the hardware conditions so that when they are introduced, all interrupts (as far as possible) become disabeled. It is subsequently the user's resposibility to reenable the interrupts. When a hardware condition is raised, the hardware will store an old PSW which indicates the state of the world prior to the interrupt, and which allows execution to resume as if the condition had not been raised. Additionally, the INPUT_OUTPUT interrupt is usually presented with a Channel Status Word (CSW) . Interrupt handlers can have parameters provided by the raising of the condition, in the more common case by the SIGNAL statement. We suggest that the hardware conditions be defined to supply the oli PSW (and possibly the CSW for an INPUT_OUTPUT interrupt, and the scanout area for a HACHINE_CHECK) as parameters. Since we assume that all interrupts are disabeled following the raising of a hardware condition, we can provide the old PSW BY ADDRESS, and thus give the interrupt handler a possibility to change the old - 177 - Appendix F: Conditions. PSW in its fixed location in low core prior to issuing the simple RESUME statement (which in a hardware condition interrupt handler should merely introduce this old PSI as a new PSW) . If the treatment of the interrupt is more involved, the interrupt handler should stack the old PSW, and then should issue a RESUME statement with a new PSW which enables all interrupts, and which uses NEXT_STATEMENT as instruction address (see section 9.2). If such an operation is very frequent, we might consider introducing a special ENABLE statement. Stacking the old PSW is also essential, if the sole purpose of the hardware interrupt handler is to further report a condition to the user of the nucleus services. In this case, the interrupt handler would access information concerning all the program trees, and it would from there determine the condition and the interrupt handler, if any. Next, the old PSW must be associated with that level of the program tree, and a new PSW must be introduced directing control to the user-supplied interrupt handler (i.e., the user's condition must be raised). We implement multiprogramming and process management, including inter-process communication, by suitable SUPERVISOR_CALLs. The assumption that the initial part of a hardware interrupt handler cannot be interrupted, permits us to implement Djikstra's P- and V-primitives or equivalent as SUPERVISOR_CALL conditions. The dispatcher (part of the SUPBRVISOR_CALL handler) accomplishes process routing by introducing new PSWs resuming execution within the suspended processes. Another part of the dispatcher is a handler for the EXTERNAL timer interrupt, implementing time-sharing among processes. Let us summarize: we propose that the six hardware conditions are predefined, and that they all have principal arguments of type PSW. For all but the IPL condition, this PSW is the old PSW belonging to the condition, and it is received BY ADDRESS. Additionally, the MACHINE_CHECK condition receives the scanout area, probably as a BYTE array, depending on the host machine model. The INPUT_OOTPUT condition handler receives a CSW. We also provide the hardware interrupt handlers with access to the contents of the general and floating-point - 178 - Appendix F: Conditions. registers at the time the interrupt was raised. The access is BY ADDRESS since the RESUME statement Mill reload the registers before resuming; nevertheless, a copy of the register contents must be made in core memory, since the interrupt handler itself will have to use these registers. This is the only instance where we should provide explicit access to the contents of the registers. F.2 Initial Program Loading. The Initial Program Loading (IPL) interrupt is caused by operator intervention. Prior to invoking the IPL interrupt handler as defined in CLEOPATRA code, a certain amount of input-output activity takes place. This activity is essentially under control of the operator. It is possible that the Channel Command Words (CCH) which control this initial activity must be written * bj hand 1 , rather than in CLEOPATRA, but then it is not expected that very many distinct IPL loading operations be designed. The IPL condition interrupt handler obtains control when the input-output activity terminates. Or rather, if the information supplied to this initial activity is correct, it should cause the IPL handler to start execution. The IPL handler is provided with the new PSH as it was recorded in a fixed location in low core prior to its introduction as a current PSH. It is thus possible for the IPL handler to detect what device it was being loaded from. For details, consult the relevant IBW System Reference Library publications. The IPL handler can now be coded to perform •normal 1 computations. It should normally establish handlers for the hardware conditions (using ON statements) , and it would most likely then introduce a new PSH which puts the system into HAIT state. At this point then, most likely, the operator would be expected to cause an EXTERNAL or INPUT_OUTPUT interrupt, and the relevant handler would be invoked. The state of the world as the CLEOPATRA compiler knows it prior to the introduction of any source code into its tables, would reflect the IPL situation of the machine. I.e., an environment request to compile into IPL (section 3.4) would assume only that a structure block entry for the - 179 - Appendix F: Conditions. IPL interrupt exists, and that this interrupt configuration is about to be resoived. The corresponding assumed program is that the IPL condition is signaled. Details will be decided by an implementation. F.3 Software conditions. In addition to the eight conditions raised by the built-in operators as described in section B.1.1, we have mentioned the REFERENCE and the SUBSCRIPT_RANGE condition. All of these can be raised at execution time as a result of certain illegal user actions. we might have to add the HALT condition and the RETURN_FROM condition, both defined in section 8.3, and a general ERROR condition. For debugging, the ASSERT condition described in section 8.4, and PL/I-like CHECK and FINISH and PL/C-like FLOW conditions may be added to the list. All of these conditions can be signaled by the user, in addition to being caused by an action involving illegal data. The user may also provide interrupts to handle all but the HALT condition, but the implementation may not permit some of these interrupts to RESUME. All of these conditions, with the exception of HALT, and possibly FINISH and ERROR, may be raised recursively, and they will be resolved correctly by the RESUME and RETURN_FROM mechanisms. He accDiplish this internally in a manner very similar tD the hardware interrupt mechanism: the hardware condition interrupt handler in the nucleus will differentiate the user-defined conditions, and will cause the actual signaling, and it will associate a suitable PSW with the RESUME mechanism. We still need to define the calling conventions for these conditions. Actually, we shall just provide some guidelines which we would expect an implementation to follow. Basically, we expect the implementation to give the interrupt a chance to correct the information that caused the condition to be raised. To this extent we define the conditions so that they include as parameters an area in which the result of the malfunctioning operation is expected. This parameter is provided BY ADDRESS, and is thus - 180 - Appendix F: Conditions. modifyable by the interrupt. The ref_type of this parameter will usually further distinguish the conditions, and we therefore would expect the result area to be the principal argument of the built-in conditions. Further parameters could then detail tha circumstances which caused the condition; it should not be too difficult to provide unique calling conventions for each conlition. Let us just consider tha OVERFLOW condition. It can be raised for operations involving all the basic types with the exception of CHARACTER and POINTER. Onf ortunately, the hardware does not provide quite enough information to sort out the cases. Unless disabled, an overflow will cause the PROGRAM_CHECK condition interrupt to be invoked. This interrupt handler has access to the old PSW and thus to the interruption code. Tha interruption code distinguishes three kinds of overflow: fixed point, decimal, and floating point. Inspection of the instruction, which can be located through the instruction address and length code in the old PSW, might permit us to further distinguish LONG_REAL and REAL overflow, DECIMAL'n overflow, etc. With the instruction we are also able to construct the result area and the operand areas. However, the implementation may or may not distinguish BYTE overflow from INTEGER overflow, and from LONG_INTEGER overflow. While the latter can be differentiated by inspection of the instruction (in most cases, at least), the implementation will have to simulate BYTE arithmetic through INTEGER arithmetic, and it may not be possible or very efficient to signal separate conditions. - 181 - References. This report was based on a careful study of the available literature on system implementation languages, general purpose programming language features, extensible languages, and operating system design needs. Some references to the literature or to specific programming languages have been made in the text. A more detailed discussion of the background material will be in the author's forthcoming Ph.D. thesis. - 182 - IOGRAPHIC DATA I T 1. Report No. UIUCDCS-R-74-646 3. Recipient's Accession No. ! |r uul Subt it If CLEOPATRA Comprehensive Language for Elegant Operating System and Translator Design 5- Report Dat< May 1974 Itkor(s) lei T. Schreiner 8. Performing Organization Rept. No. Irforming Organization Name and Address .iversity of Illinois at Urbana-Champaign partment of Computer Science .bana, Illinois 61801 10. Project/Task/Work Unit No. 11. Contract ^Grant No. Hin soring Organization Name and Address iversity of Illinois at Urbana-Champaign partment of Computer Science bana, Illinois 61801 13. Type of Report & Period Covered 14. jpplementary Notes bstracts iEOPATRA is a general-purpose and systems implementation language in the style of 130L designed for computers similar to the IBM System/360. Among its concepts are tensions to the ALGOL block structure, user-defined data types and data access nnamsms, and user-defined 'generic' operators. The language is goto-free, and has generalized decision table as its main control structure. An interrupt mechanism - proposed. . s report attempts to define an optimal language. It is intended as a user's manual : a prospective implementation. J__ . ey Words and Document Analysis. 17a. Descriptors i iputing Reviews Categories: 4.12 4.21 4.22 1 icriptors: iLGOL Compilation (computers) Compilers Computer languages, programming, programs, software and systems programs xecutive routines »■ dentif iers /Open-Ended Terms Machine coding Machine-oriented languages Monitor routines Operating systems (computers) Problem-oriented languages 4.34 4.35 Procedure-oriented languages Programming (computers), languages, manuals and techniques Recursive routines Symbolic programming J terns implementation languages ) terns implementation 'uctured programming OSATI Field/Group 1 1 ailability Statement EEASE unlimited 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 182 22. Price I NTIS-35 ( 10-70) USCOMM-DC 40329-P7I