LIBRA R.Y 
 
 OF THE. 
 
 U N IVE.R.SITY 
 
 OF 1LLI NOIS 
 
 510.84 
 K6r 
 no. 30 1 -307 
 cop. 2 
 
The person charging this material is re- 
 sponsible for its return on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the Univers.ty. 
 
 UN.VERS.TY OF I L 1 1 UOlSUMM^l^^f^^ 
 
 OEC 1 3 «7t 
 
 
 kPR 6RE 
 
 FEB 1 A, 
 
 
 
 
 L161— O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/generalizedassem306geor 
 
is Report No. 306 
 2- 
 
 ^Th^t a^c/ { 
 
 C00-1U69-0109 
 
 A GENERALIZED ASSEMBLER 
 
 by 
 
 Christos Georgiou 
 
 January 1969 
 
 THE LIBRARY OF THE 
 
 FEB 2 1359 
 
 UNIVERSITY OF ILLINOIS 
 
Report No. 306 
 
 A GENERALIZED ASSEMBLER* 
 
 by 
 Christos Georgiou 
 
 January 1969 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 618OI 
 
 Supported in part by contract U. S. AEC AT(ll-l)lU69 of the 
 Atomic Energy Commission and was submitted in partial fulfillment 
 of the requirements for the degree of Master of Science in Computer 
 Science, January 1969. 
 
ill 
 
 PREFACE 
 
 The ASSEMBLER described in the following pages is one which 
 will assemble programs written in assembly language for small machines, 
 one or two addresses, having the feature of indirect addressing but 
 not having index registers and the base-displacement way of addressing, 
 having up to U8 bit words (l6 octal numbers), and permitting double 
 relocation. Macros are not allowed. 
 
 It assembles according to the description of the particular 
 machine that it reads. In this sense it is a "general" assembler. 
 
 The author wishes to thank Professor C. W. Gear for his 
 valued guidance and support during the preparation of this thesis. 
 The technical assistance provided by Professor Gear is especially 
 appreciated. Thanks are also extended to Miss Barbara Hurdle for 
 typing the final manuscript. 
 
iv 
 
 TABLE OF CONTENTS 
 
 Page 
 
 1. INTRODUCTION 1 
 
 2. ASSEMBLY 2 
 
 3. PASS 1 5 
 
 k. PASS II 10 
 
 5. NAMES AND EXPRESSIONS 11 
 
 6. PSEUDO ORDERS 15 
 
 7. DESCRIPTION OF THE MACHINE 18 
 
 8. DIAGNOSTIC MESSAGES 21 
 
 BIBLIOGRAPHY 2h 
 
1. INTRODUCTION 
 
 In the following sections the ASSEMBLER is described, which 
 is a collection of programs written in assembler (F or G) and operating 
 on the IBM System 360/50-75. It assembles programs for the Digital 
 Equipment Corporation's PDP-7, PDP-8, and PDP-9, computers. 
 
 This collection of programs is general and flexible, not 
 depending upon a particular machine, but it accepts a description of 
 it, according to which, programs written in assembly language for that 
 particular machine are decoded and the output produced is loaded into 
 the machine. After a program has been assembled it may be punched on 
 cards or paper tape or may be saved in a file. 
 
2. ASSEMBLY 
 
 An assembler transforms the symbolic language source 
 programs into machine language. In the case of PDP-8, every machine 
 instruction occupies exactly one location in its memory. The assembly 
 language program is a sequence of input lines to the assembler which 
 specifies these instructions in a symbolic form. The assembler reads 
 these lines, decodes them, and constructs or assembles the corresponding 
 binary words for the specific computer. 
 
 Symbolic names for the memory locations are defined by their 
 appearance at the beginning, in most assemblers, of an input line. 
 Symbolic names for operation codes appear next, sometimes followed by 
 operands. Then comments may or may not follow. 
 
 The assembler lists a value corresponding to the value of 
 the operator, augmented by the value of the operand. Each such value 
 is associated with an address by means of the program counter (PC). The 
 PC contains a value which is incremented after each word is generated. 
 So normally assembled words are placed in serially ascending locations 
 in memory. Some input lines will not generate words but are instructions 
 to the assembler, for example the pseudo's END, ORG, DC, etc. The 
 symbolic information on each assembly language line is grouped into four 
 fields: the location name or label, the operation code or mnemonic, the 
 address field or operand, and comment fields. These fields are usually 
 delimited by blanks. The ASSEMBLER assumes blanks as delimiters. If 
 the user wishes to use a different symbol, he has to define it in the 
 tables that are used for the translation of the particular field. The 
 fields are described in the description of the particular machine. 
 
The location name, usually, starts at character 1 and is 
 
 terminated by the first blank. We say usually because even the case 
 
 of a machine having first the operation code, next the location, and 
 
 next the address or having an asterisk (*) for a delimiter, could be 
 
 handled provided that the user describes it so to the ASSEMBLER. If 
 
 the location name is non-empty it may contain a name of up to eight 
 
 characters, beginning with a letter. Any variable used in the program 
 
 must be defined by its appearance in the location field. The variables 
 
 used with some pseudo's, i.e. EQU, must be predefined, that is, defined 
 
 at some point before the pseudo is processed. For instance 
 
 A EQU B+l 
 B EQU 10 
 
 is illegal, whereas 
 
 B EQU 10 
 A EQU B+l 
 
 is allowed. As another example 
 
 A EQU B+10 
 B EQU 2*A-30 
 
 does not define B as 10 and A as 20. The operation code field, or 
 
 mnemonic field, is the expression starting with the first non-blank 
 
 character after the location field and ending with the next blank. Any 
 
 variable appearing in the operation code field must be an operation 
 
 code. In the particular case of the PDP computers, if the operation 
 
 code is a microinstruction then the address field or operand field is 
 
 empty. Otherwise it starts with the first non-blank character after 
 
 the mnemonic and ends with the next blank. Any variable appearing in 
 
 the address field must be a label. 
 
The above three fields may total up to 72 characters. The 
 comment field starts at the end of the address field and may extend 
 up to the 80th character. Comments after the address field may or 
 may not be preceded by the particular comment character, in the case 
 of PDP-8, the character slash (/), in others, asterisk (*). 
 
 The comment has no effect on the binary output of the 
 assembler; it is only copied on the assembly listing, being very 
 useful to the programmer so as to know what he is trying to perform 
 with a particular instruction or group of instructions as well as to 
 the other programmers who might want to use this program. The end of 
 the program is sensed by the END pseudo. In the case of Load-and-Go 
 assembly, the last instruction executed will be a transfer to the 
 address evaluated from the END pseudo. When the output of the assembler 
 is input to the loader then this address is placed in a suitable 
 position on a transfer card image. 
 
 The assembler provides two kinds of output: the binary 
 object "deck", and the assembly listing. The former is a list of the 
 machine program in a form acceptable by the loader of the particular 
 computer and the latter helps the programmer to debug his program from 
 certain possible programming errors. 
 
 The ASSEMBLER is a typical two-pass one, where the first 
 pass is used to produce a table of all symbolic addresses used and 
 their address values, whereas the second is used to substitute these 
 values into the original symbolic form to get the binary form. 
 
3. PASS I 
 
 To construct the table each input line of code is read, one 
 at a time. If a name appears in the location field, then it is put 
 into the table. The assembler assumes that the first instruction to 
 be read is placed into location zero, the next into location one, and 
 so on. In order to know what address value to associate with a name in 
 the location field, the location counter keeps an account of the space 
 used and it is incremented after each instruction has been handled. 
 
 Each entry in the table consists of the name, the associated 
 address value, a pointer pointing to the left, one pointing to the 
 right, and information on relocation, whereby left and right we mean 
 alphabetically smaller and bigger entries respectively. Since a check 
 has to be done for double definition of names during Pass I, it is 
 necessary to determine if the name just read in the location field is 
 already present in the table. This involves some sort of table look-up 
 procedure and the binary tree method has been chosen for this purpose. 
 
 The search mechanism consists of comparing the desired name 
 with the entry in the fixed position reserved for the middle entry, as 
 in the binary search. If the desired name matches, there is nothing 
 further to do, otherwise the search continues to the entry pointed to 
 by the appropriate of the two pointers (chain addresses). The table 
 built is a tree with two branches at every node, and nodes are labeled 
 by a table entry. Storing the information this way for a binary search 
 has the advantage that it is no longer necessary that the entry at the 
 start, or root, of the table must be the middle entry. Whichever entry 
 
is used, it is only necessary that all other entries be to the left 
 or right of it, depending on whether they are smaller or larger than it. 
 It has the advantage, also, that names can be added to the table at 
 any time. If a new name is to be entered, a search process is followed 
 which will finally come to the end of a chain, indicated by a zero link 
 address. If the name is found in the search then it obviously should 
 not be re-entered. When the end of a chain is reached, the new entry 
 can be added at that point and the appropriate link established. This 
 method has the additional advantage that it is easy to print the table 
 in alphabetical order when the time comes. It has the disadvantage that 
 if the names entered are in order, then the tree is one-sided. It is 
 then as slow as a sequential search, but takes up more space because 
 of the pointers. 
 
 If the input lines were containing only instructions, then 
 the simple mechanism of incrementing the location counter by one for 
 each line would be sufficient. However, there has to be one pseudo 
 instruction in any code, the END pseudo which tells the assembler that 
 the whole deck has been read so that the next pass can begin. Of course 
 there are many other pseudo' s. So in order that Pass I recognize the 
 difference between the instruction and pseudo orders, it must examine 
 the mnemonic coding. The mnemonic is a string of characters similar to 
 a symbolic name, so that similar techniques are used to handle mnemonics. 
 In this case, the table of mnemonics has been set up by the user; they 
 are the first cards of the description of the particular machine that 
 are read and stored using the same technique of the binary tree, as with 
 the names. Each entry consists of the mnemonic, the actual code, the 
 
left and right pointers, and a branch address which gives the address 
 of the code used to handle the mnemonic or speudo order. 
 
 The typical flow of Pass I is the following. The input line 
 of code is read (if it is a comment no particular action is taken, so 
 we will not mention comments) and the mnemonic extracted. The mnemonic 
 code is looked-up in the mnemonic table. If it is not present and 
 macros are not allowed, there is an error; if it is present, a branch 
 is made to the address found in the table. Then the appropriate 
 section of code takes care of the rest of the line. When the input 
 line which is usually a card image from magnetic tape or disk file has 
 been read into the program area of memory and a copy has been produced 
 for later use in Pass I, the assembler must extract various fields 
 from it in order to form names, mnemonics, and addresses. The positions 
 and lengths of these fields are described to the ASSEMBLER by the user. 
 The various fields are then handled as follows . 
 
 Location field - This may be a fixed length field. Extract 
 characters one at a time. If the first non-blank character is other 
 than an alphabetic character, it is not normally an allowed name. Otherwise 
 subsequent characters must be alphabetic, numeric, or blank. A blank 
 indicates the end of the name in which case the field should contain no 
 more non-blank characters. In other words, blanks are not allowed in 
 the names. If the user wants a delimiter other than the blank, he has 
 to describe it in the table used for the translation and testing of the 
 characters making the location field. 
 
 Mnemonic field - This may also be a fixed length field, except 
 that it is required that it start in the first column of the field. A 
 
8 
 
 similar program to the one above is used except that the first character 
 must be a non-blank character. There is no logical reason to restrict 
 mnemonics to start with an alphabetic character and contain only 
 alphanumeric characters, but they frequently are so restricted which 
 makes convenient the use of the same reading procedures. 
 
 Address field - This field may differ from instruction to 
 instruction, in many cases containing a number of subfields separated 
 by commas. Within each subfield the address can be expressions 
 involving names and numbers and some of the arithmetic operators such 
 as plus (+), minus (-), and multiply (*). We will come back to names 
 and expressions later. The first step in the process of decomposing 
 such a subfield is to break it into separate elements such as names, 
 operators, numbers » etc. There are many ways to do this. It is, however, 
 faster to perform a lexicographic scan of the field first. By a 
 lexicographic scan we mean an analysis that only concerns itself with 
 each character one at a time, taking into account the immediate neighbor 
 of the character. 
 
 The recognition of the elements of the subfield is performed 
 very simply by scanning from left to right, and noting the following. 
 
 Names start with a letter and contain letters or digits. 
 
 Numbers start with a digit and contain only digits. Starting 
 from the left, the next character is examined. If it is a letter, then 
 a name is recognized. A subscanner for name recognition examines 
 consecutive characters until a non-alphanumeric character is read. This 
 signals the end of the name. Since names are restricted to a maximum 
 length, a check is made for excessive characters. After the string of 
 
characters representing the name has been scanned, control returns to 
 the basic recognizer. The next character is examined and a branch to 
 a basic recognizer for numbers, names, or operators is made. 
 
 The recognition process for names, numbers, etc., involves 
 more than just checking for the existence of the name, number, etc. 
 Something meaningful has to be done with the address . Although the 
 address fields of instructions need not be translated until Pass II, 
 the address fields of some pseudo orders affecting the location counter 
 will have to be converted to numbers in Pass I. This means that the 
 characters in a name will have to be packed together in a form suitable 
 for the table look-up process, that decimal digits will have to be 
 converted into binary integers, and that the calculation indicated will 
 have to be performed between the operands. 
 
 This particular kind of action taken for certain pseudo orders 
 is realized by branching to the appropriate address going with every 
 mnemonic or pseudo. In the case that the expression will have to be 
 evaluated in Pass I, normally it has to be well defined, that is, all 
 of the names appearing in this expression must be previously defined 
 in the name table. 
 
10 
 
 k. PASS II 
 
 The purpose of Pass II of the assembler is to convert the 
 source language into binary, using the name table constructed in 
 Pass I to convert the addresses and the mnemonic table to convert the 
 instructions. To do this, a copy of the source program is read, a line 
 at a time, and many of the steps of Pass I repeated. 
 
 The location field is ignored because it was completely 
 handled in Pass I. 
 
 The mnemonic field is examined and a table look-up performed. 
 In this pass, the mnemonic table provides both a branch address for 
 instructions or pseudo orders and in the case of instructions, the 
 binary code. 
 
 The address subfields are converted into binary numbers for 
 packing into the instructions or use in pseudo orders. The code in 
 the Pass I handling of pseudo order addresses is re-used for this 
 process. A location counter is maintained in an identical manner to 
 Pass I. Pass II is also terminated when the END pseudo order is read. 
 The END pseudo can involve an address field which is used to provide a 
 starting address at execution time. In the case of a Load-and-Go 
 assembler, the last instruction executed would be a transfer to the 
 address evaluated from the END pseudo. In our case the output of the 
 ASSEMBLER is input to a loader, this address is placed in a suitable 
 position on a binary card image. 
 
11 
 
 5. NAMES AND EXPRESSIONS 
 
 We have mentioned names and expressions in the process of 
 decomposing the address field. A name is a symbol which stands for a 
 numeric value. It may stand for a self-defining value, called a 
 constant; or it may stand for a value which is defined elsewhere, a 
 variable. A variable may be an operation code or a pseudo order, in 
 which case it is defined from the mnemonic table read and built in 
 the description of the particular machine or it may be a label, in 
 which case it is defined by its appearance in the location field of 
 some input line. If this line corresponds to a memory location, then 
 the defined value of the label is the address of this location. If 
 the operation field of the line is the pseudo EQU or DC, the defined 
 value of the label is the value of the expression in the operand 
 (address) field. There is a special name which is self-defining. Its 
 value is the current contents of the location counter. This special 
 name is given in the description. It may be the dot ( • ) as in the 
 case of the PDP series or the asterisk (*) as in the case of the 
 IBM 360, for example. 
 
 The following EBCDIC characters may be used in the formation 
 of names and expressions. 
 
 Alphabetic: Upper case letters A-Z 
 
 Numeric: Digits 0-9 
 
 Operators: + - * (plus, minus, multiply) 
 
 Delimiters: Blanks assumed unless otherwise specified. 
 
 Special character for comment field as specified 
 in the description. 
 
12 
 
 Names must be up to eight characters long. Variables may contain 
 alphabetic and numeric characters, but they must start with an 
 alphabetic character. Constants contain only digits. An expression 
 is a sequence of names separated by the operators +, -, and *, and 
 delimited by blanks. In the mnemonic field, all variables must be 
 operation codes or pseudo orders. In the address field (operand 
 field) all variables must be labels. The assembler evaluates the 
 expression from left to right by combining the values of the names 
 according to the operators. The most general form of an expression 
 in the address field is 
 
 N*A±B 
 where N is an integer and A, B are names, absolute or relocatable. 
 The assembler produces relocation bits with each address , which tell 
 the loader whether or not relocation is to be applied to that particular 
 address. In addition to the value of the name, the name table contains 
 an entry which provides information indicating whether or not a name 
 is relocatable. Any name appearing in the location field of an 
 instruction is relocatable, as are names in the location fields of 
 certain pseudo orders. The pseudo order which can define a non-relocatable 
 name is the EQU pseudo. 
 
 An absolute (non-relocatable) address can be constructed from 
 any allowable expression involving numbers and absolute valued symbolic 
 addresses. For example, 20*3-^*A+5 is a valid absolute address if A 
 is an absolute name. A relocatable address can be constructed by adding 
 or subtracting any absolute amount to a relocatable name. For example, 
 if B is a relocatable name then B-k is also relocatable, as it is B+20. 
 
13 
 
 There are cases where an expression like B-A+l is needed and we would 
 like to arrange that the difference of two relocatable address expressions 
 is an absolute expression. In this way, the expression A-B+C would be 
 legal unless A and C were relocatable and B absolute. Although the 
 address A+C may be needed by the programmer in some cases where both 
 A and C are relocatable, it is not possible for the loader to handle 
 it with only one relocation bit. With only one bit, the loader can 
 only apply either single relocation or none. Even with single 
 relocation it is possible to allow expressions such as 2*A-B where 
 both A and B are relocatable, since the total relocation is still 
 single. The ASSEMBLER assuming two bits, and handling double, single, 
 or no relocation, restricts the general expression N*A ± B to take values 
 as in the following table, producing the corresponding relocation 
 
 Table 1. 
 RELOCATION RESULTING FROM THE EXPRESSION N*A±B 
 
 Relocation 
 
 Double 
 
 Single or Double 
 
 Single 
 
 No Relocation 
 
 No Relocation, Single or Double 
 
 Single or Double 
 
 Single 
 
 No Relocation 
 
 A 
 
 B 
 
 OP 
 
 N 
 
 Rel 
 
 Rel 
 
 + 
 
 <_ 1 
 
 Rel 
 
 Abs 
 
 + 
 
 1 2 
 
 Abs 
 
 Rel 
 
 + 
 
 "any" 
 
 Abs 
 
 Abs 
 
 + 
 
 "any" 
 
 Rel 
 
 Rel 
 
 - 
 
 1 3 
 
 Rel 
 
 Abs 
 
 - 
 
 1 2 
 
 Abs 
 
 Rel 
 
 - 
 
 "any" 
 
 Abs 
 
 Abs 
 
 - 
 
 "any" 
 
11+ 
 
 The restrictions on the integer N are imposed by the requirement of 
 at most double relocation. The value of N "any" is such that the 
 limitations of the particular computer are not exceeded. When an 
 expression is evaluated a check is made to find whether or not the 
 value of the expression is within the current memory core block 
 referred to as "page." If it is then the same-page bit of the 
 assembled instruction is set to one. If this bit is zero any location 
 in "page" zero can be addressed directly from any page of core memory. 
 All other core memory locations can be addressed indirectly by setting 
 the indirect-bit. The rest of the bits specify the location in the 
 current "page" or "page" zero, which contains the full absolute 
 address of the operand. Indirect addressing is sensed by the presence 
 of a special character, "I" in the case of the PDP-8, preceded and 
 followed by blank. This special character is given in the description 
 of the particular computer. 
 
15 
 
 6. PSEUDO ORDERS 
 
 Pseudo orders are operation codes which do not represent 
 actual machine instructions, but are simply signals to the assembler 
 to take certain action. Pseudo' s provided to the ASSEMBLER together 
 with their effect are given below. 
 Data Loading: 
 DC - Define Constant 
 
 Define the optional symbol in the location field to have a value 
 equal to the current contents of the location counter. Then 
 substitute the value of the expression in the address field for 
 the memory location signified by the current contents of the 
 location counter. It is necessary to determine how many words 
 of storage will be occupied by the data given in the pseudo, so 
 that the location counter can be incremented accordingly during 
 Pass I. In scanning the field, the first character determines 
 the type of dield following. It may be preceded by a repetition 
 factor. For some characters, an L may follow with a length 
 specification. Finally the data appears inside quotation marks 
 CI. During Pass I the program determines the boundary alignment 
 of each field in the DC in order to calculate the location counter 
 change. During Pass II the location field is ignored, but the 
 address field is converted into binary. At the same time the 
 location counter is increased once for each word produced. 
 
16 
 
 Location Counter Control: 
 
 ORG - Set the location counter to a specific quantity. 
 
 Sets the location counter to the value specified in the address 
 field, in Pass I and Pass II, so that the next instruction read 
 will be loaded in this value. If a name appears in the location 
 field, then it is put into the name table after the location 
 counter has been changed and given a value equal to the new 
 contents of the location counter. 
 
 BSS - Block Started by Symbol 
 
 Any name in the location field must be entered by the name table 
 before the location counter is incremented. 
 
 BTS - Block Terminated by Symbol 
 
 Any name in the location field must be equated to the location 
 of the last word of the block, that is to one less than the 
 contents of the location counter after it has been incremented. 
 As long as the addresses in the address field are purely numeric, 
 there are no problems. However, if a symbolic address is 
 involved, then a value has to be assigned to it, namely it has 
 to be predefined in order that the numeric value of the address 
 can be calculated. In Pass I, only those names that appeared 
 before the line being currently examined are in the name table 
 with numeric values. Therefore, names must be defined before 
 they are used in any pseudo orders that affect the location 
 counter in a manner dependent on their address field. 
 
IT 
 
 DS - Define Storage 
 
 Define the optional symbol in the location field to have a 
 value equal to the current contents of the location counter. 
 Then add the value of the expression (predefined) in the 
 address field to the contents of the location counter. Ds is 
 similar to DC and the same piece of code is used in both passes, 
 the only difference is that DS does not actually produce object, 
 ■whereas the DC must have the data specified in the address field. 
 Name Table Entry: 
 EQU - Symbolic Equivalence 
 
 Define the name in the location field to have a value equal to 
 that of the expression (predefined) in the address field. 
 Others: 
 END - End Assembly 
 
 Define the optional symbol in the location name to have a 
 value equal to the current contents of the location counter. 
 If the address field is non-empty, then its value will be 
 punched on a binary transfer card as the starting address 
 of the program. 
 
18 
 
 7. DESCRIPTION OF THE MACHINE 
 
 This is the part of the program which makes the ASSEMBLER 
 work for many types of machines. At the end of the ASSEMBLER and 
 before the END pseudo which signals the end of the process, the user 
 has to include a small deck of cards which define to the program his 
 particular machine. On each card there are three fields punched 
 starting at columns 1, 10, and l6, respectively. The first field is 
 the name used for this particular piece of information through the 
 program. In this case, in order to define another machine, only 
 this small deck of the description will have to be changed, with the 
 names in the program and the description remaining the same. The 
 second field defines the first field as a constant or equivalent to 
 the third field. The second field, also, will remain unchanged for 
 defining another machine. The third field is the actual description 
 and it is the one which changes when the machine changes. For example 
 if the character slash (./ ) is used to specify "comment" for one user 
 and the character semicolon (;) for a second then the third field will 
 be C'/' f° r the first user and C';* for the second, the C standing for 
 character and the actual character following, included in quote marks. 
 Also if the mnemonic starts at column 6 in the program of one used and 
 at column 10 in the program of a second, the third field will be 6 for 
 the first and 10 for the second. Similarly if the character I signifies 
 indirect addressing for the one and the asterisk (*) for the other, the 
 third field will be CL2'I' for the first and CL2'*' for the second, 
 with CL2 signifying that there will be two characters in the quote marks, 
 
19 
 
 the first specifying the indirect addressing and the second the 
 character blank. 
 
 The following list contains the use of the names in the 
 description and the form in which the definition is given. 
 
20 
 
 Alpha: Character specifying comment, given in the form 
 
 ALPHA DC C ' / ' 
 Beta: Starting column for mnemonic, given in the form 
 
 BETA EQU 6 
 Gamma: Length of mnemonic, given in the form 
 
 GAMMA EQU 8 
 Delta: Starting column for location name, given in the form 
 
 DELTA EQU 1 
 Epsilon: Length of location name, given in the form 
 
 EPSILON EQU 9 
 Eta; Starting column for address, given in the form 
 
 ETA EQU 10 
 Theta; Length of address, given in the form 
 
 THETA EQU 73-ETA 
 Iota; Character specifying indirect addressing, given in the form 
 
 IOTA DC CL2 • I ' 
 Kappas Character specifying current address, given in the form 
 
 KAPPA DC C ' . * 
 Lamda; Length of operation code in number of bits, given in the form 
 
 LAMDA EQU 3 
 Mi; Length of operation code in number of bits, given in the form 
 
 MI EQU 12 
 Pi: Length of operation code in case of microprogramming 
 in number of bits t given in the form 
 
 PI EQU 12 
 
21 
 
 8. DIAGNOSTIC MESSAGES 
 
 When the ASSEMBLER detects an error, or when the user 
 should be notified by means of a warning, it prints diagnostic 
 messages to help the programmer correct the cause of error. If an 
 error is detected in Pass I a flag is set so that the assembly will 
 not continue to Pass II. The diagnostic messages and their meanings 
 are listed below. 
 
 'MNEMONIC DOUBLE-DEFINED'. After printing the mnemonic 
 this message appears when a variable is given, more than once, as 
 input for building the mnemonic table. It is not a critical error, 
 unless the user defines with the same name two different operation 
 codes so the flag is not set and assembly will continue. It is 
 simply given as a warning. 
 
 'INVALID MNEMONIC. The mnemonic contains an invalid 
 character. The flag is set. 
 
 'NAME UNDEFINED'. During the evaluation of an expression 
 in the address field, a name was encountered which was not defined 
 in the program. Note that names in some pseudo orders must be predefined. 
 
 'SHOULD BE MORE ENTRIES IN THE TABLE'. This message is 
 received when the mnemonic look-up procedure takes place and tracing 
 the pointers from the root of the tree down to the branches the smallest 
 entry is found but the search argument is smaller or the largest is found 
 and the search argument is larger; in other words the mnemonic is not 
 found in the mnemonic table built by the programmer. The flag is set 
 in this case. 
 
22 
 
 'FIELD EXCEEDS LENGTH'. This message is printed in the 
 case that a name exceeds the limits of the location field (more than 
 eight characters), or a number in the address field exceeds the 
 machine limitations; also, if a name in the address field is too 
 big or a too large number is added to the current contents of the 
 location counter. In other vords this same message is printed in 
 all cases of violation of length. The flag is set in all cases 
 except in the case that the scanning of the address field takes place 
 only during Pass II. Then the assembly will continue only for awhile, 
 probably, because the too lengthy names will not be found in the table 
 and the overflow of the location counter will be caught at a later 
 point in the program. 
 
 'INVALID CHARACTER IN FIELD'. This message is received 
 when an invalid character is found in a field, for instance in the 
 location name, or the first character of a name, or a wrong character 
 in a number in the address field, or a name in the address field, or 
 the flag is set in the case that the error is discovered during Pass I, 
 else something similar as in the case producing the previous message, 
 will happen. 
 
 'NAME DOUBLE-DEFINED'. A name in the location field is used 
 more than once. In the case of a twice stored as entry in the mnemonic 
 table this error was not critical, but in this case the flag is set 
 and assembly will not continue to Pass II. 
 
 'OFF-PAGE REFERENCE'. The value of the address field of a 
 memory referencing instruction is neither an address in "page" zero nor 
 an address in the current "page" . 
 
23 
 
 At the end of the assembly the name table is printed in 
 alphabetical order together with information for cross-reference, 
 namely length, value, and definition references. 
 
2k 
 
 BIBLIOGRAPHY 
 
 Gear, C. W. "Machine Language and Systems Programming," CS 201 
 
 Class Notes, Department of Computer Science, University 
 of Illinois, Urbana, Illinois, September 1967. 
 
 Powers, M. "PDP-8 Assembler," Memorandum 12, University of Michigan, 
 Ann Arbor, Michigan, November 1967- 
 
 Small Computer Handbook , Digital Equipment Corporation, Maynard, 
 Massachusetts, 1968.