LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 51084 
 Ittbr 
 
 no. 233-300 
 cop.£ 
 
CENTRAL CIRCULATION AND BOOKSTACKS 
 
 The person borrowing this material is re- 
 sponsible for its renewal or return before 
 the Latest Date stamped below. You may 
 be charged a minimum fee of $75.00 for 
 each non-returned or lost item. 
 
 Thsft, mutilation, or defacement of library material! can be 
 causes for student disciplinary action. All materials owned by 
 the University of Illinois Library are the property of the State 
 of Illinois and are protected by Article 16B of Illinois Criminal 
 law and Procedure. 
 
 TO RENEW, CALL (217) 333-8400. 
 University of Illinois Library at Urbana-Champaign 
 
 'JUN2 81999 
 
 r£ 8 1 A,t1, 
 
 When renewing by phone, 
 below previous due date. 
 
 write new due date 
 L162 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/stringprocessing299heim 
 
> Report No. 299 
 
 STRING PROCESSING 
 ON A PARALLEL COMPUTER 
 
 by 
 
 Walter L. Heimerdinger 
 
 THE LIBRARY OF THE. 
 UNIVEKSITY OF ILLINOIS. 
 
 January 13, I969 
 
 ILLIAC IV Document No. 162 
 
Report No. 299 
 
 STRING PROCESSING 
 
 ON A 
 PARALLEL COMPUTER* 
 
 by 
 Walter L. He imer dinger 
 
 January 13, 1969 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 61801 
 
 *This work was supported in part by the Advanced Research Projects 
 Agency as administered by the Rome Air Development Center under 
 Contract No. US AF 30(602)4144 and submitted in partial fulfillment 
 of the requirements for the degree of Master of Science in Electrical 
 Engineering, February, 1969. 
 
ACKNOWLEDGMENT 
 
 The author wishes to express his gratitude to Professor 
 S. R. Ray for his guidance, encouragement, and abundant patience, 
 
 iii 
 
TABLE OF CONTENTS 
 
 Page 
 
 1. INTRODUCTION 1 
 
 2. THE ILLIAC IV MACHINE 4 
 
 2 . 1 The Quadrant 4 
 
 2.2 The Processing Element 6 
 
 2.3 The Control Unit 10 
 
 2.4 Memory Addressing 14 
 
 3. RECOGNIZER SPECIFICATIONS 17 
 
 3.1 The Source Language 17 
 
 3.2 Internal Identifiers 18 
 
 3.3 Tokens 20 
 
 3.4 Sections 21 
 
 4. THE CHARACTER CLASSIFIER 24 
 
 4.1 The Classification Process 24 
 
 4.2 The Control String 27 
 
 5. THE SECTION BUILDER 29 
 
 5.1 The Section Builder Process 29 
 
 6. THE SECTION JOINER 33 
 
 6.1 The Section Joiner Process 33 
 
 6.2 Character Count Determination 34 
 
 6.3 Symbol Tail Concatenation 35 
 
 7. TABLE MAINTENANCE 40 
 
 7.1 Table Use 40 
 
 7.2 The Reserved Word Table 40 
 
 7.3 The Main Token Table 41 
 
 8. RESULTS 46 
 
 8.1 Execution Time Data 46 
 
 REFERENCES 51 
 
 APPENDIX 52 
 
 iv 
 
1. INTRODUCTION 
 
 The name "computer" immediately associates the digital computer 
 with numerical calculations. However, a significant portion of digital 
 computer time is used for non-numerical tasks such as string processing. 
 Probably the most common programs that use string processing are pro- 
 gramming language translators or compilers. Since the string processing 
 operations used in these applications are well known, they are useful 
 for comparing alternative string processing implementations. 
 
 A compiler may conveniently be divided into two major portions, 
 the recognizer and the analyzer. The recognizer scans the input string 
 of source language characters and recognizes these as a series of 
 tokens (identifiers, numbers, etc.). The analyzer then uses informa- 
 tion generated by the recognizer to determine the structure of the 
 source program and to generate the required object code. Since the 
 structure of the program to be translated depends only upon the 
 syntactic units used and their order of appearance, the syntax analyzer 
 does not need the actual name or value of each token. In fact, many 
 translators use the recognizer to rename tokens with internal identi- 
 fiers or pointers before passing them to the analyzer. If the 
 internal identifiers or pointers are coded to indicate the classifi- 
 cation of the token, then the analyzer has all of the information it 
 needs in a compact, well-defined form. In such a system, the 
 recognizer converts the string of sparsely and irregularly distributed 
 source language characters into a new string of compact, standardized 
 token identifiers or pointers. All translators must maintain a 
 table of identifier tokens encountered. Since, in many cases, the 
 
2 
 
 internal identifiers are merely pointers to the table location occupied 
 by the original token, the maintenance of token tables is a natural 
 task for a recognizer. 
 
 A conventional recognizer usually works with only one symbol 
 at a time. Since the required operations are often simple and 
 repetitive, we might ask if a parallel machine capable of operating 
 on several characters simultaneously could be efficiently used to speed 
 up the process. The first large scale implementation of a parallel 
 computer is the Illiac IV computer which is currently under construc- 
 tion. In this machine, the single accumulator of the Von Neumann 
 computer is replaced by multiple accumulators, each having separate 
 subsidiary registers, arithmetic hardware, and memory modules. The 
 machine was designed for arithmetic problems that use vectors, 
 matrices, or meshes, but it has some features that are useful for 
 string processing. 
 
 The operations required for the translation of a modern 
 fie Id -independent programming language such as ALGOL or PL/ I are a 
 challenge to implement on a parallel machine because the basic source 
 program elements (tokens) such as numbers or identifiers vary in 
 length and separation. Furthermore, the treatment required for a 
 given token depends not only upon the identity of the token, but also 
 upon the context in which it appears. Ordinarily, the string of 
 input symbols is scanned from left to right one character at a time 
 to insure that all of the information needed is available at each 
 step. Thus the interpretation of a group of symbols such as A + B 
 is changed by the insertion of the string delimiter, ", in the input 
 
3 
 string ahead of the group. Despite such obstacles, it will be shown 
 that a recognizer can be turned sufficiently "inside out" to obtain 
 reasonable efficiency with a parallel computer such as Illiac IV. 
 
2. THE ILLIAC IV MACHINE 
 
 2.1 The Quadrant 
 
 The Illiac IV system is built around four identical arrays 
 known as quadrants . Each quadrant contains 64 processing units 
 (P.U.s) operating under the control of a single control unit (C.U.) 
 in the arrangement shown in figure 1. 
 
 The construction of an Illiac IV quadrant can be visualized 
 by considering a conventional computer arrangement consisting of an 
 arithmetic/ logical section for transforming or comparing operands, 
 a control section with the instruction decoding and sequencing hard- 
 ware, and a memory section. A quadrant essentially replicates the 
 arithmetic/logical section and the memory section of the conventional 
 computer 64 times, but retains a single control section. In the 
 single quadrant array mode of operation, in which the quadrants act 
 independently, the 64 arithmetic/ logical units, named processing 
 elements or P E.s, are arranged in a string with numbers from to 
 63. Thus, each element has an adjacent higher neighbor and an 
 adjacent lower neighbor, with P.E. acting as the higher neighbor 
 for P.E. 63. Single words may be transferred between neighboring 
 P. E.s, a process referred to as "routing." Additional routing paths 
 are provided between each P.E. and the P.E.s located eight positions 
 higher and eight positions lower in the string, so each P.E. can 
 communicate directly with four other P.E.s in the quadrant. 
 
Connections Co Other Control Units 
 and Input /Output Controller 
 
 CONTROL UNIT 
 
 (Instruction decoding /sequencing! 
 
 and address indexing) 
 
 To & from 
 P.E. #63 
 
 1 
 
 C.U. BUFFER 
 (8 words) 
 
 To & from 
 P.E. #56 
 
 PROCESSING bLiii4ENT #0 
 (Arithmetic & logical 
 operations) 
 
 PROCESSING UNIT #0 
 
 4 
 
 T 
 
 » 
 
 P.E. MEMORY 
 #0 
 
 To & from 
 P.E. #8 
 
 To & from 
 P. E. j&7 
 
 PROCESSING ELEMENT 
 
 PROCESSING UNIT #1 
 
 4 ¥ 
 
 To & from 
 
 P.E. #9 
 
 P.E. MEMORY 
 #1 
 
 64 -bit Common 
 Data Bus plus 
 about 260 control 
 lines to P.E. 8 
 
 To & from 
 
 P.E. Jb5 
 
 PROCESSING ELEMENT 
 #63 
 
 PROCESSING UNIT #63 
 
 > 
 
 X 
 
 To & from 
 P.E. #0 
 
 To & from 
 P.E. #7 
 
 *■• <- / 
 
 Routing Connections 
 
 1024 bit bus 
 to & from 
 Input /Output Switch 
 
 Figure 1 - Illiac IV Quadrant 
 
2.2 The Processing Element 
 
 The processing element is a 64 bit unit with the arithmetic 
 equipment of a large-scale floating point computer. As shown in 
 figure 2, the P.E. contains an accumulator (register A), a register 
 to hold the second operand for binary operations or to act as an 
 extension to register A for double length operands (register B) , 
 a temporary storage register (register S), and a register provided 
 with the connections for routing data to or from one of four other 
 P.E.s (register R). In addition to the aforementioned 64 bit registers, 
 the P.E. has a 16-bit address register (register X) and a separate 
 adder mechanism for address arithmetic. Address arithmetic is per- 
 formed modulo 2 , but the memory accessing hardware uses only the 11 
 least significant bits of the P.E. address field. 
 
 The P.E. 8 can be set to treat the 64 bit register contents 
 either as 64-bit operands or as pairs of "inner" and "outer" 32-bit 
 operands. Thus a single instruction may cause 64 simultaneous 64-bit 
 operations or 128 simultaneous 3 2 -bit operations in a quadrant. All 
 but a few of the P.E. instructions are applicable to both 64-bit 
 and 32-bit operands. The word size is changed by a control unit 
 instruction which sets all P.E.s of the quadrant to the selected word 
 size. 
 
 There is a small set of P.E. instructions that allow simul- 
 taneous addition or subtraction of the eight 8-bit fields or "bytes" 
 of the 64 -bit word. These instructions are unaffected by the word 
 size state. They are implemented by breaking the carry propagation 
 in the adder circuitry at eight bit intervals. The same equipment 
 
To & from 
 
 Neighboring 
 
 P.E.s 
 
 Data from 
 P.E. Memory 
 
 1. 
 
 Common Data Bus 
 from C.U 
 
 To & from 
 
 C.U. 
 
 OPERAND 
 
 SELECT 
 GATES 
 
 O 
 
 R REGISTER 
 ( Rout ing 
 register) 
 
 i 
 
 n. 
 
 B REGISTER 
 (Extension 
 register) 
 
 1—1 
 
 ARITHMETIC 
 UNIT 
 
 A REGISTER 
 (Accumulator) 
 
 I 
 
 t t 
 
 LOGIC UNIT 
 
 (Boolean 
 
 operations) 
 
 I 
 
 BARREL SWITCH 
 (Shift unit) 
 
 k 
 
 MODE 
 REGISTER 
 
 S REGISTER 
 (Temp, storage 
 
 registor) 
 
 I 
 
 ±JlA 
 
 ADDRESS 
 ADDER 
 
 X REGISTER 
 
 (Index 
 
 register) 
 
 Addresses to 
 P.E Memory 
 
 -^Data to 
 
 P.E. Memory 
 
 Figure 2 - Processing Element (P.E.) 
 
8 
 is used to test 8-bit fields for equality and inequality relations. 
 These tests operate on the eight fields simultaneously and leave 
 register A with ones in the least significant bits of byte fields 
 meeting the test conditions. The remaining bits of register A are 
 cleared. The character manipulation schemes will rely heavily on these 
 8 -bit operations. 
 
 Each P.E. also has an 8-bit register not commonly found in 
 conventional computers known as the mode register (register D). The 
 eight bits are designated E, El, F, Fl, I, G, J, and H respectively. 
 The I and J bits record the results of various P.E. test operations. 
 If the P.E. is in the 32-bit mode so that two operands are tested 
 simultaneously, then the G and H bits hold the additional test results 
 corresponding to the 1 and J bits, respectively. The F bit indicates 
 arithmetic overflow and is supplemented by the Fl bit in the 32-bit 
 mode. The E and El bits, or enable bits, perform a function unique 
 to the array computer. Setting these two bits of a particular P.E. 
 to one causes that P.E. to operate in the "enabled" state by fully 
 executing any P.E. instructions encountered by its controlling C.U. 
 However, one of these instructions may cause the E or El bit to be set 
 to zero, either directly or by using the value of one of the other 
 mode register bits. When this occurs, the P.E. enters a "disabled" 
 condition in which registers A, S, and X cannot be changed and memory 
 references are ignored, effectively stopping activity in the P.E. 
 The E bit controls the action of register sections corresponding to 
 the outer operand of the 32 -bit word format, while the El bit controls 
 the inner operand sections. Thus, when E and El differ it is 
 
9 
 possible to modify 32 bits of a register while leaving the remaining 
 portion of the register unchanged. 
 
 Some test instructions use the address adder and will cause 
 the B register of a disabled P.E. to be changed. Register R of a 
 disabled P.E. is also liable to change, since it must be available 
 for routing between other enabled P.E. 8. Since P.E. 8 may be enabled 
 according to predetermined patterns or by the results of test operations, 
 one can obtain the effect of program branching by executing several 
 sets of instructions with a different group of P.E.s enabled for each 
 set. This will be used in many places in the character processing 
 routines to follow. Obviously, such a technique must be used with 
 caution since disabled P.E.s represent unused processing power. 
 
 Processes such as normalization require that a single shift 
 command provide for differing shift counts in the various P.E.s. The 
 common technique of repeatedly shifting a short distance until 
 reaching the desired shift count was deemed unacceptably slow, so a 
 cascade of shift gates that can produce any shift or rotation from 
 to 64 bits in two clock times is provided. This shifter is referred 
 to as the "barrel switch" in most Illiac IV descriptive literature. 
 The shift count for the barrel switch can be obtained from a unit 
 that detects the position of the first one in the mantissa field of 
 the P.E. accumulator (the leading one detector) or it may be obtained 
 from a P.E. or C.U. register. The shift count may be indexed as if 
 it were a memory address. This feature is extremely useful for 
 masking and alignment operations. 
 
10 
 2.3 The Control Unit 
 
 Figure 3 shows the major sections of the control unit. The 
 advanced station (ADVAST) portion decodes, Interprets, and controls 
 the flow of all Instructions executed In the quadrant. The ADVAST has 
 four 64-bit accumulators designated ACAR through ACAR 3, and a local 
 storage area consisting of 64 Integrated circuit registers known as 
 the ADVAST data buffer (ADB). ADVAST also has a 24-bit address arith- 
 metic unit which can use any of the four accumulators as an index 
 register divided as shown below: 
 
 1 bit 
 
 15 bits 
 
 24 bits 
 
 24 bits 
 
 Not used 
 
 Increment field 
 
 Limit field 
 
 Index field 
 
 The first bit of the increment field (bit 1 of the ACAR) acts as a 
 sign for the increment. Indexing instructions that use the increment 
 field to modify the index field will subtract the increment from the 
 index if bit 1 is a one, and will add the increment and index otherwise. 
 Although the arithmetic section is limited to 24 bits, ADVAST can apply 
 any of the standard Boolean operations to 64-bit operands, and can 
 set, clear, complement, or test any of the individual bits in an 
 accumulator. The C.U. can examine the mode bit settings of the P.E.s 
 in its quadrant with an instruction that sets each of the 64 bits in 
 a designated ACAR to agree with the value of the selected mode bit 
 in a corresponding P.E. For example, if the E bit were selected, a 
 one in the last position of the accumulator would indicate that the 
 E bit of P.E. 63 was set to one (and that P.E. 63 was enabled). 
 An E bit pattern with only zeros indicates that all of the P.E.s are 
 
Memory 
 Service 
 UNIT 
 (MSU) 
 
 Advanced 
 
 Station 
 
 (ADVAST) 
 
 Test- 
 Maint. 
 Unit 
 (TMU) 
 
 i | 
 
 Final L_ 
 
 Station 
 
 (FINST) 
 
 Instruction Look -Ahead 
 Unit (ILA) 
 
 64 -word ADVAST 
 Data Buffer (ADB) 
 
 ACARO 
 
 ACAR 1 
 
 ACAR 2 
 
 ACAR 3 
 
 24 -bit adder and 
 64 -bit logic unit 
 
 11 
 
 JTo & from 
 "p.E. Memories 
 
 P.E. instructions 
 and operands 
 
 8 -word Final 
 Queue (FINQ) 
 
 P.E. Sequencer 
 
 J 
 
 P.E. control signals 
 
 Figure 3 - Illiac IV Control Unit 
 
12 
 disabled, while the complementary pattern of ones indicates that all 
 P.E.s are enabled. Such patterns may be detected by C.U. test 
 instructions that check for all zeros or all ones in the entire 64-bit 
 accumulator or in the rightmost 24 bits of it. Unlike P.E. test 
 operations, which merely cause status bits to be set, C.U. tests 
 control the instruction flow by skipping forward or backward in the 
 instruction stream. A pattern in an ACAR may be transferred to the 
 P.E. mode registers so that each of the 64 bits in the pattern sets 
 the selected mode register bit in a different P.E. A novel feature of 
 the C.U. is the leading one detection hardware which can replace a 
 64-bit pattern in one of the accumulators with a binary integer that 
 represents the position of the leftmost one (or zero) in the pattern. 
 This facility is a boon for finding special characters or subfields 
 in a long chain of characters. 
 
 The second major section of the control unit, the final station 
 or FINST, receives only instructions that require P.E. action. FINST 
 converts the P.E. instructions into sequences of appropriate enable 
 and gate signals and transmits these signals to all P.E.s in the 
 quadrant over a system of about 260 control lines. FINST also sends 
 any operands or addresses needed from the C.U. by "broadcasting" the 
 required data over a 64-bit common data bus (CDB). 
 
 Since ADVAST controls the instruction stream, all instructions 
 pass through it for decoding first. If the operation involves only 
 control unit hardware (a C.U. instruction), the ADVAST completes the 
 operation so that the instruction never reaches FINST. If the 
 instruction is a P.E. instruction, ADVAST decodes it, provides any 
 
 
13 
 indexing operations necessary at the control unit level, and passes 
 the recoded instruction and an operand, if required, to FINST for 
 disposal. Thus some instructions may be entirely processed by ADVAST 
 while other may pause in ADVAST only long enough for decoding before 
 being sent to FINST for execution. To avoid situations where either 
 ADVAST or FINST is idle waiting for the other section, the instruction- 
 operand pairs are passed from ADVAST to FINST through an eight word 
 first-in, first-out final queue named FINQ. Occasionally, ADVAST will 
 require results from the previous FINST operation, usually when reading 
 test results from the P.E.s. In such a case ADVAST is halted until 
 the FINQ is empty. Otherwise, FINQ allows for a considerable amount 
 of overlap between C.U. and P.E. instructions. This FINQ overlap 
 ability makes program timing estimation difficult, since the total 
 execution time is rarely the sum of ADVAST (C.U.) and FINST (P.E.) 
 time, although it cannot exceed this sum. 
 
 The three other control unit sections are of little interest 
 here. The instruction look-ahead (ILA) tries to maintain a supply 
 of new instructions in a set of 64 integrated circuit instruction 
 storage registers known as the instruction word store. This is done 
 by fetching blocks of eight instruction words (there are two 32-bit 
 instructions per word) from quadrant memory. Access to quadrant 
 memory is controlled by the C.U. 's memory service unit (MSU). The 
 remaining control unit section is a test -maintenance unit (TMU). 
 
14 
 2.4 Memory Addressing 
 
 Each processing element is provided with a P.E. memory unit 
 (P.E.M.) consisting of 2048 words of 64 -bit thin film memory. Thus 
 each quadrant contains an aggregate of 131,072 words of memory which 
 will be referred to as quadrant memory . P.E. memory addresses usually 
 originate in the address field of a P.E. instruction. The control 
 unit extracts this field when the instruction is decoded and may 
 increment or decrement this address by the contents of one of four C.U. 
 accumulator registers for a first level of indexing. The address is 
 then distributed to all P.E.s in the quadrant where it may be added 
 to or subtracted from the contents of each P.E.'s index register 
 (register X) to provide a second level of indexing. (Note that 
 hardware design considerations have dictated that the address be 
 subtracted from the index register contents instead of the more usual 
 subtraction of the index register contents from the incoming address. 
 Some index subtraction operations in the sequel may be confusing if 
 this requirement is not kept in mind.) The 16 bit address produced 
 by the P.E. hardware to access one of the 2048 P.E.M. locations will 
 be known as a P.E.M. address . (The memory hardware, however, uses 
 only the least significant 11 bits of this 16 bit field.) Normally, 
 an instruction that references a P.E.M. address will cause 64 memory 
 locations to be simultaneously accessed, one in each P.E.M. of the 
 quadrant. If no indexing is performed in the P.E.s, each location in 
 this block of 64 quadrant memory locations will have the same P.E.M. 
 address. Such a block will be called a slice . One of the 64 words 
 or slice elements in the slice will be located in P.E.M. 0. This 
 
15 
 word will be defined as the slice leader of the slice. Since indexing 
 in the P.E.s is possible, however, the locations actually accessed 
 all may not be in the same slice. 
 
 The control units have no memory except for a small buffer, 
 and obtain instructions and operands from the P.E.M.s. To permit the 
 referencing of any location in the four quadrant memories, a quadrant 
 address of 24 bits is created by appending eight bits to the least 
 significant end of the 16-bit P.E. address. The most significant two 
 bits of the 8 -bit addition indicate the desired quadrant, while the 
 remaining six bits locate a specific P.E.M. in that quadrant. In the 
 8 ingle quadrant mode, the quadrant specification bits become irrelevant, 
 and the address arithmetic is arranged so that, as the addresses are 
 incremented, the successor to a given address occurs at the same 
 
 r 
 
 location in the next higher numbered P.E., except that successors to 
 addresses in the last P.E. (P.E. 64) are located in the next higher 
 numbered location in the first P.E. (P.E. 0). The 24 -bit quadrant 
 address has the following format: 
 
 16 bits 
 
 2 bits 
 
 (The most significant 
 five bits are not 
 currently meaningful.) 
 
 (This field 
 should not 
 change if a 
 single quad- 
 rant is being 
 used. ) 
 
 6 bits 
 
 P.E. memory location 
 
 Quad number 
 
 P.E. number 
 
 Note that 11 of the first 16 bits of a quadrant address act as a 
 
 3 lice address , specifying which of the 2048 slices contains the desired 
 
16 
 address, while the last six bits locate the specific slice element 
 desired. 
 
 The array may operate with the four quadrants completely 
 separated or else two or more of the quadrants may execute the same 
 instruction stream and coordinate their routing operations to form a 
 larger array. Only a single quadrant will be used for the problem 
 In this study. If the full array of 256 P.E.s were available for 
 compilation, each quadrant would probably be assigned to a separate 
 compilation task. The majority of the llliac IV supervisory programs 
 reside in a B6500 commercial computer which acts as the controller 
 for the array. The B6500 controls an extremely large disk file 
 system with transfer rates up to 5 x 10° bits/second. Since 
 Input /Output operations are not directly controlled by the array, 
 this study will assume that all of the necessary information is 
 available in quadrant memory. 
 
17 
 3. RECOGNIZER SPECIFICATIONS 
 
 3.1 The Source Language 
 
 To see how a parallel processor may be used for program 
 translation, we will study an Illlac IV assembly language coding of 
 a recognizer for a dialect of the ALGOL programming language known 
 as Burroughs Extended ALGOL. Thus the source material should conform 
 to the rules specified In the 1966 edition of the Burroughs Extended 
 ALGOL Language Manual, with a few exceptions. The Burroughs Extended 
 ALGOL is defined for a set of 63 characters, each composed of six bits. 
 Since the smallest word subdivision provided in the Illiac IV instruc- 
 tions set is an eight bit "byte," and since newer communication 
 standards such as EBCDIC (Extended BCD) and ASCII (American Standard 
 Code for Information Interchange) provide for an eight bit character 
 size, a source string of eight bit characters will be assumed. The 
 first two bits of each character will be assumed to be zeros, and 
 the remaining six bits will be used to define input symbols according 
 to the coding of appendix B-l of the Burroughs Extended ALGOL Language 
 Manual. All constructs allowed by the manual should be properly pro- 
 cessed by the recognizer with the exception of the COMMENT construct 
 and a more stringent limitation on the use of reserved words. 
 
 A conventional compiler usually calls upon the recognizer to 
 process only one token before returning control back to the analyzer. 
 However, if the full capabilities of the Illiac IV array are to be 
 realized, it is apparent that more than one token must be processed 
 during each recognizer cycle. With the eight -bit instructions, each 
 Illiac IV quadrant can simultaneously manipulate 512 characters. 
 
18 
 Since the specifications for Burroughs Extended ALGOL provide for 
 tokens from 1 to 63 characters long, the recognizer is able to examine 
 several tokens simultaneously. Thus the parallel recognizer should 
 concurrently build and classify several tokens which may not be in 
 the same class. 
 
 ALGOL is built upon four classes of tokens: delimiters, 
 numbers, identifiers, and strings. The Burroughs specifications 
 further require that identifier and string tokens consist of 1 to 63 
 contiguous characters. Two types of delimiter tokens appear, reserved 
 words such as BEGIN, FOR, PROCEDURE, and the set of single character 
 delimiters or "special" characters consisting of character symbols 
 that are neither numeric nor alphabetic symbols. Care must be taken 
 to insure that delimiter symbols which are components of string tokens 
 are not confused with delimiter symbols occuring in the normal 
 context. This confusion arising from the string construct is resolved 
 quite naturally by recognizers that scan serially, but it presents a 
 problem for parallel recognizers. 
 
 3.2 Internal Identifiers 
 
 The final goal of the recognizer is the production of 
 internal identifiers in the following format: 
 
 HEADER 
 
 
 
 POINTER 
 
 
 Bit 7 8 39 40 63 
 
 All internal identifiers begin with an eight -bit header. If the 
 
19 
 internal identifier represents a delimiter token, these eight bits 
 are all zero. Otherwise, the eight bits of the header nay be repre- 
 sented as follows : 
 
 ABNNNNNN 
 
 The two bits labelled A and B identify the classification of the token 
 according to the following convention: 
 
 A 
 
 
 B 
 
 
 
 
 
 
 
 Number token 
 
 1 
 
 
 
 
 Identifier token 
 
 or 
 
 1 
 
 1 
 
 String token 
 
 The six bits designated N form a binary integer that gives the 
 number of symbols included in the token. Delimiter tokens belong to 
 a pre-defined set and are not filed in a token table, so their 
 internal identifiers can be simplified. Delimiters that appear in 
 the input block as a single non -alphanumeric symbol can be represented 
 by identifier words having the original symbol in the rightmost eight 
 bits and zeros elsewhere. Reserved word delimiters such as BEGIN, 
 PROCEDURE, FOR, etc. can be replaced by eight bit quantifiers by 
 assigning each word a unique integer betwetsu 63 and 256. These 
 quantifiers can then be used as the last eight bits of an identifier 
 word that has zeros elsewhere. Since input symbols are limited to 
 six bits, the eight -bit quantifiers can represent all possible 
 non -alphanumeric symbols as well as over 190 reserved word delimiter 
 tokens without conflict. Only the last eight bits of the delimiter 
 
20 
 token pointer field are used to hold the actual delimiter symbol or 
 a pseudo- symbol for reserved word delimiters. If the internal 
 identifier replaces a number, string, or identifier token, the pointer 
 field will contain the memory location in a table of the beginning of 
 the stored token. 
 
 3.3 Tokens 
 
 Tokens that are not delimiters are arranged to facilitate 
 filing them in a table. Each of these tokens begins with the same 
 eight -bit header that is used in the corresponding internal identi- 
 fier. The two classification bits of the header allow number, string, 
 and identifier tokens to be held in the same table. The character 
 count indicates the number of bytes following the header that contain 
 meaningful characters, and it can be used to determine the number of 
 words occupied by the stored token. Obviously, no token may occupy 
 more than eight words. Delimiter tokens contain exactly one input 
 symbol which is placed in the rightmost eight bits of the one word 
 token. The rest of the word, including the header field, consists of 
 zeros so that a delimiter token and its internal identifier are identi- 
 cal. The zeros in the header field distinguish a delimiter token from 
 the other token types. Since the header is an essential part of 
 every token, any subsequent use of the term token will be to refer to 
 a construct having an eight bit token header followed by a symbol 
 tail. The symbol tail is made up of the input symbols of the identi- 
 fier, number, or string followed by enough zeros to reach a word 
 boundary. The symbol tail for a delimiter consists of fifty-six 
 
21 
 
 zeros followed by the original delimiter character. Thus, tokens will 
 appear in one of the following formats : 
 
 DELIMITER TOKEN: 
 
 
 HEADER 
 (All zeros) 
 
 NUMBER, IDENTIFIER, OR STRING TOKEN: 
 
 DELIMITER 
 SYMBOL 
 
 A B N N N N 
 
 NXXXXXXXX 
 
 YYYYYYYY 
 
 41- 
 
 00000000 
 
 SECOND SYMBOL ZEROs'tO END OF WORD 
 
 (If necessary) 
 
 HEADER 
 
 FIRST SYMBOL 
 
 (A and B represent the two bit classification code. The N's 
 form a six bit character count.) 
 
 Although the recognizer could be required to convert number tokens to 
 a machine format, this step will be left for the analyzer. Number 
 tokens will be formed exclusively from digit symbols and will be added 
 to the table as if they were identifier or string tokens. Thus, 
 floating point quantities may appear as several number tokens separated 
 by appropriate delimiters. 
 
 3.4 Sections 
 
 The recognizer is confronted with a block of 512 characters, 
 many of them blank, that form tokens containing from 1 to 63 symbols 
 each. Each of the 64 P.E.s in a quadrant holds eight of the input 
 characters, thus partitioning the 512 input symbols into 
 
22 
 
 eight -character sub-blocks. The P.E. boundaries also divide the tokens 
 symbols into sections of eight symbols or less. Consider, for example, 
 the following input statement: 
 
 ANS 4. IDENTIFIER + 24 TIMES SUM; 
 
 This statement could appear in P.E. memory with the following grouping: 
 
 P.E. 
 
 A 
 
 N 
 
 S 
 
 
 
 4- 
 
 
 I 
 
 
 
 P 
 
 .E 
 
 l 
 
 L 
 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 P.E. 2 
 
 Rf 
 
 M 
 
 
 
 P 
 
 .E 
 
 
 3 
 
 
 
 E 
 
 S 
 
 
 S 
 
 u 
 
 M 
 
 > 
 
 
 Note that TIMES is not an identifier token, but a reserved word which 
 replaces the delimiter symbol X . Since reserved words have the same 
 structure as identifier tokens, the token building routine treats them 
 as identifier tokens which will be recognized as delimiters later in 
 the table maintenance process. To minimize interference from the P.E. 
 boundaries, the scanner first gathers the characters into sections 
 using the section builder , and then assembles the sections into complete 
 tokens with the section joiner . Sections created by section builder 
 will ultimately be combined to become tokens, so they are constructed 
 with the same format as tokens; number, identifier, or string sections 
 have an eight -bit header followed by a symbol tail, while delimiter 
 sections consist of a single word containing a delimiter symbol 
 preceded by fifty -six zeros. Obviously, the symbol tail of a section 
 contains at most eight symbols. If all of the symbols of a token 
 occur within the same P.E., the section formed from these symbols will 
 be a complete token. This is always the case for delimiter tokens, 
 that are not reserved words since, as noted before, section builder 
 
23 
 treats reserved words as identifiers end can thus assume that all 
 delimiter tokens contain exactly one character. All of the tokens 
 in the example except the tokens for the identifier named IDENTIFIER 
 and the reserved word TIMES generate sections that are complete tokens. 
 The token for TIMES is built from two sections; the token for 
 IDENTIFIER requires three sections. 
 
24 
 4. THE CHARACTER CLASSIFIER 
 
 4.1 The Classification Process 
 
 Section builder does not directly examine the input symbols 
 to determine the action to be taken, but is guided by a control string 
 in which there is an eight-bit control byte corresponding to each input 
 character. This control string is generated by a character classifier 
 that begins by placing each of the 512 input symbols into one of the 
 following classes: 
 
 ol - symbol: An alphabetic letter. 
 
 (3 - symbol: A blank symbol that does not belong in a string 
 
 token. 
 V- symbol: A symbol that is part of a string token. (A 
 
 symbol properly enclosed by quote symbols.) 
 6- symbol: A delimiter symbol--a symbol that does not fit 
 
 into any of the other classifications. 
 1- symbol: A numeric symbol (Digit). 
 
 4>- symbol: A null character. (Should not appear in any output 
 token. ) 
 Conventional recognizers apply a series of tests to each individual 
 character, using the results of one test to determine the next test 
 until the symbol is identified with a specific symbol class. The 
 parallel recognizer uses the eight -bit relational test instructions to 
 test all of the 512 input characters simultaneously. The only 
 characters that change the classification of other characters are the 
 quote characters which always signal the presence of string tokens, 
 so the classifier checks for these symbols first. If quote symbols 
 
25 
 are present, every odd -numbered quote symbol should be the beginning 
 of a string token that will be terminated by the next even-numbered 
 quote character. If the previous block of source symbols ended 
 with an uncompleted string token, then the roles of the odd and even 
 numbered quote symbols are interchanged. If any quote symbols are 
 detected, a marker bit is generated for each symbol in the string 
 tokens. To do this, the bits that mark quote symbols are shifted 
 right one character position in each P.E. and added modulo two. The 
 process is repeated eight times. The bit corresponding to the right- 
 most character in each P.E. now indicates if an uneven number of 
 quote signs occurred in that P.E. These bits are accumulated in an 
 
 * AR and the process is repeated for 63 shifts. Each one in the ACAR 
 uuw indicates that the markers in the P.E. corresponding to that ACAR 
 bit position should be complemented. When this has been done, the 
 newly generated bits mark all of the symbols that belong to a string 
 token. 
 
 Unfortunately, the routine just described breaks down if a 
 "three-quote" construct is encountered. This construct arises be- 
 cause the Burroughs character set does not include the right and 
 left quote symbols used in the ALGOL report. Thus, while quote 
 characters may be readily inserted into the string tokens described 
 in the ALGOL report, a group of three successive quote symbols is 
 required to represent a single quote character inside of a string 
 token in Burroughs Extended ALGOL. Since three consecutive quotes 
 upset the modulo two scheme used in generating the string symbol 
 markers, such groups must be detected and remedial action must be 
 
26 
 
 taken first. This is done by detecting pairs of adjacent quote symbols, 
 deleting their markers from the set of quote markers, and then marking 
 them as null characters to prevent their appearance in the output 
 tokens. If a single quote symbol follows such a pair, the quote marker 
 corresponding to this symbol is deleted so that the now unmarked quote 
 sign will be ignored by the modulo two routine. Sequences of several 
 continguous quote symbols must be reduced by repeated use of the quote 
 pair remover. An unfortunate by-product of the removal of quote symbol 
 pairs is a shrinkage in the length of the token. As we will see, the 
 possibility that there may be gaps in the symbol tails necessitates 
 the use of more general (and thus more complicated) section builder 
 and section joiner schemes. 
 
 After string components have been properly identified, the 
 character classifier uses a series of eight -bit inequality tests to 
 separate numeric and alphabetic characters from the delimiter characters, 
 Digits may be easily isolated, but the Burroughs coding requires 
 several inequality tests to identify the alphabetic characters, as 
 they occur in three groups separated by delimiter symbols. This is 
 of little concern to a parallel recognizer since each step is being 
 applied to 512 characters instead of just one symbol. The results 
 of these tests are recorded in a marker string. Each eight -bit byte 
 of the marker string corresponds to one of the input symbols and 
 indicates the classification of that symbol using the following coding: 
 
 
 
 1 
 
 2 
 
 3 
 
 4 
 
 5 
 
 6 
 
 7 
 
 
 4 
 
 P 
 
 
 Y 
 
 * 
 
 6 
 
 « 
 
Bits and 3 are not used. The exaaple input statement and its 
 corresponding marker string are given below: 
 
 27 
 
 P.E. 
 
 A 
 
 N 
 
 3 
 
 
 
 4- 
 
 
 I 
 
 cCcC 
 
 9 
 
 
 
 P 
 
 .E 
 
 
 L 
 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 
 tt 
 
 ot 
 
 W oi* UcLU 
 
 
 
 P.E. 
 
 2 
 
 
 
 R + 2 4 
 
 
 T 
 
 I M 
 
 
 oi 
 
 S/l If 
 
 oi U oi 
 
 
 
 P 
 
 ,E 
 
 . 3 
 
 I 
 
 
 
 E 
 
 S 
 
 
 S 
 
 U M 
 
 > 
 
 
 
 of of 
 
 P 
 
 * 
 
 d *i 6 
 
 P 
 
 4.2 The Control String 
 
 The marker string is easily converted into a control string in 
 which each eight -bit byte is coded in the following manner: 
 
 1234567 
 
 YAS 
 
 The o£, V, and 6 indicators have the same meaning as they did in 
 the marker string. The two new bits of control information have the 
 following significance: 
 
 X : Indicates a null character or a blank character that is 
 
 not contained in a string token. (Should not appear in 
 
 any output token.) 
 £ Section store indicator. Marks either a delimiter symbol 
 
 or a blank symbol that is not contained in a string if 
 
 the preceding symbol is not a delimiter symbol or a 
 
 non-string blank symbol. 
 The control string format is designed so that the control bytes may 
 be loaded directly into the eight -bit P.E. mode registers to control 
 
28 
 the section assembly process. The £ indicators appear in bit positions 
 and 1 since both the E and E 1 enable bits must be set to one to 
 completely enable a P.E. Bit positions 2 and 3 are the fault bits in 
 the mode register. These are always set to zero to avoid complications 
 with the interrupt hardware. The example input statement would give 
 rise to the following control string: 
 
 
 
 P. 
 
 E. 
 
 c 
 
 ) 
 
 
 
 
 
 
 P 
 
 E. 
 
 1 
 
 
 
 
 
 
 
 P 
 
 E 
 
 , : 
 
 > 
 
 
 
 
 
 
 P 
 
 E 
 
 '. 
 
 \ 
 
 
 
 A 
 
 N 
 
 S 
 
 
 
 4- 
 
 
 i 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 
 R ♦ 2 
 
 4 
 
 
 T 
 
 I 
 
 M 
 
 
 E 
 
 S 
 
 
 S 
 
 u 
 
 M 
 
 > 
 
 
 
 
 
 
 
 
 
 ot 
 
 <* 
 
 << 
 
 e 
 
 A 
 
 X 
 
 a 
 
 J\ 
 
 * 
 
 
 eC 
 
 0t|o£ 
 
 ot 
 
 o< 
 
 oi 
 
 cA 
 
 oL 
 
 
 oC 
 
 S 
 
 
 
 \OLOL4 
 
 
 ci cL 
 
 6 
 *<* 
 
 ci 
 
 
29 
 5. THE SECTION BUILDER 
 
 5.1 The Section Builder Process 
 
 Section builder processes the eight characters held in each 
 of the 64 P.E.s, beginning with the leftmost character and moving 
 right under the direction of the control string. The input characters 
 are held in the R registers of the P.E.s. The section builder begins 
 constructing a partial section in each P.E. This partial section 
 consists of a symbol tail in register A and a header in register S. 
 The eight -bit header is kept in bit positions 53 through 60 of the 
 S register, an offset of three bit positions from the right end of 
 the register. Thus, the character count of the header is incremented 
 by adding eight instead of one. The reason for this will become clear 
 later. Initially, the partial section symbol tail and header are both 
 all zeros. If an alphabetic, numeric, or string character is encountered, 
 this character is extracted from the set of input characters and added 
 to the end of the symbol tail in register A. The character count in 
 register S is also incremented. If the new symbol is an alphabetic 
 character, the first bit of the partial section header, located in 
 bit 53 of the S register, is set to one. Bit 54 is set to one when a 
 string character is added. If a delimiter symbol is encountered in 
 the input, the partial section previously being accumulated (if any) 
 is assembled into a complete section and stored in P.E. memory. The 
 delimiter symbol is then extracted from the set of input symbols, and 
 placed in the rightmost eight bits of register A to form a delimiter 
 section, which is also stored in P.E. memory. Register A and register 
 
30 
 S are then both cleared to begin a new partial section. The occur- 
 rence of a blank symbol that is not a part of a string token will 
 also cause the section builder to assemble and store any unfinished 
 partial sections, but the blank symbol itself will be discarded. 
 The section builder does not need to directly examine the input symbols 
 to determine the action to be taken since its behavior is completely 
 determined by the control string. The first step in processing a new 
 input character is to load the control byte corresponding to that 
 symbol into the P.E.'s mode register. This immediately disables the 
 P.E.s that do not have partial sections to be stored. When the partial 
 section storage operation is complete, the mode register bits are 
 rearranged to enable only P.E.s that are required to extract a non- 
 blank, non-null character from the set of input characters. The 
 section builder continues in this fashion, using the mode bits to 
 insure only the correct P.E.s are active at each step until all of 
 the possible operations on the current character have been completed. 
 It then proceeds to the next character and loads a new control byte 
 into the mode register until all eight of the characters in each P.E. 
 have been assimilated. 
 
 The kernel of the section builder is the process of simul- 
 taneously extracting the i tn character from the eight-character input 
 word in each of the enabled P.E.s. This could be done by masking a 
 copy of the input word in each enabled P.E. with a pattern properly 
 positioned in an ACAR. However, since it is important not only to 
 extract a character from the word in which it is imbedded, but also 
 to shift that character to a position aligned with the end of a 
 
31 
 symbol tail, an alternate method which dispenses with the mask in 
 favor of a series of shift operations was adopted. As noted before, 
 each P.E. is equipped with a shifter, which is capable of shifting a 
 64 bit operand in either direction, end-off or end-around, in a single 
 clock time. Even with the necessary instruction decoding and other 
 overhead, the shift instructions require only three clock times for 
 any desired shift. Furthermore, the shift count can be indexed 
 separately in each P.E., so that a single instruction may produce a 
 variety of shift lengths in the different P.E. a. The section builder 
 uses two end-off shifts to extract the character from the input word 
 and place it in the rightmost eight bits of register A. The word is 
 first shifted left far enough to left justify the selected character. 
 Next, a 56 bit right shift leaves the desired character in the right- 
 most byte of a word that contains zeros elsewhere. Delimiter characters 
 in this position need no further treatment before they are stored as 
 completed tokens. Otherwise, the extracted character is added to 
 the partial section symbol tail which is then shifted left eight bits 
 so that another symbol may be added to the right end. 
 
 When a symbol tail is to be joined with its companion header 
 to form a completed section, the exact position of the right end of 
 the symbol tail becomes unimportant, and the tail must be shifted so 
 that its leftmost symbol is aligned with the header to be added. A 
 rotation (end-around shift) to the right is used for this. The symbol 
 tails may vary from one to seven characters in length, so the shift 
 distance will vary. The correct shift counts are derived from the 
 character count portion of the header accompanying the symbol tail. 
 
32 
 
 Displacing the header three bits from the right of the S register 
 effectively multiplies the character count by eight to give the 
 necessary shift count increment. After the header in the S register 
 is OR-ed with the symbol tail, an eleven bit right rotation aligns 
 the completed section for storage. 
 
 Given the example input statement, the section builder would 
 produce the following result: 
 
 R Register (Input string) 
 
 A 
 
 N 
 
 S 
 
 
 
 4- 
 
 
 I 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 
 R + 
 
 2 
 
 4 
 
 
 T 
 
 I 
 
 M 
 
 
 E 
 
 S 
 
 
 S 
 
 U 
 
 M 
 
 > 
 
 
 
 
 
 
 
 
 
 
 k 
 
 1 Regis 
 
 ter 
 
 (Partial sec 
 
 tion header: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 1 
 
 
 
 
 
 
 
 
 
 8 
 
 
 
 
 
 
 
 
 
 3 
 
 
 
 
 
 
 
 
 
 
 A Register (Partial section symbol tails) 
 
 
 
 
 
 
 
 I 
 
 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 D 
 
 
 
 
 
 
 T 
 
 I 
 
 M 
 
 
 
 
 
 
 
 
 
 
 
 P.E. P.E. 1 P.E. 2 
 
 P.E. Memory 
 
 P.E. 3 
 
 3 
 
 A 
 
 N 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 1_ 
 
 R 
 
 
 
 
 
 
 
 
 2 
 
 E 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
 
 
 <■ 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 + 
 
 3 
 
 S 
 
 u 
 
 M 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 2 
 
 2 
 
 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 > 
 
 The underscored digits represent header bytes, and give the value of 
 the character count field in the header. 
 
33 
 
 6. THE SECTION JOINER 
 
 6.1 The Section Joiner Process 
 
 Many of the sections constructed by section builder are 
 complete tokens. The amalgamation of the remaining sections into 
 tokens is the function of the section joiner. A P.E. that contains 
 the beginning section of a mult i -section token is designated as a 
 RECEIVER P.E. since this P.E. receives the remaining sections of the 
 token from adjacent P.E.s and adds them, one at a time, to the growing 
 partial token until a completed token has been assembled. The ending 
 section of a multi-section token will always be the first token in 
 its P.E., which is called a GENERATOR P.E. Finally, LINK P.E.s hold 
 the intermediate or LINK sections of tokens which include three or more 
 sections. A P.E. which is a RECEIVER for one token may be a GENERATOR 
 for another token, but a LINK P.E. must contain exactly one section 
 and cannot also be either a RECEIVER or a GENERATOR P.E. The section 
 joiner maintains three 64 bit patterns in separate ACAR registers to 
 indicate which P.E.s are RECEIVERS, GENERATORS, or LINKS. The registers 
 in each P.E. are tested to determine if any complete sections were 
 stored by the P.E., if any partial sections remain in the registers 
 of the P.E., or if the section store indicator of the first control 
 byte in the P.E. was zero. The patterns resulting from these tests are 
 used to produce the RECEIVER, GENERATOR, and LINK patterns. In the 
 case of the example input statement, P.E. and P.E. 2 would be 
 considered as RECEIVER P.E.s, P.E. 2 and P.E. 3 would be GENERATOR 
 P.E.s, and P.E. 1 would be a LINK P.E. 
 
34 
 6.2 Character Count Determination 
 
 The sections are combined by merging all of the section 
 headers into a single token header and by concatenating the section 
 symbol tails. As in the assembly of single-section tokens, the symbol 
 tail of the first partial section must be aligned so that it will be 
 adjacent to the token header. Subsequent partial section symbol tails 
 are received from neighboring P.E.s and must be shifted to align them 
 with the previously assembled symbols. If the "three -quote" convention 
 for inserting quote symbols into strings did not exist, all inter- 
 mediate or LINK partial sections would contain exactly eight symbols and 
 all sections except the beginning section would be shifted identically 
 during the symbol concatenation process. Unfortunately, however, the 
 "three-quote" convention can shorten the length of LINK sections, so 
 a separate shift count must be used whenever a new partial section 
 symbol tail is joined to the token. As before, the character count 
 fields of the section headers provide the necessary shift counts. The 
 R registers are used to send the character count of the first section 
 in each P.E. to lower numbered P.E.s in shift register fashion. The 
 P.E.s add the shift count of their last section to the incoming shift 
 counts to determine the number of symbols that would be obtained by 
 joining two sections. The R register contents are again routed to 
 the next lower numbered P.E.s and the addition process is repeated. 
 The resulting sums now give the number of symbols included in three 
 sections. Repeating the process and saving the sums obtained at each 
 step finally produces a counter string of eight sums. Each sum 
 represents the number of symbols accumulated in a successive stage 
 
35 
 of the section joining process. The same steps are applied to the 
 classification bits from the headers, only the bits are "OR"-ed 
 together instead of added. These are combined with the character 
 counts to form a word containing eight potential headers in each P.E. 
 A RECEIVER P.E. forming a token from only two sections uses its right- 
 most header byte as the header for the assembled token. A P.E. that 
 forms a token using the maximum of nine sections uses its leftmost 
 header byte in that token. 
 
 6.3 Symbol Tail Concatenation 
 
 The process of joining the symbol tails is best illustrated 
 by examining the steps used in assembling the multi-section tokens, 
 IDENTIFIER and TIMES in the example input statement. First, the 
 chain of header bytes is temporarily stored in P.E. memory so that 
 the information remaining from the section builder process may be 
 returned to the S and A registers, with the symbols in the A 
 register left justified in the register. The symbol tail of the 
 first section in each P.E. is also left justified and loaded into 
 the R register for routing to lower numbered P.E. 8. P.E.s processing 
 the example input would contain the following information at this 
 point : 
 
R Register (First section symbol tails) 
 
 36 
 
 A 
 
 N 
 
 S 
 
 
 
 
 
 
 
 D E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 
 R 
 
 
 
 
 
 
 
 
 
 E 
 
 S 
 
 
 
 
 
 
 
 S Register (Partial section headers) 
 
 
 
 
 
 
 
 
 i 
 
 
 
 
 
 
 
 
 
 8 
 
 
 
 
 
 
 
 
 
 3 
 
 
 
 
 
 
 
 
 
 
 A Register (Partial section symbol tails) 
 
 I 
 
 
 
 
 
 
 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 
 T 
 
 I 
 
 M 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 P.E. 
 
 P.E. 1 
 
 P.E. 2 
 
 P.E. 3 
 
 P.E. Memory (Stored sections) 
 
 3 
 
 A 
 
 N 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 1 
 
 R 
 
 
 
 
 
 
 
 
 2 
 
 E 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 + 
 
 3 
 
 S 
 
 U 
 
 M 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 2 
 
 2 
 
 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 » 
 
 All P.E.s that are not RECEIVER P.E.s are now disabled. This does not 
 affect the routing action of the R registers, but it allows only 
 RECEIVER P.E.s to perform the other symbol joining operations. In the 
 example, only P.E. and P.E. 2 will be active initially. 
 
 The R register contents are routed to the left to begin the 
 first cycle. (The example is constructed as if there were only four 
 P.E.s in the quadrant, so P.E. routes to P.E. 3.) Each RECEIVER P.E. 
 shifts the symbols in the A register eight bits to the right to make 
 room for a header and then moves them to the B register. The incoming 
 symbol tail is brought into the A register and is shifted right by 
 one character position more than the number of characters given the by the 
 character count in the S register. The symbols in the A and B registers 
 are now prrmerly aligned. After shifting the symbols of the example 
 
37 
 
 statement, the registers of the two RECEIVER P.E.s appear as follows: 
 
 P.E. 
 A Register 
 
 P.E. 2 
 A Register 
 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 B Register 
 
 
 I 
 
 
 
 
 
 
 
 R Register 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 I 
 
 E 
 
 S Register 
 
 
 
 
 
 
 
 
 1 
 
 
 
 
 
 E 
 
 S 
 
 
 
 B Register 
 
 
 T 
 
 I M 
 
 
 
 
 
 R Register 
 
 E 
 
 S 
 
 
 
 
 
 
 
 S Register 
 
 
 
 
 
 
 
 
 3 
 
 The symbols in the A and B registers are then "OR"-ed together and 
 the eight header bytes are loaded into the S register. The bit in 
 the "eights" position in the character count is compared with the 
 corresponding bit in the previous character count. If the bit has 
 changed, register A is filled with symbols and must be stored. Bit 60 
 of the S register is monitored for this purpose in the first cycle. 
 In the example, P.E. would be the only P.E. to store its A register 
 contents. If the A register contents are stored, the symbols in the 
 R register are again transferred to the A register and the symbols 
 that did not fit into the A register before are passed into the B 
 register using a double length shift. The pattern of GENERATOR bits 
 in the C.U. is now shifted left and compared with the RECEIVER pattern. 
 A GENERATOR bit that is shifted into coincidence with a RECEIVER bit 
 cancels that bit and causes the corresponding P.E. to be disabled for 
 
38 
 the rest of the concatenation process. In the example, P.E. 2 is 
 
 disabled in this way after the first cycle. 
 
 The second cycle begins by routing the R register contents to 
 the left again. P.E. would be the only P.E. active at this point 
 In the example, and the registers of this P.E. would have the following 
 contents : 
 
 P.E. 
 
 
 
 A 
 
 Reg 
 
 ister 
 
 
 
 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 B Register 
 
 I 
 
 E 
 
 
 
 
 
 
 
 R Register 
 
 R 
 
 
 
 
 
 
 
 
 S Register 
 
 29 
 
 26 
 
 24 
 
 23 
 
 15 
 
 12 
 
 10 
 
 9 
 
 As before, the incoming symbols are transferred from the R register 
 to the A register and shifted into alignment with the symbols in the 
 B register before the symbol groups are "0R"-ed together. The cycles 
 continue until the absence of RECEIVER P.E.s signals the end of the 
 concatenation process. In the example, this occurs at the end of 
 the second cycle. In any case, the process is stopped after eight 
 cycles. All RECEIVER P.E.s are then reactivated to store any symbols 
 remaining in the A registers and to insert the appropriate header 
 byte in the first word of the new multi-section tokens. The tokens are 
 
39 
 now completely formed and are ready for storage in the token table or 
 addition to the output string. At this point, the example input state- 
 ment symbols are stored in P.E. memory as follows: 
 
 
 
 P. 
 
 E. 
 
 
 
 
 
 
 
 
 P. 
 
 E. 
 
 1 
 
 
 
 
 
 
 
 P.E. 
 
 i 
 
 I 
 
 
 
 
 
 
 P 
 
 .E. 
 
 
 i 
 
 
 
 3 
 
 A 
 
 N 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *■ 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 + 
 
 3 
 
 S 
 
 U 
 
 M 
 
 
 
 
 
 ±9. 
 
 I 
 
 D 
 
 E 
 
 N 
 
 T 
 
 I 
 
 F 
 
 
 
 
 
 
 
 
 
 2 
 
 2 
 
 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 > 
 
 I 
 
 E 
 
 R 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 5 
 
 T 
 
 I 
 
 M 
 
 E 
 
 S 
 
 
 
 
 
 
 
 
 
 
 
40 
 7. TABLE MAINTENANCE 
 
 7.1 Table Use 
 
 After the tokens have been created, the recognizer retrieves 
 them one at a time and creates an internal identifier for each one. 
 Delimiter tokens are transferred to the output string without change. 
 Identifier tokens are compared with a reserved word table so that 
 reserved words masquerading as identifiers can be detected and re- 
 placed with an appropriate delimiter type internal identifier. Other 
 identifier tokens as well as number and string tokens are located in 
 the main token table or are entered in this table if no matching entry 
 is found. The address of the token in the main table is then combined 
 with the header of the token to create the internal identifier that 
 represents the token in the output string. 
 
 7.2 The Reserved Word Table 
 
 The recognizer uses two tables, the reserved word table and 
 the main token table. The two tables could be combined, but the 
 entries in the reserved word table are predetermined single word 
 entries that are never changed by the recognizer, so a simpler arrange- 
 ment may be used for this table. The Burroughs Extended ALGOL manual 
 lists 111 reserved words, but 58 of these are reserved words only 
 when used in certain contexts. For simplicity, the parallel recognizer 
 arbitrarily treats the entire set as reserved words in all situations. 
 (If desired, this restriction could be relaxed by treating the 
 semi -reserved words as identifiers and allowing the analyzer to 
 detect them. ) The reserved word table entries are stored in 
 
41 
 consecutive quadrant memory locations and fit easily into two slices 
 of 64 words each. The query word is broadcast to all P.E. accumulators 
 where a P.E. test instruction is used to compare the query with the 64 
 words in the first slice of the reserved word table. The test instruc- 
 tions report a successful match by setting a specified mode register 
 bit in each P.E The test bit pattern is read into an ACAR for 
 examination. If no ones appear in the pattern, none of the 64 table 
 entries accessed match the query. Otherwise, the leading one detector 
 of the C.U. is used to identify the P.E. containing the matching 
 entry. Essentially, the quadrant is being used as a 64 word associa- 
 tive memory here. Only two P E. memory cycles are required to 
 exhaustively search the reserved word table. Since the entries in the 
 reserved word are predetermined and do not change, the fastest con- 
 ventional search technique orders these entries so that the set of 
 possible matching entries is divided in half each time a comparison 
 is made. On a serial machine, such a binary search procedure would 
 require at least six memory cycles to determine that a given query 
 word is not contained in this reserved word table. The only reserved 
 word that contains more than seven characters is PROCEDURE. A test 
 for this word is made separately from the search for shorter reserved 
 words to simplify the table. 
 
 7.3 The Main Token Table 
 
 The token table is much more complicated to search and maintain 
 than the reserved word table. Since new entries are constantly 
 being added to this table, it is not feasible to order the contents. 
 
42 
 However, since the table may grow to include several hundred entries, 
 it may not be feasible to examine every table entry for each new 
 query. Conventional recognizers attack this problem by dividing the 
 table into blocks (sometimes called "buckets"). Some property of all 
 or part of the query word is used to associate this word with exactly 
 one block. The recognizer can then limit the search to the table 
 entries stored in the selected block. Obviously, this scheme is an 
 advantage only if the entries are fairly evenly distributed among the 
 blocks. Often, many of the identifiers are similar to one another. 
 For example, all of the variables associated with a certain process 
 may begin with the same letter. To keep this clustering effect 
 from adversely affecting the distribution of entries among table 
 blocks, parts of the query word are usually transformed in some way 
 to further "randomize" the keys obtained. Since Illiac IV has high 
 speed multiplication hardware, the query word is multiplied by a 
 "hash constant" to use as many bits of the query as possible in the 
 randomizing process. A four bit hash key is derived from this process. 
 The four bits designate one of the 16 hash blocks used in the table. 
 The table is designed to hold a minimum of 1024 entries with room 
 for 2048 entries under optimum conditions. The organization of the 
 table is evident from the following diagram of the table portion of 
 a typical P.E memory: 
 
43 
 
 MEMORY ADDRESS 
 (B ■ Base Address) 
 
 B - 1 
 B + 
 B + 1 
 
 CONTENTS OF MEMORY CELL 
 
 End of table pointer 
 First word of main block #0 
 First word of main block #1 
 
 B + 15 
 B + 16 
 B + 17 
 
 First word of main block #15 
 Second word of main block #0 
 Second word of main block #1 
 
 B + 31 
 B ♦ 32 
 B+ 33 
 
 Second word of main block #15 
 First word of extension block #0 
 First word of extension block #1 
 
 B +47 
 B + 48 
 B + 49 
 
 First word of extension block #15 
 Second word of extension block #0 
 Second word of extension block #1 
 
 B + 63 
 B + 64 
 B + 65 
 
 Second word of extension block #15 
 Continuation word #0 
 Continuation word #1 
 
44 
 Each P.E. has space for one entry for each of the 16 main blocks. 
 Two locations are provided for each entry, the second location 16 
 words after the first one. Zeros are stored in the first word 
 locations when the table is set up so that empty spaces in a block 
 may be found by matching with a query word of all zeros. A second 
 set of 16 blocks is provided to extend main blocks with more than 64 
 entries. The extension blocks are linked to the main blocks or to 
 other extension blocks through a set of 32 link words kept in the 
 ADVAST Data Buffer of the C.U. 
 
 When an entry is stored in the table, the first location in 
 the table receives the first word of the token; a header followed by 
 up to seven characters. The information stored in the second word 
 of the table depends upon the length of the token being stored. If 
 the token contains more than seven but less than sixteen characters, 
 the eighth character is stored in the leftmost byte of the second 
 word, followed by the rest of the characters of the token and enough 
 zeros to fill out the memory word. If the token contains more than 
 fifteen characters, the second word of the table entry is loaded with 
 the address of a continuation word that holds the second word of 
 the token. The remainder of the token is stored in subsequent 
 continuation words. An end of table pointer, stored in the memory 
 location preceding the first main block word, holds the address of 
 the last continuation word used. 
 
 The search for a match to a query token is begun by multiplying 
 the proper section of the first word of the query by the hash constant 
 and then extracting the four bit hash key. The hash key is added 
 
 
 
45 
 to a base address to obtain the slice address of the first word of 
 the selected main block. The query word is compared with the set of 
 64 first word entries for the selected main block. If no matches 
 occur, the link corresponding to the main block is checked to see if 
 the block has been extended. If so, the first word entries of the 
 associated extension blocks are also compared with the first query 
 word. When a match is found, the next word of the query token is 
 broadcast to the P.E.s for matching. An examination of the token 
 character count indicates whether the second word of the entry should 
 be matched against the second word of the query token or used to 
 access a continuation word. If several long tokens with identical 
 beginning portions are stored in the table, the first comparisons 
 may produce several bits indicating matches. These bits are "ANO"-ed 
 with the bits produced by subsequent comparisons until at most one 
 match bit remains when all of the words in the query token have been 
 used. Note that this process examines several of the candidates for 
 a match in parallel, where a serial table routine might try several 
 entries that do not quite match the query before locating the correct 
 matching entry. 
 
 Once an entry is located in the table, the leading one detector 
 is used to determine which P.E. holds the entry. The P.E. number is 
 combined with the slice address of the entry to produce a quadrant 
 address. It is this address that appears in the output string. 
 
46 
 8. RESULTS 
 
 8. 1 Execution Time Data 
 
 Execution times were estimated for the parallel recognizer 
 for a variety of conditions. As noted before, the total execution 
 time is not the sum of the individual instruction execution times 
 due to the overlap between P.E. and C.U. instructions created by the 
 final queue (FINQ) and by the action of the C.U.'s instruction look- 
 ahead unit. 
 
 All intermediate calculations are expressed in Illiac IV 
 clock times. (An Illiac IV clock period is about 40 nanoseconds.) 
 However, since it is desirable to have some measure that is not 
 dependent on circuit speed, the final results are expressed in 
 equivalent memory cycles. An Illiac IV memory cycle requires six 
 c 1 oc ks . 
 
 The routines that assemble the 512 input characters into 
 tokens are evaluated separately from the table maintenance and 
 output string generation procedures. The former is evaluated on a 
 memory cycle/input character basis while the latter is judged on a 
 memory cycle/ output token basis. 
 
 Table 1 gives the execution times obtained for the token 
 building routines. Four possible input character sets are considered: 
 
 CASE I: Best possible case. No quote symbols appear in the 
 input and the longest token is two words in length. 
 
 CASE II: Longest possible tokens (eight words) appear but no 
 quote symbols are present in the input. 
 
47 
 
 CASE I CASE II CASE III CASE IV 
 
 Set-up routines 1560 1560 1560 1560 
 
 Quote string builder and 
 
 first cycle of quote-pair 
 
 routine — -- 1282 1282 
 
 Quote -pair routine 
 
 -additional cycles -- -- -- 2340 
 
 Marker and control 
 
 string generator 195 195 195 195 
 
 Section builder 728 728 728 728 
 
 Section joiner 
 
 -minimum portion 662 662 662 662 
 
 Section joiner 
 
 -additional cycles -- 595 595 595 
 
 Total clocks 3145 3740 5022 7362 
 
 Equivalent memory cycles 525 624 837 1227 
 
 Memory cycles /Input character 1.03 1.22 1.63 2.40 
 Table 1 - Token Building Routines -Execution Times 
 
48 
 CASE III: Longest possible tokens as well as quote symbols 
 
 appear in the input. No more than three quote 
 
 symbols appear contiguously. 
 CASE IV: Worst possible case. This very rare input string 
 
 includes 63 consecutive quote symbols which form a 
 
 maximum length token. 
 The execution time stays near to one memory cycle/ input character unless 
 quote symbols are present. Then the increase is slight unless an 
 absurd number of contiguous quote symbols appear. Obviously, the 
 three -quote convention would be one of the first things omitted in a 
 language designed for a parallel recognizer. A conventional recognizer 
 would probably operate the fastest on a set of all blank input symbols, 
 but even then at least two memory cycles would be required for each 
 symbol scanned. As more and more non-blank symbols, especially 
 delimiter symbols, appear in the input string, a conventional recognizer 
 slows noticeably. Thus the parallel token building routines provide 
 a definite speed advantage when processing non-trivial input strings. 
 The table maintenance and output string routines do not show 
 a similar advantage, however. The execution time required for these 
 routines depends upon a multitude of factors including the length of 
 the token, the number of tokens already in the table, the distribution 
 of the table entries among the table blocks, and the existence of a 
 table entry that matches the query token. The execution times for 
 four widely differing cases are presented in table 2. The cases 
 selected are as follows : 
 
49 
 
 CASE I CASE II CASE III CASE IV 
 
 Preparation of token for 
 table search 
 
 Table search 
 
 Entry of token into 
 table 
 
 Creation of output 
 internal identifier for 
 token 
 
 110 
 44 
 
 139 
 104 
 
 24 
 
 42 
 
 164 
 77 
 
 77 
 
 42 
 
 298 
 77 
 
 283 
 42 
 
 Total clocks 
 
 178 
 
 285 
 
 360 
 
 700 
 
 Equivalent memory cycles/Token 30 48_ 
 
 60 
 
 117 
 
 Table 2 - Table Maintenance Routines -Execution Times 
 
50 
 CASE I: A single word reserved word token. (Naturally 
 there is a matching entry for this token in the 
 reserved word table.) 
 CASE II: A single word string token that has a matching entry 
 
 already in the table. 
 CASE III: A single word identifier token that does not match 
 
 any entry in the table. 
 CASE IV: A maximum length token (eight words) that does not 
 match any entry in the table. 
 The parallel table maintenance routines show little if any, gain over 
 conventional schemes. This occurs because each token receives a 
 considerable amount of individual attention by the C.U., mostly in 
 the preparation of the token for the table search. An algorithm that 
 performed some of the more common pre-search steps in the P.E.s would 
 help to reduce the dominance of the preparation step in the total 
 time. Possibly a linear table search could be used for the main token 
 table, especially if less than 500 tokens is expected. But no matter 
 what modifications are adopted, a well designed conventional table 
 routine is not easily aclipsed. 
 
 Since conventional serial recognizers are usually embedded in 
 a compiler, etc., an exact comparison cannot be made, but the Illiac IV 
 parallel computer should provide an increase in string processing 
 speed of from two to fifteen times. Thus the parallel computer should 
 be a useful tool for string processing. 
 
51 
 REFERENCES 
 
 Burroughs Corporation. "15500 Extended ALGOL Language Manual," 1966. 
 
 Burroughs Corporation. "ILLIAC IV Systems Characteristics and 
 Programming Manual," with change 1 of June 12, 1968. 
 
 Graham, Robert M. "Bounded Context Translation," in Rosen, Saul, 
 
 "Programming Systems and Languages," McGraw-Hill Book Company, 
 New York, 1967. 
 
 Nauer, P., et al . "Revised Report on the Algorithmic Language ALGOL 
 
 60," Communications of the Association for Computing Machinery , 
 Volume 6, Number 1, pp. 1-17, January, 1963. 
 
 Petersen, W. W. "Addressing for Random Access Storage," IBM Journel 
 of Reseerch end Development , Volume 1, Number 2, pp. 130-146, 
 April, 1957. 
 
52 
 
 APPENDIX 
 
 ILLIAC IV ASSEMBLY LANGUAGE 
 LISTING OF PARALLEL RECOGNIZER 
 
 .QUOTSiEQU 
 .BTMSKlEGU 
 ,LIM3« EOu 
 .LIM7I EQU 
 . L I M<*>3 I EOU 
 .ENDSTIFOu 
 t 
 .HlNBRlEOU 
 
 .BLANKIEOU 
 .HAMsKjfou 
 
 ,CYC^6lEQU 
 .IOENT«FOU 
 .STRNGlEQU 
 .GENRl EQU 
 .NLlMK»EOu 
 .RCVRl EQU 
 .ACTIVlEQU 
 ,HASM« EQu 
 .HMSKI Equ 
 .WRDCTiEOU 
 .CLIMTiEou 
 .LASTXlEOU 
 .XLIMTIEQU 
 .ALIMt FQU 
 
 .PROCDlEQU 
 
 .REl EQu 
 
 .POYNTiFqu 
 
 .HOMSKIEQU 
 
 .OUTPTlFQU 
 
 .LOALFlEQU 
 * 
 
 .HIALF'EOU 
 % 
 
 .QUERYtEOU 
 .QUERPlFOU 
 
 •XBASElEOU 
 % 
 
 .CNTRi 
 
 FRSTRI 
 
 «BASF« 
 TSI Z F, 
 
 nxsrt i 
 
 OLDMKI 
 
 MARKS I 
 KTENni 
 
 EQu 
 EQU 
 F<5U 
 E(5U 
 EQU 
 EQu 
 
 Fqu 
 
 EQU 
 
 **** 
 SOlj 
 S02J 
 SD3) 
 SD4; 
 SD5j 
 $06) 
 
 SO?) 
 SD8J 
 SD9j 
 
 SDlOJ 
 
 soin 
 
 SD12) 
 SD13| 
 
 SOU' 
 SD15) 
 
 $016; 
 
 SOU) 
 SDU| 
 
 sou; 
 so2o; 
 
 S021I 
 SD22J 
 SD23) 
 SD24I 
 SD25I 
 SD26I 
 *D27) 
 SD28) 
 SD29) 
 
 S032| 
 
 SD36) 
 $037) 
 »D«6j 
 
 S063| 
 
 «8192; 
 
 .72192) 
 
 .510, 
 
 64*256) 
 64x257| 
 
 64x258; 
 
 64*259) 
 
 LABEL DEC 
 PATTE 
 MASK 
 STE 
 STE 
 
 STE 
 TELLS 
 
 ENOE 
 PATTE 
 PATTE 
 MASK 
 A STE 
 MASK 
 MASK 
 
 FLAG 
 
 TNDIC 
 
 FLAG 
 
 FLAG 
 
 MASH 
 
 MASKS 
 
 WORD 
 
 LIMIT 
 
 NUMBE 
 
 LIMIT 
 
 1 STE 
 "9PR0 
 "REOO 
 TEMPO 
 MASKS 
 OUTPU 
 USES 
 
 FOR 
 USES 
 
 FOR 
 USES 
 QUERY 
 USES 
 
 EXTE 
 COUNT 
 FIRST 
 256*T 
 SET T 
 
 LARATIONS 
 RN OF EIG 
 
 ALL BUT E 
 P 1 UNTIL 
 P 1 UNTIL 
 P 1 UNTIL 
 
 IF PREVI 
 WITH AN 
 RN OF BYT 
 RN OF S B 
 OUT LEFT 
 P 8 UNTIL 
 ALL BUT I 
 ALL BuT S 
 BITS FOR 
 ATES NON- 
 BITS FOR 
 BITS FOR 
 COOING CO 
 
 ALL BUT 
 COUNT OF 
 
 OF CONTI 
 R OF LAST 
 
 OF EXTEN 
 P 1 INDEX 
 CEDU" PAT 
 0000" PAT 
 RARY sTor 
 
 OUT ALL 
 T STRING 
 S029-SD31 
 
 alphabeti 
 
 $D32-$034 
 
 alphabeti 
 
 SD36-S044 
 
 ♦ 1 
 $D46-$D6l 
 NSION BLO 
 ER FOR SE 
 STORE FLA 
 BASE 
 
 able size 
 
 **** 
 
 HT QUOTE C 
 ACH BYTES 
 
 3 INDEX P 
 
 7 INDFX P 
 
 63 INDEX 
 OUS INPUT 
 
 UNTERMINA 
 ES ■ ULAR 
 LANK CHARA 
 4 BITS OF 
 
 56 INDEX 
 DENTIFTER 
 TRING MARK 
 GENERATOR 
 LINK P.E.S 
 RECEIVER P 
 P.E.S STIL 
 NSTANT 
 4-BlT HASH 
 QUERY 
 NUATION WO 
 
 EXTENSION 
 SION BLOCK 
 
 PATTERN ( 
 TERN 
 TERN 
 
 AGE FOR PO 
 BUT LEFTMO 
 INDEX REGI 
 
 FOR TEST 
 C CHARACTE 
 
 FOR TFST 
 C CHARACTE 
 
 FOR QUERY 
 
 HARACTERS 
 
 LAST BIT 
 
 ATTERN 
 
 ATTERN 
 
 PATTERN 
 
 BLOCK 
 
 TED STRING 
 
 GEST NUMBR 
 
 CTERS 
 
 EACH BYTE 
 
 PATTERN 
 
 MARKS 
 
 S 
 
 P.E.S 
 
 ,E.S 
 
 L ACTIVE 
 
 KEY 
 
 RD AREA 
 BLK USED 
 AREA 
 
 LIM.O) 
 
 Inter 
 
 ST BYTE 
 STER 
 BYTES 
 RS 
 
 BYTES 
 RS 
 STRING 
 
 FOR LINKS TO 
 CKS 
 
 CTION JOINER CYCLES 
 G BIT 
 
 TO 228 wORqS/P.E, 
 
 MARKER STRING FROM PREVIOUS CYCLE 
 MARKER STRING FOR CURRENT CYCLE 
 FND MARKER FROM PREVIOUS CYCLE 
 
 00000200 
 00000300 
 00000400 
 00000500 
 00000600 
 00000700 
 00000600 
 00000900 
 00001000 
 
 oooouoo 
 
 00001200 
 00001300 
 00001400 
 00001500 
 00001600 
 00001700 
 00001800 
 00001900 
 00002000 
 00002100 
 00002200 
 00002300 
 00002400 
 00002500 
 00002600 
 00002700 
 00002800 
 00002900 
 00003000 
 00003100 
 00003200 
 
 00003300 
 00003400 
 00003500 
 00003600 
 00003700 
 00003600 
 00003900 
 00004000 
 00004100 
 00004200 
 00004300 
 00004400 
 00004500 
 00004600 
 00004700 
 00004800 
 00004900 
 00005000 
 00005100 
 00005200 
 00005300 
 00005400 
 
 00005500 
 00005600 
 00005700 
 00005800 
 
53 
 
 
 
 
 
 
 
 1 
 1 
 
 2 
 2 
 3 
 3 
 
 
 
 
 1 
 1 
 
 2 
 
 2 
 
 
 1 
 1 
 2 
 2 
 3 
 3 
 
 
 1 
 1 
 2 
 2 
 3 
 3 
 E 
 
 Tl 
 |Aj 
 
 (0) 
 
 64x260; * TEMP STORAGE FOR PARTIAL SECTION 
 64x261; X TEMPORARY STORAGE FOR X REGISTER 
 64x262) I TEMPORARY STORAGE FOR S REGISTER 
 64x263; * HEADER OF TOKEN BEING FORMED 
 64x264; % ADDRESS OF LAST CONTIN. WORD USED 
 64X265) * FIRST SLICE OF RESERVED WORD TABLE 
 64x266) I SECOND RESERVED WORD TABLE SLICE 
 64x267; I I USES LOCATIONS 267-282 FOR 
 
 COMPLETED SECTIONS 
 64x266) X SECTION ♦ 1 
 64x283; % TBASE " 1 
 
 64X284) X USES LOCS 284-511 FOR MAIN TABLE 
 64x2B5| X TBASE ♦ 1 
 64x512; % SOURCE STRING 
 
 **#* SET-UP PROCESS **** 
 ■000 01 00000007000000010 | 8) 
 
 .CYC56; x 8 STEP * UNTIL 5* INDEX PATTERN 
 ■000401 00200401 0020040M 8) 
 
 .bTmSk) x mask all but each bytes last bit 
 
 ■0074 170360741 70360741 7 |8l 
 
 »HAMSK; I MASK OUT LEFT 4 BITS OF EACH BYTE 
 
 ■ 037477 176374771 7637477 1 8; 
 
 .OUOTS) I PATTERN OF EIGHT QUOTE CHARACTERS 
 
 ■000001 0000000300000000 i 6; 
 
 •LlM3j X STEP 1 UNTIL 3 INDFX PATTERN 
 
 •000 001 0000000700000000 I 8) 
 
 ,LIM7J X STEP 1 UNTIL 7 INDEX PATTERN 
 
 ■ 0000010000007700000000 I 6| 
 
 .LIM63) X STEP 1 UNTIL 63 INDEX PATTERN 
 
 .ENDST) 
 
 ■030060140300601403006018) 
 
 .BLANK; x PATTERN OE 8 BLANK CHARACTERS 
 
 «005ol2o24o5ol2o24o5ol2|8; 
 
 ,HlNBR) * PATTERN OF BYTES ■ ULARGEST NUMBR 
 
 ■01 00200401 00200401 0020 | 8) 
 
 .LOALF) X FIRST ALPHA GROUP TEST BYTES 
 
 ■ 0150320641503206415032 1 6; 
 •HIALE) 
 
 ■020040100200401002004016) 
 ,L0AIE*1)X SECOND ALPHA GROUP TEST BYTES 
 ■025o52l2425o52l2425o52|8; 
 
 .HIALF41) 
 
 ■030461 142304611 42 30461 t 8) 
 
 .L0ALF*2)X THIRD ALPHA GROUP TEST BYTES 
 
 ■035072164 35072164 3507218) 
 
 ,HlA|.F*2) 
 
 -E.OR.E) X ENSURE ALL P.E.S ARE ENABLED 
 
 •E.ORtE) 
 
 NXSRC) 
 OLDMK) 
 MARKS) 
 XTEND' 
 ■0000006777777777777777181 
 
 00005900 
 00006000 
 00006100 
 00006200 
 00006300 
 00006400 
 00006500 
 00006600 
 00006700 
 00006800 
 
 00006900 
 00007000 
 00007100 
 00007200 
 00007300 
 00007400 
 00007500 
 00007600 
 00007700 
 
 00007800 
 00007900 
 00006000 
 00008100 
 00008200 
 00008300 
 
 00008400 
 00006500 
 00006600 
 00008700 
 
 00008800 
 00008900 
 00009000 
 00009100 
 00009200 
 00009300 
 00009400 
 00009500 
 00009600 
 
 00009700 
 00009800 
 00009900 
 
 00010000 
 00010100 
 00010200 
 00010300 
 00010400 
 00010500 
 00010600 
 00010700 
 00010800 
 00010900 
 00011000 
 
 00011100 
 
 00011200 
 00011300 
 00011400 
 00011500 
 
54 
 
 AGAIN t 
 
 STL(O) 
 LTTd) 
 STLU) 
 L T T ( 2 ) 
 
 5 T L ( 2 ) 
 
 L T T ( 3 ) 
 
 <> t lc3i 
 
 L T T(0) 
 
 LTTCl) 
 C| 0(2) J 
 S T l ( 2 3 
 T X i. T H ( 1 ) 
 
 LTTC3J 
 
 l_r>4 
 
 STfl 
 
 l*;.(0) 
 
 CI RAJ 
 
 S 'A 
 
 ST A 
 
 f Vi TM(O) 
 
 ST Lf O, 
 I »TfO) 
 - •l.Cl) 
 w r T C 2 ) 
 L'2) 
 L » T ( 'J ) 
 S T L f ) 
 l »T(1) 
 ST L (1) 
 
 L ^A 
 
 L^L(O) 
 
 NFB 
 
 L r 'i. ( 3) 
 
 NFB 
 
 SMAL 
 
 l"- 
 
 Lncf 0) 
 
 ir\p 
 RU 
 
 C I . C ( 1 ) I 
 
 C«.B 
 
 I. HA 
 SFTE 
 
 sftf! 
 SmAr 
 
 L.OX 
 
 .HASHJ « SET UP HASH CONSTANT 
 
 «36|8| 
 
 ,HMS<J * 4-BIT HASH KEY MASK 
 
 »32j 
 
 .l a stxj x riRST Extension block begins 
 
 IN THE 33RD MAIN TABLE WORD 
 - 6 3 » 
 .XLlMT; % I AsT EXTENSION BLOCK ENDS 
 
 IN 64TH WORD OF THE MAIN TABLE 
 *TSlZE*64J 
 
 .CLIMTJ I SET LlMlT OF TABLE 
 ■OOOD010000001700000000I8I * STEP 1 TO 15 
 
 .XBAsE(l)) * InItIA l IzE 16 ExTEnsION 
 
 »~2\ * BLOCK LINKS TO ZERO 
 
 *64j 
 
 *c3j % riRST CONTINUATION WORD Is 
 
 LASTCJ * LOCATED IN 65TH MAIN TABLE WORD 
 
 .LIM63) X SET ACAROINCR * ♦!# ACAROLIM ■ 63 
 
 TBASMI 
 TPASEC 
 
 »-2; 
 
 aOOOOO 
 .ALIMI 
 "17760 
 .HDMS* 
 aOOOOO 
 .OUTPT 
 ■10444 
 .PROCD 
 ■0?442 
 .REj 
 
 **** 
 SORCEj 
 .OUOTS 
 
 tcoi 
 „btmsk 
 
 SC3; 
 31 
 SAJ 
 IAJ 
 
 »N00J0 
 
 ss; 
 
 63J 
 
 *R> 
 
 63| 
 
 sci ; 
 
 NXSRC) 
 
 -E.OR.E* 
 -E.OR.E) 
 48J 
 
 0)JI INITIALIZE TABLE ENTRIES TO ZERO 
 
 10000000000000001 I 8 I 
 
 I SET ACAROINCR ■ *1» ACAROINDX ■ I 
 00000000000000000181 
 
 J I MASKS OUT ALL BUT LEFTMOST BYTE 
 lOOOlOOOOOOOOOOOOlS) % SET UP INDEX 
 i X FOR 8192 WORD OUTPUT STRING 
 7l222302305212064|8| 
 t * SET UP • , 9PR0CE0U , » PATTERN 
 5 00000 000000000018 J 
 
 I SET UP "REOOOOOO* PATTERN 
 OUOTF-PAIR ROUTINE **** 
 
 * LOAD SOURCE STRING 
 i X LOAD PATTERN OF 8 QUOTE SYMBOLS 
 
 X CHECK FOR QUOTE CHARACTERS 
 i 
 
 X OBTAIN MARK FOR EACH QUOTE SYMBOL 
 
 X MOVE MARK TO PROPER POSITION 
 
 X SAVE MARKS 
 
 x "0R W marks Into acaro 
 
 i I SKIP IF NO QUOTES FOUND 
 
 X 10AD QUOTE MARKERS FOR ROUTING 
 X ROUTE MARKERS LEFT ONE P.E. 
 
 SAJ 
 
 X SET RIGHTMOST BlT OF ACARl 
 
 X FNABLE ONLY RIGHTMOST PtE, 
 
 % BRING NExT SOURCE STRING WORD IN 
 
 AT RIGHT END OF STRING 
 
 X RE-ENABLE ALL P.E.S 
 
 X FXTRACT LEFTMOST TWO BYTEs 
 
 X SAVE RYTES IN REGlSTFR X 
 
55 
 
 SMAR 
 
 81 
 
 LOB 
 
 SAJ 
 
 LnA 
 
 is; 
 
 SMAL 
 
 8J 
 
 OR 
 
 »BJ 
 
 ANON 
 
 SSI 
 
 LOR 
 
 SAJ 
 
 LnA 
 
 SSI 
 
 SHAL 
 
 161 
 
 flR 
 
 sx; 
 
 And 
 
 SRJ 
 
 Lnc(O) 
 
 SAJ 
 
 ZFRTCO) 
 
 #SBitDj 
 
 LOR 
 
 SAJ 
 
 RTL 
 
 U 
 
 LOB 
 
 SAJ 
 
 LnA 
 
 SRJ 
 
 CROTR(l) 
 
 U 
 
 LnEFl 
 
 sci; 
 
 LnA 
 
 OLDMKJ 
 
 STR 
 
 oldmki 
 
 SFTE 
 
 "EtOR.EJ 
 
 SFTEl 
 
 -E.OR.El 
 
 ?TAR 
 
 17) 
 
 .nx 
 
 SAJ 
 
 >HAR 
 
 56J 
 
 »MAL 
 
 56| 
 
 swap; 
 
 
 >HAR 
 
 9J 
 
 ]R 
 
 SB! 
 
 .nR 
 
 SAJ 
 
 )p 
 
 SSJ 
 
 .ns 
 
 SAJ 
 
 nA 
 
 SXJ 
 
 HAL 
 
 56J 
 
 ns 
 
 SAJ 
 
 nA 
 
 SRJ 
 
 HAR 
 
 a; 
 
 9 
 
 SBJ 
 
 nR 
 
 SAJ 
 
 R 
 
 SSJ 
 
 ns 
 
 SAJ 
 
 OA 
 
 SRJ 
 
 HAL 
 
 8J 
 
 9 
 
 sx; 
 
 TAR 
 
 15j 
 
 AND 
 
 ss; 
 
 ^S 
 
 SAJ 
 
 <IP 
 
 #AGAINJ 
 
 ">A 
 
 SSJ 
 
 * NOW QET LEFTMOST BYTE ALONE 
 
 I SAVE LEFTMOST BYTE In REGISTER B 
 
 S RELOAO ORIGINAL QUOTF MARKERS 
 
 I MOVE MARKERS LEFT ONF BYTE POS, 
 
 * FORM NOOUOTE-QUOTE MARKERS 
 
 I MOVE MARKERS LEFT TWO BYTE POS. 
 
 I OBTAIN NOOUOTE-quOTE-quOTE mArkErc 
 
 I -OR" MARKERS INTO ACARO 
 
 I SKIP IF NO MORE MARKpRS REMAIN 
 
 « MOVE NEW MARKERS RIGHT ONE P.E, 
 
 % fnable only leftmost P.E. 
 
 I OBTAIN MARKS FROM PRFVIOUS CYCLE 
 % SAVE MARKER STRING WORD SHIFTED 
 
 OUT RIGHT END FOR NFXT CYCLE 
 I RE-ENABLE ALL P.E.S 
 
 % RE-ASSEMBLE MARKERS SHIFTED 
 RIGHT 9 BITS 
 
 I RE-AssEMBLE MARKERS SHIFTED 
 RIGHT 17 BITS 
 
 * SET NULL MARKERS FOR SECOND SYMBOL 
 IN CHAINS OF THREE QUOTES 
 
 I RESET QUOTE MARKER FOR THIRD QUOTE 
 SYMBOL IN CHAINS OF THREE QUOTES 
 
 t BEGIN "MODULO TWO" PROCESS 
 
 00017300 
 00017400 
 00017500 
 00017600 
 00017700 
 00017800 
 00017900 
 00016000 
 00016100 
 00016200 
 
 00016300 
 00016400 
 00010500 
 
 00018600 
 00018700 
 00018800 
 00016900 
 00019000 
 00019100 
 
 00019200 
 00019300 
 00019400 
 00019500 
 00019600 
 00019700 
 00019600 
 00019900 
 00020000 
 
 00020100 
 00020200 
 00020300 
 00020400 
 00020500 
 00020600 
 00020700 
 00020800 
 00020900 
 00021000 
 
 00021100 
 00021200 
 
 00021300 
 
 00021400 
 00021500 
 
 00021600 
 00021700 
 00021600 
 00021900 
 00022000 
 00022100 
 
 00022200 
 00022300 
 00022400 
 00022500 
 00022600 
 00022700 
 00022600 
 00022900 
 
36 
 
 CSHL(3) 
 AMD 
 LOB 
 LHL(O) 
 
 SMAR 
 
 TyLTM(O) 
 in 
 
 >FTC( 1 ) 
 L<U(2) 
 ZTRF(O) 
 CnMPC(l )J 
 
 I ni_( 2 ) 
 
 ' -^MR(2) 
 C^XORC?) 
 TXLTM(0> 
 Ct.C(l)) 
 CSBt 1 ) 
 CANDCl ) 
 STL 
 0SHR(2) 
 
 rri 
 
 .: n m p a i 
 
 LR A 
 A« 3 
 SMAR 
 
 OB 
 
 L^S 
 
 
 ALOOP» 
 
 LHA 
 
 LHR 
 
 LHL(O) 
 
 ; ■"-■ 
 SmAL 
 
 op 
 
 5TA 
 LOL(O) 
 
 LHA 
 
 Lnul) 
 
 G* 
 
 LOS 
 
 L'^A 
 
 LOLC 1 ) 
 
 LR 
 
 AND 
 
 Oft 
 
 STA 
 
 * ,BTMSK IS ALREADY IN ACAR3 
 
 X extract only quote markers 
 
 X SET ACAROINCR - *1# ACAROLIM m 7 
 X REPEAT THE PROCESS EIGHT TIMES 
 
 3* 
 
 *C3j 
 
 SAJ 
 
 .LIM7J 
 8j 
 SBI 
 *-3j 
 
 60J 
 I» 
 
 .ENosTj 
 
 »U 
 
 sell 
 
 .LIM63J X SET ACAROINCR ■ *1» ACAROLlM ■ 63 
 U 
 
 scij 
 ,-3; 
 
 63; 
 SC2J 
 
 .ENDSTJ 
 
 U 
 
 SC2J 
 
 -E.OR.E-I 
 -E.OR.EJ 
 SAJ 
 
 x re-enable all p.e.s 
 
 ssj 
 
 SC3| 
 
 X GET ORIGINAL QUOTE MARKERS AGAIN 
 X ALIGN QUOTE MARKERS WITH NULL 
 
 MARKER POSITION IN MARKER STRING 
 X SET REMAINING QUOTE MARKS TO NULLS 
 X SAVE MARKER STRING 
 **** MARKER STRING GENERATOR **** 
 SORCEJ X BEGIN NUMBER MARKER 
 
 LOAD SOURCE STRING 
 X SAVE A COPY OF THE SOURCE STRING 
 X I OAO NUMBER TEST BYTES 
 X MARK ALL DIGIT CHARACTERS 
 X SHIFT MARKS TO NUMBER MARKER POSN 
 X ADD MARKERS TO MARKER STRING 
 % S AvE MARKER STRING 
 
 END NUMBER MARKING 
 X BEGIN MARKING ALPHABETIC SYMBOLS 
 
 SET AcArOIncR ■ ♦*' AcArOlIm ■ 3 
 SRI 
 
 ic°* 1 C0> x test for alphabetic characters 
 
 $AJ X SAVE TEST RESULTS IN REGISTER S 
 SR| X RELOAD SOURCE STRING IN REGISTER A 
 
 scil^^^ COMPLETE TEST FOR ALPHA SYMBOLS 
 SSI X COMBINE ALPHA TEST RESULTS 
 MARKS! X ADD ALPHA MARKERS TO MARKER STRING 
 MARKS' * SAVE MARKER STRING 
 
 SR> 
 SAJ 
 
 SAJ 
 
 .HINBRJ 
 
 SCOJ 
 
 6J 
 
 SSJ 
 
 MARKSJ 
 
 .LIM3J 
 
* ALOOPJ X END ALPHA MARKING AFTER TESTING 
 
 FOR THREE ALPHA GROUPS 
 SA' 
 SRI 
 
 .BLANK) 
 
 SCI; 
 
 .BTMSKJ 
 
 sen 
 
 5) 
 •SI 
 
 SA) 
 
 S BEGIN BLANK SYMBOL H A RKING 
 
 RELOAD SOURCE STRING 
 X LOAD BLANK CHARACTER TEST BYTES 
 
 x test for blank characters 
 s obtain marker for each blank 
 
 S MOVE MARKS TO BLANK MARKER POSTN 
 X ADD NEW MARKERS TO MARKER STRING 
 X BEGIN DELIMITER MARKING 
 
 SKJ 
 
 SCO) 
 SCI) 
 
 1) 
 SS) 
 
 ** 
 
 SA) 
 
 ,BTM 
 
 5) 
 
 SC3) 
 
 SA) 
 
 SS) 
 
 2) 
 
 SC3j 
 
 2) 
 
 SR) 
 SA) 
 SS) 
 2) 
 
 SC3| 
 
 4) 
 
 SR) 
 
 SA) 
 56) 
 SA) 
 1) 
 
 SR) 
 
 1) 
 
 SCO) 
 XTEND) 
 
 XTEND) 
 •E.OR»E| 
 
 •E.OR.E) 
 
 8) 
 SB) 
 
 X OBTAIN MARK FOR ALL CHARACTERS NOT 
 
 PREVIOUSLY MARKED 
 X SHIFT DELIMITER MARKER INTO PLACE 
 X ADD DELIMITER MARKER TO MARKER STRING 
 ** CONTROL STRING GENERATOR **** 
 
 X FXTRACT BLANK CHARACTER MARKERS 
 
 X MOVE MASK TO STRING MARKER 
 
 POSITION IN BYTE 
 X EXTRACT STRING MARKERS 
 
 X ROTATE TO BIT POSITION $2 OF BYTE 
 S MARK NON-STRING BLANk CHARACTERS 
 
 S MOVE MASK TO DELIMITER MARKER POSTN 
 X EXTRACT DELIMITER MARKERS 
 X SHIFT TO BIT POSITION #2 OF BYTE 
 X MARKERS NOw INDICATE EITHER 
 
 DELIMITERS OR NON'STRING BLANKS 
 
 x set bit #0 of acaro 
 
 s enable only leftmost p,e, 
 
 x add marker from previous pass to 
 
 left of string 
 x save end marker for next pass 
 % re-enable all p.e.s 
 
 X RE-ASSEMBLE STRING SHlFTEO RIGHT 
 EIGHT BITS 
 
 00028700 
 00026600 
 00026900 
 
 00029000 
 00029100 
 00029200 
 00029300 
 00029400 
 00029500 
 00029600 
 00029700 
 00029S00 
 00029900 
 00030000 
 
 00030100 
 00030200 
 
 00030300 
 00030400 
 00030500 
 00030600 
 00030700 
 00030600 
 00030900 
 00031000 
 
 00031100 
 00031200 
 00031300 
 00031400 
 00031500 
 00031600 
 00031700 
 00031600 
 00031900 
 
 00032000 
 00032100 
 00032200 
 
 00032300 
 00032400 
 00032500 
 00032600 
 00032700 
 00032600 
 00032900 
 00033000 
 00033100 
 
 00033200 
 00033300 
 00033400 
 00033500 
 00033600 
 00033700 
 00033800 
 
 00033900 
 00034000 
 00034100 
 00034200 
 00034300 
 
58 
 
 NAND SRI * CREATE PARTIAL SECTION STORAGE INC. 
 
 shal i) i move indicator to bit position #i 
 
 LDB $Ai 
 
 S^AL ll 
 
 OR SBI * DUPLICATE INDICATOR BIT In BlT 
 
 POSITION #0 OF BYTE 
 
 LOR $A| 
 
 LnA $si 
 
 LOLCO) .HAMSKi 
 
 AND SCO; I MASK OUT LEFT 4 BITS OF EACH BYTE 
 
 OR SRI t ADD PARTIAL SECTION STORAGE 
 
 INDICATORS TO CREATE CONTROL STRNG 
 
 STA MARKS) * CONTROL STRING IS NOW REAqY FOR USE 
 
 **** SECTION BUILDER **** 
 CLC(D) * PEGIN SECTION BUILDER SET-UP 
 
 COMPC(l)j x ACAR1 PATTERN TO RAPIDLY ENABLE 
 
 ALL P.E.S 
 
 LnL(2) .IDENTI % SET UP FLAG BIT FOR IDENTIFIERS 
 
 LnL(3) , STRNGj X SET UP F|_AG BIT FOR STRING SECTIONS 
 
 LOR SORctj X LOAD SOURCE STRING IN REGISTER R 
 
 CI RA' 
 
 Lns $ai 
 
 Lnx *A) 
 
 STA SECTNl 
 
 STA SECTPJ X FND SECTION BUILDER SET-UP 
 
 LOB MARKSJ I lOAD FIRST CONTROL BYTE 
 
 LOD SB! * SET MODE REGISTERS 
 
 SFTE -E.OR.-EIX COMPLEMENT SECTION STORAGE 
 
 SFTEl -El.OR.-EUX INDICATORS 
 
 XT .FRSTRJ * SET FlRSTSTORE FLAG 
 
 LnEFl tC lj I PE-ENABLE ALL P.E.S 
 
 STTE -H.AND.EII CONSIDER ONLY NON-BLANk# NON-NULL 
 
 STTF1 -H.AND.ElIX CHARACTERS 
 
 lor sa; 
 
 SHAR 561 % EXTRACT FIRST CHARACTER 
 
 SFTE -I. AND. FIX CONSIDER NON-BLANKS# NON-NULLS# AND 
 
 SFTEl -I. AND. Ell NON-DELIMITERS ONLY 
 
 RTAL 8' 
 
 Lns *AI 
 
 CLRAl 
 
 AriMA «8I X SET PARTIAL SECTION CHARACTER 
 
 COUNT TO 1 
 
 SFTE J.AND'EJ I CONSIDER ONLY ALPHABETIC CHARACTERS 
 
 SFTEl J,AND.E1' 
 
 DP SC2; X SET IDENTIFIER MARKER 
 
 LPEEl *Clj * RE - ENABLE ALL P»E»S 
 
 SFTF Q, AND. El % CONSIDER ONLY STRINq CHARACTERS 
 
 SFTEl G, AND. Ell 
 
 OR 1C3| % SET STRING MArkEr 
 
 LOEEl tCH * RE-ENABLE ALL P.E.S 
 
 SFTF -H. AND, El* CONSIDER ONLY NON-BLANKS AND 
 
 SFTEl •H»AND«El>* NON-NULL CHARACTERS 
 
 L.HB SA> 
 
 LHA $Sl X RETURN PARTIAL SECTION HEADER TO 
 
 LOS SBI I REGISTER S 
 
59 
 
 SETE 
 
 SFTEl 
 
 STA 
 
 CLRAl 
 
 *T 
 
 ldeei 
 
 mL(O) 
 
 I LOB 
 LDA 
 SMAL 
 SwAp| 
 
 LOO 
 
 RTAR 
 
 OR 
 
 RTAR 
 
 STA 
 
 CLRAI 
 
 LOS 
 
 XT 
 
 LDEEI 
 
 SETE 
 
 STTEl 
 
 LOB 
 
 IDA 
 
 SHAL 
 
 SHAR 
 
 SETE 
 
 SETE1 
 
 OR 
 
 RTAL 
 
 LOB 
 
 LnA 
 
 LOS 
 
 ADMA 
 
 SETE 
 
 SETE1 
 
 OP 
 
 LOEEl 
 
 SETE 
 
 SFTEl 
 
 OR 
 
 LOEEl 
 
 SETE 
 
 SETE* 
 
 SFTE 
 
 LOB 
 
 LnA 
 
 LOB 
 
 SETE 
 
 SETEl 
 
 STA 
 
 CLRAI 
 
 LOS 
 
 XT 
 
 I, AND. El s CONSIDER ONLY DELIMITER SYMBOLS 
 
 I. AND. Ell 
 
 ♦S^CTNJ I STORE DELIMITER In OUTPUT STRING 
 
 • 1J 
 
 • CM 
 
 .CYC56; 
 SAJ 
 
 MARKSl 
 ■0(0)1 
 
 SB' 
 #51 
 SSJ 
 
 111 
 
 *SECTNJ 
 
 SAl 
 ■ 1) 
 
 *CU 
 
 •H.AND, 
 •H.AND, 
 
 sa; 
 sri 
 
 ■0(0)1 
 56; 
 
 -I, AND, 
 -I .AND. 
 
 sb; 
 
 81 
 
 SAJ 
 
 SSJ 
 
 SBI 
 SJ 
 
 J.ANO.E 
 J.AND.E 
 
 SC2; 
 
 SCl| 
 
 G.AND.E 
 G.ANDtE 
 
 sc3; 
 
 ten 
 
 •H.AND. 
 
 -I tAND. 
 
 -LAND, 
 
 SAl 
 
 SSI 
 
 SSJ 
 
 I. AND.- 
 
 LAND*" 
 
 ♦SEcTNl 
 
 sa; 
 ■ H 
 
 s re-Enable all p.e.s 
 
 COMPLETE FIRST CHARACTER 
 S LOAD 8 STER 8 UNTIL 56 INDEX 
 I MOVE TO N"TH CHARACTER 
 
 * LOAD CONTROL BYTES 
 
 * SELECT BYTE #N 
 
 I MOVE CONTROL BYTE TO REqIsTEr b 
 % LOAD CONTROL BYTE IN MOOE REGISTER 
 
 I ADD HEADER TO P*RTl*L SECTION 
 
 I ALIGN PARTIAL SECTION FOR STORAGE 
 
 I STORE PARTIAL SECTION 
 
 I CREATE NEW EMPTY PARTIAL SECTION 
 
 » CLEAR CHARACTER COUNT 
 
 I ENABLE ALL P.E.S 
 EM CONSIDER ONLY NON-BLANK AND 
 FlJS NON-NULL CHARACTERS 
 
 I EXTRACT NEXT CHARACTER 
 EJS CONSIDER NON-BLANK#NON-NULL AND 
 EUS NON-OELIMITER CHARACTERS ONLY 
 
 S ADD NEW CHARACTER TO PARTIAL SECTN. 
 
 S 
 
 I I 
 
 H 
 I 
 I 
 
 I I 
 
 u 
 % 
 I 
 
 EJS 
 EJ* 
 El 
 
 INCREMENT CHARACTER COUNT 
 CONSIDER ONLY ALPHABETIC SYMBOLS 
 
 ADD MARKER FOR IDENTIFIER 
 
 RE-ENABLE ALL P«E*S 
 
 CONSIDER ONLY STRING CHARACTERS 
 
 ADD STRING MARKER 
 RE-ENABLE ALL P.E.S 
 
 consioer non-blank, non-null» and 
 
 NON-DELIMITER CHARACTERS ONLY 
 
 I RETURN PARTIAL SECTION HEADER 
 « TO REGISTER S 
 
 e;s consioer only delimiter CHARACTERS 
 
 Ell 
 
 * STORE DELIMITER IN OUTPUT STRING 
 S CREATE NEW EMPTY PARTIAL SECTION 
 I CLEAR CHARACTER COUNT 
 
 00040100 
 00040200 
 00040300 
 00040400 
 00040500 
 
 00040600 
 00040700 
 00040800 
 
 00040900 
 00041000 
 
 000*1100 
 
 00041200 
 00041300 
 00041400 
 
 00041500 
 
 00041600 
 00041700 
 
 00041800 
 00041900 
 00042000 
 00042100 
 00047200 
 000*2300 
 00042400 
 00042500 
 000*2600 
 00042700 
 00042800 
 00042900 
 
 00043000 
 00043100 
 000*3200 
 00043300 
 00043400 
 00043500 
 
 00043600 
 00043700 
 000*3800 
 00043900 
 00044000 
 000**100 
 00044200 
 00044300 
 00044400 
 
 00044500 
 00044600 
 00044700 
 
 00044800 
 00044900 
 00045000 
 
 00045100 
 00045200 
 000*5300 
 00045400 
 00045500 
 000*5600 
 00045700 
 
60 
 
 LDEE1 SCI) I RE-ENABLE ALL P.E.S 
 
 TXITM(O) #REPETI X CONTINUE UNTIL * CHARACTERS HAVE 
 
 * BEEN PROCESSED 
 
 * **** SECTION JOINER - CHARACTER COUNT ROUTINE **** 
 IXE «8192I X SET I BIT If X ■ 2**13 (FjRSTSTORE 
 SFTC(O) It X HARKER ■ 1* X ■ OTHERWISE) 
 IXE »0l 
 
 S^cd) Ij 
 
 COR(O) SC1I X ACARO ■ NOBREAK 
 
 JXG «4096l I SET J BIT If X > 2**12 (FlRSTSTORE 
 % X HARKER ■ 1) 
 
 SFTCC1) J) * ACAR1 ■ FlRSTSTORE 
 
 IsE «0| I SET I BIT IF THE P.E, CONTAINS 
 
 SFTC<2> II * NO UNSTORED PARTIAL SECTIONS 
 
 CnMPC(2)l * ACAR2 a REHAIN 
 
 L0L(3) JC2) 
 
 CSHRC3) II 
 
 CANO(O) $C3| 
 
 C*N0(3) SC 1 1 
 
 CSHL<D II 
 
 CANDC2) SCll 
 
 CANOO ) SCO| I ACAR1 . LINK 
 
 COMPCCO)! 
 
 CAND(O) SC2j X ACARO ■ RECEIVER 
 
 LOL( 2 ) SCl| 
 
 CnMPC(2)l 
 
 CANDC2) SC3| I ACAR2 ■ GENERATOR 
 
 STL(2) »GENRI X SAVE GENERATOR PATTERN 
 
 ir>L(3) .LIM7I I SET ACAR3INCR ■ *1# ACAR3LIH ■ 7 
 
 RTAR #81 I LEFT JUSTIFY PARTIAL SECTION TAIL 
 
 STA SAVEI I SAVE "ENDING" PARTIAL SECTION 
 
 STS SSAVE) I SAVE "ENDING" HEADER 
 
 LOA SECTMI X LOAD FIRST PARTIAL SECTION STORED 
 
 LDEEl SCll X ACTIVATE ONLY LINK P.E.S 
 
 LDA SSI 
 
 SHAL 53| I SHIFTED COUNT IS DIVIDED BY 8 
 
 SFTE E.OR.-EI I CORRECT STARTING SECTION HEADER IS 
 
 SFTF1 E.OR.-EI I NOW IN LEFTHOST BYTF OF REGISTER A 
 
 L08 SAI 
 
 SHAL 21 
 
 SHAR 581 
 
 LOR SA| I STARTING SECTION CHArAcTEr COUNT IS 
 % NOW IN REGISTER R 
 
 LDA SBI 
 
 SHAR 62j 
 
 SHAL 61 
 
 LOB SSI X FNOING SECTION CHARACTER COUNT AND 
 % HArkErs ArE now In register b 
 
 LOS SAI X STARTING SECTION MARKERS NOW IN RGS 
 
 LHA SBj 
 
 S H A R 3 I 
 
 RTL 631 X ROUTE COUNT BYTES LEFT ONE P.E, 
 
 AOMA SRI X COHBINE COUNTS 
 
 MOREl RTAR 8| 
 
 LHB SA* 
 
61 
 
 SMAR 
 
 OR 
 
 RTL 
 
 AHMA 
 
 TXITM(3) 
 
 RTAR 
 LOLO) 
 
 LnR 
 Lns 
 
 SMAR 
 
 smal 
 
 RTAR 
 
 Lns 
 
 ShAr 
 
 OR 
 
 RTL 
 
 OP 
 
 TYLTM(3> 
 
 RTAR 
 
 OR 
 
 STA 
 LHS 
 * SECTION 
 LOL(3) 
 STLC3) 
 STX 
 CLRA) 
 LDR 
 LOB 
 
 LDA 
 SHAL 
 
 561 
 SB) 
 631 
 SRI 
 
 'MORE) 
 
 6) 
 
 .LIM7J 
 
 SS) 
 
 SA) 
 
 62) 
 
 6) 
 
 a) 
 
 SA) 
 56) 
 
 SB) 
 
 63) 
 
 SR) 
 
 ,MMORE) 
 
 6) 
 
 SS) 
 
 HEAOR) 
 SA| 
 
 JOINER - 
 ,LlM7) 
 .CNTR) 
 SAV E Xj 
 
 SA) 
 
 SAVE) 
 
 SECTN) 
 
 6) 
 
 LOEE1 
 
 LOA 
 
 LHEE1 
 
 STR 
 
 SETE 
 SFTE1 
 LOR 
 LHA 
 
 SHAL 
 
 Lns 
 
 J« 
 
 SFTCC3) 
 
 CnMPC(l)) 
 
 STL(l) 
 
 CANOCl) 
 
 LnA 
 
 LOEE1 
 
 SHAR 
 
 STA 
 
 SCI) 
 
 SB) 
 
 SC2) 
 SECTN) 
 
 E.OR.-E) 
 E.OR.-E) 
 SA) 
 SSAVE) 
 
 3) 
 
 SA) 
 
 57) 
 
 J) 
 
 •NLINK) 
 
 SC3) 
 
 SB| 
 
 SCl) 
 
 6) 
 
 *SECTN) 
 
 I COMBINE MARKERS 
 
 I ROUTE COUNT BYTES LEFT ONE P.E, 
 
 f ADD IN NE X T COUNT BYTE 
 
 X REPEAT SEVEN TIMES 
 
 I 8 ELEMENT COUNTER STRING COMPLETE 
 
 X RESET ACAR3INCR • ♦ !• ACAR3LIM • 7 
 
 X SAVE COUNTER STRING IN REGISTER S 
 
 x Ending section marks In register a 
 
 X ROUTE MARKERS LETT ONE P.E. 
 
 S COMBINE MARKERS 
 
 X REPEAT SEVEN TIMES 
 
 S 8 ELEMENT MARKER STRING COMPLETE 
 
 S FORM COMPOSITE MARKER / COUNTER 
 
 STRING 
 X SAVE COMPOSITE STRING 
 X CHARACTER COUNT ROUTINE COMPLETE 
 RECEIVER/LINK/GENERATOR ROUTINE **** 
 X SET ACAR3INCR ■ *\, ACAR3LIM • 7 
 X SET UP SECTION JOINER CYCLE COUNTER 
 X SAVE LOCATION OF FIRST NEW ENTRY 
 
 X RELOAD ENDING PARTIAL SECTION 
 X RELOAD STARTING SECTION (IF ANY) 
 X REMOVE OLD HEADER FROM STARTING 
 
 SECTION 
 X ACAR1>LINK» THE ONLY CASE WHERE AN 
 
 ENDING PART. SECTN, IS ROUTED LEFT 
 
 X ENABLE ONLY GENERATOR P.E.S 
 X CLEAR STARTING SECTIONS IN 
 
 GENERATOR P.E.S 
 X RE-ENABLE ALL P.E.S 
 
 X SAVE SECTIONS TO BE ROUTED LEFT 
 X RETRIEVE HEADER OF ENDING PARTIAL 
 
 SECTION 
 X MULTIPLY CHARACTER COUNTS BY EIGHT 
 
 X LOCATE 8-CHARACTER PARTIAL SECTIONS 
 
 X FORM -LINK 
 
 X STORE ONLY NON-LlNK# 8-CHARACTER 
 X SECTIONS 
 
 00051500 
 00051600 
 00051700 
 00051600 
 
 00051900 
 00052000 
 00052100 
 00052200 
 00052300 
 00052400 
 00052500 
 00052600 
 00052700 
 
 00052800 
 00052900 
 00053000 
 
 00053100 
 00053200 
 00053300 
 
 00053400 
 00053500 
 00053600 
 00053700 
 00053600 
 00053900 
 00054000 
 00054100 
 00054200 
 00054300 
 00054400 
 00054500 
 00054600 
 00054700 
 00054800 
 00054900 
 00055000 
 00055100 
 00055200 
 00055300 
 00055400 
 00055500 
 00055600 
 00055700 
 00055600 
 00055900 
 00056000 
 00056100 
 00056200 
 00056300 
 00056400 
 00056500 
 00056600 
 
 00056700 
 00056800 
 00056900 
 
 00057000 
 00057100 
 
62 
 
 LOA IBJ 
 
 XT all 
 
 SHAL 56 * 
 
 CPB(O) 631 « PREVENT END-AROUND ROUTING 
 
 STl(O) .RCVRI * S*vE PATTERN OF rEcEIvEr p.E.S 
 
 CnMPC(2)l I FORM -GENERATOR 
 
 i! **♦* section Joiner - first concatenation cycle **** 
 
 LOEEl $CO| X CONSIDER ONLY ACTIVE RECEIVERS 
 
 LOB SA* 
 
 RTL 631 
 
 l_nA SRI X load nExt element FROM THE RIGHT 
 
 SHAR #OJ 
 
 OR SBI X ADD NEXT ELEMENT TO PARTIAL TOKEN 
 
 SHAR 81 X MAKE SPACE FOR THE HEADER 
 
 LHB SAI 
 
 l_nA hEAqRI 
 
 JR 601 
 
 LHA $B| 
 
 STTCd) Jl X CHECK TO SEE If CURRENT PARTIAL 
 
 crxOR(3> SC1J X TOKEN Is LARGE ENOUGH TO STORE 
 
 CANDC3) tCO| 
 
 LOEEl SC3| 
 
 LHL(3) SC1I 
 
 STA *SECTN| X ADD PARTIAL TOKEN TO OUTPUT 
 
 XT "U 
 
 Cl.RAJ 
 
 LOB SAI 
 
 LOA SRI 
 
 SHABR #81 X PREPARE LEFT PART OF NEXT PARTIAL 
 1! TOKEN IN REGISTER B 
 
 CSHL(2) II 
 
 CAND(O) SC2| X UPDATE ACTIVE RECEIVER INDICATORS 
 
 LOA SBI 
 
 LOEEl SCOl 
 
 l_nB SAI 
 
 LDA HEADRI X RELOAD CHAR, COUNT / MARKER STRING 
 
 SHAL 31 X MULTIPLY COUNT BY EIGHT 
 
 LOS SAI 
 
 z^rtcO) #f!nali x End first concatenation cycle 
 
 t #*** SECTION JOINER - EXTRA CONCATENATION CYCLES **** 
 
 CYCLFi LOEEl SCO; X CONSIDER ONLY ACTIVE RECEIVER P.E.S 
 
 RTL 63 * 
 
 L r>A SR' * LOAD NEXT ELEMENT FROM THE RIGHT 
 
 SHAR #81 
 
 OP SBI X ADD NEw ELEMENT TO PARTIAL TOKEN 
 
 LnB SA* 
 
 LOA *Sl 
 
 jr 491 X SEE Ip NEXT CHARACTER COUNT BYTE 
 X ITS "8"-BIT » 1 
 
 LDA IBI 
 
 STTCCl) Jl X CHECK TO SEE If CURRENT PARTIAL 
 
 CPXORC35 XCH * TOKEN IS LARGE ENOUGH TO STORE 
 
 CAN0(3) SCOl 
 
 LOEEl SC3| 
 
 LnL(3) SC1I 
 
63 
 
 STA 
 
 XT 
 
 CLRAI 
 
 LDB 
 
 LOA 
 
 SMABR 
 
 LOEEl 
 
 CSHL(2) 
 
 CAND<0) 
 
 LOA 
 
 SMAR 
 
 SHAL 
 
 Lns 
 
 LOA 
 
 EXCHLCO 
 
 TXGFM(0 
 
 HAITI 
 
 EXCHLCO 
 
 ZfRF(O) 
 
 **** 
 LOL(O) 
 LOEEl 
 SHAR 
 |lOL<1) 
 
 ;lde 
 
 |SFTE1 
 SFTE 
 | STA 
 
 XT 
 SFTE 
 
 srin 
 
 CLRAl 
 STA 
 JXE 
 
 SETCO) 
 'COMPCO 
 STL(3) 
 LDX 
 
 LHB 
 
 Ida 
 
 ,SHAL 
 OR 
 
 Loe 
 
 SFTE1 
 
 .SFTE 
 
 STA 
 
 SFTE 
 SFTFl 
 
 ■.run) 
 
 .OL(0) 
 
 .ni(2) 
 
 *SEcTNj 
 • ll 
 
 IA' 
 
 SRI 
 #8| 
 
 scoi 
 U 
 
 SC2I 
 SSI 
 
 tu 
 
 31 
 IAI 
 
 SBI 
 
 ) .CNTRJ 
 ) H 
 
 ) .CNTR' 
 *CYctE; 
 
 SECTION JOI 
 ,RCVR| 
 SCOj 
 
 61 
 
 .NLINKj 
 
 »Clj 
 
 -I. AND, El 
 -I.AND.FI 
 
 *sEctni 
 ■U 
 
 E.OR.-EI 
 E.ORt-Ej 
 
 ♦sEctni 
 
 •OJ 
 
 JJ 
 
 * ADD PARTIAL TOKEN TO OUTPUT 
 
 >l 
 
 .ACTlVI 
 SAVEXl 
 
 ♦SECTNJ 
 
 SSI 
 
 531 
 
 SBI 
 
 SCl| 
 •LAND, 
 -LAND, 
 ♦SECTNI 
 
 E.OR.-E 
 E.OR.-E 
 
 **** 
 .ACTIVI 
 •GENRI 
 SCM 
 
 i prepare l en part of next partial 
 token in register b 
 
 i update active receiver indicators 
 
 % shift to next character count 
 i multiply count by eight 
 
 % load cycle counter 
 
 i skip if more cycles are possible 
 
 i error halt • too many cycles 
 
 * save cycle counter 
 
 i continue only if active receivers 
 
 still remain 
 ner • final portion **** 
 x reload original receiver pattern 
 t use pattern to enable p.e.s 
 % make space for header 
 % reload -link 
 
 * consider only (remain ), ("link ) 
 % add last elements to output 
 
 * re-enable all p.e.s 
 
 * store zeros to mark end of output 
 
 i save bits that indicate p.e.s with 
 
 * at least one output section 
 
 s reload pointer to first location 
 
 used by section joiner for output 
 « reload first section joiner output 
 
 % obtain header 
 
 * attach header to final tokens 
 
 El 
 El 
 
 % RETURN FIRST ENTRY WORD TO OUTPUT 
 I I RE-ENABLE ALL P.E.S 
 I % END OF SECTION JOINER PROCESS 
 
 RESULT STRING GENERATOR **** 
 
 * FIND P.E.S WITH TOKENS STORED 
 
 00062900 
 00063000 
 00063100 
 00063200 
 OOO63300 
 00063400 
 00063500 
 00063600 
 
 00063700 
 00063600 
 00063900 
 00064000 
 00064100 
 00064200 
 
 00064300 
 00064400 
 00064500 
 
 00064600 
 00064700 
 00064600 
 00064900 
 00065000 
 00065100 
 00065200 
 00065300 
 00065400 
 00065500 
 00065600 
 00065700 
 00065800 
 00065900 
 00066000 
 
 00066100 
 00066200 
 00066300 
 
 00066400 
 00066500 
 00066600 
 00066700 
 00066600 
 00066900 
 00067000 
 00067100 
 00067200 
 00067300 
 
 00067400 
 00067500 
 00067600 
 00067700 
 00067600 
 00067900 
 
 00066000 
 00066100 
 
 00066200 
 00066300 
 00066400 
 00066500 
 
64 
 
 ZFRFC2) 1J 
 
 JUMP CEASE) * JUMP IF ALL TOKENS PROCESSED 
 
 LEAD0C2>* * FIND NEXT P,E, WITH A STORED TOKEN 
 
 CPB(l) SC2j I CLEAR MARKER FOR P#E, FOUND 
 
 CTSBF(O) 1C2#1) X SKIP IF P.E. MAD NO GENERATOR SECT* 
 
 SI.ITC2) «256> * INDEX PAST FIRST ENTRY OF GEN. P.E. 
 
 A|_IT(2) ■QBASEj I ADD 256xqBASE TO INOFX 
 
 NEXwm LHAD(2) .QUERY! I LOAD FIRST WORD OF TOKEN 
 
 LDL(3) .QUERY; 
 
 ZFRTC3) #NEXPE; S SKIP IF NO TOKENS LEFT IN THIS P.E. 
 
 AI.IT<2> -256* 
 
 LHL(O) SC3j 
 
 CSHLO) 21 
 
 CSHRC3) 58) I EXTRACT CHARACTER COUNT OF TOKEN 
 
 Z*"RT(3) ,EMITI X SKIP IF TOKEn IS A DELIMITER 
 
 ALIT(3) Ml 
 
 CSHLO) 21J « FORM WORD COUNT FOR TOKEN 
 
 SLlT(3) mOt 
 
 stl(3) .wrdcti i save word count 
 
 zfrtc3) #one*dj i skip if token is a single word 
 
 chr(3) .alimi x set acar3incr ■ *l* acar3indx ■ 1 
 
 anthri ldad<2> ,query(3)»x get additional token words 
 
 Al IT(2) «256l 
 
 TXLTM(3) #ANTHR; 
 
 TABLFI LOL(O) .QUERY* I CHECK FOR "PROCEDURE" RESERVED WORD 
 
 CFXOR(O) .PROCDI I SKIP IF TOKEN IS NOT A 9-CHARACTER 
 
 ZFRF(O) #ENTERJ * IOENT. STARTING WITH "'PROCEDU* 
 
 LDL(O) ,QUERP> * LOAD SECOND WORD OF OUERY TOKEN 
 
 CFXOR(O) ,REj X CHECK SECOND WORD OF QUERY TOKEN 
 
 ZFRF(O) #ENTERI X SKIP IF TOKEN IS NOT "9PR0CEDURE . . » 
 
 LTTfO) .1281 * FORM DUMMY DELIMITER FOR PROCEDURE 
 
 SKIP #EMITJ X OUTPUT DUMMY DELIMITER 
 
 ONEwni CTSBT(O) 1. ENTERI X SKIP IF TOKEN IS A STRING TOKEN 
 
 CTSBF(O) CENTERJ X SKIP IF TOKEN IS A NUMBER 
 
 LDA SCO| 
 
 JLE RESRVl % COMPARE WITH RESERVED WORD TABLE 
 
 ILE RESRPJ X CHECK SECOND RESERVEn WORD SLICE 
 
 SETC(l) Jl 
 
 ZFRTCl) #SEC*DJ I SKIP IF NO MATCH IN FIRST SLICE 
 
 LTAOOd)! % FIND NUMBER OF MATCHING RES, WORD 
 
 ALIT(l) «64j X FORM DUMMY DELIMITER TO 
 
 SKIP #EMIT' * REPLACE RESERVED WORD 
 
 SECNDt SFTCCl) II 
 
 ZFRT(l) #ENTER| X SKIP IF TOKEN NOT A RESERVED WORD 
 
 LFADOCDI X rIND NUMBER OF MATCHING RES, WORD 
 
 ALITC1) -1281 X FORM DUMMY DELIMITER TO 
 
 SKIP >EMITJ X REPLACE RESERVED WORD 
 
 ENTER! ST|_U) .ACTIV) X SAVE ACARS 
 
 STL(2) .POYMTj 
 
 ZFRTCl) lj X SKIP IF THIS LAST P.E. WITH A TOKEN 
 
 JUMP SERCH' * JUMP TO TABLE MAINTENANCE ROUTINE 
 
 L0AD(2) SC3; 
 
 ZFRT(3) lj X SKIP IF NO MORE TOKEnS In THIS P.E. 
 
 jump serchl x jump to table maintenance routine 
 
 jump ceasej * jump if all tokens stored but one 
 
65 
 
 LOUD 
 CANDO ) 
 
 cnR(O) 
 
 LHL(l) 
 LDL(2) 
 L0L(3) 
 |ST0RE(3) 
 | Al IT<3) 
 iSTLO) 
 SKIP 
 **** T 
 
 ani(O) 
 
 !Lf>L(i) 
 line 
 
 Ilha 
 
 SMAR 
 
 MLMA 
 
 ILOC(I) 
 
 i 
 
 ICAND(I) 
 
 Ma 
 
 |L0L<2) 
 
 lj|,E 
 
 ISETCCO) 
 
 zfrf(o) 
 
 liLnL(O) 
 SLIT(2) 
 ZPRT(O) 
 LOLC 1 ) 
 'LOL(O) 
 
 Ma 
 
 ISKIP 
 
 'JTVLFMC2 > 
 TXLrM(2) 
 
 Lnx 
 
 »!LnL(3) 
 
 in A 
 
 JLE 
 
 SPTCO) 
 CAND(O) 
 ZFRT(O) 
 TXLFM(2) 
 |5KIP 
 iLnLO) 
 
 lDA 
 JLE 
 
 5FTC<3) 
 ZFRTCO) 
 
 •Lfadoco) 
 
 :SHL(1) 
 
 :add(0) 
 
 |5KIP 
 
 MALTI 
 **** T 
 
 .QUERY! 
 .HDMSKI 
 
 sen 
 
 .ACTIVJ 
 
 •POYNTI 
 
 .OUTPT! 
 
 tcO| 
 
 »il 
 
 .outpti 
 
 #nExwdi 
 
 able maint 
 
 .QUERY) 
 •HASH! 
 SCI! 
 SCO) 
 
 16; 
 
 SB! 
 SA; 
 
 .HHSK! 
 SCO! 
 .WRDCTI 
 TBASE(I) 
 
 J! 
 
 #HTCHll 
 
 .XBASE(1 
 
 ■ 01 
 
 #A0ST6l 
 
 SCO) 
 
 .QUERY! 
 
 SCO; 
 
 #TST! 
 
 #FOWNOl 
 
 #SHORT) 
 
 TBASP(l) 
 ,QUERY(2 
 SC3) 
 *TBASH(2 
 
 SC3! 
 
 #N0YET! 
 
 #F0WND! 
 
 #LONGRI 
 
 .QUERPl 
 
 SC3) 
 
 TBASPU) 
 
 Jl 
 
 #N0YETI 
 
 ; 
 
 * RELOAD FIRST WORD OF TOKEN 
 X EXTRACT TOKEN HEADER 
 X ATTACH HEADER TO RESULT 
 I RELOAD ACARS 
 
 X GET RESULT STRING OUTPUT POINTER 
 I PUT RESULT IN RESULT STRING 
 X INCREHENT OUTPUT POINTER 
 X SAVE OUTPUT POINTER 
 X CONTINUE TO NEXT TOKEN 
 ENANCE - SEARCH PROCEDURE **** 
 I LOAD FIRST *ORD OF QUERY TOKEN 
 
 S LOAD HASH CODING CONSTANT IN RGB 
 S LOAD FIRST WORD OF QUERY in RGA 
 I SHIFT QUERY WORD TO THE RIGHT OUT 
 
 OF THE EXPONENT FIELD 
 X PERFORM HASH CODING MULTIPLICATION 
 S RETURN MOST SIGNIFICANT HALF OF 
 
 HASH COOING RESULT TO ACAR1 
 X CLEAR ALL BITS NOT IN 4-BIT KEY 
 
 S ACAR2LIM>(NUMBER OF QUERY WORDS-1) 
 !I TEST BLOCK FQR MATCHING FIRST WORDS 
 
 X SKIP 
 )IS NO 
 X SET 
 
 SKIP 
 USE 
 RELO 
 INT 
 TEST 
 SKIP 
 SKIP 
 !S LOAD 
 )IS GET 
 
 IF SOME FIR 
 MATCH • LOOK 
 WORO COUNTER 
 
 IF NO MORE 
 ExT, BLK, PO 
 AD FIRST WOR 
 REGISTER A 
 FIRST WOROS 
 IF QUERY IS 
 IF QUERY IS 
 POINTER TO 
 NEXT QUERY 
 
 st word matches 
 for Extension blks 
 to first word 
 extension blocks 
 inter vice hash key 
 of query token 
 
 in new block 
 a single word 
 two words long 
 continuation words 
 
 WORD 
 
 )II TEST FOR MATCH 
 
 6! 
 
 SCl! 
 
 fCOMPLl 
 
 X SEE IF ALL WOROS MATCH SO FAR 
 
 X SKIP IF NO ENTRIES STILL MATCH 
 
 X SKIP IF ALL WORDS OF QUERY USED 
 
 X TEST NEXT WORD 
 
 X GET SECOND QUERY WORD 
 
 IX TEST FOR MATCH 
 
 X SEE IF BOTH QUERY WORDS ARE MATCHED 
 
 X SKIP IF BOTH WOROS ARE NOT MATCHED 
 
 X FIND NUMBER OF P.E. WITH MATCH 
 
 X CONVERT P.E. ADDRESS TO QUAD ADDR, 
 
 X FORM COMPLETE RELATIVE QUAD ADDRESS 
 
 X ADDRESS OF MATCHING ENTRY IN ACARO 
 
 X FRROR HALT FOR TABLE OVERFLOW 
 
 ABLE MAINTENANCE • NEW ENTRY PROCEDURE **** 
 
 00074300 
 00074400 
 00074500 
 00074600 
 00074700 
 00074800 
 0007*900 
 00075000 
 00075100 
 
 00075200 
 00075300 
 00075400 
 00075500 
 00075600 
 00075700 
 00075600 
 00075900 
 00076000 
 00076100 
 00076200 
 00076300 
 00076400 
 00076500 
 00076600 
 
 00076700 
 00076600 
 00076900 
 00077000 
 00077100 
 00077200 
 00077300 
 00077400 
 
 00077500 
 00077600 
 00077700 
 00077800 
 00077900 
 00078000 
 00078100 
 00078200 
 
 00078300 
 00076400 
 00078500 
 00076600 
 00078700 
 00076800 
 00078900 
 
 00079000 
 00079100 
 00079200 
 00079300 
 00079400 
 00079500 
 00079600 
 00079700 
 00079800 
 00079900 
 
66 
 
 ADSTGI 
 
 SPACE 
 
 STOWl 
 
 MNYSTI 
 
 MORSTI 
 
 QUIT t 
 
 CEASTi 
 
 LOL(3) 
 CSHR(3) 
 LDA 
 J! Z; 
 
 sftc(O) 
 
 GRTRFC3) 
 
 LnB 
 
 LOX 
 
 XT 
 
 LHL(3) 
 
 J*L 
 
 SETC<3> 
 
 ZTRTC3) 
 
 CAND(O) 
 
 7FRF<0> 
 
 L*L(1) 
 
 AI.ITC1) 
 
 GRTRT(1 ) 
 
 LFADOCO)* 
 C I. C ( 3 ) ; 
 C<B(3) 
 
 LHEEl 
 LnL(O) 
 
 LHA 
 
 STA 
 
 SI IT(2) 
 
 TYLTM(2) 
 
 TXLFM(2) 
 
 LOL(O) 
 
 LHA 
 
 STA 
 
 SKIP 
 
 SLIT(2) 
 
 mx 
 
 XT 
 STX 
 
 LHA 
 STA 
 
 TXLTM(2) 
 
 Lnx 
 
 XT 
 
 STX 
 
 SFTE 
 
 SFTEl 
 
 JUMP 
 
 HALTJ 
 
 FND. 
 
 IC2J 
 
 2AJ I TRANSFER ACAR3LIM TO ACAR3INDX 
 TBASE ( 1" 
 
 f FIND ALL ZERO WORDS (EMPTY ENTRIES) 
 J) 
 
 ,ALIM#SPACEJX SKIP IF ENTRY 1 OR 2 WORDS LONq 
 LASTCJ X GET ADDR, OF LAST CONTIN, WORD USEO 
 IS) 
 
 SC3J X CALCULATE NEW ADDRESS OF LAST WORD 
 .CLIMTJ X LOAD UPPER LIMIT FOR CONTIN, WORDS 
 •1(3); X INSURE UPPER LIMIT NOT EXCEEDED 
 J' 
 
 *N0SPCJ X SKIP IF TABLE OVERFLOWS 
 SC3j 
 >STOW* * SKIP IF SPACE IS ALREADY AVAILABLE 
 
 .lastx; 
 
 ■ 2J 
 
 ,XLIMT # NOSPC'I SKIP IF NO SPACE CAN BE FOUND 
 
 tL A sTX| X UPDATE LAST EXTENSION BLOCK ADDRESS 
 
 X rIND NUMBER OF P.E. WITH MATCH 
 
 $COj 
 SC3J 
 
 .QUERY; 
 
 SCO; 
 TBASC(1 
 
 ■ Oj 
 
 #MNYSTJ 
 #QUIT) 
 .QUERPJ 
 SCOI 
 TBASP(1 
 'QUIT* 
 
 ■ U 
 
 $s; 
 
 ■ U 
 
 TBASP(1 
 .OUERYC 
 
 SCO; 
 
 #TBAsE( 
 
 *morst; 
 
 SSJ 
 
 *C2; 
 
 LASTC* 
 E.OR.-E 
 E.ORt-E 
 COMPL' 
 
 X FNABLE SELECTED P.E, TO STORE ENTRY 
 X LOAD FIRST WORD OF QUERY 
 
 )JX STORE FIRST WORD 
 X SET ACAR2INDX ■ 
 
 X SKIP IF ENTRY HAS MORE THAN 2 WORDS 
 X SKIP IF ENTRY IS A SINGLE WORD 
 X LOAD SECOND WORD OF QUERY 
 
 )JX STORE SECOND WORD OF ENTRY 
 
 X ROTH QUERY WOROS ARE NOW STORED 
 
 X RESET ACAR2INDX ■ 1 
 
 X RGS CONTAINS LASTC* THE ADDRESS Of 
 THE LAST CONTINUATION WORD USED 
 
 X INCREMENT CONTINUATION WORD ADDRESS 
 )JX STORE POINTER TO LATER ENTRY WORDS 
 2)»X GET NEXT QUERY WORD TO BE STORED 
 
 2)JX STORE QU E RY IN CONTINUATION WORD 
 
 X SKIP IF MORE WORDS TO STORE 
 
 X GET OLD VALUE OF LASTC 
 
 X UPOATE LASTC 
 
 X SAVE LAST CONTINUATION WORD ADDRESS 
 J X RE-ENABLE ALL P.E.S 
 
 X TABLE MAINTENANCE COMPLETE 
 X FND OF RECOGNIZER **** 
 
oc fi 
 
 '9*3 
 

 
 fc