WfiHL HHKshHI mmm m ~ 1Q0H BW8M HHBHB inSuffffiraSS Eft hNW •SSSSr MRS win H .*>,":• '4A-,- <loomon f tools f or microprocessors, mainly because the microromou ter is such a hostile environment for a resident assembler. These machines are typically configure! with small memories, have limited instruction sets, and usp vptv slow memory or I/O devices. Hence, it is usually morp convenient to program on a powerful host system with »-ime-shar i ng facilities, especially when a large number ot programmers are involved. For th^ sam« reasons, cross-compilers have been gaining in popularity. However, it has be°n observed that most compilers seldom generate code that is within ten percent in length of the same program written in AL T 2 ]. In fact, most compilers usually ^o much worsp. For a given microcomputer system, a programmer is of + en forced to squeeze his program into some specified amount of ROM. Hence, the inefficient code generated by a compiler may not. be tolerable. Tf the compiler has *he option of translating the source into AL mnemonics, the programmer may be abl° to recode at this lower level to increase efficiency. Otherwise, he is probably better off coding the entire program in AL. Other situations in which AL programming is preferable are discussed in C31. Thus, it appears that cross-assemblers are croinq to be useful for much of the foreseeable future. Initially, the MUMS assembler was conceived as a sort of "instant assembler." noon execution, the program would initialize internal tables and switches according to the contents of an external file. This file would consist of all machine denen^ent information, e.g., mnemonics and corresponding object codes, default values for certain assembly functions, and a set of "templates." Ss used here, the word template may be int^r prated as a mask used for buildinq the machine code equivalent of a mnemonic statement. Thus, any Droqrammer with an understanding of basic assembly techniques could bring up a cross-assembler v^ry auickly by simplv constructing the proper external file for the target machine. ^h^ assombler itself would contain all o^ th<=> functions rhoucht central to the assembly process, ^.q., line scanninq, table look-up, ^tc. A search of the opcode tabl<= would return a basic opcode bit pattern and a t^mpla^e number. A template look-up would + hen supply the details f or procossina the remaininq fields of thp instruction. The templates, as sn^cificd in thf external file, could be visualized as a set of keywords with optional arguments. Purina execution, the assembler would translate these keywords into procedure calls. H^nce, *h^ operand processor for the assembler would be implemented as a collection of independent procedures, with the number of operands to be processed and with the particular procedures to be employed determined hv th» s^qu^nce of keywords in the template. These concents characterize an id°al universal assembler, ^o handle n^w assembly lanquaa^s, the user should bo concerned only with the specifics of the machine file, and no* with the implementation details of the ass^mbl^r itself. In practice, however, the templates will require a defining lanquaqe which is almost as complex as somp high level languages. Such extreme genprality is npc^spary in order to handle the wide range o^ special cases normally encountered in *he implementation of assemblers. It" one simply chooses some fixed set of source operations, then h a is faced with th Q difficult task of constructing a logical proof that the set is sufficient ^o handle all special cases, or h° must be prepared to add fixes whenever *hese special cases appear. Thus, increased generality increases thp complexity of the assembler. In effect, the assembler becomes a template compiler and pays a severe time penally during every execution. From the viewpoint of software maintenence, i 4- is desirable for the assembler to remain as simplp as possible. Tf the assembler is to be M c;ori often, a premium mus* also be placed on efficiency. The compromise to be proposed here is basically an assembler skeleton. This skeleton performs all of the bookkeening chores, e.q., v able management, line scanning and listing, and paper tape generation. Thus, a custom cross-asspmbler may be implemented by simply inserting the template definitions into the skeleton and by creating a new external opcode table. "or maximum execution efficiency, *hp skeleton should obviously be written in AL. However, ease of maintenance an d overall flexibility were deemed more important, so that a HLL 1-hprefore seemed more appropriate. T his scheme offers all of the advantages of HLL proaramminq and is more efficient than storing the templates externally. Tn addition, the technique is so flexible that it will usually he possible to mimic the manufacturer's AL specifications. ^xistinq software can then be modified very quickly + o pass through the npw assembler. TIsinq this approach, a proqramm^r should be able to brinq up a new cross-assembler in less than onp week. These ideas have been tested by implementing a single assembler which qenerates cod" for the National IHP-16, Tntel 8080, RCA COSr»AC, "airchild F8, an 1 the MOS ^cchnoloqy 6^0X. Thp technique described above is similar to that discussed by c on ip V f'4]. Conley suggests the use of RASIC as th» implementation linguage because of its wide availability, an 1 presents a fair critique of its use in this application. The M UMS assembler was written ir PASCAL, an 1 is an extension of a program written by Jones and T.athrop [ 5 "J. This Q ar1y work provided a quick start for the proiprt, and so th a decisioc was made to continue with PASCAT. This decision is difficult to "justify in terms of portability, especially since DEC- 10 PASCAL imposes several non-standard conventions. however, the use of PASCAL is usual lv encouraged on the qrounds that, since it is procoiur e-ori ent ed , the resulting code is self-documentinq an* hiqhlv modular. Earlier work in this arpa was reported in the late six*i°s (f 6 1 ani f 7 1) . Georgiou's "Generalized Assembler" is coded in IP!i 360 !\L, and therefore conflicts with the philosophies detailed above. It also lacks the important element of convenience, as the local 360 does not directly support time-sharing. T n addition, both programs ([6] and f 7 1) wpr« designed specifically for a certain class of machines, namely th a PPP-7, -8, and -9, omDuters. Although reference f71 was not available at the time of this writing, from available evidence it seems that both programs are closely tailored to match fheir targpt machines, and therefor^ lack the flexibility to adapt easily to current microprocessors. TT. FHNCTONAT. SPECIFICATION OF THE ASSEMBLES An 'lrdfr] yinq assumption of the MUMS assembler is that f b° siqnif ican+ differences between assembly lanquaqes occur orlv in the operand flails of «ach instruction. Hence, t he svn^ac f ic and semantic processina of each operand for a particular opcode may b° accomplished by proper coding of the t<-mnlat a . Conr Q p + uall y , the assembler can be na rti tionod irto <-«o parts. One part handles most of the usual functions of an assembler. The second part is +-he set of templates, whi~h must he customized for a qiven machine. tVp rr->tiai nder of this section will a+^empt to define the responsibilities of each part and to show that this organization is sufficient to handle most machines. Source Lanquaqes The assembler processes mnemonic source statements in a free fi^ld format. Tables 1 and 2 present an EBF description of leqal source statements. Tn order to keep the notation simnle, the lexical description is somewha* informal. Fence, the fo] lowi nq facts are included to supplement the lexical details of Table 2. An explanation of the E q F notation appears in the appendix. TABLE 1 PDF Description of Source File > = + EOF (TD COLON) * ( ((OPCODE } PSETIDO-OP) (++) ?) | ((TD J DOT) EQ-OP ) )? (SEMI COMMENT) ? CONKA | SEMI | LP | RP | CR-LF = (DO 171 | ID | NtJM) () ? = LAR PAB = = pttis-OP | MINUS-OP | AND-OP | OR-OP | MUL-OP j DIV-OP = plh^-op | minus-op j corp-op = END-PSFUDO ()? TABLE 2 Textual ''"Dr^sont at ion of Terminal Symbols L^ ( COLON « S^MT • ♦ COMMA t DOT • S00OTF « nouo^F It LAB < RAE > PSEPDO- ■OP RP ) PL US -OP + MINUS-OP - AND-OP Z OR-OP t DTV-OP / MUL-OP * COMP-OP -i EQ-OP = OPCODE . ASCTT | .BYTE | .FORMAT | -LIST | .NLTST | .NPUNCH \ .PAGE | .RADIX I .TABLE | .WORD ID specified by external opcode table, < n u m > • A ■ | ... j »Z» • 0* | ... | '9' • A • | ... | »F« (DOT) ?(|| DOT) * — > ID CI' ) ? (|) — > NITM SQUOTE --> DQUOTE — > •B' I '0' | 'D' | • H' (any printing ASCII character, except underscore) * — > COMMENT 10 1. Source statements are contained in a sinale line with a maximum length of eighty columns. A line is terminated by a carriage-ret urn/line- feed (CR-LF) . 2. A *-^rminal symbol can be bounded by an arbitrary number of spaces. 3. The non-tormina] actually specifies the class of expressions which may be evaluated by the expression analyzer. Precedence is enforced by the use of angled brackets (less-than and great er-than) , thereby freeing parentheses for special applications. The particular delimiter () used to separate two expressions is a semantic issue, and must therefore be resolved in the template. An occurrence of DOT in an expression is a reference to the current value of the location counter. Obviously, certain constructs generated bv these rules are Impractical. For example, long labels waste table space, while large numbers may cause a truncation error. Also, numbers are evaluated in the context of the current radix. H^nc?, for radix eight, the digit ninp would cause an error message to be printed. The current radix may be t ~ Tiporar ily overridden by use of up-arrow (radi x) , where the t-^mporarv radix is R for binary, *or octal, D for decimal, ^nd H for hex. 11 The Phrase structure qrammir of Table 1 concisely describes the syntactic aspects of programs acceptable to th a ass Q mhler. A program is simply a collection of one or more statements terminated by the .END pseudo-op. Statements may consist of three parts: any number (zero or more) of labels, followed by an optional opcode, pseudo-op, or svmbol definition part, followed by an optional comment par*. Sote that blank lin°s are legal. Mote also that the only tine a user defined f pmplate is executed is when an oncod=> appears in the statement. The template specifies how the operands, if present, should be processed. In most cases, the t-^mplato will simply call the expression analyzer once for each operand and use the results to construct the obiect code. However, it is possible to write template programs that search for arbitrary tokens in the operand field. In ^ff^ct, then, the user has the ability to alter the grammar an! assign appropriate semantics to the result, for trample, this facility may be used to distinguish between the addressina modes of an instruction, or to sr-an for predefined register mnemonics. Tt is this great flexibility in *pmpla*° design that allows the assembler to b Q easily adapted to a wide variety of assembly languages. 12 Pseudo-ops "'he pseudo-ops listed in Table 2 are typical of those found in most assemblers. They are evaluated by a fixed set of templates, and anv subset may be implemented for a qiven machine. No claim is made as to their universality. Should this sp f prove inconvenient, the user may easily install new oseudo-op templates to suit his needs. A brief description o^ each of the supplied osoudo-ops appears in the appendix. Correspondence Between Source T.anquaqe and Hardware The manner in which certain instructions execute on a qiven machine often places subtle requirements on the construction of the object code. Basically, most of these probloms ar° concerned with proper incrementation of the location counter. <^ome of the problems place strinqent requirements on the desiqn of the assembler skeleton, but manv can be handled at the template level. The followinq special casps are not exhaustive, since an exhaustive study would necessarily have to include all machines. Rather, these cases are meant to be indicative of the types of problems that can occur durinq the design of an assembler. Anv problems overlooked here will hopefully prove easy to solve, qiven +he flexibility of the proqram and the power of PASCAL. 13 on? of the more inconvenient problems pncountered in designing a general purpose assembler is the fact that many machines have instructions which ar° variable in length. In + k a SP cases, the length of *he instruction must. be determined by an examination of the operands. For example, the MCS650X has a "^oro page" addressing mod** utilizing a one by*-o address, in addition to the more general form having a two K vte aMr^ss. Thus, an instruction such as OF* X may assemble in*o two or three bytes depending on whether X is assigned to paae zero or no*. Since programmers n^n^rally pr^f Q r *o order their code in the same order as it will apDoar in memorv, it seems reasonable to assume that nag^ zero labels and source code will appear n^ar the beginning of the program. With a zero page addressing mode, it is usually ad van t ageous to keep frequently referenced «?.ata on page zero. Thus, most memory references appearing later in the program will generally be to page zero or to r.rarbv locations (principle of locality of reference). The assembler should therefore default to absolute mode addressing wh ten the operand is indeterminate and the opcode permits zero page r^f^r^nce. This requires the user to be careful that forward references to page zero labels do not occur. Traditionally, the first pass of an assembler has been considered responsible only for construction of the symbol ••■able. Hence, the firs 4 - pass was usually designed to run as 14 quirk] y as possible. This is practical only if the length of each instruction is known in advance. To determine this lenqth, however, it is clear that the assembler must attempt to evaluate f he source statement completely; otherwise the location counter will not ^e properly maintained, and stored lah^ls may he assigned incorrect values. For simplicity, the assembler therefore assembles the source file in two identical passes. An internal switch prevents any output durina pass or.», thereby suppressing spurious error messages. Unfortunately, this simple approach involves a duplication of effort, since the table searching for labels and opcodes is repeated in the second pass. Many assemblers ao*- around this problem by producina a temporary file during pass one. Pass two then n^eds only to scan the intermediate code and reprocess the operand fi^ld in order to construct *h° ob-j^ct code. Besides simplicity, however, the simpler approach offers an interesting bonus. If the location counter becomes incorrect because of an indeterminate operand durinq pass one, succeedinq labels will have incorrect svmho] table entries. During pass two, these labels will be flagged as multiply defined. This type of error would otherwise ao undetected. One should also be aware of the differences involved in computation of branch addresses. During execution, many machines ircrem^nt their program counter before adding 15 branch offsets. The assembler should then compute the offset from the address following the instruction. However, for the FR, the offset is from the first byte of the branch instruction. This detail is easily resolved in the template. ^he final problem involvinq location counter maintenance requires a distinction to be made hetween machines that are byt ^-addressable and machines that are wor'* -addressable. v or the word-addressable machine, the location counter is si mnly incremented once for each word of obi^ct code produced, whereas the byt ^-addressable machine mus* add N (bytes per word) . Tn some cases, it may also be appropriate to check for word boundary errors. For example, in the PDP-11, a 16-bit word may not s + art at an odd address. As before, putting the location counter under template control permits an easy solution of these problems. The templates are contained in a sinqle PASCAL procedure. Resides contributinq to software modularity, this structure provides a simple solution to the assembly of certain awkward opcodes. As an example, consider the various mnemonic operand combinations for the LR instruction of th^ Fairchild PR. This instruction assembles into the seeminqly random bit patterns listed in Table 3. The number of cases involved suggests that a rather large and messy template would b<=» required. However, the problem has a simple recursive solution. Firs*-, concatenate the opcode 16 and operand fields to form a new "opcode," and then execute the template associated with this new "opcode." This second template, which is invoked by a recursive call of the template procedure, will trivially push the stored "opcode" bit pattern into the obi«ct buffer. This technique permits a direct means of translating mnemonic phrases into arbitrary bit oatterns and results in very simple templates. Or. the ne}ative side, however, this technique results in a much larq°r opcode tabl® for this example. Searching this laraer table increases the assembly time for all instructions. 17 TABLE 3 L" instruction of FH Mnemon i n Machine C OflP LR PC, OF LR PC, H 10 LP <> n ,0 OD LB D ,* 09 LR o,nc OS LP H,DC 11 LB K,P OR LR a ,r Ur LR A,K T J 00 LP A. ,KL 01 LR A ,011 02 LP A , Q L 03 LP r,A 5r LR KU,ft OU LR XL, A 05 LR QTT, A 06 LR QL, A 07 LR TS, A OB LR W, J 1D LP A,T OA LR J, W IS 18 T h° preceding discussion was given to help justify the final form of th° assembler, Georqiou's "generalized assembler" is, in comparison, a very riqidly specified software packaqe. T h^ additions to his program which specify * new machine really only chanqe insiqnif icant formatting letails (his proqram reads fixed field input). The program therefore translates a fixed qrammar accordinq to a fixed set of semantics. Thus, the proqram is able to assemble code only for the class of machines for which it is desiqned. furthermore, there is nothinq in Georqiou^ th°sis f o suqqest tha* it would be simple to modify his work to handle current microprocessors. Tn contrast, the MUMS assembler surmounts these problems throuqh use of a flexible assembler skeleton and user defined templates. The use of HT.T, allows rapid ^empla 4 -^ desiqn and dehuggina, but also results in a relatively less efficient program. Tn terms of convenience and maintainability, however, the program is felt to b<= an excellent compromise. 19 TTT. TI^E OF THE ASSEMBLE* Mow tha* some justification has be a n given for the structural design of the assembler, it seems appropriate to r^la^-p somr details cor^erninq its use. ' n he current version of the assembler contains extensions which facilitate its use in a time-shared environment. The host computer is a DFC PDP-10, and the assembler therefore shares the flavor of other HFC software. The virtues of time-sharing are well-known, and hence need not be elaborated here. The scope of the followina material will bo limited to the procedural asoects of the proqram. Further information on the implementation details is available in two forms - a user's manual and a maintenance manual. Runninq Programs Through the Assembler Aft«r croatinq or oditing a source file with one of the system ttext editors, +-his file may be assembled by issuing the following monitor command. 'UN USSEM [4577,52401 14 Th Q number in brackets is a proieot-prograromer number, which specifies th° disk area where the assembler currently resides. The remaining number indicates the amount of core needed to run the proqram. At the start of execution, ASSEM 20 will promnt with an asterisk. The user is then expected to ent^r a filename with the followinq form. ( ( (|) *-*-S) ? DOT EXTENSION) Currently valid extensions are listed in Table 4. if, for example, the source file was named PPOG. INT, the assembler woul d 1) load v he internal opcode table from file OP. TNT, 2) open file PROG.LST for an INTEL 8080 assembly listing, and 1) if the . NPUNCH directive is not detected durinq pass one, open file PROG.TftP for the absolute object code qenerated by pass two. Source lines in error and error messages are th^n printed at the terminal. The program will return control to the monitor after printing the number of errors. The user should then correct the source and iterate as necessarv. TABLE 4 Valid Source Extensions EXTENSION ^ICHTNE TN' 1 ' INTEL 8080 "0^ HOS TECHNOLOGY 6S0X FFR FAIRCHTLD F8 COS RCA COSHAC TIP NATIONAL IMP-16 21 Tn>plemen ta t ion of Now Source Languages Tn the previous section, it was explained that the assembler selects the opcode file at run-time according to the type (extension) of source file entered. Hence, one may j mplement a new lanquaqe simply by creatinq a new opcode flip OP. (unique extension) and by installing new templates in the skeleton. As an example, these additions will be considered for the T1P-16. This machine was chosen because its source code is reasonably complex yet straightforward to assemble. Tn particular, this example should demonstrate the assertion mad« in the introduction - that storing templates externally for interpretive execution adds needless complexity to the assembler. The IMP- 1f> assembler is implemented by the additions of Figures 1 ani 2(pp. 25-27). The first line of the opcode file is a title which is printed on each pacre of the assembly listing. The second line se*s internal switches in + he assembler. 'FT* se*-s a default hex listing mode for addresses and object code. The next digit specifies the number of hex diqits needed to encode the address space. An optional character not aopearing in this case is 'B 1 , which indicates a by te-addr°ssable machine. The default is word-addressable. This switch is used internally to select between byte and word options. For example, a full word checksum will b° generated for an IMP- 16 paper tape load block. 22 NATIONAL IMP-16 ASSE H4 .ASCII 6 . FNT) 01 . FORMAL 12 .LTST 07 . NLIST 10 . NPUNCH 13 .PAGE 11 .RADIX 05 .TABLF 14 . WORD 3 ADD moooo 23 ATSZ 044000 29 AND 060000 24 BOC 010000 27 CAI 050000 29 DS7 076000 25 HALT 000000 33 TSZ 074000 25 JNP 020000 26 JSR 024000 26 JSRI 001600 28 T.D 100000 22 LI 046000 29 OR 064000 24 PFT.r, 004200 34 PULL 042000 31 PDLLF 001200 33 PUSH 040000 31 PUSHF 000 200 33 PADP 030000 3 2 PAND 30 20 3 32 RCPY 030201 32 PIN 002000 28 ROL 054000 29 POR 054000 3 ROUT 003000 28 3TT OOOUOO 28 PTS 001000 28 RXCH 030 200 32 RXOR 030 20 2 32 SF1G 004000 34 SRL 056000 29 SHP 056000 30 SKAZ 070000 24 SKG 160000 23 SKNE 170000 23 ST 120000 22 SUR 150000 23 XCHRS 052000 31 iqure 1 IMP-16 ope filp 23 The remaining lines of the opcode file are stored in the opcode table. Each record in this table is composed of thr°e fields: a basic opcode bit pattern (second column of Figure 1, in octal), a template number (third column, in decimal), and i pointer into a character array where the tPx^ual representation of the opcode is stored. Note that all of the pseudo-ops share the samp template number. In Figure 2, template -^ero calls the pseudo-op evaluator, which ir turn executes *he pseudo-op template selected by a case on the stored "opcode" bit pattern. Similarly, normal opcodes case directly to the appropriate template. If an opcode is not found in the table, the epsRCH function returns a pointer to an error record. This results in thp selection of rps n rvcd pseudo-op template number zero, thereby printing an error message. The assembler skeleton provides several procedures and functions which simplify the task of template coding. EXPRESSION evaluates the non-terminal according to ♦■he productions of Table 1. ERROR globalizes error handlinq, while OUTWORE automates insertions into the ob-ject buffer. In addition, two user defined functions provide further template simplifications. With minor variations, a la rap group of imp instructions have the following form [9]. OPCODE r,(Ddisp(xr) Function Eft evaluates the effective address given by the second operand field. All three parts of this field are treated as optional. Tf indirect addrpssing is legal for 2U the qivpn instruction, then TND (not equal to zero) will specify the location of the indirect bit. This bit is then set if • a) ' is prespnt in the source. Note that the default addressinq mode is relative to the proqram counter (XR=1) . ^hp second function, DTSP, truncates the displacement to eiqht bits and prints one of two error messages if the truncated bits are not all zeroes or all ones. The prcc^dinq discussion indicates the relative ease with which new syntax rules and semantics can be coded into templates. Rrror checkinq can be as elaborate as desired. For example, the r^qisters accessible to the proarammer are ACO, AC1, AC2, and AC3. Tn template twenty-four, however, only ACO and AC1 are leqal. Of course, if a proqrammer wishes to use these register mnemonics, he must declare them in symbol definitions. Alternatively, a procedure could have been written to recoqnize these mnemonics. Hence, this implementation is by no means a unique solution. The implementor is free to include as many features that suit hi m. 2S PFOC VAR FHNC BEGT EDHPE PROCESSOPCODF; TAND1,^AND2: INTEGER; TTON PTSP(EPR: ERPOR^YPE) : INTEGER; N DISP:=0; IF (ANDF (TAND2, HEEOO) <>0) B (ANDF (T AN D2 , HFFO 0) OHFF00) THEH EFR^p (ERR) ELSE DISP: = ANDF (' r AND2, H00FE) END; FDNC /* VAR REGI TTON EA(IND: INT DECODES ADDRESSI X*: INTEGEP; N XP:=1 ; TARD2:=0; IE -.EOLE ^HEN BFGTN FTPST IE (C THEN ELSE IF LT THEN ELSE IE ^E IE (V THEN EG NG EP): INTEGER; 10DES FOP THE IHP-16 */ CH H = IF TH EL LI NE LT TA OT R= TA (CH) » V) (IN EN T SE E NRHF RtJFr NBUF ND2: E TH 1) ND2: DO0) AND1 : = TANDH-TND RROR (BADEX) P:=PPED(LINBUFP) ; SDCC (LINBUEP) |=' (» P:=SHCC(LINBIJFP) =FXPRESSTON; EN XR:=EXPRESSION; = TAND2-SUCC (I.LOCC) END; FA:= (U*TAND1 + XP)*2^ ♦ DISP(BBRAN) END; BFGTN / OPTP:=OPSPCH (OPT.FN) ; OPCODE: = OP[ npiD]. CODE; /♦DECODE AND PROCESS OPCODE*/ CASF OP[ npiD 1. MODE OP 0: PSEHPOOP; ' TEMPLATES 22: 3U PROCESS IMP-16 OPCODES */ 22: BEGIN /* ID, ST */ TAND1:=EXPRESSI0N; IE TANP1X3 THEN ERROR (NOP EG) ; OUTWOPD (EA (4) + OPCODE) EVP; 23: BEGTN /* ADD,SOR,SKNE,SKG */ TAND1 :=EXPRESSTON ; TF TAND1>3 THEN ERROR ( NOPEG) ; OUTWORD (EA (0) ^OPCODE) END; Fiaure 2 IHP— 16 opcode templates 26 2U: BFGIN /* AND,OR,SKAZ */ TAND1 —EXPRESSION; IF TAND1>1 THEN ERROR { NOREG) ; OUTWORD (EA(0) ^OPCODE) END; 25: BEGIN /* DS7,ISZ */ TAND1 :=0; OUTWORD (EA (0) +OPCODE) ; END; 26: BEGIN /* JMP,JSR */ TAND1 :=0; OUTWORD (EA (1) ^OPCODE) END; 27: BEGIN /* BOC */ TAND1:=EXPRESSION; TAND2:=EXPRESSION-SUCC (LLOCC) ; IF (TAND1>15) THEN ERROR(BADEX) ; OUTWORD (OPCODE* 2S6*TAND1 ♦ DISP(BRRAN) ) ; END; 28: BEGIN /* JSPI, RIN,ROUT,RTI,RTS */ TF -.EOLF THEN TAND1 : = AND F ( EX PR ESSION , H007F) ELSE TAND1:=0; OUTWORD (OPCODE+TAND1) END; 29: BEGIN /* AISZ, CA I, LI , ROT, , SHL */ TAND1:=EXPRESSION; IF (TAND1>3) THEN ERROR (NOREG) ; IF -.EOLF ^HEN TAND2:=EXPPESSTON ELSE TAND2:=0; OUTWORD (0PC0DE+25 6*TAND1 ♦DISP(BTRNC) ) END; 30: BFGIN /* ROR,SHP */ TAND1:=EXPRESSION; IF (TAND1>3) THEN ERROR (NOREG) ; TAND2:=-EXPRESSION; OUTWORD (OPCODE* 25 6*TAND1 ♦ DISP(BTRNC) ) ; END; /* PULL, PUSH, XCHRS */ 31: OUTWORD (EXPRESSION*256+OPCODF) ; Fiqure 2 (continued) FND; 27 1?: 9FGTN /* RADD,RAtfD,RCPY,RXCH,RXOF */ TAND1:-EXPBESSI0N; TAND2:=EXPFESST0N; OOTWORD(OPCODF-»-256*('**TANPl+TAND2) ) ; END; 33: OHTWORD (OPCODE) ; /* H ALT , PUSHF , PULLF */ 34: BEGIN /* PFLG r SFLG */ TAND1 :=EXPRESSION; IF (TAND1>7) THEN ERROR (B ADEX) ; TF -EOLF TRES T A ND2:=ANDF (EXPRESSION, H007F) ELSE TAND2:=0; OUTWOPP (OPCODE* 25 6*TAND1+TAND2) F.ND END; Figure 2 (continued) Criticism of the Program Tn its present form, the MUMS assembler offers several interesting advantages in addition to those discussed previously. Concentually, the technigue results in a collection of sibling assemblers which differ only in opcode mnemonics and in interpretation of operands. This uniformity bc f wo?n siblings guarantees that a user who is expert in *"be use of one sibling can learn to us a another sibling by studving only the instruction set of the machine in question. The cost of training a programmer in the intricacies of a n°w AL is therefore minimized. 28 ^he current proaram also effectively packs five assemblers into the disk space required for one. However, the five languages implemented +hus far require thirty-five templates. This mak^s for a rather large and unwieldy template procedure, and therefore suggests a practical limit on the number of languages that should be implemented by a single copy of the program. This proposed limitation is not for cosmetic purposes, since the appearance of the source listing has little relation to efficiency. Neither is core a problem, since the template procedure presently accounts for less than ten per c°nt of the total core reguirement s. Th^ r^al objection is that an already poor execution time deteriorates further because of the search time required to select templates near the end of the case list. An elegant solution t o this problem might entail dynamic selection and loading of a template procedure at run time in the same manrer as opcode tabl^ initialization. This feature would further reduce development costs, since only a relatively short template procedure would have to be compiled and debugged. There is a remote Dossibility that, this feature could be implemented with a recent (undocumented) version of the PASCAL compiler. Hhil a PASCAL is a beautiful language from a maintenance standpoint, PASCAL programs usually do not execute very efficiently. For one thing, since PASCAL is procedure oriented, one faces the overwhelming temptation to make ev»ry subtask a procedure, even if only a single call may be 29 necessary. This produces a very readable source listing, but, unfortunately, it also creates unnecessary stack overhead at run tiinp. In addition, the use of operators like AND,. OF, and SHIFT seems natural for the task of object cr>de construction. However, since DEC-10 PASCAL variables ar° strongly typed, "boolean" operators may not be applied to in^eaer data tvpes. This problem was rather inefficient 1 y oolv^d by writing functions ANDF and ORF, which perform the required boolean operations through use of the MACRO construct. This feature allows DEC-10 assembly lanauag? statements to be present in the PASCAL source. SHIFT was implemented by integer multiplication and division. However, when an arithmetic right shif+ with sign bit. extended is reguireci, one must be careful that the sign bits io not disappear (i.e., th» result of the division is zero). This is insured by simply truncatinq unused high order bits prior to th<= shift. Resides *heir gross inefficiency, the ANDF and ORF functions are clearly undesirable in terms of transportability. However, this is not viewed as a particularly serious defect, since DEC-10 programs are usually not transportable without a great deal of modification anyway. One major incompatibility stems from the lack of the standard PASCAL EOT, character. Instead, DEC-10 RASCAT supplies two procedures, FEADLN, which sets EOLK to false and moves the "read h°ad" past, a CR-LF on the inpu-t- medium, and WPTTELN, which outputs a CR-LF. ttany of 10 th*» buil*--in funcfions are oriented towards on-line use and require a file name in *heir parameter lists. In addition, th*? qualify of the PASCAL compiler leaves much to be desired, as many thinqs are implemented either poorly or not at all. For example, to read inteqer data types with the PEAD procedure, the read head must first be manually positioned to the start of the ASCII digit strinq via the rtET procedure. Similarly, FEADLN does nothinq if a CR-LF is not under the real head. Hence, the I/O routines in ASSEM will probably require considerable revision before they will operate Droperly ^Isewh^re. Tn order *o qain som» insiqht into performance, the Mills asspmbler (ASSEM) was benchmarked aqainst a local assembler (ASM16) for the TMP-16. ASM16 is actually a modified version of th^ DEC-10 MACPO, assembler, and requires a minimum of 21P of corp, where a paqe (P) is 512 thirty-six bit words [ 10 "J. Tn contrast, ASSEM occupies a minimum of ?6p. However, these fiqures are sliqhtly misleadinq. After compilation, a *ypic^l DEC-10 proqrara resides on disk in two separate files, which are usually referred to as the low and hiah soqm^nts. The low seqment contains information about constants, variables, and data structures, while the hiqh s?qment contains absolute object code. The symbol table in ASM1* is manaqed bv dynamic techniques, and hence smarts with five paq^s of cor-». ASSEM us«s fixed size tables, and must therefore request »nouqh space to satisfy the larqest. number of symbols expected. The current version allows 500 31 symbols, th°r^by resulting in a low segment of 13P. Since the symbol table is hash coded, a large table decreases table search time sine* it maximises the probability that a symbol will guickly hash into a unique location, albeit at the cost of sparse tables. During execution, ASSE1 will require a few additional pages for stack management. The difference of 3P in the high segments is apparently due to •♦■he MACRO facilities in ASM 16, which are, of course, not present in ASSEM. The relative execution efficiencies of the two assemblers were compared by assembling similar IMP-16 programs. Th° test routine implemented the COPDIC rotation algorithm for aeneration of SIN and COS. Each version of this routine contained about 225 source lines and differed onlv in the pseudo-ops present. ASM 16 handled the test data in 1.85 seconds, while ASSE M required 5.88 seconds. This rather poor showing is due primarily to the enormous overhead of character manipulation in ASSEM. All operations involving line scanning, symbol matching, and conversions from character string to internal integer values are performed on a per character basis. The lack of efficient string proc^ssina facilities in PASCAL means that the program has to do the work itself. DEC- 10 PASCAL does allow string variables, but there are no functions available that do anything useful with ^h^m. Standard PASCAL provides only PAC 1 ?" and UNPACK, which are not yet implemented in the DEC-10 version. Wha+- are really needed in this application are 32 built-in functions similar to those provided by PL/C (e.g., the string concatenation operator, SOBSTR, INDEX, etc.). These functions could be simulated via the HACFO feature of DEC-10 PASCAL, but the resulting code would hardly lend i^s«=>lf + o casual maintenance. 33 TV. APPLICATIONS The current version of the MT7MS assembler mimics five ©xistinq assembly lanquagcs. However, the same techniques can obviously be user? to translate almost any AL-like source. T his realization suqqested the following appl icat ions. Ha croproqramrainq rjs=>r microproqrammable minicomputers have been on the market for some time. The semiconductor houses also supply chin sets which allow the user to build his own microproqrammable hardware. Hence, there has been considerable interest in microlanquaqes . Vertical microinstructions Deform sinqle functions and tyoically have short word l^nqths. A corresponding microlangua ge would then appear to be very similar to conventional assembly lanquacres and could easily be implemented by the MDMS assembler. In constrast, horizontal microinstructions may simultaneously control several resources and therefore require much larqer word lenq+hs. Since horizontal microinstructions ir° very awkward to assemble by conventional AT, techniques, the MHMS assembler is expected to have limits utility. Ar excellent introduction to these 3U problems may be found in f 1 1 ]. Universal Assembly T.anguaqe Th? proliferation of inexpensive processors has greatly complicated the efforts of software designers. The question of "Which machine to use?" continues to consume many man-hours of investigation. However, even when the processor has been selected, many programmers complain about a problem analoqous to re-inventing the wheel: before any serious software development can begin, the necessary development tools must be purchased or developed in-house. This development facility may reside solely on some host system, solely on the target machine itself, or some combination of th° two. The components of this development facility are usually a subset of the following: an assembler, compiler, interpreter, and some basic software packaqe for the target machine (e.q., a linkinq (?) loader, monitor, and a debuq proqram) . Since these components are obviously machine dependent, it may be prohibitively expensive to maintain development stations for more than one machine. 0n° solution of current interest is the use of a general purpose assembly lanquage (GPAL) . Conceptually, opal represents the instruction set of a carefully defined virtual machine. GPAL instructions have the "nice" property 35 that they map easily (not necessarily one-to-one) into instructions for anv real machine. Tf all systems and applications proqrams were written in GPAL, this software could then be easily bootstrapped into a new machine by writinq a simple translator. The design of GPAL is beyond th° scopo of this thesis. Hence, the followinq discussion will be limited to the us^ of the MUMS assembler with GPAL. After completely specifyinq the GPAL machine, it is straiqht.f orward to implement an assembler for GPAL mnemonics. Th° GPAL obiect code may then be executed interpretively by any real machine. Address computation is performed by the interpreter, so that GPAL code is effectively position independent. Since only one assembler is involved, this method results in a clean interface with the host system. However, this technique is undesirable because a rather lengthy interpreter must be written in AL for each tarqet machine. Since the interpreter must be residen*, memory, still a rather valuable system resource, is wasted. Finally, the overhead of interpretation may degrade system performance intolerably. Alternatively, each GPAL instruction could be assembled directly into the ob-ject code of the tarq°t machine. This technique corrects th« above deficiencies, but it also lacks the advantaqes. However, any complexities in the host-tarqet machine interface clearly belong on the host side. Tt is clearly preferable to write a set of templates 36 in HLL, as opposed to writing a set of interpreters in AL. Unfortunately, a unique template will probably be required for each GPAL instruction. This means that the template procedure will be fairly larqe, which dictates the use of one translator proqram per machine. Tt is also possible to use the MUMS assembler as a +r ansla t or from GPAL mnemonics to machine-X mnemonics. This feature would make it possible for the user to do his own optimization on GPAL routines after translation. As before, this function requires a larqe template procedure, and hence a dedicated proqram for each machine. Since GPAL is a loqical choice for compiler pseudo-code, it is hoped that *hese id<=as aid its development. 37 V. SUGGESTIONS FOR FUTURE WORK The ideas in the preceding section are straightforward applications for the M01S assembler. The following discussion will be concerned with options that require modifications to the skeleton. These options have so far been omitted because of lack of time, and because of a desire to keep the program as simple as possible. When it became apparent that the assembler was not particularly efficient, it. also seemed logical to suppress the overhead associated with any unnecessary furctions. However, future improvements in the skeleton should eliminate this ob-j^ction. Finally, it should be remembered that the skeleton is really a collection of useful modules that can easily be incorporated into other PASCAL programs. Conversational Assembler Instead of using one of the system editors, some users may prefer to enter their programs via a conversational mode of the assembler. While the user enters his program, the assembler concurrently performs the first pass. If the current line has an obvious error, the assembler should complain and reguest that the line bp retyped. Otherwise, the assembler \#ill prompt for the next line. This feedback 38 increases the probability of an error-free first run. However, the user sacrifices much of the flexibility offered by ♦he system text editors, and will probably spend more time debugging programs on-line, thereby increasing development costs. An effective compromise miqht involve taking the source program from disk as usual. During the first nass, the assembler could request that erroneous lines be re-entered. This is, of course, a selective error correction technigue. For example, an unknown label in the operand field could be a valid forward reference, and should therefore be passed. Errors detected during pass two (e.g., undeclared label, or a branch address out of range) are much more serious, so that the user should be forced to make his repairs with the svstem editor. This built-in editing function offers some convenience, as it shortens the el it-assemble cycle. facro Assembler Most advanced oroarammers agree that a macro facility should definitely be present in an assembler. Macros increase programmer effectiveness and enhance program readability. In addition, a macro facility might offer a simple means of implementing GPAL. GPAL macros could exist in a separate disk file for each machine. Assembly would ••■hen proceed in three passes, with the macro substitutions occurring during the first pass. However, since the MUMS 39 assembler is already a somewhat large and clumsy program, it might be worthwhile +o implement the macroprocessor as a separate program in a lanquaqe better suited to the purpose, such as SNOBOL. If desired, the two programs could be linked at run tira^ to accomplish the desired results. Relocatable Output The MUMS assembler currently produces absolute object code. While this is adequate in many situations, the flexibility of the relocatable scheme may eventually be desired. Conclusion Computer programming is, and probably always will be, an art. Hence, most programmers find it difficult to determine when their products are ready for release, as it is possible only in exceptional instances to state conclusively that a program is optimal in some respect, or ev»n that it is correct. However, after a reasonable amount of testing, it is usually possible to establish a qualitative level of confidence in the software. In the case of the Hu"HS assembler, proarams for several different machines have been successfully assembled and executed. Furthermore, the programming technique appears to be easily uo pxtensibl» to oth°r machines. As discussed in previous sections, the implementation of the assembler skeleton is relatively inefficient. This is due partly to the goals of thp pro-jerrt, and partly to the limitations imposed by the DEC- 10 PASCAL compiler. Thus, it appears that skeletal improvements may be a continuing effort over the life of the program. However, as it stands, the MUMS assembler is considered to be a functionally complete and useful program. 41 REFERENCES [1] V. S. Lamb, "All About Cross-Assemblers, " Datamation, vol. 19, No. 7, pp. 77-80, July 1973. [21 C. Popper, "SMAL - A Structured Macro-Assembly Language For A Microprocessor," COMPCON FAIL 74, September 1974. [3] A. Opler, "Is Assembly Language Programming Passe?" Data Processing Di3^st, vol. 14, pp. 1-1S, October 1968. [4") S. W. Conley, "Portable Cross- Assemblers in BASIC," Computer, vol. 10, No. 10, pp. 32-42, October 197S. [51 D. Jones, and S. Lathrop, "An Assembler for the INTET, 8080 on the PDP-11," CS491 class project, University of Illinois, Urbana, Illinois, Spring, 1975. [61 C. Georqiou, "A Generalized Assembler," Master's thesis. University of Illinois, Urbana, Illinois, January 1969. [7] "PDP-8 Assembler," IK S^ Government Research §£(! Development Repo rt s, vol. 68, p. 96, November 10, 1968. [81 T. Wilcox, CS326 class notes, University of Illinois, Urbana, Illinois, Spring 1976. [91 IMP-1_6 APPLICATIONS MANUAL, National Semiconductor Corporation, Santa Clara, California, June 1973. [10 1 Of^S YSTE MJ.0 2?DM10R CALLS M ANU AL , Digital Equipment Corporation, Maynard, Massachusetts, May 1974. M11 A. Agrawala, and T. Rauscher, "Micro- proaramrainq: Perspective and Status," IEEE Trans., on Computers, vol. C-23, no. 8, pp. 817-837, August 1974. 42 APPENDIX U3 Supplied Pseudo-ops .asctt 'string* The ASCII characters in the string are packed (left- -justified) into consecutive memory locations. If necessary, blanks are inserted to fill the rightmost bytes of the last word. The string may be delimited by any character not occurring in the string except for ';', '_', and «=». . BYTE expl , <=>xd2, . . . The values of the expressions are truncated to eight bits, and then loaded in*o consecutive locations of memory. .END exp This directive specifies the end of the current pass. On pass two, the optional expression specifies the starting address ^or load-and-go execution. ."OR^A'* (HEX or OCTAL) The argument of .FORMAT changes the default listing mode for addresses and object code to hex or octal. .I.tst turns on the assembly listing. .NLIST turns off the assembly listing. .NPriNCH suppresses paper tat>e generation for the entire program. .PAGF forces a new paae if .LIST is in effect. .RADIX exp The default radix for evaluation of constants is set to the value of the expression. .TABLE outputs an alphabetically sorted symbol table. .WOPD expl, exp2, ... Consecutive words of memory are set to expl, exp2 r etc. 4U Description of EBF Notation EBF is an extended version of BNF (Backus-Naur Form). ERF and BNF are nipt-a languages, i.e., languages used for describing other languages. Tn Tables 1 and 2, the rules are stated in the form •left hand side' :: = 'right hand side' where ':: = ' means "may be composed of" [8]. Items appearing on the left hand side of •::=' are called non-terminal symbols, and are nonterminal in the sense that they never anpear explicitly in the actual text. Rather, the nonterminal represents any phrase that can be constructed from the rules of its right part. The right part may consist of terminal symbols, non-terminals, and other me tarota tion. Terminal symbols may appear in the source, and are usually mnemonic for thp tokens they denote. In Tables 1 and 2, the terminal ID is an example of a class symbol. Class symbols denote all of the possible tokens that may be constructed from the corresponding set of lexical nroductions. Tn Table 2, note that the class symbols serve as labels for the appropriate productions. Although BNF and FRF are capable of describing the same phrases, the trees generated by each notation are different. For example, the PNF rules ::= 45 ::= j COMMA describe a list of numbers. However, is right recursive, and therefore has a right-branching subtree. A hierarchical phrase structure is clearly undesirable for a list. fierce, the following EBP construction, which describes a 'bushy 1 subtree, is generally preferred. ::= (-n-COMMA) Oth^r EBF notation is described in Table 5. 46 TABLE 5 FxamDles of ERF Notation IHE notation . (| ) . .. . .... . . **n. . . Rhrases £roduc_3d b^ notation . . .. . . or ... . . . concatenation of phrases and • • • v 21 -^ • • • ^ 3. ^ • • • +*n n-t imps ...v.a^... .. . . . . . . . . . . *.. . . +. . () ? I n- times . (zero or more occurrences of )... • (one or more occurrences of ) . . . . . . (zero or one occurrence of ) ... BIBLIOGRAPHIC DATA SHEET 1. Report No Keport NO. , _ UIUCDCS-R-76-8O3 4. Title and Subtitle A UNIVERSAL CROSS-ASSEMBLER 3. Recipient's Accession No. 5. Report Date May 1976 6. 7. Author(s) Charles Patrick Kominczak 8. Performing Organization Rept. No -UIUCDCS-R-76-803 9. Performing Organization Name and Address Department of Computer Science University of Illinois Urbana, Illinois 6l08l 10. Project/Task/Work Unit No. 11. Contract /Grant No. 12. Sponsoring Organization Name and Address Department of Electrical Engineering University of Illinois Urbana, Illinois 6l08l 13. Type of Report & Period Covered Master' s Thesis 14. 15. Supplementary Notes 16. Abstracts This report is not a tutorial on the art of writing assemblers. Rather, it is a discussion of those properties that all assemblers have in common, and of a technique by which new cross-assemblers may be implemented very quickly. This research was motivated by the current proliferation of microporcessors and the associated software "vacuum." The result is a single cross-assembler capable of assembling absolute object code for five different microprocessors. 17. Key Words and Document Analysis. 17a. Descriptors Assemblers Microprocessors 7b. Identifiers/Open-Ended Terms 7c. COSATI Field/Group 8. Availability Statement Release Unlimited ORM NTIS-35 ( 10-70) 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price USCOMM-OC 40329-P7I «Q7S JUN 2 1 REC'O