ttflB ftflHi Digitized by the Internet Archive in 2013 http://archive.org/details/capscompilercpuu790whit , No. IFTUCDCS-R-75- 3 AO? December 1975 CAPS Compiler CPU use Report by Lawrence A. White Report No. UIUCDCS-R- 75-790 CAPS Compiler CPU use Report by Lawrence A. white December 1975 Department of Computer Science University of Illinois at Urban a -Champaign Urbana, Illinois 6l801 This work was supported in part by the National Science Foundation under Grant No. NSF 21590 A02. Table of Contents Section 1 1. 1 1.2 Section 2 z.1 2.2 Section 3 3.1 3.2 Section 4 4.1 4.2 Appendix A: Appendix B: Description of Compiler Environment 1 Tutor Lesson Organization 1 Tutor Data Areas 2 CAPS Compiler Organization 3 Compiler Modules 3 Compiler Data Structures 6 Timing Measurements 8 Command Statistics 8 Module Statistics 10 Removing Trace Collection 11 CPU Time Improvements 13 Possible Improvements 13 Recommended Design Changes 14 Parser Internal Code Counts 18 Another Compiler System 19 Abstract This report describes an investigation into the internal workings ot the driver for the CAPS compilers, and discusses the results of this investigation. The primary questions were where and how was CPU time being used in the compilers, and how could their CPU consumption be reduced to speed them up. The results, as described in this report, show a possibility for a minimal improvement, say 10X, by recoding critical sections of the compiler driver. Further improvements appear feasible only by sacrificing some features of the compiler, or by reducing the generality of the compiler driver. This report is the result of an investigation done as a CS425 project for Professor Thomas R. Wilcox at the University Illinois. The investigator and author of the report, Lawrence A. white, is a Graduate Student at the University. Additionally, he is an experienced system programmer on the Plato computer system on which the CAPS compilers run, has his own compiler system on Plato, and is quite qualified to conduct the investigation described in the following report. LAPli compiler CPU use fcepoLt 1 Section__li. D esc ription ot Compiler Environm ent - The CAPS compilers are implemented on the Plato Computer System, a computer-based educational system developed at the University ot Illinois. This system is designed to run up to 500 students simultaneously in many instructional areas. i1o_;t students are in areas other than Computer Science, and it is Still to be determined whether Plato can be used effectively to teach a subject that may require a high CPU activity, such as programming. lhis section of the report describes the portions of the Plato system relevant to the CAPS compilers. 1.1; Tut or L esson Organization Programs on Plato are organized into "lessons", each consisting of up to eight or ten thousand words of pure procedure, shared by any number ot students at once. These lessons are "condensed", rather than compiled, into portions of interpretable and directly executable code. Data areas referenced ty these lessons are described in section 1.2. The only language supported by Plato for its users is TUTCP, which has been developed along with the rest of the Plato system. To allow students to program in another language, a compiler and interpreter must be written in Tutor. The CAPS compilers are written in Tutor, and suffer the advantages and disadvantages thereof. All information needed for each active student is kept in ECS (Control Data Corporation's Extended Core Storage) during a CAPS Compiler CPU use Report 2 session. This includes each student*s current lesson *nd all data areas referenced by that lessen. Wnen the CPU is actually working on an individual student, portions of his lesson and data areas are swapped into CM (Central Memory) for processing, and stored tack in ECS between timeslices. 1.2: Tutor Data Areas The CAPS compilers may access three data areas in ECS for running students. The first area, called "student variables", is individual to each student, is 150 words long, and is loaded and unloaded frcm ECS tc *n* variables in CM every timeslice. The second data area for each user, "storage", may be up to 1500 words long, and portions of it and "common", the third area, will be loaded and unloaded into 1500 words of *nc* variables in CM as directed by the lesson. Storage is individual to each user of the lesson, but unlike student variables, storage is not saved on disk between sessions. Common is shared by all users of a lesson, may be up to 8050 words long, and may reside on disk while no user is referencing it. CAPS compiler CPU use Report i Secti on 2; CAPS Compiler Organization Each compiler in the CAPS system consists of interpretive tables specific to the language being compiled; common driving routines to interpret these tables; and a few routines, specific to the language, that are called trcm the interpreted tables. These tables are built from assembler-like source code written by a compiler imple mentor. After generation, these tables are stored in ccmmon where they are loaded into *nc* variables as needed in compiling student programs. _2_._1j Compiler Modules Flow of control in the CAPS compilers is shown in Figure 1. The editor looks at each keypress the student enters from the terminal. If the key indicates a text editing function, it is performed by the editor. If the student is entering new text, each keypress is passed on to the lexical analyzer. When the lexical analyzer receives a complete token, that token is passed or. to the syntax analyzer and parser for compilation. Since each Keypress is processed as it is entered, the compiler can give inmediate error messages when the student enters an invalid language censtruct. While ccmpiling new text, "Trace" information is stored, allowing the Reverse Editor tc uncompile the students program as the student backs up to make a change. Occasionally the storage area for Trace information gets full. When this happens, a compression unit is called which removes alternate entries from the Trace table. After compression is performed. CAPS Compiler CPU use Report U the reverse editor can only back up to alternate tokens. If necessary, it will back up to the previous token afld then forward compile to the current token. In practice, the compression routine may be called three or four tines for a student program. Alter four calls, there is Trace information for one out of every 16 of the first tokens entered. Closer to the "cursor" where the student is working, the Trace information is available for every token, or at least alternate tokens. The lexical analyzer and the parser are both table driven. The table for the lexical analyzer is a state transition diagram interpreted by Tutor code. For example, if currently in the •<' node of the diagrair for PL/1, a following • = ' causes transition tc another node, while a following '<' or '>' is an error noted by the lexical analyzer. tfhile in BASIC, for example, *<>* is the "not egual" token. The tables in the Parser are not just a state transition array, as in the lexical analyzer, tut consist of internal codes interpreted by a Tutor unit in each compiler. This internal code is complete with arithmetic operations, conditional jumps, calls to error routines, and calls to the lexical analyzer to receive the next token. Thus, the student writes in PL/1 for example, and his program is compiled by code being interpreted by Tutcr code being interpreted by Plato run time routines. The module labeled "syntax analyzer" in Figure 1 is just an interface between the lexical analyzer and the parser. Its function is to keep track of trace information and to insure that the correct tables are leaded for each routine. CAPS Compiler CPU use Heport Student j- lerminal |< I — j Editor | >l l< Reverse Editor Lexical Analyzer : > Name Table I Compress j< Module (-- I -| Syntax >( Analyzer I > Trace I J I Parser J I I i i > Symbol Table > Trace Figure 1: CAPS Compiler Modules CAPS Compiler CPU use Report 6 2.2: Compiler Data Structures The CAPS compilers use "common" for pointers and tables shared by all users of one compiler and "storage" for all pointers and tables needed by an individual user. Few, if any, of the student variables are used by the compilers. Portions of ccimon and storage may be loaded into 1500 *nc* variables in CM. However, at most three areas of each may be loaded at once. As shown in Figure 2, by arranging the data areas in ECS carefully, it was possible to meet this three area restriction and still qet the tables in desired locations in CM. However, the lexical and parse tables are each 400 words long, and only one of them could be loaded at once. (As shown later, this is significant since the compilers spend 8% of their time changing the loading arrangement.) Figure 2, showing the layout of these areas, follows on the next page. CAPS Compiler CPU use Report Cfl nc variables ( 1500) Lexical or Parse Tables 400 Parse Storage 53 -H v Symbol Table 109 c Symbol Table 210 h c Name Table 110 v Name Table 64 f ^ v Char Table 168 c Char Table Hash Table 119 20 Text 60 Trace 98 Variables 88 L v = variable portion c = constant portion number = length of table ECS Storage (644) 1 ...... I Parse Storage 53 J v Symbol Table 109 |v Name Table 64 | v Char Table 168 | Hash Table 20 | Text 60 | Trace 98 | Variables « ... 88 Common (1288) ■— — ■ Parse Table 400 Lexical Table 400 Pointers 22 c Symbol Table 210 Name Table 110 c Char Table 119 Figure 2: CAPS Data Areas CAPS Compiler CPU use Report 8 Section 3: T iming Me asurements The object of these timing tests was to determine where CPU time was used while entering text. Additional tests may later be run to determine CPU use when backing over or reparsing already entered text. Each of these tests involved entering the following sample PL/ 1 program a number of times and taking the average or most frequent timing result. Most of the timing tests were done in the middle of the day during the week, the period of highest Plato use, and the results were fairly consistent. P: Eli CC; £CL A(3,3) FLOAT, (I, J) FIXED; EC 1=1 TO 3; DO J=1 to 3; A (I, J)=I + J; END; END; END; In May 1975, Al Davis used this program in some timing tests. His results showed CPU times ranging from 1.7 to 1.9 CPU seconds. The author received similar results when he started his tests, as average of 1.75 CPU seconds per compilation. 3 . 1 : Command S t atisti cs An option is available to Plato system programmers to take command execution statistics for individual lessons. When the author turned on this option for the compiler, he received the following averages for a single compilation of the above CAPS Compiler CPU use Report 9 program. The commands shown took more than 1% of the CPU time used or the number c£ commands executed, and are broken down into various types. Coff mand dis p la y at mode shcwa showt write Count % - m total time 433 318 182 89 196 6.26 4.60 2.63 1.29 2.83 Time 46 24 44 31 48 Calcul ati ons - 51% total time calc ~1789 25.85 609 calcs 333 4.81 72 8.48 1.82 3.33 2.35 3.63 46. 14 5.45 program control arg 450 do 1C04 entry 172 goto 150 gotoc 277 jcinc 85 unit 592 - 20% total time 6.50 14.51 2.49 2. 17 4.00 1.23 8.55 36 111 15 17 17 12 55 2.73 8.41 1. 14 1.29 1.29 .91 4. 17 other - 10% total time pause 90 1.30 comload 140 2.02 stclcad 53 .77 35 2.65 75 5.68 26 1.97 Ave Time . 106 .075 .242 .348 .244 .340 .216 .080 .110 .087 .113 .061 .141 .093 .389 .536 .491 Count = number of tiroes command was executed. Times are shown in milliseconds. The "arg" command is generated by the Plato condensor for picking up parameters passed to units. The reader may note that the sum of the times shown is less than the 1.75 CPU seconds previously mentioned. In fact, command statistics increased the CPU time by 20% to 2.10 seconds, but the statistics do not take into account formatting time of output to the terminal or the overhead of timeslicing, nor do they show all commands. CAPS Compiler CPU use Report 10 3. 2: Module Statistics As well as Knowing what commands were using CPU time, the author wanted to Know how much time was being spent in each module of the compiler. To learn this, he made a copy of the PL/1 compiler and modified it to record timing statistics on entry and exit from each module. This additional code increased compile time by 34fc to 2.34 CPU seconds, and produced the following data. Times shown here are in seconds. IIltr.y_Count % Time 1 3 172 25 51 14 2 10 51 48 The data shows 172 characters (including spaces in TABS) entered and scanned by the lexical analyzer. The lexical analyzer recognized 51 tokens and passed them on to the syntax analyzer and parser, with the syntax analyzer having to call the compression routine twice. Combining module statistics with the command statistics shown in section 3.1, we see that 14X of the total time goes into the syna module, with 8 of the 1<4% being used just to change the CM loading information for the parser and then back for the lexical analyzer. Parsing consumes the largest portion of CPU time, namely Uo* of it, and involves interpreting of the internal code in the parse table. Module CPU Time Editor .07 Lexi .59 Syna .33 Comp .23 Parse 1. 12 CAPS Compiler CPU use Report 11 J- J iL§ m ovinj_Trace_Col lection Most of the Trace information generated is due to building the symbol tatle and the necessity of restoring it to previous states if the user tacks over DECLARE statements. if semantic checks, including symbol table generation, were removed from the compiler, some improvement in speed should be expected. To test this hypothesis, the author removed all storing of Trace information and then ran the same data collection routines again. The aircunt of CPU time used was reduced by 20*, with command statistics and module timing shown below. (Times again are in milliseconds for command statistics, in seconds for module statistics.) Commaud di s play at mode showa shcwt write Count % 22% total time 433 6.49 Time % 318 182 89 196 4.77 2.73 1.33 2.94 4*4 13 50 33 71 8.64 1.08 4. 14 2.73 5.87 calculations - 47% total time calc 1710 25.63 509 42.10 calcs 333 4.99 61 5.05 £I03iara_control _ i 9X total time arg do entry goto gotcc joinc unit pause couload stoload 429 6.43 42 3.47 931 13.97 115 9.51 172 2. 58 13 1.08 150 2. 17 8 .66 226 3.39 18 1.49 85 1.27 4 .33 569 8.53 36 2.98 total time 88 1.32 29 2.40 140 2. 10 88 7.28 53 .79 40 3.31 Ave Time . 102 .041 .275 .371 .380 298 183 .098 131 ,076 053 ,080 047 063 3 30 629 755 CAPS compiler CPU use Report 12 Module CPU Time Entry Editor .07 1 Lexi . 56 172 Syna .30 51 Comp Parse 1.08 51 Count X_Time 3.5 28 15 53.5 % Or igi na l fim e 3 24 13 86X The ditference between the 20% improvement mentioned above and the 14* improvement shown by these statistics results from the statistics overhead not being reduced significantly by removing the Trace collection. CAPS Compiler CPU ust» Report 13 Sec tion U: CPU Tine Imp rov e aents One of the goals of this investigation was to determine what coding changes could be lade in the compilers, and how much of a timing improvement could be expected. Currently a student may receive up to 10 milliseconds of CPU time per real time second. To enter this sample program under such conditions takes almost three minutes. To anycne who has used the system, it is apparent that an improvement is desired. U_.J Possible Improvements Four suggestions for improving CPU time have been ttade. The first involves minor recoding of critical sections of the compiler and would give a minor improvement in speed. Two such changes apparent to the author are to 1) stop displaying the "space left" indicator on the student's screen — 2.5% improvement, and 2) stop doing unnecessary -stoload- commands — 2% speedup. (The compiler was reexecuting the same -stoload- ccmmand once for every token, which is unnecessary, at least for the PL/1 compiler.) The second improvement suggestion involves recoding the lexical analyzer or parser in Tutor, rather than having Tutor code interpret these tables. This would give an unknown amount of speedup, estimated by the author at about 20% for the .lexical analyzer, more for the parser. However, this suggestion was not received toe enthusiastically by the compiler designer since it reduced the generality of the driver program. CAPS Compiler CPU use Report 14 The third method proposed for speeding up the compilers is to collect a whole line of source text at a time before processing, rather than a character at a time. This moves the collection process from the Tutor lessen to the Plato system, and allows the user tc correct errors in his current line before giving it to the lexical analyzer and parser. This method has been implemented and tested, and does reduce the CPU time used. However, this contradicts cne of the basic goals .of the CAPS system, namely that of responding immediately tc an invalid character cr token. The fourth method suggested for improving compile time speed was tc move semantic (symbol table) checking to a later pass. This method appeared to be promising enough that further tests were performed to determine how much speedup could be expected, with the results of those tests shown in section 3.3. The data in section 3.3 does not completely describe the improvement due tc removing semantic checks, since seme trace information wculd still have to be collected. Even so, the improvement. is probably still more than the 20% shown in 3.3 since the parser continued to perform semantic checks while the data was collected. U_. 2 Recommended . Design, CL any t.^; The previous section described several independent improvements that have been suggested. This section describes a coherent set of changes, some suggested by the author, others by the CAPS designer, T. R. Wilcox. Here these suggestions are CAPS Compiler CPU use Report 15 combined together in a manner that will yield a clean compiler design with improved speed. They are not just restricted to toe entry of new text, hut cover the editor, compiler, and executor f« student programs. The design is based on the tiling statistics shown in previous sections and the author's experience in Plato compilers. Move semantic checking to a new pass immediately preceding execution. This new pass should be coded in Tutor and be specific to the language being compiled. this new pass, compile the user's program into a more easily interpretable form than the tokenized input. This suggestion is dependent on the results of some experiments currently being performed, and is based on the author's experience with another compiler system that executes roughly four times faster. However, the resultant speed cannot match that of the other system since the CAPS run time routines must store more error detection and correction information. 3) Cnce semantic checking has teen removed from the editor, no reparsing of source text needs to be done as the user moves the cursor either forward or backwards over previously entered text. Thus, once an expression has been parsed, it never needs to be parsed again unless the user changes something in it. This could be applied to subexpressions, but probably only "expressions" and "statements" need to be CAPS Compiler CPU use Heport 16 recognized as units to give the desired speedup. The editor function keys might then become "intelligent" enough to recognize such commands as "back up to start ot expression" or "move forward one statement", rather than just recognizing characters, wcrds, and lines, as it currently does. 4) After 1, 2, and 3 above, this author knows of no way to improve the speed of the compiler other than recoding part or all cf it in Tutor. A reasonable speed will probably not be obtained with the present Plato CPU limitations until the lexical analyzer, at least, is recoded in Tutor. Recoding tht parser in Tutor would be easier after semantic checks are moved to a separate pass, but would still be a large -jot. Suggestion 1 gives a speedup that is not nearly proportional tc the amount of recoding involved. However, it allows suggestion 3, which gives an improvement greater than the amount of recoding needed to implement it. Once we reach suggpstion 4, the amount ot speedup has become directly proportional to the amount of recoding done. In summary, this report describes the Plato system orqanization and the compiler design features relevant to the speed cf the CAPS compilers. It shows where, and how, CPU time is being used, and it presents some ideas for speeding up the compilers. As the investigator in this report, I am impressed with the design and implementation of the compiler driver- It CAPS Compiler CPU use rteport 1 / appears to te well structured and clearly thought out, and a diagnostic compiler that runs in this environment on the order ot real time is a significant addition to the field of Computer Science. In particular, this compiler appears to meet all of its design goals except speed, and this report shows ways of speeding it up. It has been shown possible to run ricn-diagnostic student compilers on Plato (see Appendix B) , and this author believes that the CAPS compilers can be improved to run in real time, without loss of diagnostics for the student. However, this might require recoding everything in Tutor, as in the other compiler system. CAPS Compiler CPU use Report 18 i££.§ndix_AjL Par ser Internal Code Counts While investiqatiny the CPU use in the CAPS compiler, the author counted the number of each type of parser operation executed durinq compilation. Tnouqh this information was not used in the basic report, it is included here in case someone desires this data. Also, it niqht be used to make minor code chanqes in the parser interpreter to speed up the most frequent operations. These counts are derived from one entry of the sample PL/1 proqram qiven above. Code Count O£er^tion C 51 Scan 1 6 Allocate 2 5 Deallocate 3 30 Call M 30 Return 5 18 Semantic Op 6 20 Branch 66 = branch 1 61 # 2 > 3 > U < 5 < 6 notall branch 7 7 notany 8 all 9 1 any 10 29 ps(X) = Y 11 23 = locY 12 = ps (X) $mask$Y 13 = ps (X) $union$-Y 1U = ps (X) +Y 15 - ps(X)-Y CAPS Compiler CPU use heport 19 Appendix lij AnotheiCgn^i ler System Seme readers might be interested in comparing CPU use for the CAPS compilers with another compiler system on Plato. Such comparisons will undoubtedly be unfair to the CAPS system since the qoals of the two systems are very different. This other system has been developed by Axel T. Schreiner and the author, and currently contains compilers for subsets of BASIC and Fortran. The information here is incomplete, covering only Fortran compilation, while ignoring editing and execution. The sample program, shown below, is a simple numerical integration program of about 18 statements. Compilation of this program trom source to an easily executable internal code took about .46 CPU seconds. The command statistics for one compilation appear following the program. C 10 20 FUNCTICN F(X) F=X*X+4. RETURN END PRINT, 'INTEGRATE FRCP' READ, A PRINT, 'INTEGRATE TC REAC,E PRINT, 'STEP SIZE* REAE,STEP J= r.int No. NSF 21590 A02. inn soring Organization Name .md Address National Science Foundation Washington, D. C. 13. type of Report & Period ( overed 14. upplc m< nt it v Notes is r 'report describes an investigation into the internal workings of the driver for CAPS compilers, and discusses the results of this investigation. The primary stions were where and how was CPU time being used in the compilers, and how could ir CPU consumption be reduced to speed them up. The results, as described in this ort, show a possibility for a minimal improvement, say 10%, by recoding critical tions of the compiler driver. Further improvements appear feasible only by sacrific some features of the compiler, or by reducing the generality of the compiler driver is report is the result of an investigation done as a CS ^-25 project for Professor mas R. Wilcox at the University of Illinois. The investigator and author of the ort, Lawrence A. White, is a Graduate Student at the University. Additionally, he an experienced system programmer on the Plato computer system on which the CAPS pilers run, has his own compiler system on Plato, and is quite qualified to conduct investigation described in the following report. '\ Words and Document Analysis. 17o. Descriptors Interactive compilers; performance measurements PLATO, programming systems, text editors dent if iers Open-Ended Terms OS AT! Fie Id /Group aiiability Statement Release Unlimited 19. Security Class (This Report) UNCLASSIFIED 21. No. of Pages 25 20. Security Class (This Page UNCLASSIFIED 22. Price ^nITIS-35 < 10-70) 1 USCOMM-DC 40329-PH MS 1 #• UNIVERSITY OF ILLINOIS-URBANA 510.84 IL6R no. C002 no 788-793(1976 Internal report / 3 0112 088402653