Ill 'Mill ■ 
 
 
 
 
 
 
 |aUbuji">'j«i| 
 
LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAICN 
 
 510. 84 
 IjM- 
 
 Cop. Z 
 
 
I he person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 
 \3 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 
 in 2013 
 
 http://archive.org/details/precompilercompo798hugg 
 
w 
 
 Report No. UIUCDCS-R-76-798 
 
 THE PRECOMPILER COMPONENT OF A DATA BASE 
 DICTIONARY SYSTEM 
 
 by 
 
 MICHAEL JASON HUGGINS 
 
 May 1976 
 
Report No. UIUCDCS-R-76-798 
 
 THE PRECOMPILER COMPONENT OF A DATA BASE 
 DICTIONARY SYSTEM 
 
 BY 
 
 MICHAEL JASON HUGGINS 
 
 May 1976 
 
 Department of Computer Science 
 University of Illinois at Urbana-Champaign 
 Urbana, Illinois 61801 
 
 * Submit ted in partial fulfillment of the requirements for the degree of 
 Master of Science in Computer Science in the Graduate College of the 
 University of Illinois at Urbana-Champaign. 
 
m 
 
 ACKNOWLEDGMENT 
 
 The author wishes to thank Professor H. G. Friedman and 
 R. L. Mann for their advice and help in the preparation of this thesis, 
 
IV 
 
 TABLE OF CONTENTS 
 
 INTRODUCTION 1 
 
 PART I. PRECOMPILER FUNCTIONAL DESCRIPTION 
 Chapter 
 
 I. FUNCTIONAL OVERVIEW 4 
 
 II. PRECOMPILER STATEMENTS 8 
 
 III. PL/ 1 SOURCE OUTPUT 15 
 
 IV. THE COMMUNICATION MODULE 19 
 
 V. SUMMARY 23 
 
 PART II. PRECOMPILER INTERNALS AND THE OPERATING ENVIRONMENT 
 
 VI. LEXICAL ANALYSIS 26 
 
 VII. SYNTACTIC AND SEMANTIC ANALYSIS 29 
 
 VIII. INTERNAL STRUCTURES 34 
 
 IX. PRECOMPILER OPTIONS 39 
 
 X. PRECOMPILER OPERATING ENVIRONMENT 41 
 
 XI. TESTING AND VERIFICATION 43 
 
 APPENDIX 44 
 
 REFERENCES 73 
 
LIST OF TABLES 
 
 Table Page 
 
 1. FIXED LENGTH AREA IN COMMUNICATION MODULE .... 20 
 
 2. DATA BASE ENTRY IN COMMUNICATION MODULE 21 
 
 3. SEGMENT ENTRY IN COMMUNICATION MODULE 21 
 
 4. FIELD ENTRY IN COMMUNICATION MODULE 22 
 
 5. LEXICAL CLASS ASSIGNMENTS 26 
 
 6. INTERNAL HEADER RECORD 35 
 
 7. INTERNAL DATA BASE RECORD 36 
 
 8. INTERNAL SEGMENT RECORD 37 
 
 9. INTERNAL FIELD RECORD 38 
 
 10. PRECOMPILER OPTIONS 40 
 
 11. INPUT AND OUTPUT FILES 42 
 
 12. SYSTEM NODE 54 
 
 13. PROGRAM NODE 55 
 
 14. DATA BASE NODE 56 
 
 15. SEGMENT NODE 57 
 
 16. FIELD NODE 59 
 
 17. SYSTEM/PROGRAM EDGE 60 
 
 18. PROGRAM/DATA BASE EDGE-FIRST PCB ENTRY 61 
 
 19. ADDITIONAL PCB ENTRIES 62 
 
 20. SENSEG EDGE ENTRY 63 
 
 21. DATA BASE/SEGMENT EDGE 64 
 
 22. SEGMENT/FIELD EDGE 66 
 
VI 
 
 LIST OF TABLES—Continued 
 
 Table Page 
 
 23. SAMPLE HOJ RECORD 67 
 
 24. RANDOMIZING MODULE HOJ DATA 68 
 
 25. EDIT/VERIFICATION HOJ DATA 68 
 
 26. XDFIELD HOJ DATA 69 
 
 27. DATA SET GROUP HOJ DATA 70 
 
 28. LOGICAL CHILD HOJ DATA 71 
 
 29. LAT RECORD 72 
 
VI 1 
 
 LIST OF FIGURES 
 
 Figure Page 
 
 A. PCB declarative 10 
 
 B. Data base declarative 11 
 
 C. Segment declarative 12 
 
 D. Field declarative 13 
 
 E. Segment manipulation 14 
 
 F. Nonsegment manipulation 14 
 
 G. PCB, data base, and segment sample output 16 
 
 H. Data manipulation sample output 18 
 
 I. Lexical analysis 28 
 
 J. Model syntax table 30 
 
 K. Data manipulation syntax automaton 33 
 
 L. Logical view of dictionary system 48 
 
 M. Example of segment-field edge 49 
 
 N. Logical view of HOJ table 50 
 
 0. Logical view of LAT table 51 
 
INTRODUCTION 
 
 With the advent of large, general purpose data base systems 
 [1], several desirable information processing theories have now been 
 implemented. These include advances in the areas of data independence, 
 data sharing, data security, and control. While facilities to take 
 advantage of these concepts have been implemented to varying degrees, 
 much of the control needed to administer their use is not inherent in 
 the data base software itself. To meet this need, the role of data 
 base administration has emerged [2], While data base administration 
 is finding its place in data processing structures, much work is being 
 done to provide it with the tools needed to manage and control the 
 data. The greatest need is in the area of data dictionaries. 
 
 A data dictionary is a collection of data about data [3]. 
 A complete description of a particular installation's data base en- 
 vironment would be contained within the data dictionary. The use of a 
 dictionary provides a large measure of control and documentation which 
 allows data sharing and security to be used and monitored. The real 
 drawback is that the dictionary, while an excellent source of informa- 
 tion and an aid in communicating with the data base software, does not 
 actually control the access to the data. If the dictionary were the 
 source of information controlling the actual interface between a program 
 and the data it wished to process, then a real level of data independence 
 and security could be provided, and many additional services could be 
 made available. To this end the Data Base Dictionary System described 
 in Appendix A was designed. 
 
This project is a sample implementation of one subsystem of 
 the Data Base Dictionary System, the precompiler. IBM's Information 
 Management System [4] (IMS) is used here as the target Data Base Man- 
 agement System (DBMS) because it is general purpose and is currently 
 in use by many installations. The precompiler is an extension to PL/I 
 and is used to generate IMS application programs. In addition to sim- 
 plified programming, the goals of the precompiler are to implement 
 the various features offered by the Data Base Dictionary System de- 
 scribed in Appendix A. Until such time when data dictionaries, data 
 base software and compilers are closely associated and more fully 
 integrated, a precompiler can be most useful in bridging the gap be- 
 tween these separate systems and providing the needed support to the 
 application programmer. 
 
PART I 
 PRECOMPILER FUNCTIONAL DESCRIPTION 
 
CHAPTER I 
 FUNCTIONAL OVERVIEW 
 
 The function of this precompiler is to take as input an 
 application program written in PL/I with the addition of certain precom- 
 piler statements and generate a complete PL/I source program along with 
 an interface module for use by the execution monitor. In this role, 
 the precompiler serves not only as a programming aid but also as the 
 first level of security and control in the Data Base Dictionary System 
 environment. 
 
 The precompiler statements fall into two categories, declar- 
 atives and data manipulation statements. The declaratives allow for 
 the declaration of Program Control Blocks [5] (PCB), data bases, and 
 logical segments for which the appropriate PL/I DECLARE statements are 
 generated. In addition to these declaratives there are a set of precom- 
 piler statements used to communicate requests to the execution monitor. 
 These data manipulation statements generate PL/I source code which 
 includes a CALL to RNPTDLI, the execution monitor. 
 
 The concept of the logical segment is essential to many of 
 the features that the dictionary system offers. In effect, it is the 
 logical segment approach that allows for field level independence not 
 inherent in IMS. The major concepts surrounding the logical segment 
 are as follows: 
 
 1. Any field may be included in the logical segment as 
 long as it is contained in the real or source segment. 
 
That is, a logical segment is a subset of its source 
 segment. 
 
 2. A field may be requested in any scale, base or precision. 
 Conversion will be accomplished by the execution monitor 
 based on information from the dictionary and the commu- 
 nication module. 
 
 3. Field position within the logical segment is totally 
 independent of its real location within the source segment. 
 Again the execution monitor performs the necessary mapping 
 at run-time. 
 
 4. A program may update a subset of a real segment without 
 affecting fields in the source segment that it is not 
 sensitive to. 
 
 The logical segment approach allows for run-time binding. The 
 execution monitor establishes the mappings and conversions necessary to 
 give the program the data it requests. With the precompiler and execution 
 monitor functioning together in this manner, true data independence is 
 achieved. As long as the data fields requested remain in the source 
 segment, all other rearrangements and format changes are transparent 
 to the program and do not require a recompile or relink edit. 
 
 The precompiler checks security at several levels. Because 
 this precompilation is the first security check and therefore a re- 
 quirement, a method has been devised to ensure that a program executing 
 in the Data Base Dictionary System environment has been processed by the 
 precompiler. As the program is processed, the following is ensured: 
 
1. The program is described to the dictionary system and 
 is written in PL/ I. 
 
 2. The data bases requested are in the system indicated. 
 If the system is password protected, then the password 
 is given by the program. 
 
 3. The program is allowed to access each data base it 
 requests. 
 
 4. The program is sensitive to the segments it requests and 
 update access is allowed if attempted. 
 
 5. The program is allowed the type of access requested to 
 each field within the logical segments defined. 
 
 In the area of simplified programming, several precompiler 
 features make the task of creating a complete application program easier. 
 The PCB mask, a moderately large structure, is generated for each PCB 
 declared in the program. On the precompiler statements themselves, 
 several options are inferred if not explicitly stated. If the program 
 wishes to process a segment exactly as it is in the data base, then 
 the precompiler will generate the appropriate structure to map the 
 requested segment without program concern for the declaration of all 
 the associated fields. The data manipulation specifications expand into 
 the necessary source statements including the CALL to interface with 
 the execution monitor and IMS. 
 
 In addition to these "shorthand" techniques, a programmer 
 using the dictionary system need not worry as much about data editing, 
 
segment characteristics, and data conversion. This means that he can 
 concentrate on the function to be performed. While allowing the pro- 
 grammer to accomplish his task more efficiently, the precompiler as a 
 part of the dictionary system adds a real measure of data independence 
 and control to a processing environment. 
 
8 
 
 CHAPTER II 
 PRECOMPILER STATEMENTS 
 
 Precompiler statement syntax is similar to PL/ I in that it 
 is keyword oriented. Data associated with a keyword is enclosed in 
 parenthesis following that word. A set of related keywords is ex- 
 pressed as a precompiler statement. The semicolon is used as the 
 statement terminator. The period immediately followed by either 
 "DECLARE" or one of the IMS function abbreviations [5] signals the be- 
 ginning of a precompiler statement. Within the context of a statement, 
 the keywords are treated as reserved words and therefore cannot be used 
 as user symbols. These reserved words are: ASIS, BASED, BIN, CHAR, 
 DATABASE, DEC, FIELDS, FIXED, FLOAT, KEYFDBKLEN, NAME, PCB, PROCOPT, 
 SEGMENT, SOURCE, SSA, SYSTEM, and WITH. 
 
 Syntax conventions are again much like those of PL/I. The 
 precompiler is blank transparent, that is, any number of consecutive 
 blanks are treated only as a token separator. Except as a token sep- 
 arator, card boundaries and comments are also transparent. Quoted 
 strings are treated as one token regardless of their content. 
 
 The precompiler scans the input program looking for one of 
 its statements. When one is found, it is processed token by token 
 until the semicolon is found. If an error is detected, the remainder 
 of the statement in error is skipped. When the precompiler has fin- 
 ished parsing one of its statements, scanning continues until another 
 
is found or end of file is reached. Each precompiler statement must 
 begin a PL/I statement, or in the case of data manipulation requests, 
 be the only entry in a THEN or ELSE clause of an IF statement. 
 
 There are three types of declarative statements and twelve 
 data manipulation statements. Each declarative must begin with the 
 token ".DECLARE". Data manipulation statements also begin with a period 
 immediately preceding one of the following IMS function abbreviations 
 [5]: GU, GN, GNP, GHU, GHN, GHNP, ISRT, DLET, REPL, SNAP, CHKP, LOG. 
 
 A detailed definition of the syntax of each precompiler 
 statement and the semantic action taken in each case is shown in Figures 
 A through F. In all cases the keywords may appear in any order but only 
 once per statement. The notation conventions used in these figures to 
 describe the syntax are as follows: 
 
 1. Nonterminals are enclosed in braces and explained below 
 each use. 
 
 2. Items enclosed in plain brackets are optional. 
 
 3. Items enclosed in brackets followed by a superscript "+" 
 are optional and may be repeated any number of times. 
 
 4. Parentheses are terminals and must be included where 
 indicated. 
 
 5. The bar separates a list from which one and only one item 
 must be chosen. 
 
 6. User variables, passwords, and SSA names follow standard 
 PL/I conventions for symbol formation. 
 
10 
 
 SYNTAX: 
 
 .DECLARE PCB NAME (<id>) BASED (<id>) KEYFDBKLEN (<num>) ; 
 
 where 
 
 <id> is a user variable 
 
 <num> is an unsigned decimal integer 
 
 SEMANTIC ACTION: 
 
 1. establish this as the current PCB 
 
 2. allocate an internal PCB entry and save the pertinent 
 information 
 
 3. output the PL/I structure to map this PCB 
 
 ERROR CONDITIONS: 
 
 1 . invalid syntax 
 
 2. PCB already known 
 
 Fig. A.--PCB declarative 
 
11 
 
 SYNTAX: 
 
 .DECLARE DATABASE NAME (<id>) SYSTEM (<id>[,<pswd>]) 
 [PCB (<id>)]; 
 
 where 
 
 <id> is a user variable 
 
 <pswd> is the password associated with the system 
 
 SEMANTIC ACTION: 
 
 1. associate the data base with the indicated PCB, or 
 if the PCB is not specified, with the current PCB 
 
 2. verify data with dictionary 
 
 3. if PCB is specified, make it the current PCB 
 
 ERROR CONDITIONS: 
 
 1. invalid syntax 
 
 2. data base not known to the dictionary 
 
 3. program not allowed to access this data base 
 
 4. system not known to the dictionary, or if passworded, 
 the password given does not match 
 
 5. PCB not known, or if no PCB specified, no current PCB 
 
 6. PCB already associated with a data base 
 
 Fig. B.--Data base declarative 
 
12 
 
 SYNTAX: 
 
 .DECLARE SEGMENT NAME (<id>) [ASIS] [PCB (<id>)] 
 
 [SOURCE (<id>)] [PROCOPT (<procopt>)] + 
 
 [WITH] FIELDS <field-declarative>[ ,<field-declarative>] ]; 
 
 where 
 
 <id> is a user variable 
 
 <procopt> is a valid IMS processing option [5] 
 
 <field-declarative> is defined in Figure D 
 
 note 
 
 all keywords must precede the field declaratives, if any 
 
 SEMANTIC ACTION: 
 
 1. allocate an internal segment entry and save the pertinent 
 information 
 
 2. identify the source segment, either explicitly or implicitly 
 
 3. if PCB is specified, establish it as the current PCB 
 
 4. if ASIS is specified, generate the field entries for this 
 segment as it is defined to the dictionary 
 
 5. check security, i.e. the program's access to this segment 
 
 6. output the PL/I structure to map this segment 
 
 ERROR CONDITIONS: 
 
 1 . invalid syntax 
 
 2. source segment not known to the dictionary 
 
 3. logical segment already declared 
 
 4. invalid processing option 
 
 5. PCB not known, or if no PCB is specified, no current PCB 
 
 6. source segment not in data base 
 
 7. program not allowed access to this segment 
 
 8. in the ASIS case, the program is not allowed to access 
 one or more fields in the source segment 
 
 Fig. C. --Segment declarative 
 
13 
 
 SYNTAX: 
 
 <id> CHAR | DEC | BIN | ZONED (<len>L,<num>]) [FIXED| FLOATJ 
 
 where 
 
 <id> is a user type symbol which is a field name 
 <len> an unsigned integer representing the total field length 
 <num> an unsigned integer representing the number of decimal 
 places 
 
 SEMANTIC ACTION: 
 
 1. allocate an internal field entry and save the pertinent 
 information 
 
 2. check security 
 
 3. include this field in the structure mapping the current segment 
 
 4. keep track of maximum segment size 
 
 ERROR CONDITIONS: 
 
 1 . invalid syntax 
 
 2. invalid scale, base or precision combination 
 
 3. field not known to the dictionary 
 
 4. program not allowed the requested level of access to the field 
 
 5. field is not in the source segment being defined 
 
 Fig. D. --Field declarative 
 
14 
 
 SYNTAX: 
 
 .<func> <seg> [SSA (<id>],<id>] )J; 
 
 where 
 
 <func> is either GU, GN, GNP, GHU, GHN, GHNP, ISRT, DLET 
 
 or REPL 
 <seg> is a user type symbol which is a segment name 
 <id> is a user variable 
 
 SEMANTIC ACTION: 
 
 1. output CALL and preliminary set up statements 
 
 2. keep track of the maximum number of SSAs in any one call 
 
 ERROR CONDITIONS: 
 
 1 . invalid syntax 
 
 2. segment not known 
 
 3. invalid use of segment 
 
 Fig. E. --Segment manipulation 
 
 SYNTAX: 
 
 .<func> <area> PCB (<id>) ; 
 
 where 
 
 <func> is either SNAP, CHKP or LOG 
 
 <area> is a user variable 
 
 <id> is a user variable 
 
 SEMANTIC ACTION: 
 
 1. output CALL and preliminary set up statements 
 
 ERROR CONDITIONS: 
 
 1 . invalid syntax 
 
 2. PCB is not known 
 
 Fig. F.--Nonsegment manipulation 
 
15 
 
 CHAPTER III 
 PL/I SOURCE OUTPUT 
 
 The source code produced is a complete PL/I program ready to 
 be compiled. All the precompiler statements are included in it as 
 comments, followed by the appropriate generated code. The expanded 
 code can take at least forty-five characters per line. If the margin 
 length as defined by the compiler option MARGINS is not at least forty- 
 five characters, then precompilation is abandoned. 
 
 The ".DECLARE PCB" statement is expanded into the PCB mask 
 necessary to map the control blocks passed to each program by IMS. A 
 detailed description of each element in the structure can be found in 
 [5]. Figure G shows the precompiler statement converted to a comment 
 followed by the expanded PCB mask. The ".DECLARE DATABASE" statement 
 does not result in any PL/I code but is included as a comment as also 
 illustrated in Figure G. 
 
 The ".DECLARE SEGMENT" precompiler statement is expanded into 
 an unaligned structure that maps the logical segment as defined in the 
 program or the real segment if the ASIS option is taken. Figure G 
 shows how the precompiler statement is turned into a comment and fol- 
 lowed by the appropriate structure for use as an I/O area. 
 
 The data manipulation precompiler statements are treated much 
 the same as the declaratives. They are included as a comment within 
 the generated program, followed by the necessary PL/I source code to 
 
16 
 
 SAMPDATA: PROCEDURE OPTIONS(MA 
 DECLARE REGULAR-DATA CHAR(9 
 SUMEBITS BIT(5); 
 
 .DECLARE PC6 NAMt(PCBONE) B 
 
 ******************************* 
 
 DCL PTR1 POINTER; 
 
 DCL 1 PCBONE BASED(PTRl) 
 5 DbD_N4ME CHAR(8) 
 5 SEG_LEVEL CHAR(2 
 5 STATU5_CODE CHAR 
 5 P^OC_OPTIOf!S CHA 
 5 KESDLI FI <ED BIN 
 5 SEG_NAME CHAR(8) 
 5 LEN_KFDBK FIXED 
 5 NUM_SENSEGS FIXE 
 5 KEY_FD6K_AREA CH 
 
 /************ ****************** 
 .DECLARE DATABASE NAME(SAMP 
 
 ******************************* 
 
 /*************** 
 
 .DECLARE SEG 
 PROCOPTl 
 SAMP 
 SAMP 
 SAMP 
 SAMP 
 SAMP 
 
 DCL 1 LOGSE 
 
 DCL 
 
 SAMP 
 SAMP 
 SAMP 
 S^MP 
 SAMP 
 FUNCTin 
 PARMCOU 
 
 5 
 5 
 5 
 5 
 5 
 
 ***** 
 
 MENT 
 
 AP) W 
 
 FLD1 
 
 FLD2 
 
 FLD3 
 
 FLD4 
 
 FLD5 
 
 ***** 
 
 Gl U 
 FLD1 
 FLD2 
 FLD3 
 FLD4 
 FLD5 
 N CHA 
 NT FI 
 
 **** 
 
 NAME 
 
 ITH 
 
 CHAR 
 
 ZONE 
 
 FIXE 
 
 FIXE 
 
 FLOA 
 
 **** 
 
 UAL I 
 
 CHAR 
 
 PIC 
 
 FIXE 
 
 FIXE 
 
 FLUA 
 
 R(4) 
 
 XED 
 
 ****** 
 
 (LOGSE 
 
 FIELDS 
 
 (5), 
 
 D(8,2) 
 
 D OEC( 
 
 D BIN( 
 
 T OEC( 
 
 ****** 
 
 GNED, 
 (5), 
 •99999 
 D DEC 
 D BIN 
 T DEC 
 
 IN) ; 
 
 ) I N I T ( • 
 
 ******** 
 
 ASED(PTR 
 ******** 
 
 • 
 ). 
 
 (2), 
 R(4), 
 
 ( 31) , 
 t 
 
 BIN(31) , 
 D BIN(31 
 AR(6>; 
 
 ******** 
 
 DATA) SY 
 ******** 
 
 ******** 
 Gl) SOUR 
 
 .DECLARE* ), 
 
 ***************** 
 
 1) KEYFDBKLEN(6) ; 
 ***************** 
 
 ************* 
 ************* 
 
 ***************** 
 
 STEM(SAMPDATA,TST 
 ***************** 
 
 ***************** 
 CE(SAMPSEG) PCB(P 
 
 ************* 
 
 PSWD) ; 
 ************* 
 
 ************* 
 CBONE) 
 
 10,3), 
 
 31), 
 
 6); 
 
 ************************************** 
 
 9V9T* , 
 ( 10,3), 
 (31), 
 (6); 
 
 BIN(31) ; 
 
 SOMEBITS = '01010'B; 
 END SAMPDATA; 
 
 CO 
 OG 
 00 
 
 **00 
 00 
 
 */00 
 00 
 00 
 00 
 00 
 OG 
 00 
 00 
 00 
 00 
 00 
 00 
 00 
 
 **oo 
 
 00 
 */00 
 00 
 **00 
 CO 
 00 
 00 
 00 
 00 
 OG 
 00 
 */00 
 00 
 00 
 00 
 OG 
 CO 
 00 
 00 
 00 
 00 
 00 
 00 
 
 000010 
 000020 
 000030 
 000031 
 000032 
 000033 
 000034 
 000035 
 000036 
 000037 
 000038 
 000039 
 000040 
 000041 
 000042 
 000043 
 000044 
 000045 
 000046 
 000047 
 000048 
 000050 
 000051 
 000060 
 C00070 
 000080 
 000090 
 000100 
 000110 
 000111 
 000112 
 000113 
 000114 
 000115 
 000116 
 000117 
 000118 
 000119 
 000120 
 000121 
 C00130 
 000140 
 
 Fig. G.--PCB, data base, and segment sample output 
 
17 
 
 interface with the execution monitor. If the precompiler request is 
 coded as the only entry in an IF-THEN or IF-ELSE clause, then a DO 
 group is created containing the generated code. This maintains the 
 intended program structure. 
 
 Figure H shows how the data manipulation statements are 
 handled. The code produced sets a variable to the proper number of 
 parameters in the CALL to follow and finally invokes the execution moni- 
 tor with the proper parameters. The two variables FUNCTION and PARMCOUNT 
 are declared and maintained by the precompiler and therefore do not re- 
 quire programmer concern. Standard IMS Segment Search Arguments (SSA) 
 [5] are used and when included in a precompiler request statement, are 
 passed to IMS by the execution monitor. 
 
18 
 
 SAMPDATA: PROCEDURE OPTI ONS ( MAI N ) ; OO0O0O1O 
 
 DECLARE S5A1 CHAk(8) INI T ( ' SGEFPAMS* ) , 00000020 
 
 SSA2 CHAR(b) INIT( 'SGEFPANMM , C0000030 
 
 LOGRECORD CHARI12); • 00000040 
 
 /**********************************************************************0OOOOO41 
 
 •DECLARE PCb NAME(PCSONE) BASED(PTRl) KEYFDDKLEN( 8 ) ; 00000042 
 
 *************** *******************************************************/000 00043 
 
 DCL PTR1 POINTER; 00000044 
 
 DCL 1 PCBONE BA$ED(PTR1), 00000045 
 
 5 PBU_NAM£ CHAR(8), OOCG0046 
 
 5 SEG.LEVEL CHAR(2)t 00C00047 
 
 5 STATUS.CUDE CHAK(2), 00000048 
 
 5 PRGC.OPTIONS CHAR(4), 00000049 
 
 5 RESDLI FIXED UIN(31), 0C000050 
 
 5 SEG_NAME CHAR(8), 00000051 
 
 5 LtN.KFDBK FIXED BIN(31), 00000052 
 
 5 NUM.SENSEGS FIXED BIN(31), 00000013 
 
 5 KEY_FUBK_AREA CHAR(8); 00000054 
 
 00000055 
 
 /********************************************************************* *OOC00056 
 
 .DECLARE DATABASE NAME ( SAMPDATA) SYSTEM( SAMPDATA , TSTP5WD ) ; 00000057 
 
 **********************************************************************/00000058 
 
 00000060 
 
 /**********************************************************************O0C00061 
 
 .DECLARE SEGMENT NAME ( L0GSEG1 ) SOURCE ( SAMPSEG ) PCB(PCBONE) 00000070 
 
 PROCOPT(A) WITH FIELDS 00000080 
 
 SAMPFLD FIXED DEC(6,3); 00000081 
 
 ********************************* ****** *******************************/oo000082 
 
 DCL 1 L0GSEG1 UNALIGNED, 00000083 
 
 5 SAMPFLD FIXED DEC (6,3); 00C00084 
 
 DCL FUNCTION CHAR(4), C0000085 
 
 PARMCOUiMT FIXED BIN131); 00000086 
 
 00000090 
 
 /**** PROCESSING FOLLOWS ****/ 00000100 
 
 LOGRECORD = 'SAMPLE LOG 1 ; 00000110 
 
 /**** SEGMENT MANIPULATION STATEMENT FOLLOWS ****/ 000G0120 
 
 /**************************** A *****************************************00C00121 
 
 .GU L0GSEG1 SSA(SSAl,SSA2) ; 00000122 
 
 **********************************************************************/0C 000123 
 
 FUNCTION = 'GU • ; PARMCOUNT =3+2; 00000124 
 
 CALL RNPTDLI (PARMCOUNT, 00000125 
 
 FUNCTION, 00000126 
 
 PCBONE, 00000127 
 
 L0GSEG1, 00000128 
 
 SSA1, 00000129 
 
 SSA2); 00000130 
 
 00C00131 
 
 /**** NON-SEGMENT MANIPULATI ON STATEMENT FOLLOWS ****/ 00000140 
 
 /**************************#*******************************************00 000141 
 
 .LOG LOGRECORD PCB(PCBONE); 00000142 
 
 **********************************************************************/0G000143 
 
 FUNCTION = 'LOG • ; PARMCOUNT = 3 ; 00000144 
 
 CALL RNPTDLI (PARMCOUNT, 00GO0145 
 
 FUNCTION, 000C0146 
 
 PCBONE, 0C000147 
 
 LOGRECORD); 00000148 
 
 00000150 
 END SAMPDATA; 00000160 
 
 Fig. H.--Data manipulation sample output 
 
19 
 
 CHAPTER IV 
 THE COMMUNICATION MODULE 
 
 After the input program has been processed, a communication 
 section in the form of an object module is produced. This module con- 
 tains both executable code and a tabular description of each data 
 base, each logical segment, and each field declared within the program. 
 The application program object module produced by the regular PL/I com- 
 piler is linked with this module to become the complete executable ap- 
 plication program. At run time, the execution monitor will load the 
 application program and modify some code within the communication module 
 to allow it to be the entry point to which IMS will transfer control. 
 In addition, some address references will be linked such that the ap- 
 plication program can communicate with the execution monitor. With 
 the description of the program's data requirements contained within the 
 communication module, and the actual segment descriptions from the 
 dictionary, the execution monitor is able to determine the necessary 
 mappings and data conversions. 
 
 The tabular section of the communication module is composed of 
 three subsections: the data base section, the segment section, and the 
 field section. The data base section contains one entry for each data 
 base declared within the program. Similarly, the segment and field 
 sections contain one entry for each segment or field respectively. 
 These three subsections are preceded by a fixed length area containing 
 
20 
 
 the executable code and some control information. The layout of each 
 subsection is described in the following tables. 
 
 TABLE 1 
 FIXED LENGTH AREA IN COMMUNICATION MODULE 
 
 Decimal Dis- Field 
 placement Size Data Format Content 
 
 108 Code Executable code which 
 
 contains the program 
 entry point (RNPENTRY) 
 and the interface point 
 (RNPTDLI) back to the 
 execution monitor 
 
 108 2 Binary The maximum number of 
 
 SSA's used in any CALL 
 to RNPTDLI within the 
 program 
 
 110 2 Binary The CSECT size less the 
 
 114 bytes in the fixed 
 length area, but at 
 least as large as the 
 largest segment 
 
 112 2 Binary The number of data bases 
 
 declared in the program 
 
21 
 
 TABLE 2 
 
 DATA BASE ENTRY IN COMMUNICATION MODULE 
 
 Decimal Dis- Field 
 placement Size 
 
 Data Format 
 
 Content 
 
 
 8 
 
 10 
 
 8 
 2 
 
 Character 
 Binary 
 
 Binary 
 
 The data base name 
 The number of segments 
 in the data base 
 The offset to the first 
 segment entry for the 
 segments in this data 
 base relative to the 
 beginning of this entry 
 
 TABLE 3 
 SEGMENT ENTRY IN COMMUNICATION MODULE 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 8 
 
 Character 
 
 The logical segment 
 name used in the pro- 
 gram for this segment 
 
 8 
 
 8 
 
 Character 
 
 The real segment 
 
 which is the source 
 
 for this logical segment 
 
 16 
 
 2 
 
 Binary 
 
 The number of fields 
 in this segment 
 
 18 
 
 2 
 
 Binary 
 
 The offset to the 
 first field entry for 
 the fields in this 
 segment relative to 
 the beginning of this 
 entry 
 
22 
 
 TABLE 4 
 FIELD ENTRY IN COMMUNICATION MODULE 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 
 8 
 
 Character 
 
 Field name 
 
 
 8 
 
 2 
 
 Binary 
 
 The field len 
 
 gth in 
 
 
 
 
 bytes minus one 
 
 10 
 
 1 
 
 Bit string 
 
 An indication 
 the data type 
 field. Possi 
 Bit 0=1 
 Bit 1=1 
 Bit 2=1 
 Bit 3=1 
 Bit 4=1 
 Bit 5=1 
 Bit 6=1 
 Bit 7=1 
 Bits 0-7 
 
 as to 
 
 of this 
 ble codes: 
 
 FLOAT 
 
 FIXED 
 
 CHAR 
 
 PACKED 
 
 ZONED 
 
 /SX 
 
 /CK 
 
 XDFIELD 
 = 1 HEX 
 
 11 
 
 1 
 
 Binary 
 
 Field scale factor 
 
 12 
 
 2 
 
 Binary 
 
 Field position as an 
 
 
 
 
 offset into 1 
 
 ogical 
 
 
 
 
 segment 
 
 
23 
 
 CHAPTER V 
 SUMMARY 
 
 The precompiler subsystem of the Data Base Dictionary System 
 can be briefly summarized as follows. As an extension to PL/I it allows 
 several features of the Data Base Dictionary System to be implemented. 
 Since programs are processed before they are actually compiled and the 
 dictionary is available at this precompile time, security and access 
 control is enforced and programming simplification is provided for. 
 Using a set of precompiler statements, a program declares its intentions 
 regarding what data and how that data is to be processed. Authority 
 and continuity is checked. Requests for data are coded using another 
 set of precompiler statements. 
 
 The concept of the logical segment is perhaps the most important 
 technique employed. A logical segment is a subset of a real segment 
 of data as defined to IMS. Any of the data elements within the source 
 segment may be requested in any order and in any format. Conversions 
 and mappings will be done by the execution monitor. Chapter I examines 
 the advantages of this approach. 
 
 Physically then, the precompiler reads in a program which in- 
 cludes the special statements, processes the program accessing the dic- 
 tionary as needed, and optionally produces a listing of the input, a 
 listing of the expanded program produced, the expanded program for 
 input to the PL/I compiler, the expanded program for punching, a 
 
24 
 
 communication module for interface with the execution monitor, and 
 always a burst page and statistics. Chapter X gives a more detailed 
 description of the processing options available. Although this precom- 
 pilation process requires an extra step in the translation from source 
 program to executable code, the benefits gained are worth the additional 
 overhead. 
 
25 
 
 PART II 
 PRECOMPILER INTERNALS AND THE OPERATING ENVIRONMENT 
 
26 
 
 CHAPTER VI 
 
 LEXICAL ANALYSIS 
 
 Lexical analysis may be defined as the scanning of the charac- 
 ters in a source program from left to right isolating tokens or symbols, 
 A scanner is used in this precompiler to perform lexical analysis as 
 well as to determine the type of token isolated. When the syntactic 
 and semantic routines invoke the scanner, the next token is found and 
 its type made available. To perform the analysis, each character is 
 translated into a lexical class as defined in Table 5. 
 
 TABLE 5 
 LEXICAL CLASS ASSIGNMENTS 
 
 Class Members 
 
 Blanks Blanks 
 
 Letters A thru Z and #, $, @ 
 
 Underscore 
 
 Quote 
 
 Digits thru 9 
 
 Delimiters .+- = %&;(), :<>|-i 
 
 Double ".DECLARE" or a .<func> as previously 
 
 defined 
 
 Slash / 
 
 Star * 
 
 Bad Character Anything else 
 
 While scanning the input program quoted strings are treated as 
 a single token without regard to the characters within that string. 
 
27 
 
 Except for terminating tokens, card boundaries and comments are ignored. 
 Quoted strings, however, may extend onto multiple cards. When a token 
 has been isolated, its class is determined by searching a table of possi- 
 ble token types. If the token in hand is not in the table, then it is 
 an "undefined" token and is so classed. The possible classes of token 
 types are undefined tokens, numbers, strings, IMS functions, delimiters, 
 and reserved words. For ease of reference, the scanner not only indicates 
 which token class is isolated but also, when applicable, which IMS func- 
 tion, delimiter or reserved word. If while scanning the program the end 
 of the source is found, an indication of such is given by the scanner so 
 that the syntactic routines can take appropriate action. 
 
 Figure I shows a flow of the lexical analysis process. When 
 the lexical analysis routine is invoked, processing starts at the "ENTER" 
 node. Each "NEXT CHAR" box represents moving to the character at the 
 right of the current position. From each box extends one or more flow 
 lines indicating the action taken based on the particular lexical class 
 of the current character. Some lines are followed for several classes. 
 Those lines with no class indicated show action taken for lexical classes 
 not explicitly covered by other lines. "RETURN" means that a token 
 has been isolated and typed and control has passed to the invoking 
 routine. The reader should recall that card boundaries do terminate 
 tokens (except quoted strings) but are transparent otherwise. 
 
28 
 
 BLANK 
 
 STAR 
 
 SLASH 
 
 STAR 
 
 J NEXT 
 CHAR 
 
 LETTER | 
 
 DIGIT xl, N 
 
 LETTER, DIGIT, or 
 UNDERSCORE 
 
 DIGIT 
 
 DELIMITER 
 
 QUOTE 
 
 ^_i 
 
 -QUOTE. 
 
 QUOTE 
 
 SLASH 
 
 RETURN 
 
 ^ 
 
 Fig. I. --Lexical analysis 
 
29 
 
 CHAPTER VII 
 SYNTACTIC AND SEMANTIC ANALYSIS 
 
 The syntactic analysis or parsing of the input source program 
 is performed at two levels, the outermost of which performs two functions 
 First, the source program is parsed until the program name is found. 
 Since this is a PL/I program, its name should be the token preceding 
 the first colon, i.e., the label on the first external procedure. Once 
 the program name is found, the dictionary is checked to verify that the 
 program is defined and that it is written in PL/I. As with IMS alone, 
 programs must be defined before they can be used. If a discrepancy 
 exists, an appropriate error message is produced and precompilation is 
 abandoned. 
 
 The second function performed is the search for a precompiler 
 statement. Precompiler statements are divided into their two semantic 
 classes. Once one of the tokens identifying a precompiler statement has 
 been found, control is passed to one of two routines, one for the declar- 
 ative and the other for the data manipulation statements. One of these 
 two routines then performs the second level of syntactic analysis. 
 
 The declarative routine initially ensures that the precompiler 
 statement about to be parsed starts a new statement in the input program. 
 Parsing then is accomplished by moving through a series of parse tables 
 that are linked together in such a way that syntactical analysis and 
 semantic processing are performed quite simply. Figure J shows the 
 
30 
 
 NUMBER OF 
 ENTRIES 
 
 TOKEN 
 VALUE 
 
 -1 
 
 "T 
 
 PROCESSING 
 ROUTINE 
 
 BAD SYNTAX 
 ROUTINE 
 
 NEXT 
 TABLE 
 
 NEXT 
 TABLE 
 
 Fig. J. --Model syntax table 
 
 structure of these tables. After the initial table is established the 
 parser repeats the following: 
 
 1. The lexical analysis routine is called to get the next 
 token. 
 
 2. The current syntax table is searched for the entry that 
 corresponds to the current token. 
 
 3. The indicated routine is called to take semantic action 
 based on the token in hand. 
 
 4. The table indicated as the next table becomes the current 
 table. 
 
 This process is continued until a statement terminator, the semicolon, 
 is found or the end of the input program is reached. The semantic 
 
31 
 
 routines invoked process a particular keyword and its operands, if any. 
 Processing includes accessing the dictionary for verification and 
 security functions as well as maintenance of the internal structures. 
 A full description of these structures is found in Chapter VIII of this 
 document. 
 
 When the outermost level of the syntactic parser finds a data 
 manipulation statement, the second inner level routine is invoked. Since 
 the semantic action required for this type of statement is much less 
 than that required for declarative statements, a finite state automaton 
 approach was used to parse them. This technique affords good syntax 
 analysis while supporting the limited semantic processing required. 
 No dictionary access is needed. If the request is a data base manipu- 
 lation function then the I/O area given must be a logical segment that 
 has been previously defined. If, however, the function is SNAP, CHKP 
 or LOG then the PCB given must have been previously defined. 
 
 Figure K graphically illustrates the nine state automaton and 
 the movement through the states for different types of expected input. 
 Parsing begins at START after the I/O area has been identified. If an 
 unexpected token is found, then the processing of this statement is 
 terminated. Four cases of missing right parentheses are shown with 
 dotted lines. In these cases the missing token is assumed. 
 
 While the outer level routine identifies the precompiler 
 statements and invokes the appropriate second level routine, syntactic 
 and semantic analysis continues until the end of the input program is 
 
32 
 
 reached. When errors are detected error messages are produced and the 
 remainder of the current statement is bypassed. 
 
33 
 
 w 
 
 on 
 
 o 
 +-> 
 
 E 
 o 
 +J 
 
 (T3 
 X 
 
 <o 
 +J 
 
 c 
 
 >> 
 
 +-> 
 
 Q. 
 
 
 Q 
 I 
 
 I 
 
 05 
 
 
34 
 
 CHAPTER VIII 
 INTERNAL STRUCTURES 
 
 A set of internal tables is created and maintained throughout 
 the precompilation process. These tables contain the information nec- 
 essary to verify the correctness of and the continuity between the 
 entities declared by the program. In addition, they accumulate data 
 used to generate the communication module. There are four table types. 
 The first is a header record that contains counters and other static 
 variables as well as the heads of the linked lists connecting the other 
 tables types. The second, third and fourth table types represent each 
 data base, segment, and field declared respectively. From the header 
 record, all the data base table occurrences are linked on a list. All 
 segment table occurrences are linked on a second list. Each data base 
 table contains the head of a list of segment tables that represent the 
 segments within that data base. In like manner, each segment table 
 contains the head of a list of field tables for the fields contained 
 within that segment. 
 
 This network of interrelated tables is built from the precom- 
 piler declarative statements and information from the dictionary. As 
 each new statement is being processed, the current environment depicted 
 by these internal tables is checked to see if the new entity fits in. 
 If it does then the necessary tables are created and/or maintained. 
 When processing the data manipulation statements, the tables are checked 
 
35 
 
 to ensure the feasibility of the request in hand. When the entire input 
 source program has been processed, the tables are used to create the 
 communication module. It in turn is linked with the object module from 
 the PL/I compiler to form the complete application program. The layout 
 of each of the four internal records is shown in Tables 6 through 9. 
 
 TABLE 6 
 INTERNAL HEADER RECORD 
 
 Decimal Dis- Field Data Format Content 
 
 placement 
 
 2 Binary Number of data bases 
 
 declared 
 2 2 Binary Number of segments 
 
 declared 
 4 Binary Number of fields 
 
 declared 
 6 2 Binary Maximum number of SSAs 
 
 used in any CALL 
 
 statement 
 8 2 Binary The size of the largest 
 
 segment 
 10 4 Pointer Head of the linked list 
 
 of data base records 
 14 4 Pointer Head of the linked list 
 
 Field 
 
 Data Format 
 
 Size 
 
 
 2 
 
 Binary 
 
 2 
 
 Binary 
 
 2 
 
 Binary 
 
 2 
 
 Binary 
 
 of segment records 
 
36 
 
 TABLE 7 
 INTERNAL DATA BASE RECORD 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 6 
 
 14 
 22 
 30 
 
 32 
 
 2 
 
 Binary 
 
 The eventual location of 
 the corresponding data 
 base entry in the 
 communication module 
 
 4 
 
 Pointer 
 
 Link to next data base 
 record 
 
 8 
 
 Character 
 
 Data base name 
 
 8 
 
 Character 
 
 System name 
 
 8 
 
 Character 
 
 PCB name 
 
 2 
 
 Binary 
 
 Number of segments in 
 this data base 
 
 4 
 
 Pointer 
 
 Pointer to first segment 
 record for this data 
 base 
 
37 
 
 TABLE 8 
 
 INTERNAL SEGMENT RECORD 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 
 
 6 
 
 14 
 22 
 30 
 
 34 
 
 36 
 
 40 
 
 2 
 
 Binary 
 
 The eventual location of 
 the corresponding seg- 
 ment entry in the 
 communication module 
 
 4 
 
 Pointer 
 
 The link to the next 
 segment record from the 
 header record 
 
 8 
 
 Character 
 
 Logical segment name 
 
 8 
 
 Character 
 
 Source segment name 
 
 8 
 
 Character 
 
 PCB name 
 
 4 
 
 Character 
 
 The PR0C0PT for this 
 segment 
 
 2 
 
 Binary 
 
 Number of fields in this 
 segment 
 
 4 
 
 Pointer 
 
 Link to next segment in 
 its data base 
 
 4 
 
 Pointer 
 
 Link to first field in 
 
 
 
 this segment 
 
38 
 
 TABLE 9 
 
 INTERNAL FIELD RECORD 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 6 
 
 14 
 16 
 
 18 
 
 20 
 22 
 
 2 
 
 Binary 
 
 The eventual location 
 of the corresponding 
 field entry in the 
 communication module 
 
 4 
 
 Pointer 
 
 Link to the next field 
 in its segment 
 
 8 
 
 Character 
 
 Field name 
 
 2 
 
 Binary 
 
 Field length minus one 
 
 2 
 
 Binary 
 
 Total number of digits 
 or characters in this 
 field 
 
 2 
 
 Binary 
 
 Offset to this field 
 within the logical 
 segment 
 
 2 
 
 Binary 
 
 Number of decimal places 
 for numeric fields 
 
 1 
 
 Bit 
 
 Field type indicator 
 
39 
 
 CHAPTER IX 
 
 PRECOMPILER OPTIONS 
 
 Run-time options are passed to the precompiler by means of a 
 parameter string specified on the EXEC statement of the invoking JCL. 
 Standard OS parameter conventions apply. Each option has a default, 
 as indicated in Table 10; options may be specified in any order separated 
 by commas. If an option appears more than once, the last specification 
 (scanning left to right) is used. Each option keyword may be abbreviated 
 with any number of characters up to its complete spelling. For the 
 options which may be prefixed by "NO," abbreviations still apply with 
 or without the prefix. A description of each option along with any 
 unique specification rules is shown in Table 10. 
 
 / 
 
40 
 
 TABLE 10 
 PRECOMPILER OPTIONS 
 
 Keyword 
 
 Meaning 
 
 DECK/NODECK 
 GEN/NOGEN 
 
 INSOURCE/NOINSOURCE 
 
 LIST/NOLIST 
 
 MARGINS(a,b,c) 
 
 NUMBER/NONUMBER 
 
 SEQUENCE(x,y) 
 
 PL/ I code 
 SYSLIN 
 be written 
 
 input file 
 Default 
 
 PL/I code 
 SYSLIST. 
 
 This option indicates whether a card image 
 
 version of the PL/I code produced is to be 
 
 written to file SYSPNCH for punching. 
 
 Default is NODECK. 
 
 This option indicates whether the 
 
 produced is to be written to file 
 
 and the communication CSECT is to 
 
 to file SYSOUT. Default is GEN. 
 
 This option indicates whether the 
 
 is to be listed on file SYSPRINT. 
 
 is NOINSOURCE. 
 
 This option indicates whether the 
 
 produced is to be listed on file 
 
 Default is NOLIST. 
 
 This option indicates the source margins 
 
 applicable to the input. All values must 
 
 be between and 80 inclusive. Only data 
 
 within the source margin is processed. 
 
 a - The left margin. Default is 2. 
 
 b - The right margin. Default is 72. 
 
 c - The carriage control character 
 position, used when printing the 
 insource. If 0, then single 
 spacing is used. Default is 0. 
 This option indicates whether the PL/ I code 
 produced should be renumbered, starting with 
 10 and incrementing by 10 in the sequence 
 area defined by the SEQUENCE option. 
 Default is NONUMBER. 
 
 This option indicates the position of the 
 sequence field in the input record. 
 
 x - The left margin of the sequence 
 field. Default is 73. 
 
 y - The right margin of the sequence 
 field. Default is 80. 
 
41 
 
 CHAPTER X 
 PRECOMPILER OPERATING ENVIRONMENT 
 
 The precompiler was written in PL/I and compiled using IBM's 
 PL/I Optimizing Compiler. It is intended to be executed on IBM 370 
 hardware with access to the dictionary system of Appendix A. Five 
 input files and five output files are used by the precompiler in its 
 processing. Four of the input files are the dictionary's four data 
 sets. These data sets are VSAM files and access to them is through 
 special dictionary service modules. A full description of each of these 
 four dictionary system files can be found in Appendix A. The following 
 table describes the files used by the precompiler. Given are the in- 
 ternal PL/I file names, the associated DDNAME and compiler options, the 
 formats of each file with characteristics when different from the PL/I 
 default, and the usage for each file. 
 
 A set of PL/I preprocessor macros are used to generate the 
 three VSAM control blocks within the precompiler program. These blocks 
 are the Access-Method Control Block (ACB), the Request Parameter List 
 (RPL), and the Exit List (EXLST). With these control blocks and an 
 external assembler routine, the precompiler has full access to the NODE 
 data set. Access to the LAT table, HOJ table and the EDGE data sets is 
 through a set of I/O routines, one for each data set. These assembler 
 routines are tailored for the type of requests that are made against 
 their particular data set. All access to the other files used by the 
 precompiler is through standard PL/ I I/O. 
 
42 
 
 TABLE 11 
 
 INPUT AND OUTPUT FILES 
 
 File Name 
 
 DDNAME 
 
 Compiler 
 Option 
 
 Format 
 
 Usage 
 
 SYSIN 
 
 SYSIN 
 
 N/A 
 
 STREAM, 
 INPUT 
 
 The input source 
 program to be 
 precompiled 
 
 SYSPRINT 
 
 SYSPRINT 
 
 INSOURCE 
 
 PRINT, 
 
 LINESIZE(130), 
 
 VBA, 
 
 LRECL(135), 
 
 BLKSIZE(139) 
 
 The listing file 
 that contains the 
 header page, insource 
 listing and error 
 messages 
 
 SYSLIST 
 
 SYSLIST 
 
 LIST 
 
 PRINT 
 
 The listing of the 
 output source pro- 
 gram generated 
 
 SYSLIN 
 
 SYSLIN 
 
 GEN 
 
 RECORD, 
 OUTPUT, FB, 
 LRECL(80), 
 BLKSIZE(1680) 
 
 The output source 
 program generated 
 
 SYSPNCH 
 
 SYSPUNCH 
 
 DECK 
 
 RECORD, 
 OUTPUT, F, 
 LRECL(80) 
 
 The to-be-punched 
 form of the source 
 program generated 
 
 SYSOUT 
 
 SYSOUT 
 
 N/A 
 
 RECORD, 
 OUTPUT, FB, 
 LRECL(80), 
 BLKSIZE(1680) 
 
 The communication 
 module in object 
 form 
 
 N/A 
 
 LATTABLE 
 
 N/A 
 
 VSAM, ESDS 
 
 The dictionary LAT 
 data set 
 
 N/A 
 
 NODE 
 
 N/A 
 
 VSAM, KSDS 
 
 The dictionary NODE 
 data set 
 
 N/A 
 
 EDGE 
 
 N/A 
 
 VSAM, ESDS 
 
 The dictionary EDGE 
 data set 
 
 N/A 
 
 RNPDDHOJ 
 
 N/A 
 
 VSAM, ESDS 
 
 The dictionary HOJ 
 data set 
 
43 
 
 CHAPTER XI 
 TESTING AND VERIFICATION 
 
 The data base precompiler was implemented in a structured, 
 top down fashion. Therefore, by using "null" routines where routines 
 were not yet implemented, each section being programmed could be tested. 
 With this technique the parameter parsing section was written and tested 
 first. Following that the lexical analysis section was programmed and 
 tested, then the syntactic and semantic routines. 
 
 Testing the semantic routines was the most difficult part. 
 Since the Data Base Dictionary System described in Appendix A was not 
 fully implemented, exhaustive system testing was impossible. A sample 
 data definition language and dictionary maintenance subsystem was not 
 developed at all. In light of this, only a small set of test data was 
 loaded into test dictionary data sets to allow the precompiler to test 
 its access and use of the dictionary. Although not a thorough test, 
 this does show the feasibility of such a precompiler as part of a dic- 
 tionary system. 
 
44 
 
 APPENDIX A 
 
 INTRODUCTION 
 
 Appendix A is a description of the Data Base Dictionary Sys- 
 tem. Section I gives an overview of VSAM, IBM's Virtual Storage Access 
 Method. Only terminology and concepts necessary to the reader's under- 
 standing of the following system description have been included. Section 
 II is a system overview that discusses the dictionary itself as well as 
 the role played by the precompiler and execution monitor. The third and 
 final section gives detailed record layouts of the data records in each 
 of the four data sets composing the dictionary. 
 
45 
 
 SECTION I 
 VSAM OVERVIEW 
 
 Because our Data Base Dictionary System was implemented on IBM 
 computing hardware, using IBM software for support, it was necessary to 
 choose an access method from those available with current IBM operating 
 systems. Our requirements were quite varied. In addition to direct 
 access by pointer within and between data sets, we needed direct and 
 sequential access by key value. VSAM (Virtual Storage Access Method) 
 was chosen because it supported all our processing needs. The following 
 is a short overview of VSAM and the terminology used when describing 
 our use of it. 
 
 VSAM offers two types of data sets, key-sequenced data sets 
 (KSDS) and entry-sequenced data sets (ESDS). The primary difference be- 
 tween the two is the order in which records are stored within them. 
 In a KSDS the records are stored in sequence by the value of a specified 
 key field from each record. Sequential and direct access is possible 
 via this key field. In an ESDS, records are stored without regard to 
 data within the records. The sequence of an ESDS is determined by the 
 order in which records were stored. Physical sequential access is allowed 
 as well as direct access by relative byte within the data set. 
 
 Both ESDS and KSDS are actually stored and retrieved in units 
 called control intervals. The total space of a data set is considered 
 to be divided into a continuous set of these control intervals; hence 
 
46 
 
 a data record stored within a control interval can be addressed by its 
 Relative Byte Address (RBA), i.e., offset, in bytes, from the beginning 
 of the data set. We have used these RBA's for our direct pointer imple- 
 mentation both within a data set and between data sets. A complete 
 description of the VSAM access method can be found in appropriate IBM 
 documentation and publications [6]. 
 
47 
 
 SECTION II 
 SYSTEM OVERVIEW 
 
 The Data Base Dictionary System was designed to be an adminis- 
 trative aid as well as the source of information used to allow and con- 
 trol access to data in a particular environment. Five different levels 
 of data are recognized as separate entities by the dictionary. These 
 entities are: fields, segments, data bases, programs, and systems. 
 For each of these entities the dictionary maintains information on its 
 characteristics, usage and relationship with other entities. Each entity 
 type is represented as a node in the graphical diagram of the dictionary 
 system (Figure L), and the five nodes have been labeled Nl through N5. 
 All node data, however, is kept in one VSAM key-sequenced data set 
 (KSDS). In addition to the static information about each entity, inter- 
 node relationships are maintained to build levels of data, that is, 
 several fields make up a segment, several segments make up a data base, 
 several data bases may be used by one program, and several programs may 
 belong to one system. 
 
 Certain types of information have meaning only as they relate 
 one entity to another; for example, a field's location is significant 
 only as that field relates to a particular segment. This type of 
 relational information is called "edge" data. The four different types 
 of edge data are represented in Figure L by the labels El through E4, 
 and are stored in a VSAM entry-sequenced data set (ESDS). An example 
 
48 
 
 Data "Bases 
 
 rocjrams 
 
 Stj-yleinn - 
 
 Pv-oorara £] 
 
 T>aU "Base 
 
 Sustems 
 
 Dab "Base- 
 
 Seq meats 
 
 Secjmervt 
 
 Fig. L.— Logical view of dictionary system 
 
49 
 
 showing how the segment-field edge data exists is given in Figure M. 
 A segment points into the edge data set to the head of a linked list 
 connecting all the edge entries for all the fields in that segment. 
 In like manner, a field points into the edge data set to the head of a 
 list linking all occurrences of that field in the several segments in 
 which it might exist. Each edge entry points to both of the node en- 
 tries it relates. 
 
 ■free list head 
 
 Segment 
 
 A 
 
 Reld I 
 
 si I fields ui 
 Segment A. 
 
 -—all occuv-Tonces 
 of hold I 
 
 Fig. M. --Example of segment-field edge 
 
 Infrequently needed or variable-length information for any of 
 the nodes or edges is kept in the HOJ table; this is a VSAM ESDS. This 
 device was chosen to improve efficiency from both the storage and the 
 
50 
 
 processing point of view. The HOJ table allows the data sets containing 
 all of the information pertinent to the node or edge to consist of fixed- 
 length records. Figure N gives a representational view of the HOJ table. 
 A variable number of fixed-length records make up one entry of information. 
 These records are linked, and the node or edge entry referencing the 
 information in the HOJ table contains a pointer to the head of this list. 
 
 pointers 
 
 •from 
 MODE 
 
 EDGE 
 
 -c- 
 
 free list 
 head 
 
 Fig. N. --Logical view of HOJ table 
 
 The dictionary uses relative byte addresses (RBA) as direct 
 pointers from one entry to another, the latter being either in the same 
 
51 
 
 data set as the former or in one of the other three data sets composing 
 the dictionary. In VSAM KSDK's the RBA of existing records can change 
 as records are added, changed, or deleted. In order to minimize the 
 effect of this relocation of records in the node data set, an indirect 
 pointer scheme is used. A separate data set, the LAT table, is used to 
 implement these indirect pointers. Figure shows how a node, edge, or 
 HOJ entry points into the LAT table, which in turn points at the target 
 node entry. As RBA's change in the node data set, the corresponding 
 LAT table entry is updated. With this technique, the many pointers that 
 reference a particular entry can be maintained by updating only one 
 indirect pointer. 
 
 Node 
 
 F.clcje 
 
 Hoj 
 
 Node 
 
 Fig. 0. --Logical view of LAT table 
 
 In order to offer control and several other features, the dic- 
 tionary system has two major subsystems. These are a PL/I precompiler 
 and an execution monitor. Both of these subsystems access the dictionary 
 
52 
 
 data for the information needed to provide various services outlined 
 below. 
 
 1. Security enforcement to the field level. 
 
 2. A shorthand for some of the control blocks and call 
 statements. 
 
 3. The definition of logical segments of data. 
 
 4. An interface module for communication with the execution 
 monitor. 
 
 The execution monitor acts as an interface between the appli- 
 cation program and the data base software, thereby allowing several 
 features not inherent in the data base software to be available. These 
 include: 
 
 1. Translation between user defined logical segments and 
 real data segments. 
 
 2. Data editing. 
 
 3. Data compression and "invisible" fields with default 
 values. 
 
 4. Derivable fields. 
 
 The concept of a logical segment defined by an application pro- 
 gram proves useful in several ways. It first helps clean the users code 
 by deleting filler fields in input/output data structures. It frees the 
 user from being tied to data of specific characteristics, and finally, 
 it allows a program to be desensitized to data at the field level. 
 
53 
 
 SECTION III 
 
 DETAILED DESCRIPTION OF THE DICTIONARY DATA SETS 
 
 This section provides a detailed record layout of the entire 
 Data Base Dictionary System. As discussed above, the system comprises 
 four separate data sets linked by RBA pointers: the node, edge, LAT, 
 and HOJ data sets. 
 
 NODE RECORDS 
 
 The node data set is a VSAM/KSDS file whose key is composed 
 of a one-byte type identifier and an eight-byte name, for a total key 
 length of nine bytes. All node records are thirty-eight bytes long. 
 The control interval size is 512 bytes. There are five different types 
 of node records. The fields making up the various node records are 
 explained below in Tables 12 through 16. 
 
54 
 
 TABLE 12 
 
 SYSTEM NODE 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 
 
 1 
 9 
 
 12 
 
 15 
 
 18 
 27 
 
 8 
 3 
 
 Character 
 
 Character 
 Binary 
 
 Binary 
 
 Binary 
 
 "Y" to identify system 
 node 
 
 System name 
 RBA of LAT entry for 
 this node record 
 RBA of HOJ entry for 
 the text string de- 
 scribing this system 
 RBA of the first system/ 
 program edge entry for 
 this system 
 
 8 
 
 Character 
 
 Password for 
 
 this syst 
 
 1 
 
 Binary 
 
 System type, 
 codes: 
 
 Possible 
 
 
 
 Bit 0=1 
 
 system 
 
 
 
 Bit 1=1 
 
 trans- 
 action 
 
 
 
 Bit 2=1 
 
 job- 
 stream 
 
55 
 
 TABLE 13 
 
 PROGRAM NODE 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 1 
 
 Character 
 
 "P" to identify program 
 node 
 
 1 
 
 8 
 
 Character 
 
 Program name 
 
 9 
 
 3 
 
 Binary 
 
 RBA of LAT entry for 
 this node record 
 
 12 
 
 3 
 
 Binary 
 
 RBA of HOJ entry for 
 the text string de- 
 scribing this program 
 
 15 
 
 3 
 
 Binary 
 
 RBA of the first system/ 
 program edge entry for 
 this program 
 
 18 
 
 3 
 
 Binary 
 
 RBA of the first program/ 
 data base edge entry for 
 this program 
 
 21 
 
 3 
 
 Binary 
 
 Program input/output 
 area size 
 
 24 
 
 3 
 
 Binary 
 
 Program segment search 
 area size 
 
 27 
 
 1 
 
 Binary 
 
 Program type. Possible 
 codes: 
 
 Bits 0-1=00 PL/I 
 Bits 01 assembler 
 
 10 COBOL 
 Bit 2 =0 CMPAT=N0 
 1 CMPAT=YES 
 
 28 
 
 3 
 
 Binary 
 
 Maximum enqueue calls 
 allowed at any one time 
 
56 
 
 TABLE 14 
 
 DATA BASE NODE 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 1 
 
 Character 
 
 "D" to identify data 
 base node 
 
 1 
 
 8 
 
 Character 
 
 Data base name 
 
 9 
 
 3 
 
 Binary 
 
 RBA of LAT entry for this 
 node record 
 
 12 
 
 3 
 
 Binary 
 
 RBA of HOJ entry for text 
 string describing this 
 data base 
 
 15 
 
 3 
 
 Binary 
 
 RBA of the first program/ 
 data base edge entry for 
 this data base 
 
 18 
 
 3 
 
 Binary 
 
 RBA of the first data 
 base/segment edge entry 
 for this data base 
 
 21 
 
 3 
 
 Binary 
 
 RBA of the shared 
 secondary index head 
 node LAT entry 
 
 24 
 
 3 
 
 Binary 
 
 RBA of the LAT entry of 
 the next shared secondary 
 index data base node 
 
 27 
 
 1 
 
 Binary 
 
 Data base type. Possible 
 codes: 
 
 Bit 0=1 HSAM 
 
 
 
 
 Bit 1=1 SHSAM 
 Bit 2=1 HISAM 
 Bit 3=1 SHISAM 
 Bit 4=1 HDAM 
 Bit 5=1 HIDAM 
 Bit 6=1 INDEX 
 Bit 7=1 LOGICAL 
 
 28 
 
 1 
 
 Binary 
 
 Physical access method. 
 Possible codes: 
 
 Bit 0=1 ISAM 
 
 Bit 1=1 VSAM 
 
 Bit 2=1 OSAM 
 
 Bit 3=0 NOPROT 
 1 PROT 
 
57 
 
 TABLE 14--Continued 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data 
 
 Format 
 
 29 
 
 
 3 
 
 Bina 
 
 ry 
 
 32 
 
 
 3 
 
 Bina 
 
 ry 
 
 Content 
 
 RBA of the HOJ entry de- 
 scribing the randomizing 
 module for this data base 
 RBA of the first HOJ entry 
 giving data set group 
 information for this data 
 base 
 
 
 
 TABLE 15 
 
 
 
 
 SEGMENT NODE 
 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 1 
 
 Character 
 
 "S" to identify system 
 node 
 
 1 
 
 8 
 
 Character 
 
 Segment name 
 
 9 
 
 3 
 
 Binary 
 
 RBA of LAT entry for the 
 record 
 
 12 
 
 3 
 
 Binary 
 
 RBA of HOJ entry for the 
 text string describing 
 this segment 
 
 15 
 
 3 
 
 Binary 
 
 RBA of the first data 
 base/segment edge entry 
 
 
 
 
 for this segment 
 
 18 
 
 3 
 
 Binary 
 
 RBA of the first segment/ 
 field edge for this 
 segment 
 
 21 
 
 3 
 
 Binary 
 
 RBA of the LAT entry for 
 the physical source seg- 
 ment for this segment 
 
 24 
 
 3 
 
 Binary 
 
 RBA of the LAT entry for 
 the physical sibling 
 segment for this segment 
 
58 
 
 TABLE 15— Continued 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 27 
 
 1 
 
 Binary 
 
 Segment type. Possible 
 
 
 
 
 codes: 
 
 
 
 
 
 Bit 0=0 
 
 Non-com 
 pressible 
 
 
 
 
 1 
 
 Compressible 
 
 
 
 
 Bit 1 
 
 Indicates how 
 the pointer 
 segment par- 
 ticipates in 
 the concate- 
 nated segment 
 being defined 
 
 
 
 
 =0 
 
 Physically 
 
 
 
 
 = 1 
 
 Virtually 
 
 
 
 
 Bit 2 
 
 Indicates how 
 the segment 
 pointed at 
 participates 
 in the con- 
 catenated 
 
 
 
 
 =0 
 
 =1 
 
 Bit 3=1 
 
 segment being 
 defined 
 Physically 
 Virtually 
 Key of segment 
 being pointed 
 at is stored 
 in this seg- 
 ment 
 
 28 
 
 2 
 
 Binary 
 
 Maximum length of the 
 
 
 
 
 segment 
 
 
 30 
 
 2 
 
 Binary 
 
 Minimum length of the 
 
 
 
 
 segment 
 
 
 32 
 
 3 
 
 Binary 
 
 RBA of edg 
 
 e entry for 
 
 
 
 
 logical source segment 
 
 35 
 
 3 
 
 Binary 
 
 RBA of edge entry for 
 destination source 
 
 
 
 
 segment 
 
 
59 
 
 TABLE 16 
 FIELD NODE 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 1 
 
 Character 
 
 "F" to identify field 
 node 
 
 1 
 
 8 
 
 Character 
 
 Field name 
 
 9 
 
 3 
 
 Binary 
 
 RBA of LAT entry for 
 this node record 
 
 12 
 
 3 
 
 Binary 
 
 RBA of HOJ entry for the 
 text string describing 
 this field 
 
 15 
 
 3 
 
 Binary 
 
 RBA of the first segment/ 
 field edge for this field 
 
 18 
 
 3 
 
 Binary 
 
 Pointer to field edit 
 information in HOJ 
 
 21 
 
 3 
 
 Binary 
 
 Indirect RBA of the 
 generic parent field 
 node for this field 
 
 24 
 
 3 
 
 Binary 
 
 Indirect RBA of the next 
 generic sibling field 
 node for this field 
 
 27 
 
 1 
 
 Binary 
 
 Field type. Possible 
 
 codes: 
 
 Bit 0=1 FLOAT 
 Bit 1=1 FIXED 
 Bit 2=1 CHAR 
 Bit 3=1 PACKED 
 Bit 4=1 ZONED 
 Bit 5=1 /SX 
 Bit 6=1 /CK 
 Bit 7=1 XDFIELD 
 Bits 0-7=1 HEX 
 
 28 
 
 2 
 
 Binary 
 
 Field length 
 
 30 
 
 1 
 
 Binary 
 
 Decimal places in this 
 field 
 
60 
 
 EDGE RECORDS 
 
 The edge data set is a VSAM/ESDS file whose records contain 
 thirty-eight bytes of data within control intervals of 512 bytes each. 
 All access is by direct RBA to the desired record. Initially this data 
 set is completely filled and all the records are linked on a "free list" 
 from which records are made available as they are needed. The different 
 record types and the fields that comprise them are described in Tables 
 17 through 22 below. 
 
 TABLE 17 
 SYSTEM/PROGRAM EDGE 
 
 Content 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 placement 
 
 Size 
 
 
 
 
 3 
 
 Binary 
 
 3 
 
 3 
 
 Binary 
 
 6 
 
 3 
 
 Binary 
 
 RBA of the next system/ 
 program edge for this 
 system 
 
 RBA of the next system/ 
 program edge for this 
 program 
 
 RBA of the LAT entry for 
 the system node partici- 
 pating in this relation- 
 ship 
 Binary RBA of the LAT entry for 
 the program node partici- 
 pating in this relation- 
 ship 
 
61 
 
 TABLE 18 
 PROGRAM/DATA BASE EDGE-FIRST PCB ENTRY 
 
 Decimal Dis- Field Data Format Content 
 
 placement Size 
 
 3 Binary RBA of the next program/ 
 
 data base edge for this 
 program, i.e., next PCB 
 for this program 
 3 3 Binary RBA of the next program/ 
 
 data base edge for this 
 data base 
 6 3 Binary RBA of the LAT entry for 
 
 the program mode partici- 
 pating in this relation- 
 ship 
 9 3 Binary RBA of the LAT entry for 
 
 the data base node par- 
 ticipating in this 
 relationship 
 12 1 Binary PCB type. Possible 
 
 codes: 
 
 Bit =0 single 
 
 positioning 
 
 1 multiple 
 
 positioning 
 
 Bits 1-3=000 processing 
 
 option G 
 
 001 processing 
 
 option I 
 
 010 processing 
 
 option R 
 
 Oil processing 
 
 option D 
 
 100 processing 
 option A 
 
 101 processing 
 option L 
 
 Bit 4 =1 processing 
 option E 
 
 Bit 5 =1 processing 
 option S 
 
62 
 
 TABLE 18--Continued 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 13 
 15 
 
 2 Binary 
 
 3 Binary 
 
 18 
 
 28 
 36 
 
 10 
 
 8 
 3 
 
 Binary 
 Binary 
 
 Bit 6 =1 
 
 Bit 7 =1 
 
 processing 
 option P 
 processing 
 option 
 This is the length of the 
 longest concatenated key 
 in this PCB 
 
 If a secondary processing 
 sequence is used, this is 
 the RBA of the LAT entry 
 for the secondary index 
 data base 
 
 First SENSEG entry, see 
 TABLE 20 
 Unused 
 
 RBA of the next edge 
 record for this PCB 
 
 TABLE 19 
 
 ADDITIONAL PCB ENTRIES 
 
 Decimal 
 
 Dis- 
 
 Fi 
 
 eld 
 
 Data Format 
 
 placement 
 
 Size 
 
 
 
 
 
 
 10 
 
 Binary 
 
 10 
 
 
 
 10 
 
 Binary 
 
 20 
 
 
 
 10 
 
 Binary 
 
 30 
 
 
 
 6 
 
 — 
 
 36 
 
 
 
 3 
 
 Binary 
 
 Content 
 
 A SENSEG entry, TABLE 20 
 
 A SENSEG entry, TABLE 20 
 
 A SENSEG entry, TABLE 20 
 
 Unused 
 
 RBA of the next edge 
 
 record for this PCB 
 
 There will be one SENSEG for each sensitive segment in the PCB being 
 described. The format of a SENSEG entry is shown in Table 20. 
 
63 
 
 TABLE 20 
 SENSEG EDGE ENTRY 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 
 placement 
 
 Size 
 
 
 
 
 
 
 3 
 
 Binary 
 
 RBA of data base/s 
 
 egment 
 
 
 
 
 edge for this sjeigment 
 
 3 
 
 3 
 
 Binary 
 
 RBA of data base/segment 
 
 
 
 
 edge for the paren 
 
 t seg- 
 
 
 
 
 ment of this segme 
 
 nt in 
 
 
 
 
 this hierarchy 
 
 
 6 
 
 1 
 
 Binary 
 
 Processing options 
 this segment. 
 Bit 
 Bits 1-3=000 
 001 
 010 
 011 
 100 
 101 
 Bit 4=1 
 5=1 
 6=1 
 7=1 
 
 for 
 
 unused 
 G 
 I 
 R 
 D 
 A 
 L 
 E 
 S 
 P 
 K 
 
64 
 
 TABLE 21 
 DATA BASE/SEGMENT EDGE 
 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 
 3 
 
 Binary 
 
 RBA of the next 
 
 : data 
 
 
 
 
 base/segment edge for 
 
 
 
 
 this data base 
 
 
 3 
 
 3 
 
 Binary 
 
 RBA of the next data 
 base/segment edge for 
 
 
 
 
 this segment 
 
 
 6 
 
 3 
 
 Binary 
 
 RBA of the LAT 
 
 entry for 
 
 
 
 
 the data base participating 
 
 
 
 
 in this relationship 
 
 9 
 
 3 
 
 Binary 
 
 RBA of the LAT 
 
 entry for 
 
 
 
 
 the segment participating 
 
 
 
 
 in this relationship 
 
 12 
 
 1 
 
 Binary 
 
 Pointers used: 
 Bit 0=0 
 1 
 Bit 1=0 
 1 
 Bits 2-4=001 
 010 
 011 
 100 
 101 
 110 
 111 
 Bit 5=1 
 Bit 6=1 
 
 Bit 7=1 
 
 SNGL 
 
 DBLE 
 
 VIRTUAL 
 
 PHYSICAL 
 
 HIER 
 
 HIERBWD 
 
 TWIN 
 
 TWINBWD 
 
 NOTWIN 
 
 LTWIN 
 
 LTWINBWD 
 
 LPARENT 
 
 counter 
 
 present 
 
 paired 
 
 13 
 
 3 
 
 Binary 
 
 RBA of the LAT 
 
 entry for 
 
 
 
 
 the parent of this seg- 
 
 
 
 
 ment in this data base 
 
 
 
 
 hierarchy 
 
 
 16 
 
 3 
 
 Binary 
 
 RBA of the HOJ 
 taining LCHILD 
 this segment 
 
 entry con- 
 data for 
 
 19 
 
 3 
 
 Binary 
 
 RBA of the HOJ 
 
 entry for 
 
 
 
 
 the data set group in 
 
 
 
 
 which this segment belongs 
 
65 
 
 TABLE 21 --Continued 
 
 Decimal 
 
 Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 22 
 
 
 3 
 
 Binary 
 
 RBA of the HOJ entry 
 containing the logical 
 parent data, if 
 applicable 
 
 25 
 
 
 4 
 
 Binary 
 
 Frequency of occurrence of 
 this segment in hundredths 
 
 29 
 
 
 1 
 
 Character 
 
 The insert rule for this 
 segment 
 
 30 
 
 
 1 
 
 Character 
 
 The delete rule for this 
 segment 
 
 31 
 
 
 1 
 
 Character 
 
 The replace rule for 
 this segment 
 
 32 
 
 
 1 
 
 Character 
 
 The nonunique sequence 
 where rule for this 
 segment 
 
66 
 
 TABLE 22 
 
 SEGMENT/FIELD EDGE 
 
 Decimal 
 
 Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 
 3 
 
 Binary 
 
 RBA of the next segment/ 
 field edge for this 
 segment 
 
 3 
 
 
 3 
 
 Binary 
 
 RBA of the next segment/ 
 field edge for this field 
 
 6 
 
 
 3 
 
 Binary 
 
 RBA of the LAT entry for 
 the segment node partici- 
 pating in this relation- 
 ship 
 
 9 
 
 
 3 
 
 Binary 
 
 RBA of the LAT entry for 
 the field node partici- 
 pating in this relation- 
 ship 
 
 12 
 
 
 1 
 
 Binary 
 
 Field type. Possible 
 codes: 
 
 Bits 0-1=10 unique key 
 
 13 
 16 
 19 
 
 21 
 
 24 
 25 
 26 
 
 3 
 
 Binary 
 
 3 
 
 Binary 
 
 2 
 
 Binary 
 
 Binary 
 
 1 
 
 Binary 
 
 1 
 
 Binary 
 
 8 
 
 Character 
 
 field 
 11 multiple- 
 valued key 
 field 
 00 not key 
 field 
 RBA of HOJ security 
 records for this field 
 RBA of HOJ XDFLD records, 
 if applicable 
 The relative field posi- 
 tion of this field in 
 this segment 
 
 RBA of HOJ default value 
 or derivable field data, 
 if applicable 
 Secondary index XDFLD 
 constant value 
 Secondary index NULLVAL 
 value 
 
 Name of secondary 
 index exit routine 
 
67 
 
 HOJ RECORDS 
 
 The HOJ data set is a VSAM/ESDS file that is used as a secondary 
 storage area for infrequently needed or variable-length data. The 
 thirty-eight byte logical records are stored in 512 byte control intervals. 
 All five of the node entities make use of the HOJ table as well as several 
 of the edge entries. Textual descriptions, edit information, and default 
 values are examples of the types of data stored in the HOJ table. Each 
 logical record has room for thirty-four bytes of data and four bytes of 
 control information. A free list, whose head is the first HOJ record, 
 is maintained in order to link all unused records together. Table 23 
 below shows the layout of a HOJ record. 
 
 TABLE 23 
 SAMPLE HOJ RECORD 
 
 Decimal Dis- Field Data Format Content 
 placement Size 
 
 1 Binary Length of data portion 
 
 1 3 Binary Next record RBA 
 4 34 Mixed Data (variable) 
 
 The data portion of each HOJ record is different depending on 
 the type of record. Much of the information kept here is used to generate 
 IMS control blocks. [7] can be consulted as to the meaning of many of 
 the fields. The different usages and layouts of the data portion of a 
 HOJ entry, which may be more than one HOJ record, is shown in Tables 
 24-28. 
 
68 
 
 TABLE 24 
 
 RANDOMIZING MODULE HOJ DATA 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 8 
 
 Character 
 
 Module name 
 
 8 
 
 4 
 
 Binary 
 
 Number of root anchor 
 points 
 
 12 
 
 4 
 
 Binary 
 
 Maximum relative block 
 
 ■ 
 
 
 
 number 
 
 16 
 
 4 
 
 Binary 
 
 Maximum number of bytes 
 of a data base record 
 stored in the root 
 addressable area 
 
 TABLE 25 
 
 EDIT/VERIFICATION HOJ DATA 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 Binary 
 
 1 
 3 
 
 Binary 
 
 Type of verification. 
 Possible codes: 
 Bit 1=1 range of 
 possible 
 values 
 2=1 list of 
 possible 
 values 
 Number of values 
 The two values or the 
 list of possible field 
 values. The format and 
 length match the field 
 characteristics 
 
69 
 
 TABLE 26 
 XDFIELD HOJ DATA 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 source segment 
 
 3 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 search field one 
 
 6 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 search field two 
 
 9 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 search field three 
 
 12 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 search field four 
 
 15 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 serach field five 
 
 18 
 
 3 
 
 Binary 
 
 RBA 
 one 
 
 of 
 
 subsequence field 
 
 21 
 
 3 
 
 Binary 
 
 RBA 
 two 
 
 of 
 
 subsequence field 
 
 24 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 subsequence field 
 
 
 
 
 three 
 
 
 27 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 subsequence field 
 
 
 
 
 four 
 
 
 30 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 subsequence field 
 
 
 
 
 five 
 
 
 33 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 duplicate data 
 
 
 
 
 fie' 
 
 d one 
 
 36 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 duplicate data 
 
 
 
 
 fie' 
 
 d two 
 
 39 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 duplicate data 
 
 
 
 
 fie' 
 
 d three 
 
 42 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 duplicate data 
 
 
 
 
 fie' 
 
 d 1 
 
 Four 
 
 45 
 
 3 
 
 Binary 
 
 RBA 
 
 of 
 
 duplicate data 
 
 
 
 
 fie' 
 
 d five 
 
70 
 
 TABLE 27 
 
 DATA SET GROUP HO J DATA 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 1 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 
 6 
 
 Character 
 
 Data set group name 
 
 6 
 
 8 
 
 Character 
 
 DDNAME one 
 
 
 14 
 
 8 
 
 Character 
 
 DDNAME two 
 DDNAME 
 
 or overflow 
 
 22 
 
 2 
 
 Binary 
 
 Block factor one 
 
 24 
 
 2 
 
 Binary 
 
 Block factor two 
 
 26 
 
 2 
 
 Binary 
 
 Size/record length one 
 
 28 
 
 2 
 
 Binary 
 
 Size/record length two 
 
 30 
 
 1 
 
 Binary 
 
 Scan limit 
 
 
 31 
 
 1 
 
 Binary 
 
 Free block 
 factor 
 
 frequency 
 
 32 
 
 1 
 
 Binary 
 
 Free space 
 factor 
 
 percentage 
 
 33 
 
 1 
 
 Binary 
 
 Model and < 
 
 device type. 
 
 
 
 
 Possible codes: 
 
 
 
 
 Bit 0=1 
 
 2314 
 
 
 
 
 Bit 1=1 
 
 2305 
 
 
 
 
 Bit 2=1 
 
 2319 
 
 
 
 
 Bit 3=1 
 
 3330 
 
 
 
 
 Bit 4=1 
 
 3340 
 
 
 
 
 Bit 5=1 
 
 2400 
 
 
 
 
 Bit 6=1 
 
 3400 
 
 
 
 
 Bit 7=0 
 
 2305 model 1 
 
 or 
 3330 model 1 
 
 
 
 
 1 
 
 2305 model 2 
 
 or 
 3330 model 11 
 
71 
 
 TABLE 28 
 
 LOGICAL CHILD HOJ DATA 
 
 Decimal Dis- 
 
 Field 
 
 Data Format 
 
 Content 
 
 placement 
 
 Size 
 
 
 
 
 
 3 
 
 Binary 
 
 RBA of data base/segment 
 edge for the logical 
 child segment 
 
 3 
 
 1 
 
 Binary 
 
 Pointer type. Possible 
 codes: 
 
 Bit 0=1 SNGL 
 
 Bit 1=1 DBLE 
 
 Bit 2=1 NONE 
 
 Bit 3=1 INDX 
 
 Bit 4=1 SYMB 
 
 4 
 
 3 
 
 Binary 
 
 RBA of data base/segment 
 edge for paired segment 
 
 7 
 
 3 
 
 Binary 
 
 RBA of segment/field edge 
 for the index field 
 
 10 
 
 1 
 
 Character 
 
 Insert rules, either "F", 
 "H", "L" 
 
 In addition to the above uses of the HOJ records, there are 
 three other uses that do not lend themselves to tabular description. 
 They are textual descriptions, security records, and the default value/ 
 derivable field records. In the textual records the character string 
 description is packed into as few records as possible. Security records 
 relate to a segment/field edge entry and are made up of a series of four 
 byte entries. Each entry gives the RBA of a program node corresponding 
 to a program that has access to the field and an indication as to the 
 type of access. If the record is a default value entry, then it contains 
 the default value of a field packed into as few HOJ records as possible. 
 
72 
 
 For derivable fields, the module name and the RBA pointer to the argu- 
 ments) passed to that module are stored. 
 
 LAT RECORDS 
 
 The LAT data set is a VSAM/ESDS file which has a record length 
 of four bytes, containing only a type byte and the RBA of a node record. 
 This allows modification of the placement of the node records without 
 changing all pointers to these entries, since all pointers point through 
 the LAT. Only this single RBA need be updated. Table 29 below describes 
 the layout of the LAT record. 
 
 TABLE 29 
 LAT RECORD 
 
 Decimal Dis- 
 placement 
 
 Field 
 Size 
 
 Data Format 
 
 Content 
 
 
 
 Binary 
 
 Binary 
 
 Type of entry pointed at, 
 Possible codes: 
 Bit 1 = 1 field 
 
 Bit 2=1 
 Bit 3=1 
 Bit 4=1 
 Bit 5=1 
 Bit 6=1 
 
 segment 
 data base 
 program 
 system 
 generic head 
 
 Bit 0-7=0 free 
 RBA of entry pointed at 
 
73 
 
 REFERENCES 
 
 1. Cohen, Leo J. Data Base Management Systems: A Critical and Compara 
 
 tive Analysis . Performance Development Corporation, Trenton, New 
 Jersey, 1973. 
 
 2. Nerad, Richard A. "Data Administration as the Nerve Center of a 
 
 Company's Computer Activity," Data Management , vol. 11, no. 10, 
 October 1973, 26-31. 
 
 3. "The Data Dictionary/Directory Function," EDP Analyzer , vol. 12, 
 
 no. 11, November 1974, 1-13. 
 
 4. Information Management System Virtual Storage (IMS/VS), General 
 
 Information GH20-1260, IBM Corp., White Plains, New York, March 
 1974. 
 
 5. Information Management System Virtual Storage (IMS/VS), Application 
 
 Programming Reference Manual SH20-9026, IBM Corp., White Plains, 
 New York, August 1974. 
 
 6. OS/VS Virtual Storage Access Method (VSAM), Programmer's Guide 
 
 GC26-3818, IBM Corp., White Plains, New York, May 1973. 
 
 7. Information Management System Virtual Storage (IMS/VS), Utilities 
 
 Reference Manual SH20-9029, IBM Corp., White Plains, New York, 
 August 1974. 
 
LIOGRAPHIC DATA 
 ET 
 
 1. Report No. 
 
 UIUCDCS-R-76-798 
 
 3. Recipient's Accession No. 
 
 5- Report Date 
 
 May 1976 
 
 itle and Subt it \c 
 
 he Precompiler Component of a Data Base Dictionary System 
 
 6. 
 
 uthor(s) 
 
 ichael Jason Huggins 
 
 8- Performing Organization Rept. 
 
 N °- UIUCDCS-R-76-798 
 
 jrforming Organization Name and Address 
 
 epartment of Computer Science 
 
 niversity of Illinois at Urbana-Champaign 
 
 rbana, Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 Sponsoring Organization Name and Address 
 
 epartment of Computer Science 
 
 niversity of Illinois at Urbana-Champaign 
 
 rbana, Illinois 61801 
 
 13. Type of Report & Period 
 Covered 
 
 faster of Science Thesis 
 
 14. 
 
 supplementary Notes 
 
 Abstracts 
 
 With the advent of large, general purpose data base systems, several desirable 
 nformation processing theories have now been implemented. These include advances 
 11 the areas of data independence, data sharing, data security, and control. While 
 acilities to take advantage of these concepts have been implemented to varying 
 sgrees, much of the control needed to administer their use is not inherent in 
 he data base software itself. To meet this need, the role of data base administratiofi 
 as emerged. While data base administration is finding its place in data processing 
 tructures, much work is being done to provide it with the tools needed to manage 
 nd control the data. 
 
 Key Words and Document Analysis. 17a. Descriptors 
 
 ita Dictionary 
 recompiler 
 
 Identif iers/Open-Ended Terms 
 
 COSATI Field/Group 
 
 vailability Statement 
 
 Release Unlimited 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 20. Security Class (This 
 
 Page 
 UNCLASSIFIED 
 
 I NTIS-35 ( 10-70) 
 
 21. No. of Pages 
 
 79 
 
 22. Price 
 
 USCOMM-DC 40329-P7 1 
 
\** 
 
 <6 
 
 & 
 
J