LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 51084 
 U(pr 
 
 to. 293-300 
 cot>.2 
 
CENTRAL CIRCULATION AND BOOKSTACKS 
 
 The person borrowing this material is re- 
 sponsible for its renewal or return before 
 the Latest Date stamped below. You may 
 be charged a minimum fee of $75.00 for 
 each non-returned or lost item. 
 
 Thtft, mutilation, or defacement of library materials can be 
 tausei for student disciplinary action. All materials owned by 
 the University of Illinois Library are the property of the State 
 of Illinois and are protected by Article 16B of Illinois Criminal 
 law and Procedure. 
 
 TO RENEW, CALL (217) 333-8400. 
 University of Illinois Library at Urbana-Champaign 
 
 JIM 2 81999 
 
 ff 8 1 A.h. 
 
 When renewing by phone, write new due date 
 below previous due date. L162 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/storageallocatio297mura 
 

JMtAJ 
 
 Report No. 297 
 
 January 13, 19^9 
 
 fYl il^sisi 
 
 STORAGE ALLOCATION ALGORITHMS 
 IN THE TRANQUIL COMPILER ' 
 
 by 
 
 Yoichi Muraoka 
 
UM 
 
 « 
 
 I 
 
Report No. 297 
 
 STORAGE ALLOCATION ALGORITHMS 
 IN THE TRANQUIL COMPILER* 
 
 by 
 
 Yoichi Muraoka 
 
 January 13> 19^9 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 * This work was supported in part by the Advanced Research Projects 
 Agency as administered by the Rome Air Development Center under 
 Contract No. US AF 30(602)Ul4U and submitted in partial fulfillment 
 of the requirements for the degree of Master of Science in Computer 
 Science, February, 1969* 
 
■ir 
 
 M 
 
Ill 
 ACKNOWLEDGEMENT 
 
 The author would like to express his most sincere gratitude 
 to Professor Robert S. Northcote, Department of Computer Science of 
 the University of Illinois, who graciously offered many suggestions 
 and comments. Furthermore, without his criticism of the presenta- 
 tion, this thesis would probably be totally unreadable. 
 
 The author is also indebted to Professor David J. Kuck, 
 who originated TRANQUIL and provided helpful criticism and sugges- 
 tions throughout this work. 
 
 Thanks are also extended to Mrs. Patricia Stippes, who 
 typed the manuscript in its final form. 
 
 
I 
 
IV 
 
 ABSTRACT 
 
 TRANQUIL is a language for describing algorithms in terms 
 of parallel constructs. Its compiler is now being implemented for 
 the parallel array computer ILLIAC IV. This paper discusses a 
 particular part of the implementation; namely, the problem of 
 storage allocation for arrays. 
 
nB'i 
 
TABLE OF CONTENTS 
 
 Page 
 
 1. INTRODUCTION 1 
 
 2. DATA STRUCTURES IN TRANQUIL 3 
 
 2.1 Data Declarations 3 
 
 2.2 Mapping Functions 3 
 
 3. IMPLEMENTATION 9 
 
 3-1 Introduction 9 
 
 3.2 Array Partitioning 10 
 
 3«3 Address Calculation 2k 
 
 3«^ Storage Allocation 26 
 
 k. FURTHER DISCUSSION 29 
 
 5- CONCLUSION 31 
 
 APPENDIX 
 
 A. SYNTAX AND SEMANTICS SPECIFICATION OF 
 
 TRANQUIL DECLARATIONS 32 
 
 1. Declarations 32 
 
 2. Variable Declaration 33 
 
 3- Array Declaration . 3^ 
 
 h. PEM Reserve Declaration 39 
 
 5* PEM Assignment Declaration ho 
 
 B. TABLES k3 
 
VI 
 
 Page 
 C ARMY PARTITIONING AND PACKING FLOWCHARTS 1+5 
 
 D. STORAGE ALLOCATION PACKAGES 50 
 
 LIST OF REFERENCES 55 
 
Si 
 
 fa 
 
VI 1 
 
 LIST OF FIGURES 
 
 Figure Page 
 
 1. The Standard Storage Schemes 6 
 
 2. Partitioning for Array A [1:3* 1:300, 1:300] 11 
 
 3. Block Packing for Array A [1:3, 1:300, 1:300] .... 18 
 
 k. Use of PE Memory for the Blocks Shown in 
 
 Figures 2 and 3 19 
 
 5- BASETB Format 1 20 
 
 6. BASETB Format 2 20 
 
 7. Examples of BASETB Entries 21 
 
 8. One Subarray Generated from 
 
 Array A [#1:10, ##1:10, *1:3, **1:5] 22 
 
 9. SLIST 27 
 
 Al. Subarrays for an Array A [#5, ##>+, *32, **6U] .... 37 
 
 Bl. Entries and Linkage of Tables for 
 
 A[l:M n , 1:M , ..., 1:M ] ' kk 
 
 1 2 n 
 
 CI. Pass 2 Program Block Entry and Block Exit Flowcharts 
 
 for Array Declarations k6 
 
 C2. Array Partitioning Flowchart kQ 
 
 C3« Residual Block Packing Flowchart U9 
 
 Dl. Table and List Entry Formats 52 
 
 D2. Example of an Entry for I^MEMORY 53 
 
 D3« Example of an Entry for VLIST 5U 
 

 Mil! 
 
 a 
 
1. IMRODUCTION 
 
 Familiarity with the structure of the ILLIAC IV computer 
 [l] and the TRANQUIL language [2] will, in general, be assumed 
 throughout this paper. However a few of the characteristics of 
 ILLIAC IV which are important to the development of this paper will 
 now be given. 
 
 The most important feature of ILLIAC IV is that many- 
 simple identical processing elements (PEs) obey instructions which 
 are decoded by one common control unit (CU). Each PE can receive 
 common data which is broadcast from its CU, but it also has access 
 to data in its own 2K word memory (PEM). Thus, although executing 
 identical decoded control signals from a CU, every PE can operate 
 upon different data. 
 
 Each PE has the option of playing either an active or a 
 passive role during an instruction cycle on the basis of its own 
 state, which is determined by mode control. Also, each PE is 
 connected to its four nearest neighbors, thus permitting the routing 
 of data from one PE to another. 
 
 If ILLIAC IV is to be used efficiently, it is essential 
 that data be stored "evenly" in PEM's so that, on receiving control 
 signals from a CU, as many PE's as possible can operate on their 
 own data, i.e., the data which are stored in their own PEM. 
 Further, if it should be necessary for same PE to use data in 
 another PEM, the routing distance should be as small as possible. 
 
TRANQUIL, as a language, has been designed to assist in the 
 programming of primarily algebraic numerical computations. Among 
 many existing computer languages, ALGOL, FORTRAN, PL1, and APL [3] 
 also fall into this category. The prime concern here is to decide 
 what kind of data types (information structures other than simple 
 variables) should be included in the language. According to 
 Knuth [h] , information structures can be categorized as follows: 
 (i) linear list 
 (ii) tree 
 
 (iii) multilinked structure. 
 In the languages mentioned above, however, not all of these structures 
 are provided explicitly. Arrays are usually the only information 
 structure which is provided (although multilinked structures are 
 allowed in PLl). Other information structures such as the tree 
 structure are implemented by utilizing arrays [3]. TRANQUIL also 
 has arrays as the only structured data type in the language. The 
 other types of data structures were not included because ILLIAC IV 
 has been designed as an array machine and computations on arrays are 
 its primary function. 
 
 In the following chapters, the problem of storage alloca- 
 tion in the TRANQUIL compiler, namely how to specify mapping 
 functions for arrays and how the TRANQUIL compiler handles them, are 
 discussed. 
 
2. DATA STRUCTURES IN TRANQUIL 
 
 2.1 Data Declarations 
 
 The data structures which are recognized in TRANQUIL are 
 simple variables, arrays and sets. (For further discussion on sets, 
 the reader is referred to [8].) All data structures, as well as 
 labels, switches, and procedures must be declared in some block head 
 as in ALGOL. The data type attributes are INTEGER , REAL , COMPLEX 
 and BOOLEAN . Certain precision attributes also may be specified. 
 Array declarations must include attributes which specify type (as 
 above) and storage scheme (i.e., mapping function), in addition to 
 size specifications which follow the same format as those in PL1; e.g., 
 
 REAL SKEWED ARRAY A [1:50, 1:50]. 
 
 As in ALGOL, the size of an array may be specified dynamically in 
 inner blocks. 
 
 A complete syntax specification and informal semantics for 
 declarations is given in Appendix A. 
 
 2.2 Mapping Functions 
 
 As was implicitly mentioned in the introduction, the 
 efficiency of a program is highly dependent on the choice of mapping 
 functions for arrays. Hence, TRANQUIL should provide users with a 
 means of specifying mapping functions which are both simple to use 
 and yet capable of specifying quite complicated storage schemes. 
 
Several conventions for the specification of mapping functions are 
 provided in TRANQUIL. These include both predefined standard storage 
 schemes and a mechanism to enable users to specify their own scheme. 
 Before investigating these schemes we discuss the partitioning of 
 arrays, and the use of the PE memories as a two-dimensional array. 
 ILLIAC IV memory may be regarded as a 20U8 X 256 array 
 (number of words in a PEM) X (number of PEM's). Hence, it is 
 desirable to reduce all n-dimensional arrays (n > 2) to a set of 
 2-dimensional subarrays. Rows of these subarrays are usually stored 
 in a row of ILLIAC IV memory: i.e., across PEM's. As an example, if 
 the 3-dimensional array A is declared: 
 
 REAL STRAIGHT ARRAY A [1:3, 1:100, 1:200], 
 
 then the compiler will treat this as a set of three subarrays each 
 of size 100 X 200. Note that the size of the subarrays to be formed 
 is, in general, determined by the last two dimensions. 
 
 In some cases it may be desirable to change this rule; 
 i.e., form subarrays from dimensions other than the last two. To 
 satisfy this requirement the following convention has been adopted. 
 One asterisk placed in front of any dimension size specification in 
 an array declaration forces that dimension to be stored in a column 
 (i.e., one PEM) of ILLIAC IV memory, while two asterisks indicate 
 row storage (i.e., across PEM's). Thus, 
 
 REAL STRAIGHT ARRAY A 0*1:100, *1:200, 1:3] 
 
forces the compiler to form three sub arrays of size 200 X 100. 
 Furthermore, 
 
 REAL ARRAY A [*1:50] 
 
 causes the vector A to be stored in one PEM. 
 
m 
 
 PL, 
 
 
 ^_^ 
 
 
 ^_^ 
 
 
 CO 
 
 
 H 
 
 00 
 
 H 
 
 
 CO 
 
 
 ^ .» 
 
 
 CO 
 
 r- 
 
 
 CO 
 
 
 ^— s 
 
 
 -=(- 
 
 
 VO 
 
 
 OO 
 
 
 ^"»N 
 
 
 --■v 
 
 
 on 
 
 
 rH 
 
 
 LT\ 
 
 
 on 
 
 
 JA 
 
 
 ' 
 
 
 oo 
 
 CvJ 
 
 H 
 
 
 -J- 
 
 
 cvj 
 
 on 
 
 _a- 
 
 
 * 
 
 
 ^ s 
 
 ^ — ^ 
 
 ^ — -^ 
 
 
 on 
 
 OJ 
 
 H 
 
 
 no 
 
 H 
 
 OJ 
 
 on 
 
 
 ' 
 
 
 •— -s 
 
 ^— * 
 
 .- — *. 
 
 
 OJ 
 
 rH 
 
 CO 
 
 
 i - 
 
 H 
 
 
 on 
 
 
 ^ — ^ 
 
 ^— \ 
 
 
 • — X 
 
 
 H 
 
 CO 
 
 
 Ol 
 
 tH 
 
 H 
 
 OJ 
 
 
 CO 
 
 oj on -3 on vo t— oo 
 
 TJ 
 
 O 
 
 3= 
 
 Ph 
 
 
 _^^ 
 
 
 H 
 
 
 CC 
 
 CO 
 
 
 • 
 
 
 r— 
 
 • 
 
 • 
 
 • 
 
 • 
 
 VJD 
 
 • 
 
 • 
 
 • 
 
 • 
 
 
 J- 
 
 LT\ 
 
 
 OJ 
 
 
 on 
 
 J- 
 
 J- 
 
 
 OJ 
 
 rH 
 
 
 y-— V 
 
 ,— *v 
 
 * — s 
 
 ^— ■* 
 
 
 H 
 
 OJ 
 
 rn 
 
 -d" 
 
 en 
 
 •» 
 
 •t 
 
 1\ 
 
 •s 
 
 
 OO 
 
 OJ 
 
 rH 
 
 CO 
 
 
 ■**—*' 
 
 ■*~s 
 
 •^-^ 
 
 * — ' 
 
 
 y— >» 
 
 * — s 
 
 ^-v 
 
 
 H 
 
 OJ 
 
 oo 
 
 
 OJ 
 
 ■, 
 
 <r\ 
 
 r\ 
 
 • 
 
 
 OJ 
 
 rH 
 
 CO 
 
 
 >»— s 
 
 — s 
 
 -■— ^ 
 
 
 ^-K 
 
 y^-*N 
 
 
 H 
 
 Ol 
 
 
 r-l 
 
 rH 
 
 _ 
 
 CO 
 
 
 oj on _3- 
 
 u 
 o 
 
 co 
 
 PL| 
 
 
 ^-^ 
 
 , — ( 
 
 
 1 
 
 . 
 
 
 en 
 
 I ) 
 
 rH 
 
 OJ 
 
 
 □O 
 
 f- 
 
 • 
 
 • 
 
 
 • 
 
 vO 
 
 • 
 
 • 
 
 
 • 
 
 L'\ 
 
 • 
 
 • 
 
 
 • 
 
 J" 
 
 • 
 
 • 
 
 
 • 
 
 
 m 
 
 on 
 
 
 rH 
 
 »N 
 
 
 • 
 
 
 ^~v 
 
 
 OJ 
 
 OJ 
 
 
 Ol 
 
 '1 
 
 H 
 
 OJ 
 
 
 CO 
 
 
 H 
 
 H 
 
 
 H 
 
 rH 
 
 H 
 
 OJ 
 
 
 CO 
 
 o 
 
 oj on -3- ir\ vo i>- co 
 
 o 
 
 
 OJ 
 
 
 b.C 
 
 
 a3 
 
 
 rH 
 
 
 o 
 
 a 
 
 
 -p 
 
 j- — s 
 
 
 y— "s 
 
 ^— ^ 
 
 
 CO 
 
 OJ 
 
 
 0- 
 
 vO 
 
 
 ^ — s 
 
 
 on^xi 
 
 
 <-H OJ 
 
 H vO 
 
 
 rH OJ 
 
 
 rH 
 
 ♦. r-i 
 
 • • 
 
 • • 
 
 ^ on 
 
 -> H 
 
 • • 
 
 • • 
 
 -. on 
 
 
 OJ H 
 
 
 OJ H 
 
 OJ w 
 
 
 OJ H 
 
 a 
 
 
 m PL, 
 
 
 on pl, 
 
 on PL, 
 
 
 on pl, 
 
 
 CO 
 
 "■^ 
 
 
 w 
 
 *-* 
 
 
 ' ' 
 
 0) 
 
 o 
 
 
 cd 
 
 
 +3 
 
 
 -p 
 
 ■ 
 
 
 VI 
 
 O 
 -P 
 
 a) 
 S 
 
 at 
 
 CO 
 
 aj 
 si 
 
 EH 
 
 
 . 
 
 
 • 
 
 Tl 
 
 * 
 
 
 * 
 
 
 o 
 
 CO 
 
 • 
 
 ■\^T 
 
 
 ^*N 
 
 , s 
 
 
 H 
 
 Ol 
 
 
 r- t- 
 
 ^O 
 
 
 "N 
 
 (U 
 
 P3 
 
 
 on h 
 
 
 rH H 
 
 H H 
 
 
 H r- 
 
 !h 
 
 Ixj 
 
 OJ 
 
 ■< H 
 
 • • 
 
 • • 
 
 •v W 
 
 - £1 
 
 t- PL, 
 
 • • 
 
 • • 
 
 •v H 
 
 cti 
 
 Vt-, 
 
 rH 
 
 r- pl, 
 
 
 r-- pl. 
 
 
 C- W 
 
 
 O 
 
 H 
 
 
 H 
 
 H 
 
 
 rH PL, 
 
 t — * 
 
 M 
 
 
 VD 
 
 
 rH 
 
 Q 
 
 s: 
 
 , — ^ 
 
 
 ^ — . 
 
 ^ — » 
 
 
 OJ vO 
 
 m L?\ 
 
 
 r— vo 
 
 VO MD 
 
 
 ^— N 
 
 <^N 
 
 t* 
 
 OJ 
 
 
 rH rH 
 
 <-i Lf\ 
 
 
 OJ CM 
 
 H \D 
 
 u 
 
 AJ 
 
 
 -> OJ 
 
 • • 
 
 • • 
 
 ^ W 
 
 ^ OJ 
 
 • • 
 
 * oo 
 
 -> H 
 
 
 -H 
 
 
 vO W 
 
 
 VO Ph 
 
 VO W 
 
 
 vq w 
 
 VO H 
 
 -n 
 
 
 rH PL, 
 
 
 H 
 
 rH P^ 
 
 
 rH f^ 
 
 rH PL, 
 
 cS 
 
 Lj 
 
 
 " ' 
 
 
 * — 
 
 
 "— " 
 
 — 
 
 -P 
 
 •H 
 
 
 -P 
 
 
 • 
 
 
 • 
 
 • « 
 
 d 
 
 •S 
 
 
 • 
 
 • • 
 
 • 
 
 • • 
 
 •H 
 
 
 o 
 a 
 
 
 CO 
 
 id 
 
 
 ^OJ 
 
 
 ^OJ 
 
 
 OJ -4" 
 
 
 vO J- 
 
 
 •—•CO 
 
 *"*N 
 
 ^l 
 
 w 
 
 
 on oj 
 
 
 H OJ 
 
 
 Ol H 
 
 H OJ 
 
 S 
 
 OJ 
 
 
 ^ H 
 
 • • 
 
 • • 
 
 • • 
 
 ^ W 
 
 • • 
 
 - W 
 
 ^ W 
 
 Ph 
 
 bi 
 
 
 OJ Ph 
 
 
 OJ PL, 
 
 
 OJ PL, 
 
 OJ PL, 
 
 vl) 
 
 
 ^_^ 
 
 
 ^— "* 
 
 
 ^-^ 
 
 ^ — 
 
 UN 
 
 OJ 
 
 CM 
 
 on 
 
 
 ^— «.„~ r 
 
 
 I^^rH 
 
 
 OJ J" 
 
 
 C— 
 
 vO -d" 
 
 
 s~- [— 
 
 s-^ 
 
 
 m oj 
 
 
 H H 
 
 H Ol 
 
 
 OJ rH 
 
 H H 
 
 OJ 
 
 X 
 
 
 •> w 
 
 • • 
 
 • • 
 
 •> W 
 
 " !? 
 
 • • 
 
 ^ W 
 
 -< W 
 
 u 
 
 
 H PL, 
 
 
 H Ph 
 
 H PL, 
 
 
 rH PL, 
 
 H PL, 
 
 1) 
 
 Ol 
 
 
 * — 
 
 
 ' ' 
 
 ,, ^~^' • 
 
 
 ' ' 
 
 UJ 
 
 ro 
 
 
7 
 
 The standard, more commonly used, methods of storage are: 
 
 STRAIGHT , SKEWED and CHECKER . 
 
 These schemes are illustrated by the examples in Figure 1 where an 
 8x8 matrix A is stored in eight PEM's of ILLIAC IV. STRAIGHT is 
 the simplest storage form for a matrix and leads to the simplest 
 program, but the drawback is that a column is contained entirely in 
 one PEM, thus prohibiting simultaneous access to all elements of a 
 column. SKEWED allocation, on the other hand, distributes columns, as 
 well as rows, across PEM's allowing access to an entire column in 
 parallel. Since rows and columns are equally accessable in this 
 scheme, matrices can be PACKED by appropriate transposition, thus 
 minimizing wastage of memory space, as shown in Figure 1. The 
 CHECKER allocation scheme has been developed specially for storing 
 mesh type data for elliptic partial differential equations. This 
 scheme allows each PE fast access to the four nearest neighboring 
 mesh points. Further discussion on storage schemes for matrix type 
 operations is found in [5]» The CHECKER scheme is discussed in [6]. 
 Mapping functions are applied to the aforementioned subarrays. Thus, 
 for example, skewing is done on each of three subarrays of size 
 200 X 100 belonging to the array A which is declared 
 
 REAL SKEWED ARRAY A [**1:100, *1:200, 1:3]. 
 
 It is feasible to include new mapping functions in this list, but it 
 
is anticipated that most users needs will be satisfied by this set 
 of standard mapping functions. 
 
 Finally, the user who "wishes to specify his own mapping 
 function may make use of a PE memory assignment statement. For 
 example : 
 
 PEMEMORY PB [1:10, 1:256]; 
 
 PEM FOR (I, J) SIM ([1, 2, ..., 10] X [1, 2, ..., 256]) DO 
 
 PB [I, J] «- B [I, MOD (256, I + J - 1)]; 
 REAL ARRAY (PB) B[l:10, 1:256]; 
 
 establishes virtual space of size 10 X 256 in PE memory, and then 
 stores a 10 X 256 array B there in skewed form. Thus, instead of 
 making up the aforementioned subarrays out of an array declaration, 
 space reserved in PE memory may be used. In the program, the pro- 
 grammer refers to an element of memory space via the assigned array 
 name B and its subscripts, as usual. 
 
 It should be noted that storage mapping functions can not 
 be specified dynamically. Should remapping of data be required, an 
 explicit assignment statement may be used; e.g., to change the data 
 in an array A from skewed to straight storage an assignment statement 
 
 B - A 
 
 is used, where B has been declared to be a straight array. 
 
3 . IMPLEMENTATION 
 
 3.1 Introduction 
 
 The TRANQUIL compiler currently has two passes. In pass 1, 
 on recognizing declarations, the compiler enters the necessary 
 descriptive information; e.g., in the case of array declarations, 
 the attributes, type of mapping function, and size and dimension 
 information for the array, into a table (IDTAB). Segmentation of 
 data, i.e., partitioning of arrays, and storage allocation are taken 
 care of in pass 2. Descriptions and formats of the tables used are 
 given in Appendix B. 
 
 There are many computational problems the programs for 
 which require more working storage than is available. This is 
 particularly true for programs which will be run on ILLIAC IV, 
 because the size of each PEM is relatively small compared with the 
 computational speed of a PE. For example, multiplication of two 
 256 X 256 matrices takes only 70 msec on ILLIAC IV, but almost half 
 the memory is needed to accomodate three of these arrays. Thus 
 a strategy for segmenting arrays, and controlling the overlaying of 
 segments, must be devised for the TRANQUIL compiler. The segmentation 
 of object programs will not be considered here. Our main concern is 
 how to deal with large arrays. 
 
10 
 
 3.2 Array Partitioning 
 
 According to Randell and Kuehner [7], two characteristics 
 believed to "be most "useful for revealing the functional capability 
 and underlying mechanisms of current hardware-assisted dynamic 
 storage allocation systems are related to the concepts of name space 
 and predictive information. A discussion of these characteristics 
 will now be given. 
 
 (i) Name Space 
 
 On conventional computers, memory is always treated as a 
 linear space. Array elements must be stored as a linear array or 
 vector; e.g., matrices are stored in order by rows, and a row of an 
 array frequently forms a segment [7] • A typical size for a segment 
 is 1024 words (B5500), and this limitation is reflected in the fact 
 that the maximum size vector that can be declared is 1024 words. 
 
 ILL1AC IV memory, however, may be regarded as a two- 
 dimensional space, i.e. an array, with 2048 rows (number of words in 
 a Pffl) X 256 columns (number of PEM's). This suggests the use of a 
 two-dimensional segment, to be referred to as a block. Explicitly, all 
 arrays will be partitioned into two dimensional -blocks, the maximum 
 size of which is 256 X 256 words. This partitioning is done on the 
 subarrays which were mentioned in Chapter 2. 
 
11 
 
 A [1, *, *] 
 
 ! k 
 _ _ J 
 
 A[2,200,100] 
 
 6 
 
 5 
 
 7 
 
 A [2, *, *] 
 
 8 
 
 10 
 
 9 
 
 li 
 
 A [3, *,,*] 
 
 Figure 2. Partitioning for Array A [1:3, 1:300, 1:300] 
 
12 
 
 ,* i 
 
 Figure 2 illustrates the partitioning of a three-dimensional array 
 into 3 subarrays, each with k blocks for a total of 12 blocks. 
 
 The reasons for using two-dimensional segments are as 
 follows : 
 
 (a) It is necessary to have large segments in order to reduce 
 the number of I/O requests. It is especially necessary 
 that all Pffl's be interrupted evenly when I/O requests are 
 being processed [l], i.e., a block should be formed so that 
 it has elements across all PEM's (if possible) when it has 
 been read into memory. 
 
 (b) On the other hand, operations on arrays may always be 
 performed in terms of subarrays; e.g., one can write 
 
 iM 
 
 A ll A 12 
 
 A A 
 
 21 22 
 
 B ll B 12 
 
 B 21 B 22 
 
 (A 11 + V ( A 12 +B l 2 ) 
 (A 21+ B 21 ) (A 22+ B 22 ) 
 
 for the addition of two matrices A and B provided that 
 submatrices A. . and B. . have the same order. 
 
 These two observations suggest the use of two-dimensional segments, 
 i.e., blocks. Another advantage of having two-dimensional segments 
 is that either a row or a column of a segment can be accessed, which 
 is an advantage over linear (row or column) storage. The partition- 
 ing of an array into blocks is independent of the mapping function 
 used; i.e., for a SKEWED array skewing is applied to each block 
 after partitioning. 
 
13 
 
 One possible disadvantage which could be claimed against 
 block segmentation is that of difficulty in indexing across blocks. 
 This may, however, be remedied by again utilizing the submatrix 
 concept in matrix operations, e.g., instead of writing: 
 
 FOR (I) SIM ([1,2, ...,512]) DO 
 FOR (J) SIM ([1,2, ...,512]) DO 
 BEGIN 
 
 C[I,J] - 0; 
 
 FOR (K) SEQ([l,2,...,512]) DO 
 
 C[I,J]«- C[I,J] + A[I,K] X B[K,J] 
 
 END 
 
 _(D 
 
 write: 
 
 FOR (M) SEg ([1,2]) DO 
 
 FOR (N) SEQ^ ([1,2]) DO 
 
 BEGIN 
 
 [M,N] C <- 0; 
 
 FOR (L) SE§ ([1,2]) DO 
 
 [M,N] C «- [M,N] C + [M,L] A x [L,N]B 
 
 END 
 
 (2). 
 
14 
 
 where [M,N]C is computed using code (l), but with indices now varying 
 from 1 to 256 instead of 1 to 512, i.e. 
 FOR (I) SIM ([1,2,... ,256]) DO 
 FOR (J) SIM ([1,2, ...,256]) DO 
 BEGIN 
 
 [M,N] C [I, J] «- 0; 
 
 FOR (K) SEQ ([1,2,..., 256]) DO 
 
 F" 
 
 .« [M,N] C [I, J] *• [M,N] C [I, J] 
 
 r ; + [M,L] A [I,K] X [L,N] B [K,J] 
 
 END 
 
 It should he emphasized that the generation of codes corresponding 
 to (2) above from (l) is the compiler's responsibility and programs 
 written in TRANQUIL do not have to contain explicit provisions for 
 segmentation. For the detailed algorithm for generating code (2) 
 from code (l), the reader is referred to [8,93- 
 
 The sizes of the blocks obtained by the array partitioning 
 fall Into four categories: 
 
 (a) 256 X 256 
 
 (b) n X 256 n < 256 
 
 (c) 256 X m m < 256 
 
 (d) m X n m, n < 256 
 
 For the sake of discussion we call these SQUARE, HBLOCK, VBLOCK, and 
 SBLOCK, respectively. As was mentioned before, it is important to 
 form a block which has 256 columns, so that when I/O is being 
 
15 
 
 processed all PEM's will be interrupted evenly. Thus, small blocks 
 belonging to the same array should, if possible, be packed together 
 to form a larger block. The details will be discussed later. 
 
 (ii) Predictive information 
 
 All array operations and data transfers between ILLIAC IV 
 disk and PE memory are done in terms of these blocks. Blocking 
 
 facilitates the use of arrays which are larger than the total PE 
 
 o 
 memory. All data is normally stored on the 10 bit (approximately 
 
 30 memory loads) ILLIAC IV disk, and blocks are only brought into 
 
 Q 
 
 PE memory (at a transfer rate of 5 X 10 bits/second or 8 ms /block) 
 as required. The disk rotation time is kO ms. The TRANQUIL compiler 
 will automatically generate block transfer I/O requests, which are 
 handled by the operating system, to make it possible to write a 
 TRANQUIL program which includes no explicit I/O statements. All 
 data are initially stored on the disk in the same way they are used 
 on ILLIAC IV; e.g., if a skewed array is necessary on ILLIAC IV, 
 then the array is also stored in a skewed fashion on the disk before 
 the program which uses it is initiated by the operating system. 
 
 It is generally agreed that both preplanned and dynamic 
 storage allocation have advantages for certain types of problems. 
 Preplanned allocation should work best with more regularly predictable 
 problems, whereas dynamic storage allocation can best cope with 
 problems whose flow and storage requirements are highly data 
 dependent and therefore unpredictable. In the case of the TRANQUIL 
 compiler it has been further recognized that the methods are not 
 mutually exclusive and can work well in combination. A pre scheduled 
 
16 
 
 sequence of I/O requests, for example, can be generated locally; 
 i.e., in a TRANQUIL statement involving large arrays as in 
 
 A <- B X C 
 
 IF*" 1 
 
 where A, B, and C are arrays of size 102*+ X 102U. 
 
 On the other hand, a dynamic storage allocation scheme is necessary 
 globally; i.e., between TRANQUIL statements, and this should consider 
 transfers of control. The first TRANQUIL compiler will adopt a 
 
 M V 
 
 simple strategy; i.e., a block on demand system utilizing the first-in 
 
 last-out strategy. 
 
 The block number for an element A[i n , i_,..., i . , i "| of 
 
 1 2' ' n-1 n 
 
 an array A[l:R., l:M p ,..., 1:M ] is given by 
 |(...((i 1 - 1) M 2 + (i 2 - 1)) M 3 + ...(i n _ 2 -l)) 
 
 For example, the block number for the element A[2,200,100] in 
 Figure 2 is 
 
 ((2-1) r 30 + 255] + rioo"i\r3oo + 25£i + r i0(T i = h 
 
 L 256 J u^J/L 25^ J L25£| 
 
17 
 
 For each block thus established an entry is made in BASETB. This, 
 however, does not necessarily imply that a segment is established 
 for each block. The blocks smaller than 256 X 256, i.e., HBLOCKs, 
 VBLOCKs, and SBLOCKs may be packed together (they are then called 
 subblocks) to form a larger block (called a superblock) . It should 
 be noted that the terms HBLOCK, VBLOCK, SBLOCK and SQUARE only refer 
 to the size of a block as previously defined. Thus it is possible, 
 for example, to talk about a superblock which is a HBLOCK. 
 
 The entry in BASETB gives an absolute base address (the 
 address of the upper left-most element) of the block, if the block 
 is in memory, or the address relative to the base address of the 
 corresponding superblock and a pointer to the BASETB entry of that 
 superblock in the case of a subblock. The above addresses are each 
 specified by a PEM word number (x) together with a PEM number (Y) . 
 Thus if (x, y)j (X,Y) are the relative address of a subblock and base 
 address of a corresponding superblock, respectively, then the base 
 address of the subblock is (x + X, y + Y). 
 
 As an example, in Figure 2 there are: 
 
 3 SQUARES (blocks 0, k and 8), 
 
 3 HBLOCKs (blocks 2, 6 and 10), 
 
 3 VBLOCKs (blocks 1, 5 and 9), and 
 
 3 SBLOCKs (blocks 3, 7 and 11) . 
 Twelve entries are established in BASETB corresponding to these 
 blocks. The 3 VBLOCKs are packed into a 256 X ikk superblock 
 (Figure 3)- 
 
Ikk 
 
 18 
 
 
 mm 
 ■HI 
 
 8* 
 
 4» 
 
 I-*- 
 
 VD 
 OJ 
 
 vr 
 
 P 
 I 
 
 a 
 
 i 
 
 £5 
 
 I 
 
 I 
 
 I 
 
 I 
 
 m 
 
 1 
 
 I 
 
 ^ 
 
 Packing of VBLOCKs 
 
 256- 
 
 -H 
 
 T 
 
 
 i 
 
 4U 
 
 Packing of HBLOCKs 
 
 -Ikk- 
 
 
 [ 
 
 3 
 
 IT 
 
 Packing of SBLOCKs 
 
 Figure 3- Block Packing for Array A[l:3, 1:300, 1:300] 
 
19 
 
 The relative address, in this superblock, of block 10 is 
 (0, 96). The three HBLOCKs are packed into a 132 X 256 superblock 
 and the three SBLOCKs are stored in a kk X ikk superblock as shown. 
 
 Further, suppose these blocks are stored in PE memory as 
 shown in Figure h. 
 
 PE Word No, 
 
 256 
 
 512 
 
 768 
 
 1021+ 
 
 1156 
 
 PE No, 
 
 TW 
 
 2, 6, 10 
 
 ^> 
 
 cm 
 
 CM 
 
 OJ 
 
 CM 
 
 m 
 
 H 
 
 > 
 
 Figure k. Use of PE Memory for the Blocks Shown in 
 Figures 2 and 3* 
 
,. H 
 
 20 
 
 Entries for BASETB are made after packing. 
 
 p 
 
 X 
 
 Y 
 
 SIZEX 
 
 SIZEY 
 
 Figure 5. BASETB Format 1 
 
 mm 
 
 
 The BASETB entry format for a superblock or SQUARE is 
 shown in Figure 5, where 
 
 P indicates that this is an entry for a superblock or SQUARE; 
 (X,Y) is the absolute base address of this block; 
 SIZEX is the number of rows of this block; 
 SIZEY is the number of columns of this block. 
 
 c 
 
 ORIGIN 
 
 COX 
 
 COY 
 
 Figure 6. BASETB Format 2 
 
 where 
 
 Figure 6 shows the BASETB entry format for a subblock, 
 
 C indicates that this is an entry for a subblock; 
 (COX, COY) is the relative address of the subblock. 
 
21 
 
 For example, the BASETB entries for blocks 2 and 10 in 
 Figure h are : 
 
 p 
 
 768 
 
 
 256 
 
 Ikk 
 
 10 
 
 c 
 
 2 
 
 
 96 
 
 Figure 7. Examples of BASETB Entries 
 
 Thus, the absolute base address for block 10 is given by (X.. n , Y, n ) 
 
 where 
 
 and 
 
 X Q = BASETB [10] . COX + BASETB [BASETB [10] . ORIGIN] . X 
 
 = + 768 = 768 
 
 Y 1Q = BASETB [10] . COY + BASETB [BASETB [10] . ORIGIN] . Y 
 
 = 96 + = 96 
 
 The procedure above generates, for example, 100 subarrays 
 and 100 entries in BASETB for an array 
 
 A[l:10, 1:10, 1:3, 1:5] 
 in spite of the fact that the subarrays might be packed and generate 
 only one superblock. 
 An array 
 
 A[#l:10, ##1:10, *1:3, **1:5], on the other hand, generates 
 only 1 entry in BASETB. This is because * and ** together with 
 
22 
 
 # and ## force the compiler to generate a single subarray of size 
 30 X 50 (Figure 8) . Since partitioning is done in terms of this 
 subarray, only one block is generated, making only one entry in 
 BASETB. Thus if *'s and#'s are used effectively, more efficient 
 code can be generated. 
 
 10 
 
 30 
 
 Figure 8. One Subarray Generated from Array 
 A [#1:10, #1:10, *1:3, **1:5] 
 
23 
 
 Should VBLOCKs or SBLOCKs result from partitioning, their 
 sizes will he modified so that they have 
 
 (n + 7) columns, where n 
 I ft J 
 
 T 
 
 is the number of columns of the original block. This is done in 
 order that each block be stored beginning at the p-th PM, where 
 p mod 8 = 0, to make efficient use of the 8 word CU fetch 
 capabilities. 
 
 Blocks will be packed after the above mentioned modification 
 has been made. Packing is done in two stages. First, packing is 
 done for the same kind of blocks (e.g., VBLOCKs are packed by 
 themselves) belonging to the same array. For example, VBLOCKs are 
 arranged side by side until the number of columns of the resultant 
 superblock reaches 256. A similar procedure is used for packing 
 HBLOCKs, i.e., they are stacked so that the resultant superblock has 
 up to 256 rows. In the case of SBLOCK packing, first they are 
 treated as VBLOCKs (e.g., they are arranged side by side). Further, 
 if the resultant superblock is a HBLOCK, then HBLOCK packing is done. 
 It should be noted that even after the packing it is possible to have 
 residual blocks which can not be packed, or a superblock which is 
 either VBLOCK, or SBLOCK. These are the objectives of further 
 packing, which is discussed in the following section. HBLOCKs will 
 not be considered in further packing and will be treated as SQUARES. 
 
 A detailed flowchart for partitioning and packing is given 
 in Appendix C. 
 
2k 
 
 3.3 Address Calculation 
 
 The effective address of any array element is established 
 by computing its block number, which was discussed in the previous 
 section, and a relative address within that block. The computation 
 of the relative address varies from one mapping function to another. 
 Here only standard mapping functions are discussed. 
 
 For an element A [i , i , . . . , i , i ] of an array 
 A [1:M-. , . . ., 1:M ], the relative PE number and the relative PE 
 word address of the element in the specified block are given as 
 follows, where x will denote PEM address and y will denote PEM 
 number: 
 
 (i) STRAIGHT array 
 
 x = (i -1) mod 256 
 n-1 
 
 y = (i -1) mod 256 
 
 (ii) SKEWED array 
 
 x = (i -1) mod 256 
 n-1 
 
 y = (i _ + i -2) mod 256 
 v n-1 n ' 
 
 For example, consider again the array in Figure 2. The 
 block number for the element A[2, 200, 100] was h. If the array is 
 skewed, the PE number relative to the base address of this block 
 is (220 + 100 -2) mod 256 = k2 and the relative address in this 
 PEM is 199. 
 
25 
 
 The equation in Section 3-2 can also be written as; 
 
 /(...((^-l) ' M 2 + (i 2 -l)) ' M 3 + ...(i n _ 2 -l)) 
 
 • m' + r hsw • m' n + r x n-n 
 
 where M' . = 
 
 1 
 
 M. + 255 
 1 
 
 256 
 
 To obtain the block number using this form requires (n-2) 
 
 subtractions, (n-l) multiplications and (n-l) additions. This value, 
 
 in turn, is used to access BASETB to locate an absolute base address 
 
 of the block. M, • 1VL...M • M 1 -, • M' words are required in 
 \L 2 n-2 n-l n H 
 
 BASETB for this array, besides n words in DOPETB, which contains the 
 bounds for each subscript position. Upon finding the absolute base 
 address, the relative address of an element in the block is calculated 
 using one of the two sets of equations given above, if a standard 
 mapping function is specified. Both equations require one or two 
 subtractions or additions and a shift operation. In practice, a PEM 
 word address is calculated in the PE which requires it, and is used 
 as an index value for the PEX. Thus, the PE index value together 
 with the CU index value, which is an absolute base address of the 
 block, is used to locate the element of the array. In most cases 
 some or all of the elements in a row or column are used simulta- 
 neously. In the case of column operation, for example, each PE can 
 simultaneously compute the relative address (index value) which it 
 will require. 
 
26 
 
 It should be noted that to compute the PM number and the PEM word 
 address of an element, no information on the array size is necessary; 
 i.e., no reference to DOPETB is necessary. On the linear memory 
 space, however, all dimensional information is required to locate 
 the memory cell corresponding to an array element [10] . 
 
 3-4 Storage Allocation 
 
 The storage allocation procedure is a separate package 
 which is independent of the other parts of the compiler, such as the 
 "block packing procedure. The compiler can request any amount of 
 memory space; i.e., SQUARE, HBLOCK, VBLOCK or SBLOCK, and free it at 
 any time. The storage allocation procedure keeps track of memory 
 usage, and returns appropriate memory space on request, or frees it. 
 
 In allocating memory space for a block a linked space list 
 called I4MEM0RY, which keeps count only of the number of rows of 
 memory which have been used, is utilized. If a HBLOCK of size 
 m X 256 (or a SQUARE) is to be stored, the list is searched to locate 
 the smallest space which has at least m (or 256) adjacent rows and 
 the block is stored there. In the case of VBLOCK (256 X m) 256 rows 
 of PE memory may be allocated and a sublist corresponding to a 
 256 X 256 block of storage called VLIST is established. A VLIST 
 records use of columns (PEMs) in a particular 256 X 256 block. In 
 the case of a SBLOCK, again a 256 X 256 block may be allocated with 
 associated list SLIST. This block is divided into 4x8 subblocks. 
 SLIST consists of a 6k X 32 bit boolean array (Figure 9) in which 
 each bit represents the use, or otherwise, of each 4x8 subblock. 
 
27 
 
 32 bits 
 
 r 
 
 Gh bits 
 
 V: 
 
 T|T~ 
 
 '■ I — ■ — »■ II I I ■ ■■■■ — I '■ 
 
 m bit r U8 word 
 
 PE memory block 
 
 Figure 9. SLIST 
 
 The above is the overall picture of the storage allocation. 
 However, in transferring data between ILLIAC IV disk and PE memory, 
 the initial PE memory address is restricted to certain PEM's and the 
 block of data transferred can have only 16, 32, 6U, 128 or 256 
 columns. For example, a 6^4 column block transfer must start in one 
 of PEM numbers 0, 6U, 128 or 192. Thus, adjustment must be made of 
 the size of VBLOCKs and SBLOCKs so that they have one of the above 
 mentioned numbers of columns. This implies that a certain amount of 
 storage is wasted. For instance., a VBLOCK of size 256 X 7 2 is made 
 up to a block of size 256 X 128, wasting 256 X 56 words. To avoid 
 this further packing is introduced which is applied before PE memory 
 is assigned to them(i.e., before requesting memory space to the 
 storage allocator) . 
 
28 
 
 First a bit array which is similar to SLIST is prepared. 
 Upon entering a TRANQUIL program "block, arrays are partitioned and 
 packed as discussed in Section 3*2. If any residual VBLOCKs or 
 SBLOCKs resulted, then they are now formed into a superblock and 
 entries are made in the bit array. A superblock of size m X 256 
 (SQUARE) is made whenever this bit array becomes full, or the 
 program exits from the TRANQUIL block. This method may still 
 introduce uneconomical packing. However, such wasted space can not 
 be used during that program block anyway. 
 
 The strategy that has been chosen might involve considerable 
 bookkeeping, but it minimizes storage fragmentation and reduces the 
 frequency of storage allocation and I/O requests. Details of the 
 algorithm are given in Appendix C. 
 
29 
 
 k. FURTHER DISCUSSION 
 
 There are several parts of the TRANQUIL language still to 
 be implemented. One of these is the PEM storage allocation statement. 
 This will be implemented using a macro generator which is now being 
 written. For instance, consider the array B, the storage allocation 
 statement for which appears in Chapter 2. After this statement in a 
 TRANQUIL program, whenever the array B is used its subscript 
 expression is replaced by the expression appearing in the storage 
 allocation statement. 
 
 Data management, such as partitioning of arrays and packing 
 of blocks, has been discussed and it has been shown that effective 
 address calculation for elements of partitioned arrays can be 
 computed efficiently. However, no provision has yet been made for a 
 proper way to communicate with the ILLIAC IV operating system. 
 Obviously, for efficient transfer of data from the disk to/from the 
 PEM the format (mapping and partitioning) of the data on the disk 
 must be in the form required in PEM. Hence the TRANQUIL compiler 
 must communicate the appropriate information to the operating system 
 which, when it obtains file name, program name and external format 
 information, will cause the data to be partitioned and mapped in the 
 format required by the associated TRANQUIL program. 
 
 Also relating to input/ output is the strategy of better 
 resource allocation. One direction which is now being studied is to 
 make a model of a program and interchange statements so that the 
 resultant program is computationally equivalent to the original one, 
 
30 
 
 yet faster in execution speed. The idea is to reorganize statements 
 to minimize input/ output of data blocks. Such high optimization 
 will be especially effective for lengthy production type programs. 
 
 Finally, TRANQUIL is by no means a completely satisfactory 
 problem oriented language in the sense that users are not free from 
 the burden of choosing data representation, e.g., mapping functions. 
 What is needed is a language which Balzer calls a dataless programming 
 language [ll] • He states: 
 
 "The independence (of a processing to data repre- 
 sentation) will allow the programmer (l) to 
 disregard, while specifying the program, the 
 details of data processing, memory space require- 
 ments, and matching of data representation to the 
 processing done on it; and (2) to handle them 
 instead, during the data declaration phase. .... 
 The problem of data representation can be left 
 until this programming has been completed; thus, 
 a more rational decision can be made concerning 
 an optimal representation. Because of this 
 separation, the programmer should be able to 
 think through his problem better." 
 
31 
 
 5 • CONCLUSION 
 
 Array mapping functions and array partitioning for the 
 TRANQUIL language have been discussed in this paper. It is obvious 
 that for array type operations SKEWED storage will often provide the 
 best result in terms of computational speed because either a row or 
 a column of a SKEWED array can be accessed simultaneously by all PEs. 
 Also, since the use of a TWS system [12] makes the TRANQUIL compiler 
 more readily modifiable, it is feasible to incorporate new mapping 
 functions in TRANQUIL relatively easily. It should be emphasized 
 that the addressing mechanism for arrays is rather simple in spite 
 of the complicated partitioning. Actually, to locate an element in 
 a block, only a few additions/ substructions and some shift operations 
 are required. Thus, the partitioning mechanism has no adverse effect 
 on the computational speed of compiled codes. 
 
 Finally the partitioning and blocking of data, together 
 with automatic (computer generated) input/ output of data blocks 
 
 to/from disk, enables the user to write programs using data arrays 
 
 / 9 \ 
 
 (up to 10 bits) which are larger than the ILLIAC IV memory. 
 
32 
 
 APPENDIX A 
 
 SYNTAX AND SEMANTICS SPECIFICATION 
 OF TRANQUIL DECLARATIONS 
 
 The syntax specification is written in Backus Naur Form 
 [13] • The syntax for the following non-terminal symbols is as 
 specified in the ALGOL report [lU], or in Appendix B of [8], and they 
 are used here without any further definition: 
 
 <arithmetic expression> 
 <identifier> 
 <unsigned integer> 
 <set variable list> 
 <set name list> 
 <empty> 
 
 The semantics specification is given in English. If not explicitly 
 given, the semantics is assumed to be identical to that given in 
 [lk] for the equivalent ALGOL construct. 
 
 1. Declarations 
 
 <declaration>: := <variable declaration>|<array declaration^ 
 <PEM reserve declaration^ 
 <PEM assignment declaration> 
 
33 
 
 2. Variable Declaration 
 
 2.1 Syntax 
 
 <variable declaration>: := <attribute> <variable list> 
 <attribute>: := BOOLEMf| REAl | REALS | 
 
 REAL!) ! INTEGER I INTEGERS | 
 
 INTEGER! 1 BYTE8 J BYTE16 
 <variable list>: := <variable list>, <variable>| 
 
 <variable> 
 <variable>: := <identifier> 
 
 2.2 Examples 
 INTEGER I, J 
 REAL X, Y 
 
 2.3 Semantics 
 
 Variable declarations serve to declare certain identifiers 
 to represent simple variables of a given type. Each attribute 
 corresponds to a specific word format in ILLIA.C IV: 
 
 BOOLEAN 6k bit word (only the least significant bit is 
 
 meaningful) 
 
 REAL 64 bit floating point 
 
 REALS 32 bit floating point 
 
 REALD 128 bit (double precision) floating point 
 
 INTEGER kQ bit fixed point 
 
 INTEGERS 2k bit fixed point 
 
 INTEGER! 6k bit fixed point (no sign) 
 
3h 
 
 BYTE 8 8 bit fixed point (no sign) 
 BYT El 6 16 "bit fixed point (no sign) 
 COMPLEX Simple variables further specified by this 
 
 attribute have complex numbers as values. 
 
 3> Array Declaration 
 3«1 Syntax 
 
 <array declaration^ := <mapping function> ARRAY <array list>| 
 <attribute> <mapping function> ARRAY <array list>| 
 ARRAY (<PEM area>) <array list>| 
 <attribute> ARRAY (<PEM area>) <array list> 
 
 ^mapping function>: := STRAIGHT ! SKEWED | SKEWED PACKED | 
 
 CHECKER | <empty> 
 
 <inapping procedure name>: := <identifier> 
 
 <array list>: := <array segment>| <array list>, <array segment> 
 
 <array segment>: := <array identifier> [<bound list>] | 
 
 <array identifier>, <array segment> 
 
 <bound list>::= <bound>| <bound>, <bound list> 
 
 <bound>: := <arrangement> <bound pair>| <arrangement> <limit> 
 
 <arrangement>: := *| ** | #| ##| <empty> 
 
 <bound pair>: := <Lower bound> : <upper bound> 
 
 <lower bound>: := <arithmetic expression> 
 
 <upper bound>: := <arithmetic expression> 
 
 <limit>: := <arithmetic expression> 
 
 <PEM area>: := <identifier> 
 
35 
 
 3.2 Examples 
 
 REAL SKEWED ARRAY A[l;5, 1:10], B[5, 10] 
 
 ARRAY AR, BR[**1:80, *1:256] 
 
 INTEGER ARRAY AI, Bl[#4, **64, *128] 
 
 3-3 Semantics 
 
 An array declaration declares one or several identifiers 
 to represent multidimensional arrays of subscripted variables and 
 gives the dimensions of the arrays, the bounds of the subscripts, 
 the types of the mapping functions, and the type of the variable. 
 
 3-3.1 Subscripts Bounds 
 
 As in PL1, the subscript bounds can be given in either 
 ALGOL form or FORTRAN form. Simultaneous use of both forms in one 
 bound list is, however, forbidden; e.g., [1:5* 10] is illegal. 
 
 3.3.2 Dimensions 
 
 Up to eight dimensions are allowed in arrays. 
 
 3.3.3 Arrangement (Also see Chapter 1.) 
 
 Generally arrays are stored in PE memory by subarrays, 
 which are made up from the last two dimensions; e.g., three 6k X 128 
 subarrays are to be formed for an array A[3, 64, 128]. Arrangement 
 declarations serve to change the rule. One asterisk *, together with 
 ** placed in front of a bound indicates that a subarray is formed by 
 those dimensions. For example, A[5, **256, *128] will generate five 
 
36 
 
 subarrays of size 128 X 256. Since subarrays are stored in PE memory 
 as they are, i.e., a m X n subarray occupies m PEM words in n PEM's, 
 the arrangement declaration may be used to introduce a better memory 
 usage. Also,* forces data corresponding to that subscript to be 
 stored in one PEM, and ** indicates that the data for that subscript 
 is stored across PEM's. Thus, a vector A[50] can be stored in 
 one PEM by declaring as A[*50], or across PEM's by A[**50] • One 
 sharp # and two sharps ## similarly indicate how to arrange subarrays 
 in PE memory. One sharp indicates the direction of increasing PEM 
 word address and ## indicates across PEM's. For example an array 
 A[#5, ##h, *32, **64] will introduce twenty 32 X 6k subarrays, 
 arranged in five rows of k subarrays thus making up a 160 X 256 
 block (Figure Al) . 
 
6k 
 
 37 
 
 32A[1,1,*,*] 
 
 A[l,2,*,*] 
 
 A[l,i+,*,*J 
 
 6k 
 
 160 
 
 32 
 
 A[5,l,*,*] 
 
 256 
 
 A[5,^,*,*] 
 
 Figure Al. Subarrays for an Array A[#5,## i +,*32,**6U] 
 
38 
 
 The five combinations of arrangement markers given in 
 Table Al are legal, where the numbers indicate the number of 
 allowable appearances in a single declaration. 
 
 
 -* 
 
 -*-* 
 
 # 
 
 # 
 
 1st combination 
 
 n- times 
 
 
 2nd 
 
 1 
 
 1 
 
 
 3rd 
 
 1 
 
 1 
 
 1 
 
 
 hth 
 
 1 
 
 1 
 
 
 1 
 
 5th 
 
 1 
 
 1 
 
 1 
 
 1 
 
 Table Al. Combinations of Arrangement Declaration Markers 
 
39 
 
 3.3.4 Mapping Function 
 
 The def ault mapping function is SKEWED PACKED . In the case 
 of a user-specified mapping function, a corresponding PEM assignment 
 declaration must be in effect at the time the array declaration is 
 processed. 
 
 3.3-5 PEM Area 
 
 If a PEM assignment declaration is used to define a special 
 mapping function, then a corresponding PEM area name must appear in 
 the array declaration. 
 
 h. PEM Reserve Declaration 
 4.1 Syntax 
 
 <PEM reserve declaration^: := PEMEMORY <PEM area name> 
 
 [<word size>, <PEM size>] 
 
 <PEM area name>: := <identifier> 
 
 <word size>: := <unsigned integer> 
 
 <PEM size>: := <unsigned integer> 
 
 k.2 Example 
 
 PEMEMORY PEMEM [10, 256] 
 
 k-3 Semantics 
 
 PEM reserve declarations serve to reserve a certain amount 
 of virtual memory, allowing the programmer to store arrays there in 
 any fashion. The integer used for PEM size should not be greater 
 than 256. Should more space be needed, more than one area may 
 
40 
 
 "be reserved. The declaration should appear before the area is used 
 in a PEM assignment declaration. The reserved area will be released 
 upon exit from the block in which it was declared, as usual. 
 
 4.3.1 Word Size and PEM Size 
 
 Both word and PEM are understood to be numbered starting 
 from and increasing in increments of 1. 
 
 5 • PEM Assignment Declaration 
 5«1 Syntax 
 
 <PEM assignment declaration>: := PEM <PEM assignment b>lock> 
 <PEM assignment block>: := <PEM assignment const ruction>| 
 
 BEGIN <List of PEM assignments END 
 <list of PEM assignments>: := <PEM assignment construct ion>| 
 
 <PEM assignment construction^ <list of PEM assignments> 
 <PEM assignment construction^*: := <PEM assignment statement>| 
 <set assignment statements | 
 <PEM for statements 
 <PEM for statement>: := <PEM for clause> <FEM assignment block> 
 <PEM for clause>::= FOR (<set variable list>) SIM 
 
 (<set name list>) DO 
 <PEM assignment statement>: := <PE memory> •<- <array name> 
 <PE memory>: := <PE area> [<word index>, <PEM index>] 
 <word index>::= <unsigned integer>|<variable> 
 <PEM index>: := <unsigned integer>|<variable> 
 <array name>: := <array identifier> [<subscript list>] 
 
1+1 
 
 <sub script list>: := <subscript>|<sub script list>, <subscript> 
 <subscript>: := <arithmetic expression> 
 
 5.2 Examples 
 
 PEM FOR (I, J) SIM ([1,2,..., 256] X [1,2, . . . ,256] ) DO 
 
 PEMEM[I, J] ♦- ARAY [J,l] 
 PEM BEGIN 
 
 PEMEM [1,1] -ARAY [l,ljj 
 PEMEM [256,1] *- AMY [1,256]; 
 FOR (I) SB4 ([2,3,. .-,255]) DO 
 PEMEM [1,1] «- ARAY [l,l] 
 END 
 
 5-3 Semantics 
 
 PEM assignment declarations serve to store arrays into a 
 reserved PEM area in the way specified, i.e., declare a new mapping 
 function. 
 
 5.3.1 PEM Area 
 
 PEM areas must be reserved before they are actually used 
 by PEM assignment declarations. 
 
 5.3.2 Variables 
 
 All variables appearing in PEM assignment declarations 
 besides PEM area names and array names need not be declared and they 
 are understood to be local to the declaration. 
 
k2 
 
 5.3.3 Sets 
 
 All sets appearing in a PEM assignment declaration may not 
 be dynamic; i.e., parameters can never be passed to this declaration 
 from outside. 
 
 5.3.^ PEM Assignment Statement 
 PEM PEMEM [I, J] «- AR [K,L] 
 causes the (K,L) element of an array AR to be stored in the I-th row 
 and the J-th column of PEMEM. 
 
 5.^ Further Example 
 BEGIN 
 
 PEMEMORY PEMEM [256,256]; 
 PEM 
 
 FOR (I, J) SIM ([1,2,... ,256] X [1,2, ...,256]) DO 
 PEMEM [I, J] «- ARAY [J,l]; 
 REAL ARRAY (PEMEM) ARAY [256,256]; 
 
 END 
 
 In the above example a 256 X 256 array ARAY is stored in 
 PE memory in such a way that a column of ARAY is across PEMs and a 
 row of ARAY is in a single PEM. 
 
^3 
 
 APPENDIX B 
 TABLES 
 
 The following is the list of tables used in the TPAUQUXL 
 compiler to take care of declarations and memory allocation, 
 (i) Tables used in both Pass 1 and Pass 2 
 
 IDTAB contains the information on each identifier 
 declared in a program; e.g., type, a pointer 
 to a corresponding DOPETB entry if an 
 identifier is an array. 
 
 DOPETB contains the information necessary to refer- 
 ence arrays; e.g., size of each dimension, the 
 number of dimensions. 
 
 (ii) Tables in Pass 2 
 
 BASETB contains the descriptor for each block resulting 
 from an array partitioning; e.g., size of a 
 block, base address for a block. 
 
 These tables are linked as shown in Figure Bl. The number 
 of blocks N in BASETB is determined by: 
 
 L - 25s j L 2 5< 
 
 N = ^ X M 2 X ... X IM^-l + 255 X M + 255 
 
 "256 
 
kk 
 
 LDTAB 
 
 A 
 
 DOPETB 
 
 n < 
 
 "i 
 
 Mr 
 
 M 
 
 n 
 
 BASETB 
 
 SIZE 
 
 BASE 
 
 256X256' 
 
 100 ^ 
 
 
 • 
 
 
 ■) 
 
 
 N 
 
 n ^ number of dimensions 
 N = number of blocks 
 
 Figure Bl. Entries and Linkage of Tables for 
 A[l:M ] _,l:M 2 ,...,l:M n ] 
 
APPENDIX C 
 
 ARRAY PARTITIONING 
 
 AND 
 PACKING FLOWCHARTS 
 
 ^5 
 
he 
 
 
 w 
 
 
 o 
 
 < 
 
 o 
 
 
 l-l 
 
 en 
 
 CQ 
 
 H 
 
 
 jg 
 
 § 
 
 w 
 
 K 
 
 
 8 
 
 
 « 
 
 
 Ph 
 
 rl 
 
 
 s 
 
 
 H 
 
 
 H 
 
 >H 
 
 CO 
 
 < 
 
 t-'H 
 
 W 
 
 C) 
 
 C ) 
 
 o 
 
 o 
 
 >: 
 
 O 
 
 5 
 
 
 i-q 
 
 a 
 
 jg] 
 
 pq 
 
 < 
 
 
 H 
 
 Pd 
 
 H 
 
 Ph 
 
 o 
 
 Pd 
 
 
 Em 
 
 <! 
 
 
 H 
 
 
 CO 
 
 
 CM 
 
 W 
 W 
 
 Ph 
 
 -p 
 
 a 
 
 Ph 
 
 H 
 O 
 
 •H 
 
^7 
 
 EXIT A 
 PROGRAM BLOCK 
 
 Yes 
 
 _I_ 
 
 MAKE IT A 
 SEGMENT AND 
 ALLOCATE MEMORY 
 
 No 
 
 end 
 
 Figure CI (Part 2). Pass 2 Program Block Entry and Block Exit 
 Flowcharts for Array Declarations. 
 
U8 
 
 A [l:M 1 , l:M 2 , . . ., 1:*^, 1:M q ] 
 
 I 
 
 FORM SUBARRAYS 
 
 OF SIZE M _ X M 
 n-1 n 
 
 I 
 
 PARTITION A SUBARRAY 
 INTO 2p6 X 256 (OR 
 SMALLER RESIDUAL) BLOCKS 
 
 ESTABLISH A PROPER 
 NUM5ER OF ENTRIES IN BASETB 
 
 PACK SBLOCKS 
 
 Figure C2. Array Partitioning Flowchart 
 
h 9 
 
 ENTER WITH A M X N 
 
 RESIDUAL BLOCK 
 
 TO BE PACKED 
 
 IS THERE ANY 
 FREE BLOCK IN WHICH 
 PACKING HAS BEEN 
 DONE? 
 
 Yes 
 
 IS THERE ANY 
 EE SPACE IN THE FREE 
 
 BLOCK TO PLACE 
 A RESIDUAL BLOCK? 
 
 No 
 
 No 
 
 Yes 
 
 PLACE IT IN 
 THE FREE BLOCK 
 
 V 
 
 
 CREATE A NEW 
 256 X 256 FREE BLOCK 
 
 IS THERE STILL 
 ENOUGH FREE SPACE 
 IN THE FREE 
 BLOCK? 
 
 No 
 
 Yes 
 
 Q 
 
 Figure C3- Residual Block Packing Flowchart 
 
50 
 
 APPENDIX D 
 STORAGE ALLOCATION PACKAGES 
 
 There are six separate procedures for storage allocation. 
 Any one of them can be called at any time. To maintain a record of 
 PE memory usage, three lists or tables are used. 
 
 ^MEMORY is a dynamic linked list which records the use of 
 PEM rows. Corresponding to any m X 256 PEM memory block assignment 
 there is an element of the list which contains both an origin of 
 this block in PEM, and the number of PEM rows used, i.e., m. The 
 list contains a similar entry (except for 1 bit) for each block of 
 available storage in PEM. Each element also has two pointers, one 
 pointing to that element (free or used) which has the next higher 
 origin, and the other pointing to the element of the same kind (free 
 or used) having the next higher origin. 
 
 To allocate memory space for a block of size 256 X m, the 
 
 table VLIST is used. This is essentially a 16 element array, each of 
 
 whose elements corresponds to usage of 16 adjacent PEMs. This 
 
 implies that the number of PEMs (i.e., m) can only be a multiple of 
 
 l6 (actually either one of 16, 32, 6U or 128 is allowed due to the 
 
 hardware specification) . Upon encountering a request for memory 
 
 space of size 256 X m, a check is made for the existence of a current 
 
 VLIST table. If none exists, then a 256 X 256 block of PEM is 
 
 reserved and a VLIST table is created. Then VLIST is searched until 
 
 f ml 
 
 l-To" Jadjacent free columns are found, the appropriate PEM is 
 
 allocated, and the appropriate table entries are flagged. 
 
51 
 
 The SMALLSPACE table is used to allocate space for smaller 
 size blocks. This is a 6k X 16 bit boolean array in which each bit 
 represents a h X l6 word block of PEM (l = allocated, = free). 
 Thus SMALLSPACE represents a 256 X 256 PEM block. This implies 
 that the valid memory block size which can be requested should have 
 numbers of rows and columns which are multiples of h and 16, 
 
 M 
 
 respectively. To find freespace of size m X n, the first U+ J rows 
 of SMALLLIST are anded together and searched from left to right 
 
 r n i 
 
 (i.e., from 0-th PEM to 255-th PEM) for Llo"J consecutive O's. If 
 
 this is unsuccessful, then the same process is repeated on the next 
 
 f ml 
 
 L ^4 rows and so on. 
 
 Figures Dl, D2 and D3 show the formats for the above 
 
 tables and examples of table entries. 
 
52 
 
 (a) I4MEM0RY entry word 
 
 F or U ORIGIN POINTER 1 
 
 POINTER 2 
 
 SIZE 
 
 POINTER 
 To BASETB 
 
 F = Free 
 U = Used 
 
 (b) VLIST entry format 
 
 AVAILABLE SIZE 
 
 POINTER T0 BASETB 
 
 *» Used 
 or Free 
 
 (c) Structure of VLIST 
 
 16 words 
 
 POINTER TO I^MEMORY 
 
 1 word s 16 PE 
 
 Figure Dl. Table and List Entry Formats 
 
53 
 
 PE memory usage 
 
 I4MEM0RY 
 
 To BASETB 
 
 
 256 
 296 
 376 
 
 > 
 
 842 
 
 1098 
 
 2047 
 
 i 
 
 USED 
 
 FREE 
 
 USED 
 
 FREE 
 
 USED 
 
 FREE 
 
 i 
 
 1 
 
 
 u 
 
 
 3 
 
 2 
 
 256 
 
 - 
 
 1 
 
 
 2 
 
 F 
 
 256 
 
 4 
 
 3 
 
 40 
 
 
 3 
 
 U 
 
 296 
 
 5 
 
 4 
 
 80 
 
 
 4 
 
 F 
 
 376 
 
 6 
 
 5 
 
 466 
 
 
 5 
 
 U 
 
 842 
 
 5 
 
 6 
 
 256 
 
 
 6 
 
 F 
 
 1098 
 
 6 
 
 6 
 
 50 
 
 
 Figure D2. Example of an Entry for I4MEM0RY 
 
3h 
 
 PE Memory Usage 
 
 500 
 
 <* 
 
 6k 32 32 
 
 <> 
 
 LT\ 
 
 OJ 
 
 IUMEMORY Word 
 
 u 
 
 
 SIZE =256 
 
 
 VLIST 
 
 w 
 o 
 
 CO 
 
 W 
 
 O 
 
 w 
 
 3: 
 
 CO 
 
 CM 
 
 
 u 
 
 SIZE = 128 
 
 
 u 
 
 
 u 
 
 • 
 
 
 u 
 
 SIZE - Gk 
 
 
 u 
 
 • 
 
 
 u 
 
 
 F 
 
 
 F 
 
 
 u 
 
 SIZE = 32 
 
 
 u 
 
 • 
 
 
 ^To BASETB 
 
 Figure D3: Example of an Entry for VLIST 
 
55 
 
 LIST OF REFERENCES 
 
 [1] Barnes, G. H-, et al, "The ILLIAC IV Computer", IEEE Transactions 
 on Computers , C-17 , 8 (August, 1968), pp. 7^-757- 
 
 [2] Kuck, D. J., "ILLIAC IV Software and Application Programming", 
 IEEE Transactions on Computers , C-17 , 8 (August, 1968), 
 pp. 758-770. 
 
 [3] Iverson, K. E-, " A Programming Language ", John Wiley & Sons, 
 Inc., New York (1962). 
 
 [k] Knuth, D. E., " The Art of Computer Programming ", Vol. 1, 
 Addi son-Wesley (1968). 
 
 [5] Knowles, M., et al, "Matrix Operations on ILLIAC IV", Department 
 
 of Computer Science, University of Illinois, Urbana, Illinois, 
 ILLIAC IV Document No. 118 (March, 1967). 
 
 [6] Benokraitis, V., "Alternate Storage Methods for Two-Dimensional 
 Hydrodynamics Calculations", Department of Computer Science, 
 University of Illinois, Urbana, Illinois, ILLIAC IV 
 Document No. 190 (May, 1968). 
 
 [7] Randell, B. and Kuehner, J. C, "Dynamic Storage Allocation 
 Systems", Comm. ACM , 11, 5 (May, 1968), pp. 297-306. 
 
 [8] Wilhelmson, R. B., "Control Statement Syntax and Semantics of 
 a Language for Parallel Processors", (M.S. Thesis), 
 Department of Computer Science, University of Illinois, 
 Urbana, Illinois, (January, 1969). 
 
 [9] Budnik, Paul P., "TRANQUIL Arithmetic", (M.S. Thesis), 
 
 Department of Computer Science, University of Illinois, 
 Urbana, Illinois, (January, 1969) « 
 
 [10] Hellerman, H., "Addressing Multidimensional Arrays", Comm. ACM , 
 5, k (April, 1962), pp. 205-207. 
 
 [11] Balzer, R. M., "Dataless Programming", Proc FJCC , (1967), 
 
 pp. 535-5^3. 
 
 [12] Northcote, R. S., "The Structure and Use of a Compiler-Compiler 
 System", Proc. Third Australian Computer Conference , 
 (May, 1966), pp. 339-3^- 
 
 [13] Backus, J. W., "The Syntax and Semantics of the Proposed 
 
 International Algebraic Language of the Zurich ACM-GAMM 
 Conference", Proc. Int. Conf. Inf. Proc , UNESCO , Paris, 
 France, (June, 1959)- 
 
56 
 
 [lh] Naur, P., et al., "Revised Report on the Algorithmic Language 
 ALGOL 60", Comm. ACM , 6 (January, I963), pp. 1-17- 
 
 
UNCLASSIFIED 
 
 Security Classification 
 
 DOCUMENT CONTROL DATA -R&D 
 
 (Security claaell lcatlon ol till,, body o/ rt.ff.cl and Indenrnj annotation mu.t be entered when the , w „|) rep ort /» cle.altled) 
 1. ORIGINATING ACTIVITY (Corporate author) ' 
 
 Department of Computer Science 
 
 University of Illinois 
 
 Urbana, Illinois 6l801 
 
 J. REPORT TITLE 
 
 2a. REPORT SECURITY CLASSIFICATION 
 
 UNCLASSIFIED 
 
 2b. CROUP 
 
 STORAGE ALLOCATION ALGORITHMS IN THE TRANQUIL COMPILER 
 
 4. descriptive NOTES (Typm ot report and Ineluaive dmtmm) 
 
 Research Report 
 
 8. AUTHOR(S) (Flrmtnmotm, middle initial, Im at nana) 
 
 Yoichi Muraoka 
 
 6. REPORT DATE 
 
 January 13, 1969 
 
 *a. CONTRACT or srant no. 
 
 46-26-15-305 
 
 6. PROJEC T NO. 
 
 USAF 30(602)1+1^1 
 
 10. DISTRIBUTION STATEMENT 
 
 7a. TOTAL NO. OF PACES 
 
 6l 
 
 76. NO. OF REFS 
 
 14 
 
 ORIGINATOR'S REPORT NUMBER(S) 
 
 DCS Report No. 297 
 
 96. OTHER REPORT NOW (Arty other number* that may be ateloned 
 tMa report) ' 
 
 Qualified requesters may obtain copies of this report from DCS, 
 
 III. SUPPLEMENTARY NOTES 
 
 • 2. SPONSORING MILITARY ACTIVITY 
 
 
 Rome Air Development Center 
 
 NONE 
 
 Griffiss Air Force Base 
 
 IS. ABSTRACT — — 1 
 
 Rome, New York 13MK) 
 
 TRANQUIL is a language for describing algorithms in terms of 
 parallel constructs. Its compiler is now being implemented for the parallel 
 array computer ILLIAC IV. This paper discusses a particular part of the 
 implementation; namely, the problem of storage allocation for arrays. 
 
 DD ,'.r..1473 
 
 UNCLASSIFIED 
 
 Security Classification 
 
UNCLASSIFIED 
 
 Security Classification 
 
 KEY <NO ROS 
 
 Data Declarations: INTEGER, REAL , COMPLEX 
 
 and BOOLEAN 
 
 ROLE 
 
 RO L. E WT 
 
 UNCLASSIFIED 
 

 ; r> 
 
UNIVER9ITY OF ILUNOI9-URBAN* 
 
 3 0112 045402051