LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAICN 
 
 net- 
 
 cop. 2 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 SEP 2 197fc 
 
 ir 
 
 UP 1 2 RECTO 
 
 SEP3< 
 
 1QQC 
 
 2001 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/designofarithmet333atki 
 
Report No. 333 
 
 //yuucii 
 
 coo-1018-1183 
 
 DESIGN OF THE ARITHMETIC UNITS OF ILLIAC III: 
 USE OF REDUNDANCY AND HIGHER RADIX METHODS 
 
 by 
 
 Daniel E. Atkins 
 
 May 1969 
 
 JUL u 
 
COO-1018-1183 
 
 REPORT NO. 333 
 
 DESIGN OF THE ARITHMETIC UNITS OF ILLIAC III: 
 USE OF REDUNDANCY AND HIGHER RADIX METHODS* 
 
 by 
 
 Daniel E. Atkins 
 
 May 1969 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 *To be presented at the Workshop on the Theory of Computer Arithmetic, 
 
 Third Annual IEEE Computer Conference, Minneapolis, June 16 , 1969. 
 
 This work was supported in part by the U.S. Atomic Energy Commission 
 
 under Contract No. USAEC AT(ll-l -1018 and in part by the National 
 
 Science Foundation under Grant No. NSF - GP - U636 . 
 
ABSTRACT 
 
 In keeping with the experimental nature of the Illinois Pattern 
 Recognition Computer (illiac III), the arithmetic units are intended 
 to be a practical testing ground for recent theoretical work in com- 
 puter arithmetic. This paper describes the use of redundant number 
 systems and the design of a structure with which multiplication and 
 division are executed radix 256. The heart of the unit is the stored- 
 sign subtracter, a recently discovered member of the family of borrow- 
 save subtracters and carry-save adders. A cascade of these subtracters 
 controlled by a multiplier recoder, provides multiplication. The same 
 structure, controlled by a "model division" (a quotient recoder), 
 performs division. 
 
 ■111- 
 
ACKNOWLEDGEMENT 
 
 The author wishes to acknowledge and thank Professor James E. 
 Robertson, Professor Bruce H. McCormick and Mrs. Tuh-Kai Koo for their 
 assistance in the design effort described in this paper. Mrs. Koo 
 wrote extensive simulation programs which were used to validate the 
 arithmetic algorithms. 
 
 -IV- 
 
TABLE OF CONTENTS 
 
 Page 
 INTRODUCTION 1 
 
 Adder-Subtracter 1 
 
 Multiplication 1 
 
 Division 2 
 
 ADDER-SUBTRACTER 3 
 
 Background 3 
 
 Definition 3 
 
 Properties 5 
 
 Input-Output Compatibility 5 
 
 Limited Borrow . 5 
 
 Unique Zero 5 
 
 Negation 6 
 
 Least Significant Digit 7 
 
 Overflow Detection 8 
 
 Truncation Error 11 
 
 Sign Detection 12 
 
 Assimilation 13 
 
 Implementation Ik 
 
 MULTIPLICATION 17 
 
 Background 17 
 
 Recoding Scheme 18 
 
 Multiplication Structure 21 
 
 Brief Operation Description 2k 
 
 Truncation Error 26 
 
 -v- 
 
Page 
 DIVISION 28 
 
 Background 28 
 
 Model Division 29 
 
 Operational Description of Model Division 31 
 
 Division Structure 36 
 
 Brief Operational Description of Full Precision 
 
 Division Scheme 37 
 
 Truncation Error 38 
 
 REFERENCES 39 
 
 APPENDIX 1+1 
 
 Proof of the Validity of the Correction Scheme for 
 
 Bogus Overflow k2 
 
 Brief Description of Illiac III Computer System . . . . U6 
 
 -vi- 
 
INTRODUCTION 
 
 In keeping with the experimental nature of the Illinois 
 Pattern Recognition Computer (illiac III), the arithmetic units are 
 intended to be a practical testing ground for some recent theoretical 
 work in computer arithmetic. The hulk of this work centers upon the 
 use of redundant number systems and/or the use of higher radix methods. 
 The design of the arithmetic units of Illiac III exhibits both tech- 
 niques. They are of primary importance in the adder-subtracter structure 
 the multiplication structure, and the division structure. 
 
 Adder-Subtracter 
 
 A key factor in the rapid execution of the iterative 
 sequences of multiplication and division is the operation time of the 
 adder-subtracter. The design used in Illiac III is a member of a 
 family of limited carry-borrow propagation adder-subtracters. The 
 necessity for propagation of carries or borrows is eliminated by permit- 
 ting the results of an operation to be represented in a redundant form. 
 Redundancy is achieved by using a signed-digit format. Associated with 
 each digit is a magnitude of either 1 or , and a sign of either posi- 
 tive or negative Changing a number in a signed-digit format to a 
 conventional non-redundant representation requires a carry or borrow 
 propagation, but only one such conversion is required per arithmetic 
 operation and it may be accelerated by use of lookahead techniques. 
 The adder-subtracter structure exhibits several other interesting pro- 
 perties not found in the conventional carry-save adder or borrow-save 
 subtracter. 
 
 Multiplication 
 
 In other than the adder-subtracter complex, high-speed 
 operation is also obtained by extensive use of redundancy and by 
 executing operations in radices greater than two. Multiplication, for 
 example, is performed radix 256, i.e. 8 bits of the multiplier are 
 retired in one pass from the primary to the secondary rank of the 
 
 -1- 
 
accumulator. By recoding, redundancy is introduced Li 
 
 ;uch a manner that all tl required mult tip] Lcand W 
 
 be formed merely by shifting. 
 
 Division 
 
 In division, redundancy is introduced into the representation 
 of the quotient. As a consequence quotient digits may be determined fj 
 a relatively few high-order bite of the divisor and partial remainder, 
 full precision comparison of the divisor and partial remainder is not 
 required. The division algorithm makes efficient ur. the large amount 
 of hardware devoted to high S] -ed multiplication and is also performed 
 radix 25b. Eight bits of the quotient are generated in one pass from * 
 primary to the secondary rank of the accumulator. 
 
 Appenli x 
 
 The Appendix includes the proof of the validity of the bogus 
 overflow correction scheme and introduction to the entire Illiac III 
 . ,:tem. 
 
 -2- 
 
ADDER- SUBTRACTER 
 
 Background 
 
 It has long been realized that the execution of multipli- 
 cation is substantially accelerated by the use of adders in which carry 
 propagation is eliminated until a terminal step. Recently, Robertson 
 [l] has noted that the traditional carry save-adder or borrow-save 
 subtracter derived by the modification of conventional adders or 
 subtracters are but two members of a larger family of limited carry- 
 borrow propagation adders - subtracters. At least two of the designs 
 obtained using his deterministic procedures appear to be new and of 
 practical importance. They are the stored sign adder and the stored 
 sign subtracter . The design properties of both are similar and in the 
 final analysis both are actually capable of either addition or subtrac- 
 tion. The stored sign subtracter has been implemented in Illiac III. 
 This device is also referred to as a signed-digit subtracter . The 
 two names will be used interchangeably in this paper. 
 
 Definition 
 
 A typical position of a signed-digit subtracter is shown in 
 Figure 1. Each position is a three-input, two-output device together 
 with an interpositional connection and a "NEG" control line. The 
 symbol Y. represents the ith bit of the subtrahend (minuend - subtrahend 
 = difference) in conventional binary form*. S. and X. together comprise 
 the ith minuend digit in a redundant format. X. is interpreted as a 
 magnitude, either 1 or , and S. as a sign; is positive and 1 is 
 negative. The digital values 1, 0, or l (overbar denotes negation) are 
 thus represented as follows: 
 
 *The design described here employes one operand in conventional 
 form and one in redundant form. Designs have been proposed in which 
 both operands are represented redundantly. See Rohatsh [2] and Borovec [3] 
 
 -3- 
 
Subtrahend Minuend 
 
 Yi 
 
 Si 
 
 C;_ 
 
 i-1 
 
 
 
 
 
 
 -^ 
 
 4 
 
 POSITION i 
 
 4 
 
 
 
 
 ^ 
 
 Ti 
 
 Difference 
 
 S. = sign of minuend digit 
 
 X. = magnitude of minuend digit 
 
 Y. = subtrahend in conventional binary form 
 
 T. = sign of difference digit 
 
 Z. = magnitude of difference digit 
 
 NEG = control to complement T. 
 
 If NEG = 1 then T. is complemented, else not 
 
 G = gate on interpositional connections 
 
 C. = interpositional connection 
 
 T. = C. © NEG 
 l l 
 
 Z. = C. 9 X. © Y. 
 i ill 
 
 c i-i = (s i x i v x i V G 
 c i = (s i + i x i+ i v x i+ i Vi> 
 
 NEG 
 
 G 
 Ci 
 
 Figure 1 - Typical Position of a Signed-Digit Subtracter 
 
 -h- 
 
S X 
 
 i i Digital Value 
 
 +0 
 
 1 +1 
 
 10 -0 
 
 11 -1 
 
 The logical equations for a stored sign adder may be derived by changing 
 the sign of all non-zero digits in a truth table for the equations 
 given in Figure 1. 
 
 The gate signal, G,. shown in Figure 1 is not inherent in 
 the logical design of a stored sign adder or subtracter but is necessary 
 for a particular application in Illiac III. During the assimilation 
 of a redundantly represented result into conventional form the require- 
 ment arises for the Z output to be identical to the X input. In general, 
 the addition or subtraction of zero will not guarantee this. However, 
 with G=0 , all interpositional inputs, C. , are 0, and thus Z. = X. ©Y. ; 
 the subtracter will perform the exclusive or function with G=0. Further- 
 more if all Y. are also 0, then Z.=X. . The signal G will always be 1 
 i 11 
 
 whenever the device is actually being used for addition or subtraction. 
 
 Properties 
 
 Input-Output Compatibility - An important property of the 
 subtracter in the execution of iterative operations such as multipli- 
 cation is the fact that the output is in this same signed-digit format 
 as the input. Z. is the magnitude and T. is the sign of the ith digit 
 of the output . 
 
 Limited Borrow - The introduction of redundancy in the output 
 of the subtracter has permitted the length of the borrow propagation 
 chain to be drastically limited. The interpositional connection, C, 
 is a function of only the inputs to the adjacent position, i+1. It is 
 not a propagating borrow. 
 
 Unique Zero - Note that although the representation is 
 redundant, the representation of zero is unique except for sign. A 
 
 -5- 
 
number in the signed-digit J Jl magnit i 
 
 bits are zero. For a signed-digit representation in radix r, t:. 
 requirement for a unique representation of zero demands 
 magnitude of allowed digit values not exceed r-1. 
 
 Negation - Another property of this logical structure is t 
 ability to algebraically negate a number in sign' format I ely 
 logically complementing all the sign bits. There is no analogous 
 property for the conventional carry-save adder or borrow-save subtracter. 
 This feature of the signed-digit subtracter permits additions and sub- 
 tractions in a cascade of such devices to be interleaved in any manner 
 
 desired. 
 
 In Figure 1, NEG is a control signal which when set to 
 logical 1 complements T. and when set to logical ) allows T ± to pass 
 unchanged. Now consider a subtracter consisting of adjacent, inter- 
 connected positions such as shown in Figure 1 and let 
 
 Y = the algebraic value of the subtrahend in conventional form ; 
 
 X* = the algebraic value of the minuend in signed-digit form, and 
 
 Z* = the algebraic value of the difference. 
 
 With NEG = '0' the device is truly a subtracter and Z* = X*-Y. With 
 NEG = '1 ! the output is negated and thus Z* = -(X*-Y). Now note that 
 if complementing circuits are added to the S. input so that both X* 
 and Z* may be independently negated it is possible to form Z* = -(-X*-Y) 
 = X* + Y and the device is adding. For many applications the negating 
 circuits for the sign bits, S., need not be included in the subtracter 
 per se but rather the same result is achieved by gating the complement 
 outputs of the register containing S, or when the subtracters are 
 cascaded, by negating the output of the previous stage. 
 
 This ability to negate a result while it is still in a 
 redundant form also expedites the execution of floating point addition 
 and subtraction. In the floating point format adopted for Illiac III 
 the mantissa is considered to be positive, i.e. to be a magnitude. 
 The sign is given by a bit apart from the mantissa. In multiplication 
 and division the sign of the result if the exclusive OR of the signs of 
 the operands. In addition and subtraction the sign determination is 
 more complicated: it depends upon the signs and the relative magnitude 
 of the operands. 
 
 -6- 
 
Consider two operands with magnitudes A and B and with signs 
 SIGNA and SIGNB, respectively. A logical one denotes a negative 
 quantity; a logical zero denotss a positive quantity. The table below 
 gives the sign of the result as a function of the sign of the operands 
 and their relative magnitude: 
 
 
 
 SIGN(A+B) 
 
 SIGN(A-B) 
 
 SIGNA 
 
 SIGNE 
 
 
 A>B 
 
 AiB 
 
 A: B A*B 
 
 
 
 
 
 
 
 . 1 
 
 
 
 1 
 
 
 
 1 
 
 
 
 I 
 
 
 
 1 
 
 
 
 1 
 
 1 
 
 1 
 
 1 
 
 1 
 
 1 
 
 If the exponents of the operands are different, then the relative 
 magnitude is readily determined from the difference of the exponents. 
 But if the exponents are equal and SIGNA ± SIGNB for addition, or SIGNA 
 = SIGNB for subtraction then the sign of the result cannot be determined 
 prior to actually performing the operation. 
 
 First consider the cases in which the sign of the result 
 maybe be determined. If the sign is known to be negative the result 
 is negated prior to the conversion to a conventional form. The ability 
 to negate the redundant form of the result permits this. In cases 
 in which the sign is not known prior to calculation, the sign of the 
 result is assumed to be positive, the operation is performed and then 
 converted into a conventional form. The high order bit of the conver- 
 ted result is the sign. If it is negative then the redundant result 
 (still present on the outputs of the subtracter) is negated and then 
 again converted to a conventional form. The necessity for two 
 conversions would be avoided if the sign of the result could be deter- 
 mined from the redundant form. However, as discussed in the next 
 section, sign determination is complicated by use of redundant notation. 
 The logic required is of the same order of complexity as that required 
 to convert the redundant result to a conventional form in which the sign 
 is apparent. 
 
 Least Significant Digit - A basic property of a stored sign 
 adder or subtracter is that the position of the least significant digit 
 need not be known. A conventional adder used only for addition does 
 
 -7- 
 
not required the insertion of a carry into the least signifies 
 digital position. Similarly, a subtracter used only for subtroctic 
 does not require borrow insertion. Hence, the combination adder- 
 subtracter does not require insertion of a carry during addition or a 
 borrow during subtraction. 
 
 Since there is no requirement for a carry or borrow 
 insertion in the least significant position, a signed-digit subtrac- 
 ter of a given length may be partitioned into several subtracters of 
 smaller length. Furthermore, by suitably partitioning the NEG con- 
 trol signal, addition could be performed in some group, while subtrac- 
 tion occurs in others. This facility is of application in variable 
 length operand formats and for parallel vector arithmetic. Although 
 neither of these are available in the initial version of the Illiac 
 III arithmetic units the potential usefulness of vector operations 
 influenced the decision to implement a signed-digit subtracter. 
 Vector facilities could be included in a subsequent version of the 
 arithmetic unit without major modifications. A very limited use of 
 this facility is being made in performing integer division. To be 
 compatible with the floating point division algorithm an integer 
 divisor or dividend in two's complement negative form is converted to 
 sign-magnitude form during a preliminary step. In performing this 
 conversion a 6U-bit signed digit subtracter is used as two, 32-bit 
 subtracters . 
 
 Overflow Detection 
 
 For redundant representations it is possible to derive 
 sufficient but not necessary conditions for overflow detection. 
 
 Let Z* = Z* + .1. Z* 2 _1 
 
 i=l l 
 
 with the constraint -1 < Z* < 1. Inspection of Z* and Z* gives rise 
 to three possible range conditions: overflow, no overflow, or maybe 
 overflow. The later conditions means that overflow may or may not 
 occur on assimilation to conventional form. Table 1 defines the 
 range of Z* for all possible combinations of Z* and Z* In Illiac 
 III overflow is checked only after the result has been converted to 
 conventional form. There are sufficient subtracter positions to the 
 

 o 
 
 V 
 
 * 
 
 o 
 
 _l 
 
 Ll. 
 
 *v 
 
 o 
 
 _i 
 
 Ll 
 
 or 
 
 1 
 
 5 
 
 3 
 
 U. 
 
 h- 
 
 N 
 
 > 
 
 N 
 
 UJ 
 
 > 
 
 N 
 
 01 
 UJ 
 
 
 V 
 
 O 
 
 V 
 
 o 
 
 V 
 
 > 
 
 o 
 
 
 1 
 
 o 
 
 z 
 
 O 
 
 o 
 
 z 
 
 CM 
 
 1 
 
 
 
 
 
 
 
 
 £ 
 
 
 — 
 
 o 
 
 CO 
 
 o 
 
 O 
 
 o 
 
 
 ,v 
 
 _J 
 
 LL 
 
 *v 
 
 _l 
 lo- 
 
 *v 
 
 li- 
 ar 
 
 
 N 
 V 
 
 UJ 
 
 s 
 
 N 
 V 
 
 ir 
 
 UJ 
 
 > 
 
 o 
 
 N 
 V 
 
 Ul 
 
 > 
 
 o 
 
 
 o 
 
 o 
 
 z 
 
 
 
 T 
 
 o 
 
 z 
 
 
 
 
 
 £ 
 
 
 £ 
 
 
 <\J 
 
 o 
 
 1 
 
 
 3 
 
 CM 
 
 o 
 
 
 \ 
 
 CM 
 
 — i 
 
 u. 
 or 
 
 \ 
 
 _l 
 lo- 
 
 o 
 
 N 
 
 Ll 
 
 tr 
 
 UJ 
 
 > 
 
 N 
 
 UJ 
 
 > 
 
 o 
 
 1 
 N 
 
 or 
 
 Ul 
 
 > 
 
 o 
 
 +i 
 
 V 
 
 o 
 
 V 
 
 UJ 
 00 
 
 V 
 
 UJ 
 
 
 o 
 
 CM 
 
 
 CM 
 
 
 
 CM 
 
 z 
 
 \ 
 
 >z 
 
 \ 
 
 >- 
 
 
 \ 
 
 
 — 
 
 < 
 
 ro 
 
 < 
 
 
 1 
 
 
 
 2E 
 
 I 
 
 S 
 
 «9o 
 
 o 
 
 
 
 
 i 
 
 
 /r 
 
 + 1 
 
 
 
 
 i 
 
 
 < 
 
 (T 
 
 O 
 Ll 
 
 ^2 
 
 O 
 
 Q 
 LU 
 
 ~Z. 
 
 ^2 
 en 
 
 (r 
 o 
 
 o 
 
 h- 
 o 
 
 LU 
 
 I- 
 LU 
 Q 
 
 3 
 
 Li- 
 LU 
 > 
 O 
 
 LU 
 
 _l 
 
 -9- 
 
left, of the radix to insure that no high P digit t. 
 
 But in using redundant representations it 
 obvious what constitutes a "sufficient" number of subtracter posit i'. 
 The decision is complicated by the fact that although the algebraic 
 value of a redundantly represented number may be within the range of, 
 
 say, n non-redundant digits, the actual form of the redundantly 
 represented number requires more than n digits. This point is 
 illustrated by an example. 
 
 Consider an 8 bit integer represented in a conventional 
 binary format. If I denotes this integer then the allowable positive 
 range of I is £_ I <_ 255. Conversely, given a conventionally repre- 
 sented binary integer, I, in the range <_ I ^_ 255, an 8-bit register 
 should be adequate to hold I. Now let I* be a signed-digit version of 
 I. We must now assign two bits per digital position of our 8 digit 
 register; one for the sign and one for the magnitude. The term "digit" 
 will now refer to one of these sign-magnitude positions. The tempta- 
 tion is to reason as follows: Due to the range restrictions imposed 
 on I , it may be stored in an 8-bit register. Every I* is equivalent 
 in value to an I, therefore I* may be stored in an 8 digit register. 
 This reasoning may well be incorrect as illustrated by the following 
 specific example. 
 
 Let I = 10000000. = 128 
 
 and Let I*= 10000000. = 128 
 
 Although both I and I* are equivalent in value and both are in the 
 range to 255, I* is in a form requiring 9 digits. This behavior 
 gives rise to a condition we shall call bogus overflow . The essence 
 of the problem is the fact that a signed-digit subtracter or adder 
 will sometimes transform a bit pattern of 01 into 11 or a pattern of 01 
 into 11. 
 
 One method of coping with bogus overflow is to provide 
 auxiliary register positions. It may be shown that if I* and I* 
 are both represented within n digits or less, the sum or difference of 
 I* and I* is representable within n+1 digits. Note however, that when 
 repetitive additions or subtractions are performed (even addition or 
 
 -10- 
 
subtraction of zero) each operation may generate another non-zero digit 
 to the left. Once bogus overflow begins it tends to propagate leftward 
 unless corrected. The implementation of positions to store the bogus 
 overflow not only adds hardware costs to the subtracters and registers, 
 it also burdents the assimilation logic. Although the assimilated number 
 will be contained in only n digits, the assimilation logic must propagate 
 borrows across n+k digits of the redundant form of the number. The maximum 
 value of k is the number of additions or subtractions which take place prior 
 to assimilation. But fortunately a procedure is available to control bogus 
 overflow. We shall first state the procedure and then prove that it is valid, 
 
 Statement : 
 
 Consider the high-order byte of an Illiac III signed-digit sub- 
 tracter. The positions are numbered 1 through 8. The radix point is to 
 the right of position 8. If the inputs to position 1 are such that S X = 1 
 then a bogus overflow will occur. Without implementing the Oth position, it 
 may be corrected by complementing the sign of the result of position 1, i.e. 
 by replacing T by T . 
 
 Proof : 
 
 The proof is presented in the Appendix. 
 
 Truncation Error 
 
 n 
 Let Z* = I Z* 2 _1 
 
 i=l X 
 
 The first column of Table 2 gives the possible digital values of 
 Z* for the output of a signed digit subtracter or adder, for the output of 
 a conventional carry-save adder, and for the output of the conventional 
 borrow save subtracter, all of length n to the right of the radix point. 
 
 -11- 
 
Signed-Digit 
 
 Conventional 
 Carry- Save 
 
 Possible Values 
 
 due to truncation to 
 
 of Z* 
 
 the rifrjnt of position l*e 
 
 i 
 
 
 1, o, 1 
 
 Let t = 2 - e -2' n 
 
 
 -T <A£ T 
 
 0, 1. 2 
 
 0<A< t 
 
 Conventional 
 Borrow- Save 
 
 1, 0, 1 
 
 -T <A< T 
 
 Table 2 - Comparison of Digital Values 
 and Truncation Error 
 
 For the signed-digit subtracter and the conventional borrow- 
 save subtracter the symmetry of the digital values gives rise to a 
 symmetric truncation error. As described in the next section, this 
 property tends to improve the precision of the results of floating 
 point multiplication. 
 
 Sign Detection 
 
 Let Z* be the algebraic value of a number in the signed 
 digit format, i.e. 
 
 Z* =.E, 
 
 1=1 
 
 Z* • 2 
 i 
 
 -l 
 
 where 
 
 Z* e {1,0,1}. 
 i 
 
 The sign of Z is the sign of the highest order, non-zero digit. 
 Unlike the sign in a non-redundant system, the sign of a number in 
 signed-digit format is not readily available. 
 
 Let 
 
 Z* = .E_ (1-2S.. )X. Z" 
 i=l l l 
 
 where S. and X. e {0,1} 
 i l 
 
 -12- 
 
Now treating S^ and Xj_ as Boolean values, the sign of Z is given 
 by the following: 
 
 SIGNZ = S 1 X x v S 2 X 2 X ± v S 3 X X ± X £ v 
 
 S X X X . . .X 
 v n n 1 2 n-1 
 
 If SIGNZ = then Z* is positive and if SIGNZ = 1 then Z* is 
 negative. If Z* = then SIGNZ = 0. The implementation of the 
 equation for SIGNZ becomes very expensive for large n. In Illiac III 
 sign determination is made only after a result has been assimilated 
 into a non-redundant form. 
 
 Assimilation 
 
 Although arithmetic operations are computed in the 
 redundant signed-digit format they are eventually converted into 
 a conventional form, i.e., the sign bits are assimilated. A 
 negative number will be represented in two's complement. The 
 requirement is that the redundantly represented number, 
 
 n 
 Z* = Z* + I Z* 2" 1 
 ° i=l X 
 
 with Z* e {1,0,1} to be converted to a conventional notation of 
 i 
 
 the form 
 
 n 
 A = -2A + A + E A. 2 _1 with 
 - 1 ° i=l X 
 
 A i e {0,1} 
 
 such that A = Z*. A is the sign bit. This conversion requires 
 a borrow propagation followed by an exclusive OR operation. 
 
 Let Z. and T. be the magnitude and sign bit, respectively 
 11 
 
 of the digit Z*. The propagation logic produces borrow bits, B , 
 defined as follows: 
 
 -13- 
 
B. , = B. Z. v T. Z. 
 l-l 11 11 
 
 where i = n, n-1 ..., 0; B =0. The assimilated result, A, is 
 
 n 
 
 produced by evaluating 
 
 A. = Z. © B. 
 ill 
 
 for i=0, 1, . .., n. A , the sign, equals B . Note that the 
 
 recursive definition of B. is essentially an evaluation of the 
 
 l 
 
 following: 
 
 (Borrow from i-1 position) = (Borrow from i position) (Z* = 0) 
 
 v(Z* = 1) 
 
 In actual practice a signed-digit subtracter may be 
 
 used to perform the second step of assimilation; the formation 
 
 of Z. © B.. Recall from Figure 1 that 
 li 
 
 Z. = C. © X. © Y. . 
 1111 
 
 If X. = Z., and Y. = B. and C. = 0, then Z. = A. . Since C. = 
 11111 11 l 
 
 (S i + 1 X i+1 v X. , Y. ) G, C. may be force to by setting G=0. 
 l+l l+l l 
 
 In Illiac III the formation of the B. bits has been 
 
 i 
 
 accelerated by use of lookahead techniques. The B. bits for i 
 equal 1 to 6U are formed in 10 collector delays . 
 
 Implement at i on 
 
 Figure 2 illustrates the logic of one position of the 
 
 signed-digit subtracter. The logic symbols conform to MIL-STD-806B. 
 
 The AND gates are implemented with diodes; the NOR gates are DTL. 
 
 The operating sequence for two adjacent positions, i and i+1, is 
 
 as follows : 
 
 1. COUT.,.. and its complement are formed from S.._ 
 l+l ^ i+l 5 
 
 X. and G. in in one collector delay. Note that COUT . , = CIN. . 
 l+l, l+l J l+l i 
 
 -Ik- 
 

 _ 
 
 O 
 
 
 — 
 
 • 
 
 
 >- 
 
 
 LlI 
 
 z 
 
 X 
 
 1* 
 
 
 
 
 
 > 
 
 •■» 
 
 
 X 
 
 Z 
 
 z 
 
 
 o 
 
 o 
 
 CO 
 
 II 
 
 II 
 
 II 
 
 t- 
 
 N 
 
 »- 
 
 Z> 
 
 IS 
 o 
 
 1- 
 
 1- 
 
 Z> 
 
 ZD 
 
 61 
 
 6l 
 
 O 
 
 o 
 
 AA ^AA M 
 
 CO O X X > 
 
 o 
 
 15= I 
 
 CO 
 
 o = 
 w ± 
 
 z o 
 
 _l 2 
 
 <2co 
 Q O 
 
 Q_ 
 
 Ld 
 
 O 
 
 < 
 
 cr 
 
 h- 
 
 00 
 
 Z> 
 
 co 
 
 CD 
 
 Q 
 i 
 
 Q 
 Ld 
 Z 
 CD 
 CO 
 
 U_ 
 O 
 
 CO 
 
 O 
 
 Q_ 
 LU 
 O 
 
 o 
 
 LL. 
 
 o 
 
 CD 
 O 
 
 CO 
 Ld 
 
 CD 
 
 -15- 
 
2. T. and Z. are formed in one additional collector d< 
 
 11 
 
 3. The complements of T. and Z, are formed in one 
 collector delay. The complements are necessary as inputs to the 
 next subtracter in the cascade. 
 
 Using this logic parallel addition or subtraction takes 
 place in three collector delays. A block diagram of the entire 
 adder-subtracter complex is shown in the section describing 
 multiplication. It consists of a cascade of four subtracters, 
 each 6k positions wide. 
 
 -16- 
 
MULTIPLICATION 
 
 Background 
 
 Multiplication in a digital arithmetic unit is generally 
 accomplished by over-and-over addition of multiples of the multi- 
 plicand with the contents of an accumulator. One way to accelerate 
 the execution of multiplication is to decrease the time required to 
 add the multiplicand to the partial product. The efficacy of a 
 reduced add time is the primary motivation for the use of a 
 borrow-save device such as the signed-digit subtracter. Another 
 technique for accelerating the execution of multiplication is to 
 accommodate more than one bit of the multiplier per iteration. 
 Such a scheme may be viewed as multiplication in radix r, where 
 r = 2 V , with k equal the number of bits inspected per iteration. 
 
 While use of a higher radix has the advantage of reducing 
 the number of iterations by a factor of k over the binary case, 
 it has the disadvantage of requireing additional multiples of the 
 multiplicand. For a non-redundant number system, multiplication 
 radix r, requires the multiples 0, 1, 2,..., (r-l) times the 
 multiplicand. If, however, a redundant number system is adopted 
 then the multiples 0, 1, 2, ..., (r-l) may be transformed into the 
 multiples -r/2, (-r/2 -l),..., 0, 1,..., r/2 (for even radices). 
 In this new set of multiples, half of the members are merely the 
 complement of the others. For the specific case r = h, the set 
 {0, 1, 2, 3} may be replaced by the set {2~, 1, 0, 1, 2}. Note 
 that in fact we do have redundancy in the second set , since there 
 are more than r (in this case five) digit symbols. The multiple of 
 3 in the first set is awkward or costly to form, but in the second 
 set all multiples may be formed by shifting and complementation. 
 
 It is useful to view this transformation as a recoding 
 of groups of k bits of the multiplier represented in conventional 
 form into digits belonging to the redundant set in such a manner 
 that algebraic equivalence is maintained. Additional information 
 
 -IT- 
 
on the theory of multiplier recoding may be found in references 
 [h] and [5]. Parts of these works are concerned with recodings 
 which permit the probability of a digit to be high. This 
 property is important in an implementation in which an adder 
 is bypassed if a multiple of is selected. In Illiac III, 
 however, this property is not stressed since the addition time 
 is at least as fast as a bypass. 
 
 Recoding Scheme 
 
 The recoding scheme adopted for Illiac III was suggested 
 
 by Wallace [6]. It is first defined for a radix k but will be 
 
 extended to a radix 256. The recoding actually requires the 
 
 parallel inspection of three bits of the multiplier. If X. is 
 
 the low-order bits of the multiplier, then the bits inspected are 
 
 X. , , X. , and X. , . The bit X. , is an extra position at the 
 l-l 1 l+l l+l 
 
 right of the least significant bit of the multiplier. It is 
 initially 0, but after the first right shift of the multiplier it 
 will equal the previous X. , which may not be 0. In a sense, 
 X is the indicator of what "mistake" was made on the previous 
 cycle. The recoding is shown in Table 3. It will accommodate 
 a negative number in two's complement representation. 
 
 v 
 
 i-1 ^i yv i+l Recoded Digit /Multiple Selected 
 
 Oil +2 
 
 10 +1 
 
 
 1 
 1 
 1 
 
 1 
 
 X. 
 
 1 
 
 X. , 
 
 1+1 
 
 1 
 
 1 
 
 1 
 
 
 
 
 
 1 
 
 
 
 
 
 1 
 
 1 
 
 1 
 
 
 
 
 
 1 
 
 
 
 
 
 TABLE 3 - Multiplier Recoding Scheme 
 
 -18- 
 
The Wallace recoding scheme has the following 
 advantages : 
 
 1. It requires little logic. 
 
 2. All selections can be made simultaneously; the 
 recoding is not a serial process. 
 
 3. The multiples used can be obtained from the 
 multiplicand by the processes of complementation and displacement. 
 
 h. It applies without alteration to the leftmost 
 digits of the multiplier. 
 
 Multiplication has been further accelerated by cascading 
 four signed-digit subtracters between the primary and secondary 
 ranks of the accumulator. A radix h multiplication takes place 
 at each subtracter: the result is a radix 256 multiplication for 
 a complete pass. Eight bits of the multiplier are retired per 
 iteration. The motivation for cascading subtracters is demonstrated 
 by the following: 
 
 Let 
 
 t = the time required to execute the iterative part 
 m 
 
 of the multiplication, 
 
 t = the time required to add or subtract; 
 a 
 
 t = the summation of the following times: time 
 
 to load the secondary accumulator, 
 
 propagation time through the shift gates into 
 
 the primary accumulator, 
 
 time to load the primary accumulator, 
 
 propagation time through the gates on the output 
 
 of the primary accumulator, control overhead time; 
 
 n = the number of additions, 
 a 
 
 n = the number of shifts . 
 s 
 
 Thus 
 
 t = n t + n t 
 m a a s s 
 
 i 
 If N is the number of bits in the multiplier, r is the 
 
 radix of the multiplication performed at each subtracter and K is 
 
 the number of subtracters in cascade then, 
 
 -19- 
 
n = 
 
 a log 2 
 
 n 
 a 
 
 s K 
 
 t = T— *-T (t ♦ ^ ) 
 
 'm log r' 
 
 a K 
 
 The radix of the multiplication from accumulator to 
 accumulator is given by r = 2 . For the Illiac III implementation, 
 t = 3 delays, t = 8 delays, N = 56 , and r' = k. The table below 
 
 3, S 
 
 gives t for K = 1 to 6. 
 m 
 
 K t (collector delays) Percent Decrease 
 
 m 
 
 1 
 
 308 
 
 
 
 2 
 
 196 
 
 36 
 
 3 
 
 158 
 
 kg 
 
 h 
 
 lUo 
 
 55 
 
 5 
 
 129 
 
 58 
 
 6 
 
 121 
 
 61 
 
 Increasing the number of subtracters decreases t , but 
 
 m 
 
 by a decreasing amount. The 36% decrease in t for doubling the 
 
 m 
 
 number of subtracters is substantial. The 55% decrease for 
 quadrupling the subtracters is less impressive but was nevertheless 
 deemed justifiable in light of the anticipated high demand for 
 multiplications. The following factors also contributed to this 
 decision: 
 
 1. A radix 256 structure is highly compatible with byte 
 oriented data formats. 
 
 2. Control complexity and overhead is decreased. 
 
 3. The structure can be used to accelerate division 
 
 and thus the cost is amortized across both operations 
 
 -20- 
 
Multiplication Structure 
 
 Figure 3 is a block diagram of an Illiac III arithmetic 
 unit. The conventions used in this figure are as follows: 
 
 1) Functional sub-blocks are denoted by rectangles. 
 Inside each box is the name of the block followed by a 
 list of the names of signals which control it. 
 
 2) The lines between boxes denote data buses. 
 
 3) Selector signal names are of the form F X T, where F 
 is the name of the register from which the data is 
 transferred, and T is the name of the register to which 
 the data is transferred. 
 
 X = D if the transfer is direct , i.e. without shifting. 
 X = Rn if the data is shifted n places to the right 
 
 during the transfer. 
 X = Ln if the data is shifted n places to the left 
 
 during the transfer. 
 h) A register name standing alone, for example, UQ, denotes 
 the true output of all positions of the register. A 
 subsection of a register is specified in the following form: 
 <register name> np , 
 
 where n is the number of the first byte (8 bits per byte) 
 of the subsection and p is the number of the last byte of 
 the subsection. Byte numbering is through "J. Example: 
 VDUHU7 means V-BUS Direct to UH-Register, bytes h through 7- 
 
 5) If R denotes the name of a register, then RSEL denotes 
 the output of the associated input selector. 
 
 6) If R denotes the name of a register, then LDR denotes 
 the signal which loads the output of the associated 
 selector into the register flip-flops. 
 
 7) All selectors, registers, subtracters and shift gates 
 are 6h bits (8 bytes) wide, except for the M-Register 
 which is 56 bits wide. 
 
 8) The signed digit subtracters are denoted SDS1 through 
 SDSU. 
 
 -21- 
 
- ► 
 
 JQ 
 0</> 
 
 ■ ■ B ; 
 
 h- 
 
 
 z 
 
 (\J 
 
 Z) 
 
 
 
 cr 
 
 (J 
 
 UJ 
 
 
 > 
 
 J- 
 
 — 
 
 UJ 
 
 2 
 < 
 
 X 
 
 rr 
 
 UJ h- 
 
 55 
 
 CD 
 
 < 
 
 O < 
 
 Q 
 
 u 
 
 ^ 
 
 
 o 
 
 o 
 
 o 
 
 < 
 
 _J 
 
 _l 
 
 CD 
 
 _l 
 
 
 
 
 CD (/> 
 D "J 
 
 in u 
 
 
 a 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 (T 
 
 IT 
 
 2 
 
 -J 
 
 
 
 to 
 
 
 V) 
 
 
 
 
 a 
 
 2 
 
 
 
 W « 
 
 
 3 
 
 
 
 =5 2 
 
 
 
 
 
 oo o 
 
 _j 
 
 
 
 
 ^g 
 
 2 
 
 UJ 
 
 
 
 -J 5 
 
 (/) 
 
 
 
 . . Kl 
 
 ■ 2 
 
 D 
 
 
 
 3Sg 
 
 2* 
 
 
 
 2 
 
 
 is 
 
 
 3 
 O 
 
 3 
 
 a 
 
 
 
 
 -j 
 
 
 ir 3 
 
 
 
 
 a uj = 
 
 Ps 
 
 
 oc 
 
 a: 
 
 O in 
 
 
 UJ 
 
 
 
 a => 
 
 UJ o 
 
 
 H 
 
 t- 
 
 
 
 (/> 
 
 10 
 
 -J o 
 
 
 IP 
 
 o 
 
 UJ 3 
 
 
 a: 
 
 UJ 
 
 in 
 
 1 
 
 
 w 
 
 2 
 
 3 
 
 3 
 
 
 3 
 
 Z> 
 
 o « 
 
 2 
 - .- 
 
 7 
 
 
 
 2 
 
 O 
 
 
 = 
 
 2 
 
 ■-^ 
 
 a 
 ? 
 
 □ 
 
 o 
 
 > 
 
 r. 
 
 
 
 - 
 
 a 
 
 a 
 
 o 
 
 -1 
 
 a 
 
 a 
 
 
 U 
 
 t 
 
 i 
 
 
 o 
 5 
 
 a 
 
 8 
 
 o 
 a 
 
 
 
 
 So 
 
 o5 
 
 Si k 
 
 or 
 
 2 < 
 
 -22- 
 
One multiplication cycle consists of a sequence of four, 
 radix k multiplications: mule ipli cat ion is perform radix 256. 
 Nine bits of the multiplier stored in the UQ Register are recoded 
 simultaneously to control the gates of the M Shift Array and the 
 NEG signals of the signed-digit subtracters which determine whether 
 addition or subtraction is performed. 
 
 The shifters are all logically identical, however, 
 they are connected to the appropriate subtracter so that , with 
 respect to the radix point of the subtracters (between the first 
 and second byte^, the values of the multiples are as shown below: 
 
 SDS No. 
 
 Multiples Selected 
 
 0, +128, +6U 
 
 0, +32, +16 
 
 0, +8, +h 
 
 0, +2, +1 
 
 The recoding is performed in three bit, overlapping 
 groups according to the specifications in Table 2. Figure h 
 illustrates the low-order byte plus the extra right-most bit of the 
 UQ register and the shift gates each control. 
 
 UQ BIT NO. 
 
 produces 
 signals: 
 
 57 
 
 58 
 
 59 
 
 60 
 
 61 
 
 62 63 
 
 64 
 
 65 
 
 
 
 
 
 
 
 
 
 
 L 
 
 J 
 
 r i i 
 
 ML7YI ML5Y2 ML3Y3 MLIY4 
 ML6YI ML4Y2 ML2Y3 MDY4 
 
 FIGURE 4. MULTIPLIER BIT, SHIFT GATE CORRESPONDENCE. 
 
 -23- 
 
The logic equations actually implemented in Multiplier 
 Recode box are shown below. The MYNEG (Multiply Negation) signals 
 are used to set the NEG controls of the SDS to select whether the 
 multiple is added or subtracted. 
 
 ML7Y1 = (UQ 5T UQ 58 UQ 59 )v(UQ 5T UQ 58 UQ 59 ) 
 
 ML6U1 = UQ cQ ®UQ cn 
 5o 59 
 
 ML5Y2 = (^Q 59 UQ 6o UQ 6l )v(UQ 59 UQ 6o UQ 6l ) 
 
 MLUY2 = UQ 6q ®UQ 6 
 
 ML3Y3 = (UQ 6l UQ 62 UQ 63 )v(UQ 6l UQ 62 UQ 63 ) 
 
 ML2Y3 = UQ 62 ®UQ 63 
 
 ML1YU = (^Q 63 UQ 61| UQ 65 )v(UQ 63 UQ 6U UQ 65 ) 
 
 MDYk = UQ., ®UQ^ 
 
 64 Op 
 
 NEGO = UQ 
 
 NEG1 = UQ__®UQ__ 
 p ( ?9 
 
 NEG2 = UQ ®UQ 6l 
 
 NEG3 = UQ 6l ®UQ 63 
 NEGU = UQg 
 
 Brief Operational Description 
 
 The fractional part of the multiplier is loaded into 
 the UQ-Register from the V-BUS. The fractional part of the 
 multiplicand is loaded into the UH-Register from the V-BUS and 
 then forwarded to the M-Register. Both fractions are 7 bytes 
 (56 bits) long. The low-order byte of the UQ-Register plus an 
 additional position UQ. (initially 0), drive the multiplier 
 recorder. One multiplication loop consists of the following 
 sequence of steps: 
 
 -2k- 
 
1) Recoder sets up shift gates and NEG signals. Contents 
 of US-UM (accumulated result in signed-digit format) 
 gated into subtracter cascade. 
 
 2) Output of subtracter cascade loaded into secondary 
 rank of accumulator, LS-LM. 
 
 3) Multiplier shifted right 8 bits. Secondary rank of 
 accumulator (LS-LM ) shifted right 8 bits into 
 primary rank ( US-UM). 
 
 This loop is executed seven times, once for each byte of 
 the multiplier. At the end of seven loops, UQ^ may be 1. If so, 
 then 1 times the multiplicand must be added to the partial product. 
 This is accomplished during the assimilation pass, the steps of 
 which are as follows: 
 
 1) Turn off subtracters 2, 3, k by setting G2 = G3 = GU = 0. 
 Set PDYU (Propagation Logic Direct to YU input on 
 subtracter k) . Set MDY1 if UQ65 = 1. Set NEGO = 
 
 NEG1 = 1; other NEG signals set to 0. 
 
 2) Gate US-UM into subtracter cascade. The T and Z 
 outputs of signed-digit subtracter 1 (SDSl) drive 
 the propagation logic. Meanwhile the Z bits from 
 SDSl propagate through SDS2 and SDS3. In SDSU the 
 output of the propagation logic and the Z bits are 
 combined in an exclusive OR to produce the result in 
 a conventional form. This assimilated result is 
 stored in the LM-Register and then forwarded to the 
 UQ-Register. The UQ-Register serves as an input- 
 output buffer. 
 
 The range of a normalized, non-zero fraction, f, is given by To" <_ f <1 
 
 The product of two such fractions, f and f , therefore lies in the range 
 
 1 
 T^Z ^_ f f < 1. A product may require a terminal left shift of 
 
 k bits accompanied by a reduction of its exponent. If zeros were 
 
 inserted in the low-order k bits during the shift then the precision 
 
 of the result would be impaired. The value of these bits, although 
 
 actually computed, would normally be lost in the last right shift 
 
 from the LS-LM Registers to the US-UM Registers. Logic has been 
 
 added which assimilates and stores the four-digits before they are 
 
 -25- 
 
lost by shifting. The borr'- a the assimilation n oupled 
 into position 6k of the Propagation Logic. If a terminal shift is 
 required these four bits, rather than zeros, are shifted into the 
 low-order position. 
 
 Truncation Error 
 
 It is difficult to identify a normalized result while it 
 is represented redundantly. For this reason the h extra low-order 
 bits of the product are always assimilated but used only if the 
 full-precision result requires a left shift for normalization. 
 For purposes of error analysis we may assume that 60 rather than 
 56 bits of the product are assimilated. The range of the trunca- 
 tion error, e , due to truncation after 60 signed-digits is given 
 by -2 <e <2 . In cases in which no left shift is required, the 
 four low-order bits of the assimilated result are dropped and thus 
 e is in the range -2 < e < 2 ' . If the left shift is required 
 
 t/- q/T 
 
 then e is in the range -2 ' < e < 2 ' 1 This later range is 
 worst case, but since in general the programmer will not know 
 whether or not the shift has occurred, it must be taken as the 
 best guaranteed precision. 
 
 The entire arithmetic unit has been simulated using PL/1. 
 The simulation of multiplication brought to light an interesting 
 property of the signed-digit representation: the tendency to 
 produce rounded results. The results of the simulator where 
 compared with the result for the same operation performed by the 
 arithmetic unit of the IBM/360/75. Frequently the result produced 
 by the simulator was greater than that of the 360 by 2 : a 1 in 
 the least significant position. 
 
 It was determined that the IBM/360 was producing the 
 result by what is equivalent to truncating a double precision 
 Result in conventional form. No rounding was performed. In the 
 Illiac III multiplication scheme, since the signed-digits may 
 be positive or negative, either a positive or a negative truncation 
 error is introduced by each right shift in the multiplication loop. 
 
 -26- 
 
In the cases observed these errors tended to cancel. The result 
 
 produced is the same as would be produced by the IBM/360 if rounding 
 
 -56 
 occurred at the position of weight 2 ' based upon the value of 
 
 the bit to the right. Subsequent work by Robertson [7], based 
 
 upon work by Rohatsh [2] has shown that for the signed-digit 
 
 format, the probability of obtaining a rounded result is 5/6. 
 
 -27- 
 
DIVISr 
 
 Background. 
 
 Robertson [8 ] has proposed a class of division techniques 
 in which quotient digits are selected based upon an inspection of only 
 a few high-order bits of the divisor and partial remainder. The quotient 
 selection mechanism may be viewed as a model of the full precision divi- 
 sion mechanism. The model division uses truncated versions of the divi 
 and partial remainders to produce quotient digits which are in turn used 
 in forming the next full precision partial remainder. The division pro- 
 cedured used in the model need bear no relationship to a conventional 
 division procedure, in particular, to the full precision procedure. The 
 procedure for the model division of Illiac III is a radix h table 
 look-up. The nature of this class of division techniques is explored 
 in detail in a paper by Atkins [9 \ . 
 
 The model division determines which multiples of the divisor 
 are to be subtracted from the partial remainder. In this respect it is 
 analogous to the multiplier recoder and may, in fact, be viewed as a 
 quotient recoder. In multiplication the recoder introduces redundancy into 
 the representation of the multiplier; in division the recoder introduces 
 redundancy into the representation of the quotient. The quotient recoder 
 is, however complicated by the following properties of the division 
 algorithm: 
 
 1. The quotient recoding is a function of both the divisor 
 and the partial remainder. 
 
 2. The partial remainder, unlike the divisor or the 
 multiplier, is not constant throughout the operation. 
 
 3. Since partial remainders are formed with the signed- 
 digit subtracters, they are represented redundantly. 
 
 But despite these complications, the strong analogy between multiplier 
 recoding and the concept of the model division leads to a division 
 scheme which is highly compatible with the multiplication scheme des- 
 cribed in the previous section. 
 
 -28- 
 
Model Division 
 
 As shown in the Atkins paper [9 ] , a radix h division may be 
 performed using a table look-up on inputs consisting of the four high- 
 order bits of the divisor and the six high-order bits of the shifted 
 partial remainder. The output of the table is a quotient digit value 
 of either 2, 1, 0, 1, or 2. In the most brute force form the table look- 
 up may be thought of as a grid or matrix. The vertical lines are outputs 
 of decoders applied to d, the truncated (h bit.} version of the divisor; 
 the horizonal lines are outputs of decoders applied to rp, the truncated 
 (6 bit) version of the shifted partial remainder. At each intersection 
 of the lines is an AND gate with one input connected to the vertical line 
 and the other connected to the horizonal line. Each point of intersection 
 corresponds to a quotient digit value, i, and thus the output of each 
 AND gate is connected to an input of the OR gate with output correspond- 
 ing to the quotient digit, q=i. 
 
 The size of the table is constrained by restrictions on 
 
 the range of the divisors and partial remainders. The divisor is normalized 
 
 in the range 1/2 <d< 1. Due to certain properties of the division scheme 
 
 (see Ref. [9 ] ) , any partial remainder, say p., must be in the range 
 
 J 
 | p. | £_ 2/3 d. The shifted partial remainder, rp . , where r is the radix 
 
 and j is the recursive index, must be in the range |rp.| _ 8/3 d when 
 
 J 
 
 r = h. The divisor is always positive; the partial remainder may be either 
 
 positive or negative since the division is nonrestoring. The actual 
 
 implementation is not nearly as formidable as the brute force attack 
 
 might imply. This will be demonstrated using the actual equations for 
 
 the Illiac III model division. 
 
 It is prohibitively expensive to apply the redundant from 
 
 of the partial remainder directly to a table look-up. In a redundant 
 
 number system one algebraic value may be disguised in many forms. The 
 
 6 digit estimate of the partial remainder is therefore assimilated into 
 
 a conventional radix complement form prior to the table look-up. The 
 
 assimilated version is of the form A^A n A^.A^A, A,_A^, where A_ is the sign. 
 
 01234po U 
 
 -29- 
 
The estimates of the divisor are decoded into the intervals shown in 
 Table 4. Since the divisor is stored in the M-Register and since the 
 radix point is between positions 8 and 9» the high-order four bits of 
 the divisor are designated M , M , M and M . Note that since d is 
 at least 1/2, M is always 1. 
 
 y 
 
 Interval 
 
 Name Logic Equations Range of divisor, d, represented 
 
 D 1 **10**11**12 1/2 £ d < 9/16 
 
 D 2 
 
 M 10 M 11 M 12 9/l6 1 d < 5/8 
 
 D 3 ^10 M 11^12 5/8 £ d < 11/16 
 
 B h ^10 M 11 M 12 ll/l6 £ d * 3/k 
 
 D M 10^11 3/k - d < 7/8 
 
 D 6 M 10 M X1 7/8 1 d < 1 
 
 D D v D v D 1/2 < d < 11/16 
 
 D D, v D,. v Y) r 11/16 < d < 1 
 
 8 4 5 o — 
 
 D D, v D 11/16 <_ d < 7/8 
 
 9 4 2 
 
 D 1Q (D v D 6 ) = U ±± 3/4 < d < 1 
 
 D 1X (D x v D 2 v D 3 v B k ) = M X1 1/2 £ d < 3/4 
 
 Table 4 - Divisor Interval Selection Logic 
 
 
 -30- 
 
The assimilated estimate of the partial remainder and the 
 outputs of the divisor interval selection logic are used to generate 
 the logic signals ZERO, ONE, and TWO corresponding to quotient digit 
 magnitudes of 0, 1, and 2, respectively. The signals ZERO and TWO are 
 formed as the OR of two other signals, one corresponding to the quadrant 
 for positive partial remainders; the other corresponding to negative 
 partial remainders. The signal, ONE, is implemented in the form 0NE= 
 
 ZERO v TWO. The following defines ZERO and TWO: 
 
 ZERO = ZEROP v ZERON 
 TWO = TWOP v TWON 
 
 ZEROP = A A A A A, v A A A A A, AD v AJV n AJYJ 
 
 1 2 3 10 
 
 ZERON = AqA^A A^ v AqA^A A^A D 1Q 
 
 TWOP = A A 3 A u A 5 A 6Dl v A^A^ v A^A^A^ 
 v A Q A 2 A 3 D 8 v A A 2 A 3 A 1+ D 9 v A^A^A^ 
 v I A 2 I 3 A U A 5 A 6 D 6 v A oAl v A^ 
 
 TWON = A Q A 3 A U D 1 v A^A^D,, v A^A^A^ 
 
 v A A 2 A 3 A U A 5 A 6 D 5 v A^A^Dg v A^ 
 v A A 2 A 3 A U D 5 v A A 2 A 3 v A^ 
 
 Operational Description of Model Division 
 
 We have now defined a radix k quotient selection mechanism 
 which is analogous to the multiplier recoder defined in Table 3- As with 
 multiplication, division is extended to radix 256 by means of four 
 successive applications of radix k division. 
 
 Let the figure below represent the high-order byte of the 
 US-UM Register and let : denote the radix point for the full precision 
 division. 
 
 To Radix h Division 
 
 -31- 
 
A radix k division with radix point denoted ' . ' is applied to the 
 leading 6 positions of the output of US-UM. For a radix 256 division of 
 the class described in [9] the magnitude of the shifted partial remai:. 
 is less than 170 2/3 relative to the radix point,': 1 . The shifted 
 partial remainder relative to '.' is therefore less than 8/3. The radix 
 h table look-up selects a quotient digit magnitude of either 0,1, or 2. 
 These correspond to radix 256 digits of 0. 64, or 128 and to the selection 
 of no shift gates, shift gate ML7Y1 , or ML6Y1. If neither gate is selected 
 the Y input to the subtracter is zero. The selected multiple of the 
 divisor is added in the first signed-digit subtracter (SDS1 in Figure 3) 
 if the sign of the partial remainder, A , is 1; it is subtracted if 
 A is 0. The new partial remainder, the next input to the model division, 
 appears at the output of SDSl. For the next radix h division, rather than 
 shifting the partial remainder left two positions, the input to the model 
 is shifted right by two positions. Figure 5 summarizes all four stages 
 of one pass through the subtracter cascade. 
 
 Figure 6 is a block diagram of the entire model division 
 structure. The Input Gating is an AND-OR complex which under control of 
 signals C through C, gates the appropriate digits of successive 
 partial remainders into the Assimilation box. Before continuing with 
 the description we must note a slight complication again arising from 
 bogus overflow. In Figure 5, as the inputs to the model division are 
 moved to the right, zeros are shown occupying all positions to the 
 left of the highest order input. The range restrictions on the shifted 
 partial remainders are such that the positions shown as zero should 
 indeed be zero if the partial remainders were not in a redundant form. 
 But due to bogus overflow, the highest order digit of the input to the 
 model may be 1 with a 1 in the position immediately to the left, or 
 vice-versa. To compensate for this behavior the magnitude of the digit 
 immediately to the left of the model input is monitored. If it is non- 
 zero, then the sign of the high-order digit into the model is complemented 
 as it is gated into the Assimilation box. Note that the 0th bit of 
 the UM-Register is equivalent to the 8th position of the LM-Register. 
 
 -32- 
 
Shift Gate Selected 
 
 for magnitude of 
 Position Number quotient digit = 
 
 123H56789111 
 12 2 1 
 
 ° utput of ML7Y1 ML.6Y1 none 
 
 US-UM ' : 
 
 *■ ■ V ' 
 
 To Model 
 
 ° Utput ° f ML5Y2 MLUY2 none 
 
 SDS 1 0_ 0_ _ _._ _:_ 
 
 v v / 
 
 To Model 
 
 Output of 
 
 SDS 2 0000 . : M13Y3 ML2Y3 none 
 
 » v f 
 
 To Model 
 Output of 
 
 SDS 3 000000 : ML1YU MDY^ none 
 
 1 v > 
 
 To Model 
 
 Note: The symbol . represents the radix point for the radix k model 
 division. The symbol : represents the radix point for the 
 full precision division, radix 256. 
 
 SETTING OF NEG SIGNALS: 
 
 Division Stage No. Positive Partial Negative Partial 
 
 Remainder Remainder 
 
 (A =0) (A = 1) 
 o o 
 
 1 NEG0 = NEG1 = 1 NEG0 = NEG1 = 
 
 2 NEG1 = NEG3 = 1 NEG1 = NEG2 = 
 
 3 NEG2 = NEG3 = 1 NEG2 = NEG3 = 
 h NEG3 = NEGU = 1 NEG3 = NEGU = 
 
 Figure 5 - Connection of Model Division to Full Precision 
 Structure 
 
 -33- 
 
C\JC\jfOfO,j. 
 
 >->->-;>->->->L<rO_C\IK> , fr 
 
 f^tOirWfOCVJ — >-Oo'- 
 
 _l — I I I — I — I _ I O UJ UJ UJ UJ UJ 
 
 5 5 525555ZZ2ZZ 
 
 
 O 
 
 if) 
 
 a. 
 
 CL 
 
 z <r 
 
 UJ LU 
 
 — u. 
 
 t- u. 
 
 O => 
 
 3 CD 
 O 
 
 H O 
 
 if) § 
 o 
 
 o 
 
 - > - 
 
 -i 
 
 < 
 
 Q. 
 
 o 
 
 < 
 
 (f) 
 if) 
 < 
 
 < • 
 
 o 
 < 
 
 o 
 o 
 
 UJ 
 
 UJ 
 CO 
 
 UJ 
 
 m 
 < 
 
 Q- 
 
 3 
 CO 
 
 AAA 
 
 rr 
 
 < 
 
 h- 
 
 n 
 
 > 
 
 o 
 
 tn 
 
 cr 
 
 UJ 
 
 
 Ul 
 
 1 
 
 ^> 
 
 (- 
 
 UJ 
 
 Q 
 
 z 
 
 if) 
 
 if) 
 
 a 
 
 -z. 
 o 
 
 if) CO 
 
 ° > 
 
 _1 
 
 UJ 
 Q 
 O 
 
 < 
 
 or 
 < 
 
 o 
 
 o 
 
 —I 
 
 GO 
 
 CO 
 UJ 
 
 or 
 
 z> 
 o 
 
 o o oo 
 
 _ 3 l + _ 
 
The Assimilation "box produces a two's complement version 
 of the estimate of the shifted partial remainder. This together 
 with the Division Interval Select Logic drives the Quotient Select 
 Table. The quotient digits are represented in the same signed 
 digit format as produced by the subtracters. The following gives the 
 signed digit representation of each quotient digit value: 
 
 Quotient Digit Representation 
 
 +2 
 
 +1 
 
 
 1 
 
 
 
 
 
 l 
 
 
 
 
 
 
 
 
 
 
 
 I 
 
 T 
 
 
 
 Note that a distinction is made between a positive and 
 negative zero. The sign of all digits, including zero, is the 
 same as the sign of the partial remainder. If the digit is formed 
 then zero is subtracted from the partial remainder. If the digit 
 is formed then zero is added to the partial remainder. As shown 
 in the proof in the Appendix this method of handling a zero quotient 
 digit eliminates bogus overflow at position 1 for division. 
 
 The quotient digits are buffered until eight are collected. 
 They are then gated to the low-order byte of the UH-UQ Register. 
 The quotient digit also setup the shift gates and NEG signal in 
 accordance with description in Figure 6. The operating time of the 
 model is summarized in Table 5. 
 
 -35- 
 
Block No. of Collector Delay: 
 
 Input Gating 2 
 
 Assimilation 3 
 
 Quotient Selection 2 
 
 Quotient Storage and 
 Shift Control 3 
 
 Total 10 
 
 Table 5 - Operating Times of the Model Division 
 
 It should be emphasized that the scheme used in the 
 model division is but one of many possibilities. Since the amount 
 of logic involved is quite small (10 cards), and has a well 
 defined interface and is physically one package, it is quite 
 feasible to replace the model with new, hopefully improved versions. 
 The operating time for division relative to the operating time 
 for multiplication is primarily a function of the relative operating 
 times of the multiplier recoder and model division. The concept 
 of a i„odel division and the analogy to the multiplier recoder 
 offers several interesting areas of research, some of which are 
 being explored by the author in Ph.D. thesis research. 
 
 Division Structure 
 
 As mentioned earlier, a primary motivation for use of 
 the model division approach is its high compatibility with 
 multiplication. The division structure is the same as the 
 multiplication structure described in conjunction with Figure 3. 
 
 -36- 
 
Brief Operational Description of Full Precision Division Scheme 
 
 The fractional part of dividend is loaded into the UQ- 
 Register from the V-Bus . The fractional part of the division is 
 loaded into the UH_Register from the V-Bus. Both fractions are 
 7 bytes (56 hits) long. The range of a normalized fraction, f, 
 is given by l/l6 <_ f < 1. The model division scheme requires 
 that the division, d, be in the range 1/2 <_ d < 1. If the given 
 divisor is not in this range then both the divisor and dividend 
 are shifted left until it is in range. After normalization, the 
 divisor is forwarded to the M-Register and the dividend is for- 
 warded to the UM-Register. The US-Register is cleared, i.e. all 
 sign bits are set to 0. One division loop consists of the 
 following sequence of steps: 
 
 1) The contents of US-UM (dividend) is gated into 
 the subtracter cascade. The model division 
 successively sets up the shift gates and NEG signals 
 in accordance with the previous description. 
 
 2) The output of subtracter cascade is loaded into 
 secondary rank of accumulator, LS-LM. 
 
 3) The quotient (sign bits in UH, magnitude bits in UQ) 
 is shift left 8 bits and the 8 digits buffered in 
 the model division are inserted into the low-order 
 byte of UH-UQ. The secondary rank of the accumulator 
 ( LS-LM) is shifted left 8 bits into the primary 
 
 rank (US-UM). 
 
 Due to the initial normalization of the divisor and 
 corresponding shifting of the dividend, the dividend may extend 
 across 8 bytes. The division loop must therefore be executed 8 
 times. After the last loop the quotient in the UH and UQ Registers 
 is transferred to the US and UM Registers, respectively. The 
 
 -37- 
 
quotient is then assimilated in the same manner as described in 
 the brief operational description of multiplication. 
 
 The range of the quotient for the division of two non- 
 zero fractions F and F is given by 1/16 < f /f < l6 . A quotient 
 may therefore require a terminal right shift of U bits accompanied 
 by an increase of the exponent. Division by zero or into zero is 
 detected during preliminary steps of the division operation. 
 
 Truncation Error 
 
 The range of the truncation error, e , due to truncation 
 
 after 56 signed digits is given by -2 '' < e < 2 ' . If a terminal 
 
 right shift of the assimilated result takes place e is brought 
 
 into the range -2 < e < 2 ' , however, the first range is the 
 best case that can be guaranteed. 
 
 •38- 
 
REFERENCES 
 
 [l] J. E. Robertson, "A deterministic procedure for the 
 
 design of carry-save adders and "borrow-save subtracters," 
 University of Illinois, Department of Computer Science, 
 Report No. 235, July 5, 1967. 
 
 [2] F. A. Rohatsch, "A study of transformations applicable 
 to the development of limited carry-borrow propagation 
 adders," University of Illinois, Department of Computer 
 Science, Report No. 226, June 1, 1967 . 
 
 [3] R. T. Borovec , "The logical design of a class of 
 
 limited carry-borrow propagation adders," University 
 Illinois, Department of Computer Science, Report No. 
 275, August 1, 1968. 
 
 [k] J. E. Robertson, "The correspondence between methods 
 
 of digital division and multiplier recoding procedures," 
 Department of Computer Science, Report No. 252, University 
 of Illinois, Urbana, December 1967 . 
 
 [5] J. 0. Penhollow, "A study of arithmetic recoding with 
 
 applications to multiplication and division," Department 
 of Computer Science, Report No. 128, University of 
 Illinois, Urbana, September 1962. 
 
 [6] C. S. Wallace, "Suggest design for a very fast multipler," 
 Department of Computer Science Report No. 133, University 
 of Illinois, Urbana, February 11, 1963. 
 
 [7] J. E. Robertson, Internal memo, February 11, 1968. 
 
 -39- 
 
[8] J. E. Robertson, "Methods of selection of quotient digits 
 during digital divison," Department of Computer Science, 
 University of Illinois, Urhana, File 663, 1965. 
 
 [9] D. E. Atkins, "Higher radix division using estimates of 
 the divisor and partial remainders," IEEE Trans . 
 Computers , vol. C-1T, no. 10 (Oct. 1968), pp. 925-93*+ . 
 
 NOTE: All references except [7] and [9] are available upon 
 request to the following: 
 
 Department of Computer Science 
 Mailing Center, Room 23^ DCL 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 A reprint of reference [9] is available from the author 
 
 -ko- 
 
APPENDIX 
 
 The Appendix includes the proof of the validity of the bogus 
 overflow correction scheme and an introduction to the entire Illiac III 
 system. 
 
 -1*1- 
 
PROOF OF THE VALIDITY OF THE BOGUS 
 OVERFLOW CORRECTION SCHEME 
 
 We are concerned with the value of the output of positions and 1 
 of the subtracter. Inspection of the equations for the subtracter defined in 
 Figure 3 reveals that the value of these outputs, Z* and Z* are functions only 
 of the inputs to positions 0, 1 and 2. Since the 0th position is not imple- 
 mented S and X are implicitly both zero. Furthermore since the subtrahend 
 is always considered to be positive and can never be greater than 128, Y and 
 Y are also both zero. Table A-l enumerates Z* and Z* as functions of X*,X* 
 and Y*. Recall the notational convention defined below: 
 
 T. 
 
 l 
 
 z. 
 
 1 
 
 z* 
 
 1 
 
 
 
 
 
 
 
 
 
 1 
 
 1 
 
 1 
 
 
 
 ~0~ 
 
 1 
 
 1 
 
 1 
 
 A digit under the X* or X* columns may be either a positive or 
 negative zero. The table is defined for NEG =0. If NEG were to be 1 , the 
 signs of all output digits would be complemented but magnitudes and thus 
 bogus overflow conditions would be the same. It is therefore sufficient to 
 complete the proof for NEG = 0; the proof for NEG = 1 follows immediately by 
 symmetry. 
 
 For all the cases in Table A-l for which Z* is zero no problem 
 arises. Note that Z* is non-zero if and only if X* = 1, in other words , when 
 S X = 1. For the cases marked with * the bogus overflow scheme is valid. 
 For those marked with # the scheme is not valid but we shall show that within 
 the constraints of the Illiac III implementation these cases cannot occur. 
 The proof is considered for the three classes of operations in which the 
 subtracter cascade is used, namely, addition-subtraction, division and 
 multiplication. 
 
 -1+2- 
 
Entries in Table are Z* (weight 256) and Z* (weight 128 ] 
 
 (a) (b) 
 
 Row No. 
 
 x l* 
 
 x* 
 
 Y 2 = 
 
 Y 2 =l 
 
 (128) 
 
 (6k) . 
 
 (6k) 
 
 (6k) 
 
 
 
 
 
 00 
 
 01 
 
 
 
 l 
 
 00 
 
 00 
 
 
 
 I 
 
 01 
 
 01 
 
 1 
 
 
 
 01 
 
 00 
 
 1 
 
 
 
 11* 
 
 io# 
 
 1 
 
 1 
 
 01 
 
 01 
 
 1 
 
 1 
 
 00 
 
 oo 
 
 1 
 
 1 
 
 11* 
 
 11* 
 
 1 
 
 1 
 
 10# 
 
 10# 
 
 Notes: S Q = X Q = Y Q = Y ± = 
 
 NEG = 
 
 A digit may be positive or negative. Numbers in parentheses 
 indicate the weight of the digital position. 
 
 *Indicates bogus overflow which will be corrected by comple- 
 menting the sign of Z* and disposing of Z*. 
 
 ^Indicates cases in which this correction scheme is not valid. 
 
 TABLE A-l - Possible Values of Z* and Z* 
 
 ■k3- 
 
Addition - Subtraction 
 
 The radix point for floating point operations is between posi- 
 tions 8 and 9 of the subtracters. All operands are less than 1, therefore 
 X* = X* = and Y = 0. We are therefore restricted to row 1, column a 
 (denoted 1-a) of the table. Bogus overflow is therefore avoided. 
 
 Division 
 
 We have shown that bogus overflow arises if and only if X* = 1. 
 For the case of division this is also the sign of the partial remainder 
 and since the partial remainder is negative it is added to the selected 
 multiple of the division. Addition requires that X* be complemented prior 
 to entry into the subtracter and thus X* becomes 1. In division the sub- 
 tracters always see only positive inputs and therefore the states in rows 
 5, 8 and 9 cannot occur. Bogus overflow is avoided altogether. 
 
 Multiplication 
 
 The multiplicand, M, is stored in the M-Register and is in the 
 range l/l6 to 1. The maximum multiple of M which may be formed is 128 
 times M (at the first subtracter) and thus Y can be equal 1 only at the 
 first subtracter. The contents of the accumulator (US-UM), the signed- 
 digit input to the first subtracter, is always less than 1 in magnitude 
 and therefore X* = X* = 0. Thus all entries in column b except 1-b are 
 eliminated as possibilities. The remaining task is to show that case 9-a 
 cannot occur. 
 
 At this point we must note a property of the multiplier recoding 
 scheme defined in Table 3. This property is that 128 is the maximum multiple 
 of the multiplicand which may be combined with the partial product in any one 
 pass through the subtracter cascade. This may be established by considering 
 a group of nine bits which are to be recoded. If +2 or »2 is selected as 
 
 -kk- 
 
the recoded version of the leftmost trio of bits, then all recoded digits 
 to the right are either zero or of opposite sign. Recall from Figure k 
 that the selection of 2 at the left of the recoding logic generates a 7 
 bit left shift of M into the first subtracter. 
 
 Having ruled out cases 5-b and 9-b , we may state that the 10 
 condition in 9-a occurs if and only if X* = X* = 1. This means that the 
 algebraic value of the signed-digit input to the subtracter is more negative 
 than -128. This clearly cannot be the case unless a multiple of 128 x M 
 has been combined with a non-zero partial product of the same sign as the 
 multiple. This may occur only in the first subtracter. If case 1-a 
 occurs, the partial product is less than 128 in magnitude and thus cannot 
 become more negative than -128 by subsequent operations in the subtracter 
 cascade. 
 
 Case 1-b can occur only if a multiple of 128 is selected and thus 
 the subsequent subtracters can only either preserve the value of the partial 
 product by subtraction of zeros or decrease it in magnitude. If the mag- 
 nitude is decreased, it will be decreased to less than 128 and thus case 9 _ a 
 is immediately ruled out. If subtraction or additon of zero occurs at sub- 
 sequent subtracters the 01 pattern in 1-b will propagate through and cannot 
 become 10. This is demonstrated by the following reasoning. For case 1-b 
 with X* = 0, Z* is never 1. Z* and Z* are the X*, X* inputs to the next 
 subtracter and thus we are brought to either row, but will be corrected back 
 to either the case Z* = 1 , Z* = 1 or to the case Z* = 1, Z* = 0. The 01 
 pattern of case 1-b will therefore pass through all of the subtracters and 
 will never be reformed as 11 in 9-a. 
 
 -1*5- 
 
BRIEF DESCRIPTION OF THE ILLIAC III COMPUTER SYS'-' 
 
 The Illinois Pattern Recognition Computer, Illiac III, 
 is a digital processor for visual information. It is primarily 
 designed for automatic scanning and analysis of massive amounts of 
 relatively homogeneous visual data. In particular the design is 
 an outgrowth of studies at this laboratory of a computer system 
 capable of scanning, measuring and analyzing in excess of 3 x 10 
 "bubble chamber negatives per year. 
 
 Illiac III, though specifically designed to process visual 
 information, also provides complete facilities for standard general- 
 purpose computation. Both the picture processing and general- 
 purpose computation facilities of Illiac III will be available to 
 users on a time-sharing basis. 
 
 As can be seen in Figure A-l, Illiac III is a multi- 
 processor computer system. Six processors (U Taxicrinic Processors 
 and 2 Input/Output Processors) access in parallel the computational/ 
 storage units consisting of 2 Arithmetic Units, 1 Interrupt Unit, 
 1 Pattern Articulation Unit, and k Storage Units. Each computational/ 
 storage unit of the computer system specializes in a particular 
 activity. Thus, for example, all floating-point computation is 
 done in the Arithmetic Units, while picture processing is performed 
 primarily by the Pattern Articulation Unit. Processors, on the 
 otherhand, analyze user jobs and route their constituent tasks 
 to the appropriate specialized processing units. The individual 
 processors of the system can operate simultaneously and independently 
 (within the limits imposed by the System Supervisor) with a consequent 
 increase in overall efficiency. 
 
 The Input/Output Processors (I0P) are attached via Channel 
 Interface Units and Device Controllers to various input and output 
 devices. Among facilities important for the ingestion of visual 
 information are 8 CRT flying spot scanners: two for 70 mm film, 
 two for h6 mm film, two for microfilm/microfiche, and two for 
 
 * 
 
 From Section 1 of the Illiac III Programming Manual. 
 
 -1*6- 
 
TAXICRINIC 
 
 J 
 
 J 
 
 PROCESSORS 
 
 >s ^ ^ ><; X jK 
 
 ^ ^ "^ 
 
 FC 
 
 SC 
 
 DC 
 
 *; s * s < i 5 ,< , c 
 
 ^ t i 
 
 FC 
 
 SC 
 
 DC 
 
 ^ »< >! »S 7 C ■><■ 
 
 ^ ^ ^ 
 
 FC 
 
 SC 
 
 DC 
 
 >C >5 S 5 *t 7* 7<- 
 
 \ \ "\ 
 
 FC 
 
 SC 
 
 DC 
 
 *= >< H >< 7' 7* 
 
 1 1 ] 
 
 ARITHMETIC 
 UNITS 
 
 AU 
 
 AU 
 
 
 ^ ^r-^ 5 > 5 7 C 7 C 
 
 \ \ \ 
 
 EXCHANGE NET- 
 
 A 
 
 PAU 
 
 I/O PROCESSORS 
 o I 
 
 ( t t t t t-rA X \ \ \ \ \ \ \ 
 
 ■CENTRAL UNITS 
 
 i i r 
 
 CHANNEL INTERFACE UNITS 
 
 4 5 6 7 8 9 I 10 I II 
 
 12 
 
 13 
 
 14 
 
 15 
 
 SECONDARY STORAGE 
 
 SCAN/DISPLAY 
 
 INTER- 
 MACHINE 
 LINKS 
 
 LOW 
 SPEED 
 TERM. 
 
 Figure A-l Schematic of Illiac III Computer 
 
 -1*7- 
 
microscope slides. These scanners c i.n also operate as film 
 cameras and thus serve as both input and output devices. Monit-. 
 stations have also been attached to the Input/Output syst' 
 These each consist of a CRT display, a typewriter, and a magnetic 
 tape unit; and are provided to assist human control of the analysis. 
 
 The duty of the Pattern Articulation Unit (PAU) is to 
 perform local preprocessing on the input from the scanners, such 
 as track thinning, gap filling, line element recognition, etc. 
 The logical design of this all-digital processor has been optimized 
 for the idealization of the input image to a line drawing. Nodes 
 representing end points, points of inflection, points of inter- 
 section, etc. are labeled in parallel by appropriate programming 
 under overall control of the Taxicrinic Processor. The abstract 
 graph describing the interconnection of labelled nodes is then 
 extracted as a list structure, which comprises the normal output of 
 the Pattern Articulation Unit. 
 
 This output is then operated on by a Taxicrinic Processor 
 (TP), which assembles such graphs into coherent list structures 
 subject to a recognition grammar and then syntactically categorizes 
 them to complete the visual recognition process. The Taxicrinic 
 Processors are primarily responsible for the execution of user 
 programs, that is, to oversee the operations of the Pattern 
 Articulation Unit, the Arithmetic Unit and to initiate input/ 
 output operations in the IOP's by making requests to the Interrupt 
 Unit. 
 
 The Arithmetic Unit (AU) is used exclusively for performing 
 arithmetic operations for the TP. Although there are a few simple 
 arithmetic operations which can be done in a TP (e.g. , integer 
 addition) the more complicated operations are done in the AU. 
 The AU has been optimized for double-word floating point arithmetic. 
 
 The Interrupt Unit (IU) handles all the interrupt requests 
 from the TP and IOP. When an interrupt is requested it notifies 
 the proper processors which then take appropriate action. 
 
 -48- 
 
All of the Illiac III processors and units communicate 
 with each other through the Exchange Net (XN) as shown in Figure 
 A-l. The Exchange Net is responsible for all the necessary 
 queueing and priority checking. 
 
 As noted above, there is indeed a reason for calling 
 one piece of equipment a processor and another a unit, even though 
 the type of operations they perform may both appear to be "processing" 
 operations. In the Illiac III system all major modules are 
 designated as either "processors" or "units" according to their 
 position in the Exchange Net. In Figure A-l, the processors 
 are shown at the top and bottom and the units are shown on the 
 right. The effect of this division is that processors may communicate 
 directly with units and vice versa but may not communicate directly 
 with each other. If a processor needs to communicate with another 
 processor it must get help from a unit (normally the Interrupt 
 Unit) and if a unit (say the PAU) wants to communicate with another 
 unit (say a storage unit) the information must be transferred 
 through a processor (the TP in this case). 
 
 -k9- 
 
Form AEC-427 
 
 (6/68) 
 
 AECM 3201 
 
 U.S. ATOMIC ENERGY COMMISSION 
 
 UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR 
 
 DISPOSITION OF SCIENTIFC AND TECHNICAL DOCUMENT 
 
 1. AEC REPORT NO. 
 
 C00-1018-1183 
 
 ( See Instructions on Reverse Side ) 
 
 2. TITLE 
 
 DESIGN OF THE ARITHMETIC UNITS OF ILLIAC III 
 USE OF REDUNDANCY AND HIGHER RADIX METHODS 
 
 3. TYPE OF DOCUMENT (Check one): 
 
 LJ a Scientific and technical report 
 
 LJ b. Conference paper not to be published in a journal: 
 
 Title of conference 
 
 Date of conference 
 
 Exact location of conference 
 
 Sponsoring organization 
 
 □ c. Other (Specify) 
 
 4. RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): 
 
 l^j a. AEC's normal announcement and distribution procedures may be followed. 
 
 Q b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractors 
 
 I I c - Make no announcement or distrubution. 
 
 5 REASON FOR RECOMMENDED RESTRICTIONS: 
 
 SUBMITTED BY. NAME AND POSITION (Please print or type) 
 
 Daniel E. Atkins 
 
 Organization 
 
 Department of Computer Science, University of Illinois, Urbana, 111, 
 
 Signature 
 
 fcK&O&ti-^ 
 
 Date 
 
 May 28, 19 69 
 
 FOR AEC USE ONLY 
 
 AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE A 
 RECOMMENDATION: 
 
 NNOUNCEMENT AND DISTRIBUTION 
 
 PATENT CLEARANCE: 
 
 D a. AEC patent clearance has been granted by responsible AEC patent group. 
 U b. Report has been sent to responsible AEC patent group for clearance. 
 LJ c. Patent clearance not required. 
 
: .^ 
 
 3fc 
 
r ,---;>