LI B R.AR.Y OF THE UN IVERSITY Of ILLINOIS no.&l- Bfc NOTICE: Return or renew all Library Materialsl The Ulnlmum Fee for each Lost Book Is $50.00. The person charging this material is responsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for discipli- nary action and may result in dismissal from the University. To renew call Telephone Center, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN f^B20 L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/studyofparallelo81metz UNIVERSITY OF ILLINOIS li^a^^'-n GRADUATE COLLEGE ^ "^ . DIGITAL COMPUTER LABORATORY ^ . "^ REPORT NO. 81 A STUDY OF PARALLEL ONE'S COMPLEMENT ARITHMETIC UNITS WITH SEPARATE CARRY OR BORROW STORAGE by Gemot Metze November 11, 1957 (This is being submitted in partial fulfillment of the requirements for the Ph.D. degree in Electrical Engineering, February, 1958') This work, was supported in part by the Atomic Energy Commission and the Office of Naval Research under AEC Contract AT(ll-l)-4l5 IV TABLE OF COITTEWTS 1 Introduction 1 2 A Review of Material Pertaining to Conventional One ' s Complement Arithmetic Units 5 2,1 Introduction 5 2o2 The One's Complement System of Kumter Representation 5 2.21 The One's Complement Representation 5 2.22 End-Around Corrections for Additions or Suhtractions 6 2.23 End-Around Corrections for Shifts 9 2.3 Boolean Algebra Notation 11 2.4 Register and Gating Arrangements in Conventional Arithmetic Units 12 2.^1 Addition and Subtraction 12 2,^2 Left Shift and Right Shift 13 2.^3 Multiplication and Division 15 3 Addition and Subtraction in Arithmetic Units with Separate Carry or Borrow Storage l6 3.1 Introduction l6 3.2 Quasi-Adder with Separate Carry Storage l8 3.21 Quasi -Addition l8 3.22 Carry Assimilation 22 3.23 Zero Recognition 25 3.2^ An Example of Quasi-Addition 27 3.3 Quasi -Sub tract or with Separate Borrow Storage 29 3.31 Quasi-Subtraction 29 3.32 Borrow Assimilation 30 3.33 Zero Recognition 3O V 3.4 Quasi-Adder-Suttractor with Separate Carry- Borrow Storage 30 3. ill Quasi-Addition-Subtraction 30 3.^2 Carry- Borrow Assimilation 2^ 3.43 Zero Recognition 37 3 o 5 Summary ' 37 4 Multiplication and Division in Arithmetic Units with Separate Carry or Borrow Storage 39 i+„l Introduction 39 k,2 Extended Arithmetic Units kl h.3 Multiplication 42 4.31 Method of Multiplication k2 4o32 Multiplication in Conventional Arithmetic Units k^ 4.33 Multiplication in an Arithmetic Unit with Separate Carry or Borrow Storage k6 4.34 An Example of Multiplication k'J k.k Division k'J 4.41 Method of Division kf 4.42 Division in Conventional Arithmetic Units 5I 4.43 Division in an Arithmetic Unit with Separate Carry or Borrow Storage 53 4.44 An Example of Division 5^ 5 Analysis of Overflow 56 501 Overflow in Conventional Arithmetic Units 56 5 02 Overflow in Arithmetic Units with Separate Carry or Borrow Storage 57 VI 5-3 Overflow Analysis of Arithmetic Operations 62 5o31 Preliminary Remarks 62 5.32 Overflow during Left Shift of One Digital Position 6h 5.33 Overflow during Addition or Subtraction 66 5.3^ Overflow during a Multiplication Step 68 5035 Overflow during a Division Step 69 5 036 Overflow during Improper Division 72 5 o h Summary 72 6 Conclusions 7^ Bibliography 77 1 INTRODUCTION In recent years, several proposals have "been made to increase the speed of operation of "binary, parallel, asynchronous, automatic digital com- puters. These improvements range from the use of newly-developed components and changes in the circuit design of the basic logical units to new concepts in the logical design of entire sections of the computer. In this thesis, we shall "be concerned only with speed increases re- sulting from the logical re-design of the arithmetic unit of a computer which uses the one's complement system of number representation. The design philosophy used is fnat used in computers of the Instutute for Advanced Study type [l], [h], "" Since the arithmetic unit is involved in most of the machine opera- tions, some increase in over-all speed of the computer will certainly be obtained if the basic arithmetic processes are speeded up. Since subtractions are performed as additions of the complement of the subtrahend in machines of the type where addition is the basic operation, and since multiplications and divisions are executed as series of conditional additions, subtractions, and shifts, attention will be focused on the process of addition. Multiplication especially can be further speeded up by minimizing the number of additions or subtractions . In a parallel computer, addition is performed in two logically dis- tinct steps o The carry digits are formed and propagated in a necessarily serial fashion, and the sum digits are determined in a parallel fashion as soon as the carries are formed. Numbers in brackets refer to references cited in the bibliography. 2 In many existing computers the time assigned to carry propagation is that required hy the worst case, when a carry is propagated through the entire length of the register, plus a safety margin to allow for variations in the carry circuitry as well as in the timing device. Several improvements have "been suggested. The carry can be made to time itself [5]^ and advantage can be taken of the fact that some of the digit combinations possible at the adder input produce an outgoing carry which is Independent of the incoming carry l6] . We note that in all of these schemes a carry propagation is required for each addition. Further increases in operation speed can be achieved by reducing the need for carry propagations. If separate carry storage is provided, carries arising from a sequence of additions need not be propagated over more than one digital position during each "quasi-addition" [l] , [2], £3]^ [T] f l8J. Numerical results are represented with the carries unasslmilated when- ever possible. Carries are assimilated to obtain the conventional representa- tion only when results are to be transferred from the arithmetic unit. Inter- stage carries may arise during this process of carry-assimilation. Separate carry storage can be employed in arithmetic units using several systems of number representation. The advantage of not having to propagate the carries for each addition is, to a large extent, lost in the absolute- value-and- sign system. Since it is not known at the outset whether e.g. a subtraction will yield a positive or a negative difference, the result will be incorrectly represented one-half of the time and the representation has to be converted in these cases by complementing the non-sign digits and changing the sign. The need for this conversion is, however, not apparent until after a carry- assimilation, at least to the sign digit, has been per- formed. The use of the two's complement system of number representation in 3 an arithmetic unit with separate carry storage was investigated by the staff of the Digital Computer Laboratory at the University of Illinois during the fall of 1956 and the spring of 1957 [8] . Since zero is always interpreted as a positive quantity in this system it was not found possible to devise a method of "tidy" division which would directly yield a truncated quotient and a corresponding remainder such that the remainder has the same sign as the dividend and is less in absolute value than the divisor. The schizophrenic zero of the one's complement notation — a balance of zero is indicated as negative zero in an addition^ as positive zero in a subtraction — appeared to permit exactly such a division scheme if both addition and subtraction could be performed directly in the arithmetic unit. Complementing facilities for the accumulator register would then not be necessary. The desire for a "tidy" division scheme initiated an investigation into the properties of an arithmetic unit with separate carry storage using the one's complement number representation. A quasi-adder with separate carry storage was designed. The notion of separate cariy storage was then extended to separate borrow storage. It was also found that carries and borrows arising in a quasi-adder-subtractor could be stored in one register without interference. Although such an arithmetic unit using a quasi-adder- subtractor with separate carry-borrow storage is expensive in terms of hard- ware, it possesses interesting theoretical properties which, in the author's opinion, warrant further investigation. An attempt to apply as many time-saving schemes as possible to these arithmetic units revealed that multiplication in the one's complement nota- tion is more expensive, in terms of either time or equipment, than in the two's complement notation. On the other hand, the three designs of arithmetic k units with separate carry, borrow, or carry-borrow storage exhibited an un- expected zero- recognition feature which directly permits tidy division without recourse to complementing facilities for the registers of the arithmetic unit. In fact, the easy complementation feature of the one's complement notation, one of the strongest arguments for the use of that notation in conventional computers, is lost in the two-register representation of the separate carry or separate borrow storage scheme and is conjectured to persist only in the separate carry-borrow scheme. 2 A REVIEW OF MATERIAL PERTAINING TO CONVENTIONAL ONE'S COMPLEMENT ARITHMETIC UNITS 2.1 Introduction In this chapter we shall review material pertaining to that type of arithmetic unit which forms the starting point of this investigation. The mode of operation of such a conventional arithmetic unit is taken to be parallel and asynchronous. Numbers used in the machine are represented in base 2, negative numbers are expressed in the one's complement (digitwise complement) system, one of the three systems most frequently used in binary digital computers [5] • The precision of a number X is (n + l) bits, i.e. binary digits. All numbers are restricted to the range -1 < X < +1. A number X is repre- sented in the machine as a set of digits ■^0' '^1' '^2' '''> ^-^ ' The radix point is assumed to be fixed between x„ and x . The digit x„ is the sign digit: it is zero if the number is positive, 1 if the number is negative. 2.2 The One's Complement System of Number Representation 2.21 The One's Complement Representation If a number X is positive, i.e. if x„ = 0, then the arithmetic value of X is obtained as the weighted sum of the non-sign digits of the machine representation : n ^-i X = r. n 2"^ X. (X >0) . ^— '1=1 1 ^ ' If a number X Is negative, i.e. if x,. = 1, then the arithmetic value of X is the negative weighted sum of the digitwise complement of the machine representation : n ^-i — X= - X. n 2"^ X. = - Zi^i 2-^1 - -) = -1 + 2'"^ + y .^ 2"^ X. (X <0) '^ 1=1 1 ^ The two expressions can be combined into one by means of the sign digit: X = X (-1 + 2"'') + 7.^ 2"^ x. (-1 < X <+l) o^ ' ^^1=1 1 ^ ' It is evident that two representations of zero are possible: the arithmetic value of zero is obtained if the weighted sum of the non-sign digits is zero and x„ = 0, or if the weighted sum of the non-sign digits is (l - 2 ) and x^ = 1. We thus have a positive zero with x„ = x. = 0, and a negative zero 1 ' with x^ = X. = 1. 1 2,22 End-Around Corrections for Additions or Subtractions A feature of the one's complement system is that a correction is required under certain conditions in the least significant digital position of a number. Consider the sum Z of two fractions, X and Y. The arithmetic value of Z should be Z = X + Y 7 The adder, however, performs additions modulo 2 on numbers in machine repre- sentation. Now the machine representations of X and Y are x and y respectively; = y "" 2 n ^-1 X. 1 ,-n> = X + Xq(2 - 2'n y = Y + yQ(2 - 2"'') . The result of the modulo 2 addition of x and y is X + y = X + Y + (xq + yQ)(2 - 2"^) mod 2 = X + Y - S'^'Cxq + y^) mod 2 The machine representation z of the sum Z modulo 2 should, however, "be z = Z + z (2 - 2"^) mod 2 = X + Y - 2"^ Zq mod 2 Thus, the difference between the desired machine representation of the sum and the actual adder output is difference = z - (x + y) where z„ is the true sign digit of the sum modulo 2. The correction necessary to make the adder output equal the machine repre- sentation of the sum is an addition of (x„ + y^, - z ) in the least significant stage of the adder. The otherwise unused carry input, c , to the least 8 significant stage of the adder can be used for this purpose. The sign digit z„ generated "by the sign digit stage of the adder is, however, not the true sign digit of the sum modulo 2 if the sum has exceeded the range -1 < Z +1 Overflow Sum < -1 Xp,, yp, = sign digits of augend, adddnd 0' ■'O c = carry input to sign digit stage of adder z„ = sign digit of sum generated by adder *(true sign of sum modulo 2 in parentheses) c ^ = carry output from sign digit stage of adder c = correction term, applied to carry input of least significant stage of adder 9 It can Toe shown easily that a carry cannot travel more than once around the carry loop created by connecting the carry output of the sign-digit stage to the carry input of the least significant stage: A carry cannot start from any stage other than one having the digits of addend and augend both equal to one. This stage, however, will have a carry output regardless of whether the carry originating from it ultimately appears at its carry input or not; only the sum digit of that stage depends on the carry input. Evidently, analogous remarks apply to a machine in which the basic operation is subtraction, with addition performed as a subtraction of the complement of the addend. An "end-around borrow" is necessary to keep machine representations of numbers consistent. Of course, the end-around correction will be either a carry or a borrow if additions and subtractions are per- formed directly in an adder- sub tractor . 2.23 End-Around Corrections for Shifts An end-around correction similar to the one for additions and sub- tractions is necessary when a quantity is shifted left, or doubled. It must be true that the arithmetic value of the number after the shift is twice that before the shift, i.e. Y = 2X or y^C-l + 2-^) + Xi=i 2"' J. = 2[xq(-1 + 2-^) -f ^i^i ^'^ ^] niod 2 , The left shift is described by y. =x. , 0 ^ (i V .th , 1 stage of adder (? /\ 1-1^ \ to I r ?' ^ 1 I i k a. 1 CG m. 1 c. 1 c. 1 inter- stage carry- from ® gate s\im © gate straight dovm. @ gate straight up to © gate left down ^-1 © gate right down CG complement gate Figure 2.1 i Stage of a Conventional Arithmetic Unit using an Adder 15 A right shift is performed in a manner similar to the left shift by using gate 5 instead of gate h. To maintain a correct representation as indicated in section 2.23, digital position s„ is connected to a via a gate h, and to a as well as a via a gate 5« The contents of the least significant stage before a right shift is lost during the shift. 2.ii3 Multiplication and Division Since multiplications and divisions are generally executed as a series of conditional additions or subtractions and shifts, the arithmetic units must have at least the equipment specified so far. Furthermore, a set of shifting registers is necessary to hold the multiplier dioring multiplication and the quotient during division (see Figure 2.1): Multiplier-Quotient Register Q Temporary Multiplier-Quotient Register T, serving as temporary storage for the contents of Q during shifts . The multiplicand or the divisor are held in the number register M. If double-length products and dividends are to be used, registers A and S must be extended to (2n + l) stages. Additional equipment is necessary to provide end-around corrections in the double-length representation, unless only very restrictive variants of multiplication and division are allowed. 16 3 ADDITION MD SUBTRACTION IN ARITHMETIC UNITS WITH SEPARATE CARRY OR BORROW STORAGE 3ol Introduction Since our attempts to increase the over-all operation speed of a parallel computer are assumed to be restricted to a redesign of the arith- metic unit logic, a reconsideration of the hasic operations of addition and subtraction becomes necessary. Multiplications and divisions which are per- formed as a series of steps involving additions and subtractions will, of course, profit from any improvement in the speed of additions and subtractions, Let us investigate therefore the processes of addition and subtrac- tion. The discussion can be limited to the type of arithmetic unit in which addition is the basic operation, since dual remarks apply to an arithmetic unit with subtraction as the basic operation. Conventional parallel addition is executed in two logically distinct steps. The carry digits are formed and propagated in a manner which is necessarily serial. The sum digits can be determined in a parallel fashion as soon as the carries are given. In most computers the most time-consuming step is carry propagation. In many existing computers which are otherwise asynchronous a timing device is used to determine the time allotted to carry-propagation. This interval must be long enough to allow carry propagation for the worst case, when a carry travels through all stages of the adder » The interval must in- clude a safety margin to allow for tolerances in the carry circuitry as well as in the timing device itself. Strictly asynchronous operation of the adder can be obtained if the carry is made to time itself r5~l« Both a one's-carry and a zero's-carry 17 signal are used. Since one or the other of these two signals must propagate, the completion of the carry propagation can he sensed. The time required for carry propagation is exactly that time necessary for either the one's-carry or zero's-carry to travel through all stages of the adder. The improvement attained is only slight: a timing device is not necessary and speed variations in the carry circuitry are automatically absorbed. A further improvement takes advantage of the statistical properties of carries [6]. Since half the digit combinations possible at the input to an adder stage produce a one' s -carry- out or a zero's -carry-out independently of the carry-in, one ' s-carries and zero' s-carries can be initiated simul- taneously and independently at all such stages. Carry propagation is complete if each stage exhibits either a one' s-carry-out or a zero's-carry-out . The total time required is merely that necessary to propagate one' s-carries or zero' s-carries through the longest sequence of stages in which the carry-out depends on the carry-in. The average length of the longest such sequence was experimentally shown to be 5»6 stages for a 40-bit register using a two's complement adder [6] , Each of the schemes described so far requires some sort of carry propagation for each addition. Further speed improvements can be obtained by reducing the need for carry propagations in general. If carries can be stored separately, then carries arising from a sequence of additions need not be propagated over more than one digital position during each step [l], \_'2'\, \_3']f [yI, [8] . Quantities are stored within the arithmetic unit with the carries unassimilated, whenever possible. Carries are assimilated to obtain the con- ventional representation only when results are to be transferred. During the process of carry assimilation carries may propagate through all stages of the assimilator. 18 In the following sections we shall develop arithmetic units which employ separate storage facilities for carries or "borrows. 3.2 Quasi-Adder with Separate Carry Storage 3.21 Quasi-Addition The Boolean expressions defining the formation of the sum and carry digits in a conventional one's complement arithmetic unit using an adder are (see Figures 2.1 and 3'l): s . = a. @ m. @ z. 1 111 :;. , = z. (a. @ m, ) V a.m. with z = z , 1-1 1^ 1 1 11 n -1 where a. = i digit of augend th ' m. = i digit of addend 1 s. = sum digit formed in stage i z. -, = carry digit formed in stage i, propagated to stage (i-l). i = 0, 1, . . ,, n . Instead of allowing the carries to propagate, the cariy chain of the conventional adder is "broken at the point marked X in Figure 3'1 a^nd carries are stored in the temporary carry register B. Previously formed carries are stored in the carry register C (see Figure 3»2). A quasi-addition is performed in a manner similar to conventional addition. The augend in A and C is combined with the addend in M "by the quasi- ■><■ The addend is o"btained at the output of the num"ber register-complement gate complex M. Thus, subtractions performed as additions of the comple- ment of the subtrahend are implicitly included in the discussion of additions . 19 •^i-l s. 1 a. X °i-l m. 1 NOTE: For conventional adder operation connect b. , to c. at X, for quasi-adder operation break at X; Figure 3-1 i Stage of a Conventional Adder Modified to a Quasi-Adder 20 from b. ® \ d. inter-stage carry in carry assimilator 1 gate qua si- Slim 2 gate straight down 3 gate straight up 4 gate left down 5 gate right down 6 gate assimilated sum CG complement gate Figure 3 '2 i Stage of an Arithmetic Unit with Separate Carry Storage 21 adder, yielding a quasl-sum in S and carries in B. The contents of S and B are then transferred to A and C, respectively, to form the new augend. The Boolean expressions defining quasi-addition are: s. = (a. ©m.)©(a. ^ m. ,v c.) with s = s , 1 1 1 ^ 1+1 1+1 1 n -1 b. -, = (a. © m. ) • (a. ^ m. , v c. ) with "b = "b , 1-1 ^1^1 ^ 1+1 1+1^ 1^ n -1 where a., c, m., s., Td. are the i digits of registers A, C, M, S, B s. = quasi-sum digit formed in stage i h. , = carry digit formed in stage i, stored in stage (i-l) i = 0, 1, • . . , n . We note that a carry arises if corresponding digits of A and M, say a. ^ and m. ,, are both 1. This carry is either absorbed in s . if "^ 1+1 1+1' "^ 1 a. = m., or is automatically propagated over one digital position and stored in b. , if a. ^ m. . Previously stored carries, c.are similarly absorbed or 1-1 1 ' 1 '' ' i' "^ automatically propagated. We shall show that these two types of carries cannot interfere, as a consequence of choosing to break the carry chain of the conventional adder at X. As a further consequence, carries need not be assimilated during a sequence of additions . It is evident from the expressions defining quasi-addition that s. • b. , = 0. We also have that C was cleared initially. During a quasi- addition, a. and c. are the s. and b. of the previous quasi-addition. Hence ' 1 1 11 ^ ^ we also have a. - c. ., = and a. ^ ° c. = 0. Therefore, a. ^ m. ^ and c. 1 1-1 1+1 1 ' 1+1 1+1 1 cannot simultaneously be 1. If the result of a sequence of additions is to be transferred from the arithmetic unit, the carries in B are combined with 22 the quasl-sum in S in a process called carry assimilation to obtain a con- ventional representation of the sum. Interstage carries may propagate through all stages of the assimilator during this process. 3.22 Carry Assimilation Carry assimilation is the process in which the contents of S and B are combined into A while C is cleared. Assuming that an adder of conventional design is needed for assimilation, we have a. = s. ® h. ® d. 1 111 d. -, = s.(b. @ d.) V b.d. with d = d , 1-1 1^11 11 n -1 where a. = sum digit assimilated in stage i d. , = interstage carry formed in stage i, propagated to stage (i-l). It can be shown, however, that less than a full adder is sufficient for carry assimilation. We obtain from the equations defining s. and b. , that s. b. , = 0, 1 1-1 1 1-1 ' i.e. that s. and b. ., cannot be both ones. To show inductively that 1 1-1 "^ s. b. , =0 implies b.d. = we note that either b. = for some i or b. = 1 1 1-1 ^11 1 1 for all i. The latter case cannot occur, since we must have both (a. © m. ) and (a. ^ m. ^ v c.) equal to one for all i to get b. = 1 for all i. Clearly 1+1 1+1 1 ^ '='1 "^ (a. © m. ) and (a. ^ m. ^ ) cannot both be one for all i. Hence we must have 11 1+1 1+1 c. = 1 for all i to get b. = 1 for all i. This, in turn, is possible only if b. = 1 for all i as the result of the previous addition, etc. Since the C register is cleared to at the beginning of a sequence of additions, this case will never occur. 23 If b. =0 for some i, then b.d. = for that i. Since h = h , and 1 ' 1 1 n -1 d = d , because of the end-around corrections necessary, we can use that i n -1 ' as starting point to prove by induction that b.d. = follows from s.b. , = 0. It remains to be shown that b. ..d. , =0 follows from b.d. = and s.b. ., = 0. If b . , =1, then s. must be 0. Then d . -, = since b .d . = by hypothesis. J J If d . T =1, then both s . and (b . @ d.) must be 1, since b .d . = by hypothesis. But if s . = 1, then J J J b. , = 0. Hence, b. -,d. , = for all ,1 follows from s.b. ., = 0. The carry assimilation equations can therefore be drastically simplified to a. = s. @ (b. V d. ) d. -,=s. • (b.Vd.) with d = d ^ . 1-1 1^11 n -1 Carry assimilation will be fastest if the carry is made to time itself [5]> L^]' Po^ this purpose, two signals are used which are complements of each other only after all changes have occurred: a one's-carry d. and a zero's-carry d.. Carry assimilation is initiated by enabling assimilation gates g. Completion of the carry assimilation process is indicated by a signal h. The carry assimilation equations are modified to incorporate these signals as follows: a. = s. ® (b. V d, ) = s. b. d. V s.(d.v d.) Ill 1^1 1 2h i-i <.r ' s.(b. V d}) 1 1 ;(s. V b. d. ) ^ 1 11 h= (d°vdj)(d° Vd^) with d = d , n -1 with d = d T n -1 (d° V a^) n n The logical diagram of one stage of the carry assimilator is shown in Figure 3«3 "below. from other stages Figure 3«3 i Stage of a Carry Assimilator for an Arithmetic Unit with Separate Carry Storage 25 3.23 Zero Recognition Carry assimilation is initiated hy enabling the g-gates in both the one's-carry and the zero's-carry output lines of each stage of the assimilator (see Figure 3»3)» Wow any stage in which s. = b. = 1 will give rise to a one's-carry^ any stage in which s. = will give rise to a zero's-carry, and the stages in which s. = 1, b. = will wait for an incoming carry before an output is produced. To be precise, only a stage in which s. = b. =0 should originate a zero' s-carryo However, s. = 0, b. = 1 must also produce a zero's- carry since b.d. = holds and a one's-carry cannot arrive at that stage. We have therefore that a one's-carry or a zero's-carry will arise in at least one stage, causing carries to be propagated through the remaining stages, unless s. = 1, b. = in all stages. Since carries are propagated in a loop because of the necessary end-around correction, neither a one's-carry nor a zero's-carry will arise in the case s. = 1, b. = for all i. We claim that this case, where s . = 1, b^ = for all i, is a unique representation, before carry assimilation, of a balance having arithmetic value zero if that value is represented in the machine as "negative zero." To prove the uniqueness of the representation we assume by way of contradiction that other unassimilated representations of a zero balance exist. We obtain from the original carry assimilation equations, . = s. ® (b. V d.) and d ^=s.(b.Vd.), that a.d. ^ = for all i. 1 11 1 1-1 1 1 1 ' 1 1-1 There are two possibilities of representing zero after assimilation: (i) positive zero, i.e. a. = for all i (ii) negative zero, i.e., a. = 1 for all io a 1 26 (i) If a. = for all i after assimilation, then s. = (Td. V d.) for all i. ^ ' 1 ' 1 ^ 1 1 (a) If s . = for all i, then h. = d. = must hold for all i. ^ ' 1 ' 11 We note that zero's-carries will arise in each stage since s. = 0. The quasi-adder equations indicate, however, that s, = b. = as a representation of a sum can result only if the augend is represented by a. = c. = and the addend by m. = 0, for all i. Thus, if negative zero is used as the assimilated machine representation of zero, a balance of zero will not be indicated by s. = b. = 0. ' 11 (b) If s . =0 for some but not all i, then there must exist two ad,]acent stages such that s. =0, s. ., = 1. Then (b. ^ V d. ^ ) = 1. ^ ^ 1 ^ 1+1 ^ 1+1 1+1' Hence d. = 1<. However, d. = must hold to obtain a. =0. Hence 1 ' 1 1 s. = for some but not all i cannot occur as an unassimilated 1 representation of positive zero. (c) If s . = 1 for all i, then b. = for all i since s. b. , =0 holds ^1 ' 1 1 1-1 for all i. Hence d. = 1 for all i. Thus s. = d. = 1 and b. = 1 111 for all i would yield positive zero upon assimilation if a one's- carry d. were to arise in some stage. (ii) If a. = 1 for all i after assimilation, then d. -, = for all i. Hence ^1 ' 1-1 s. © b. = 1 for all i. Since s. b. , =0 holds for all i, we can represent 11 1 1-1 ' ^ an unassimilated negative zero only by one of the two following representations (a) s. = 0, b. = 1, with d. = for all i. We note that s. = 0, b. = 1 ^I'l' 1 I'l implies that (a. ©m.) = 1= (a. m. V c.) for all i, implying in turn that c. = 1 for all i. However, since C is set to zero at 1 ' the beginning of a series of quasi-additions, c. = 1 for all i cannot obtain. 27 (Td) s. = 1, Id. = 0, with d. = for all 1„ We note that s. = 1, h. = for all i would yield negative zero upon assimilation if a zero's- carry were to arise in some stage. Thus- if s . =1 and h. = for all i, the assimilator will fail to ' 1 1 ' work since each stage waits for a one's-carry or a zero's-carry output of the preceding stage. This condition corresponds uniquely to an unassimilated balance of zero if "negative zero" is used as the conventional machine repre- sentation of zero. This choice is consistent since a balance of zero is indicated as negative zero in a conventional adder. The occurrence of zero must be sensed to enable the machine to pro- ceed. A signal e defined by e = (d° V db N/ (d? V dh V . . . V (d° V db ^0 0^1 1 ^ n n is when this condition obtains and 1 otherwise (see Figure 3'3)' An assimilation as such is not necessary in this case since the result is known to be zero. 3o24 An Example of Quasi-Addition A sequence of quasi-additions is presented in Figure 3»^ below. The arithmetic unit has been modified to include digital position a , for overflow detection purposes. An analysis of overflow is presented in Chapter 5« 28 A . 1 + li/l6 C . 1 1 + 5/16 M . 1 + 8/16 Add S . 1 1 + 9/16 B . 1 + 8/i6 Shift down A . 1 1 + 9/l6 C . 1 + 8/16 M 1 . 1 1 - 10/16 Add and shift down A 1 1 . 1 1 - 9/16 C 1 . + 16/16 M 1 . 1 - 13/16 Add and shift down A 1 1 . 1 - ii^/16 C . 1 + 8/16 M . 1 1 1 + 7/16 Add and shift down A 1 1 . 1 1 - 3/16 C . 1 + 4/16 Assimilate A , 1 + 1/16 Augend Addend (9/16) + (+8/16) + (-10/16) + (-13/16) + (+7/16) = + 1/16 Figure 3°^ A Sequence of Additions in an Arithmetic Unit with Separate Carry Storage 29 3.3 Quasi-Su'b'tir actor with Separate Borj'ow Storage 3.31 Quasi- Subtract ion The Boolean expressions defining the formation of the difference and borrow digits in a conventional arithmetic unit using a subtracter are: s . = a. © m. © y. 1 1 1 "^ 1 = a^ © m^ e j^ ^i-1 = yi^^i ® "^1) ^ ^i°^i '^^'^^ ^n " ^-1 where a. = i digit of minuend 1 ° m. = i digit of subtrahend s. = difference digit formed in stage i y = borrow digit formed in stage i, propagated to stage (i-l) The register arrangement of Figure 3'>2 above can be used. Argu- ments dual to the ones used in connection with quasi-addition are employed in the analysis of quasi-subtraction. The Boolean expressions defining quasi-subtraction are: s^ = (a^ @ m_j^) i) (a^^^ m^^^ V c^) with s^ = s_^ where ). T =(a. © m. ) ° (a. ^ m. ^ V c, ) with b = b , 1-1 '■1^1' ^ 1+1 1+1 1 n -1 a.o c.« m.c s.o b. are the i digits of registers A, C, 1' 1' 1' 1' 1 ° ° f 9 M, S, B s. = quasi-difference digit formed in stage i b. , = borrow digit formed in Stage i, stored in stage (i-l) i = 0, 1, o , , ^ n 30 By duality^ borrows of successive subtractions do not interfere because C was cleared initially and because s. b. , =0. 1 1-1 3.32 Borrow Assimilation Less than a full subtractor is sufficient for borrow assimilation, by duality to the case of carry assimilation^ since s. b. , =0 implies b.d. - for all i, 1 1 Two borrow signals, assimilation gates g, and a borrow assimilation completion signal h are used: a. = s . b. d, V s. (b. V d. ) 1 111 11 1 d. T = g ° s.(b. V d. ) with d = d , 1-1 '^ 1 1 1 n -1 d? ^ = g(s.V b. d.) with d° = d°, 1-1 °^ 1 1 1 n -1 h=(a°vdj){a°vdi) • • . (d°vd;;) 3.33 Zero Recognition The condition s. = b. =0 for all i corresponds uniquely to an un- assimilated balance of zero if "positive zero" is used as the conventional machine representation of zero. Since the borrow assimilator fails to work in this case^ the signal e defined in the carry assimilator section must be used to enable the machine to proceed, 3,^ Quasi-Adder-Subtractor with Separate Carry- Borrow Storage 3,^1 Quasi-Addition-Subtraction It is evident that the Boolean expression defining the formation of the sum digit in conventional addition Is equivalent to that defining the formation of the difference digit in conventional subtraction. The expressions 31 for the carry and borrow digits differ in the complementation of one term. If w is a signal determining whether addition or subtraction is to be per- formed, then the following Boolean expressions, obtained by combining the expressions for adder and subtractor by means of w, define the operation of a conventional adder- subtractor: s . = a. © m. © X. 1 11 1 X. , = X. (w a. @ m. ) V (w ® a. )m. with x = x , 1-1 111 ^11 n -1 where w = for addition = 1 for subtraction a. = i digit of augend or minuend m. = i digit of addend or subtrahend s. = sum or difference digit formed in stage i X. -, = carry or borrow digit formed in stage i, propagated to stage (i-l) i = 0, 1, . o . , n Note that carries or borrows are completely propagated before the next opera- tion and can therefore not interfere with each other. Although a conventional adder- subtractor could be modified to use separate carry and borrow storage, it seems easier to develop the expressions describing the operations in a quasi-adder-subtractor directly, expecially since advantage should be taken of cancellations between carries and borrows. A close look at the expressions defining quasi-addition and quasi- subtractlon reveals that carries and borrows arise under mutually exclusive conditions and can therefore be stored in the same register without interfering 32 with each other. Some means to remember the sign of these carry- "borrows has to he provided, of course. It can be shown that the sign of carry-borrow b. -■ is implicitly determined by the value of quasi-sum-difference digit s.. The argument is presented in tabular form; Table 3»1 below summarizes the operation of the quasi- adder- subtractor. The signal w is for addition, 1 for subtraction. The carry-borrow register C is cleared to zeros at the beginning of a series of additions and subtractions. The first operation may cause carries or borrows to arise. In all cases, the sign of the carry-borrow b. , is seen to be determined by the value of the quasi-sum-difference digit s.: b. -,, originating in stage i, has weight +1 if s . =0 and weight -1 if s. = 1. This relationship remains invariant, of course, when the contents of S and B are shifted into A and C. It is interesting to note that borrows can arise while the quasi- adder- subtractor is used for quasi-additlon, and carries can arise while it is used for quasi-subtraction. Since, similar to quasi-addition and quasi- subtraction, "automatic" carries and borrows arise under certain conditions and are propagated over one digital position, cancellations with previously stored carries and borrows may or may not occur. Thus, a previously stored borrow must be propagated as borrow during an addition if a cancelling auto- matic carry is not propagated into that digital position. Similarly, a stored borrow, which should make an automatic carry from that digital position unnecessary, is propagated as borrow so that the automatic carry need not be blocked in that case. Since carries and borrows can be stored in the same register, the register arrangement of Figure 3°2 can be used. 33 th Table S-l Operation of the i Stage of a Quasi -Adder- Subtrac tor w = Add w = 1 Subtract a, m. a. , m. ^ c. b. -, s . ^. -, s. 1 1 1+1 1+1 1 1-1 1 1-1 1 00000 00 00 0000+1 01 01 00010 00 -11 0001+1 01 00 00100 00 00 10-1 -11 -11 00110 01 00 0011-1 00 -11 1 1x0 Ix) During subtrac- rs-i^^T T^ 1 r. tion, a. = 0, 1 0+1 +1 Ox+1 ',i' m. = 1 produces 01010 01x00 an automatic 0101+1 +10x01 01100 01x01 0110-1 00x00 borrow into position (i-l). y) During addition. 1 1 1 +1 0x0 1 a. =m. =1 111-1 1x00 produces an automatic carry 10000 01 01 10 0+1 +10 +10 10010 01 00 1001+1 +10 01 10100 01 01 1010-1 00 00 10110 +10 01 1011-1 01 00 llOOOyOO 00 1100+lyOl 01 llOlOyOO -11 1101+lyOl 00 lllOOyOO 00 1 1 1 -1 y -1 1 -11 llllOyOl 00 1111 -lyOO -11 into position (i-l). 3h The simplified Boolean expressions defining the operation of a quasi- adder-subtractor are: s. = a. @ m. © (w ® a. , ) m. , ® c. with s = s , 1 11 1+1 1+1 1 n -1 "b. -, = (v@a. ^)r(w@a. @m.)m. ^ c. vfw^a. @m.)m. ^ c.l 1-1 ^ 1+1 ^ 1 1 1+1 1 ^1^1 1+1 1-^ V (w ® a. -, ) (w © a. © m. ) c . with h = b ., ^ 1+1^ 111 n -1 where w = for addition = 1 for subtraction th a., c.^ m., s., b. are the i digits of registers A, C, M, S, B. s. = quasi-sum-difference digit formed in stage i b. , = carry-borrow digit formed in stage i, stored in stage (i-l). b. , is a carry if s . =0 1-1 1 b. -, is a borrow if s . = 1 1-1 1 i = 0, 1, » , . , n 3o^2 Carry- Borrow Assimilation For purposes of transferring the sum-difference, the carry-borrows in B are combined with the quasi-sum-difference in S in a process called carry-borrow assimilation, the rules of which are summarized in Table 3 -2 below. The column headed by s . ., is necessary to determine the sign of b.. We shall now show that s. , also determines the sign of the inter- 1+1 ° stage carry-borrow d.o If s . =0, an interstage carry-borrow d. of value +1, i.e. a carry, can arise from stage (i+l) only if b. ., = d. , = +1, but b. ., = +1 35 Table 3.2 Operation of the i Stage of a Carry- -Borrow Assimilator g i b o ^+1 d. 1 ^i-1 a. 1 -1 -1 1 1 1 +1 1 +1 1 +1 -1 -1 1 -1 1 -1 1 +1 1 1 1 -1 1 1 1 1 1 +1 +1 1 +1 +1 1 +1 -1 1 1 -1 1 1 -1 1 +1 1 Implies s. _ = 0. Thus, to get d. ^ = +1, we must have b. ^ = d, _ = +1. By ^ 1+2 ^ ^ 1+1 ' 1+2 1+2 "^ induction on i, remembering that carry-borrows are propagated in a loop be- cause of the necessary end- around correction, we find that all stages are alike, i.e. s. = 0, b. = 1 for all i. There is no way for an interstage carry-borrow to arise. Thus d. cannot be +1 if s . , =0. "^ 1 1+1 36 If s . - =1, an Interstage borrow d. can arise from stage (i+l) only if Id. -, = d. , = -1. But b. ^ = -1 implies s. ^ = 1. Thus, to get d, ^ = -1 i+l 1+1 1+1 ^ 1+2 ' ^ 1+1 we must have b. ^ = d. _ = -1. By induction on i we find, analogous to the 1+2 1+2 7 D above argument for s. , =0, that all stages are alike and that there is no way for an interstage carry-borrow to arise. Thus, d. cannot be -1 if s. = 1, We conclude therefore that s . ., = implies that d. is an interstage 1+1 ^1 '^ borrow, while s . .. = 1 implies that d. is an interstage carry. We found previously that s = implies that b. is a stored carry, while on the other hand s. , = 1 implies that b. is a stored borrow. Hence, a stored carry and a propagated carry, or a stored borrow and a propagated borrow, cannot coin- cide during carry-borrow assimilation. The Boolean expressions defining the formation of the i sum- difference and interstage carry-borrow digit in a carry-borrow assimilator as obtained from Table '^.2 are: a. = s . © b. ® d. 1111 d. , = d. b.(s. ® s. -, ) V d. b.(s. ® s. ^) with d = d , 1-1 1 1^ 1 1+1 1 1^ 1 1+1' n -1 a. = sum-difference digit assimilated in stage i. d. -, = interstage carry-borrow formed in stage i, propagated to stage (i-l) i = 0, 1, . . . , n Two carry-borrow signals, assimilation gates g, and an assimilation comple- tion signal h are used to make carry-borrow assimilation as fast as possible: a. = s . ® (b. d. V b. d. ) 1 1 ^11 11 d. ^ = g d. b.(s. © s. ^ ) \/ g d.(s. ® s. ,) with d = d , 1-1 1 1^ 1 1+1 ° 1^ 1 1+1 n -1 d? , = g(s. ® s. , ® b. ) V g(b.d"l- V b.d?) with d° = d°, 1-1 &v -L ^ -L+1 ^ i' ^^ 1 1 1 i' n -1 37 3.^3 Zero Recognition It can be seen from the defining equations that one 's-carry-horrow outputs depend on the inputs « Zero's-carry-horrows however will arise in one or several stages, and thus cause carry- borrows to be propagated, unless s.©s. ^@b. =0 holds for all 1. 1 ^ 1+1 1 We conjecture that the conditions for which carry-borrow-assimila-t;ion fails to work are exactly those corresponding to an unassimilated representation of a zero balance. This conjecture is based partly on the results obtained previously for carry- assimilation in the quasi-adder and borrow-as similat ion in the quasi-subtractor, and partly on the fact that no examples could be pro- duced such that an unassimilated version of zero failed to satisfy the con- dition s.@s. T©b. =0, ifC was ever cleared to zeros. 1 1+1 1 ' 3.5 Summary The speed of conventional parallel arithmetic units using e.g. an adder is ultimately limited by the time required to let carries propagate through the stages of the adder for each additiono Since multiplications and divisions are sequenced as conditional additions, subtractions, and shifts, attention is focused on making additions and subtractions faster. A sequence of additions can be speeded up if separate carry storage is provided,. Carries arising from an addition need not be propagated over more than one digital position. Numbers are represented with carries un- assimilated whenever possible. Carries have to be assimilated only when results are to be transferred. Of course, interstage carries arising during assimilation may propagate through the stages of the assimilator. In this chapter we have designed an arithmetic unit using a quasi- adder with separate carry storage, and have extended the design to a quasi- 38 subtracter with separate borrow storage as well as a quasi-adder-subtractor with separate but coincident carry-borrow storage. Assimilators have been designed for each arithmetic unit and have been found to exhibit the feature that the assimilator fails to assimilate balances of zero under certain con- ditions. It is thus possible to recognize a balance of zero. In the following chapter these arithmetic units will be modified and extended to permit multiplications and divisions to be performed. 39 h MULTIPLICATION AND DIVISION IN ARITHMETIC UNITS WITH SEPARA.TE CARRY OR BORROW STORAGE if .1 Introduction Although multiplication and division are operations which yield inverse results, the methods employed to effect multiplicatp.on and division in a particular computer are not necessarily inverse. Most conventional com- puters sequence hoth operations from additions, subtractions, and shifts. The formation of the quotient digits in restoring division depends on the outcome of a trial subtraction (if divisor and dividend have like signs) of divisor from a partial remainder. Division in general is a necessarily sequential operation which requires that the most significant digit of the quotient be formed first. This implies that left shifts are used in division. Multiplication in its simplest form is performed in an inverse fashion by adding (if multiplier and multiplicand have like signs) the multi- plicand conditionally, depending on the value of a multiplier digit, to a partial product and shifting. However, the first multiplier digit sensed can be either the least significant digit, implying that right shifts are used in multiplication, or the most significant digit, implying that left shifts are used in multiplication. In a refinement of this multiplication method the multiplier digits are receded into reversed ternary notation to minimize the number of additions or subtractions per multiplication. An analogous method of division can be devised yielding a quotient in reversed ternary notation. Nevertheless, the number of additions and subtractions per division is higher than in the corres- ponding multiplication. Again, the most significant quotient digit must be fOCTned first, while the digit at either end of the multiplier can be sensed first. In fact, multiplication can be performed, at high speed and great cost, in a device which yields the product in essentially one step, or hy a hybrid method combining features of this device with features of other schemes, whereas there exist no corresponding high-speed schemes for division. In any case, the arithmetic units capable of performing additions and subtractions have to be extended to (2n + l) stages to accomodate (2n + sign)-digit products resulting from the multiplication of two (n + sign)- digit operands, and (2n + sign)-digit dividends resulting in an (n + sign)- diglt quotient upon division by an (n + sign)-digit divisor. Shift facilities have to be provided. In general, multiplier and quotient are held in register Q, a register with shifting facilities. In some very restrictive cases (when the accumulator remains positive) the accumulator extension can be shared be- tween digits of the final product and the not-yet-sensed multiplier digits, or between digits of the dividend and digits of the quotient. Unless division is considered unimportant in comparison to multiplication, multiplication and division methods which yield inverse results and which share the same facilities to as large an extent as possible are commonly preferred. Considerations such as the desirability of both right and left shift facilities and the nature of the equipment required for the extended sections enter the choice of methods here, aside from the desire to make the processes as fast as possible » The schemes analyzed here are chosen because they represent general methods of multiplication and division and yield inverse results, because the division yields a truncated quotient and corresponding remainder, and because the multiplication method employs time-saving features. 1^1 k.2 Extended Arithmetic Units It is evident that an (.i+l)-digit number in one's complement notation is extended to (2n + l) digits in either direction by setting the extended digits equal to the sign digit. We note in comparison that a number in two's complement notation is extended in the same way at the most significant end, but at the least significant end by appending zeros. Multiplications and divisions require that the accumulator be ex- tended, with shift facilities, to (2n + l) positions. Properly, adder or subtractor and the MCC (Number Register-Complement Gate complex) have to be extended to (2n + l) stages as well. In particular, the end-around corrections now affect stage (2n). Hence, the initial contents of the accumulator must be extended correctly. If we assume, however, that m, the number at the output of the MCC, is always positive, then the extended stages of adder or subtractor will always receive inputs of m. =0 and hence can be simplified to a carry or borrow propagation circuit. This circuit is necessary to take care of end-around corrections which occur when the quantity in the accumulator changes sign. Evidently, if the number at the MCC output is not always positive, then the inputs m. to each stage of the adder or subtractor extension are connected to m^, the sign digit of the number at the MCC output. The MCC of course need not be extended de facto in either case. In comparison, the two's complement system does not require a pro- pagation circuit associated with the extended positions of the accumulator, regardless of the sign of m, and the accumulator extension can be shared be- tween multiplier digits and digits of the final product, if the accumulator is extended at the least significant end. The analysis ignores here stages necessary for overflow detection. k2 4.3 Multiplication ^.31 Method of Multiplication A general type of multiplication is an unrestricted hold multipli- cation yielding a product p defined "by p = xy + 2'^ Pq where p = final product X = multiplicand y = multiplier P- = initial contents of accumulator n = numher of non-sign digits of multiplier or multiplicand. We note that partial products may change sign in this type of multiplication if the initial (hold) contents of the accumulator is not restricted or if the hinary multiplier is recoded into reversed ternary notation such that the total numher of operations per multiplication is minimized by allowing the multiplicand to be subtracted as well as added. The operations performed in each step of the multiplication can be expressed in form of a recursion relationship as where k = recursion index, k = 0, 1, ..., n-1 p„ = initial contents of accumulator p = k partial product X = multiplicand f , = recoded multiplier digit with values 0, +1, or -1, n— K h3 An (n + l) step is necessary to form the final product, p . This step. when k = n, is descrihed by P .T = (P + X • f„) . ^n+1 ^ *^n ' In each step it is therefore necessary to generate a recoded multi- plier digit f , with values 0, +1, or -1, and to right shift p, , add x to p and right shift the sum, or subtract x from p and right shift the difference, respectively. The right shift is omitted when k = n. The recoded multiplier digit, f , is generated during step k by sensing two multiplier digits, y , , and y , , and a mode digit, r , , . ^ J/ to ; ■'n-k-l ''n-k' ^ ' n-k+1 A new mode digit, r , is also formed during step k. The rules for the generation of f , and r , are summarized in Table ^.1 below. ^ n-k n-k Table ^.1 Rules for the Generation of Recoded Multiplier Digits and Mode Digits during Step k of a Multiplication ^n-k+1 ^n-k-1 ^n-k or + 1 mode '1 1 1 1 1. 1 1 or - 1 1 mode 1 1 1 1 1 n-k Vk +1 -1 1 +1 1 -1 1 1 r , = mode digit formed in step k, used in step (k + l) y , = multiplier digit ''n-k ^ f , = recoded multiplier digit n-k ^ ° k = 0, 1, . . ., n kk It is necessary to set the mode initially to agree with the sign of the multiplier, i.e. r .. = y„. It is also necessary to duplicate y_ to obtain y , in the last step. Negative multipliers are found to ohey the same rules as positive multipliers. In either case, (n + l) steps are performed. The rules show incidentally that a step, in which an addition or subtraction and a right shift is performed, is followed by a step in which only a right shift is performed. Thus, shifting over two digital positions could be employed and the multiplier could be recoded into a reversed quinary notation. With some additional equipment the speed of multiplication could then be further increased. It can be shown that the arithmetic value of the multiplier in reversed ternary notation, obtained by means of the above rules, is equal to the arithmetic value of the binary multiplier. The proof is based on the fact that the arithmetic value of a string of, say, q ones in the binary multiplier represented by "2l-_ -i ^ • 1 is equal to 2 - 2 , the arithmetic value of the recoded multiplier. A more detailed analysis yields Since the recursion relation for partial remainders implies Pi = 2"^ p„ + X • y ^ T 2'^ f . . , < k < n, ^k ^0 <^ 1=1 n-k+i' - ' it follows that the final product, p -,, is p T = p + X • f _ ^n+1 n = 2-%o--ZLo2'^^i 2 Pq + xy. 4.32 Multiplication in Conventional Arithmetic Units The type of multiplication described in the preceding section can he performed in the following arithmetic unit designs, provided the registers are extended properly to (2n + l) stages. In all cases an (n + l)-stage shifting register Q is used to store the multiplier. (a) Arithmetic unit with adder, subtraction being performed by means of the Complement Gate . Here m is not always positive, and the extended stages of the adder are full adder stages. (b) Dual of scheme (a), using a subtractor. (c) Arithmetic unit with adder- subtractor. The Complement Gate is used to make m positive. The extended section of the adder- subtractor is simplified to a carry-borrow propagation circuit. The setting of the Complement Gate is offset by e.g. reversing the sensing of the multiplier digits . (d) Arithmetic unit with adder, subtraction being performed by comple- menting the accumulator, adding m, and re complementing the accumu- lator. The Complement Gate is used to make m positive. The extended section of the adder is simplified to a carry propagation circuit. However, complementing facilities for the accumulator are necessary. (e) Dual of scheme (d), using a subtractor. (f ) Use of an accumulator extension which has an independent sign. Then adder or subtractor need not be extended at all, but a complex correction of the digit about to be shifted into the extension is necessary, involving a possible second use of adder or subtractor depending on the signs of accumulator and extension. The final product may have different signs associated with the most and least significant halves. k6 4.33 Multiplication in an Arithmetic Unit with Separate Carry or Borrow Storage An application of the conventional multiplication schemes to arithmetic units with separate carry or borrow storage is limited by the fact that the easy complementation feature of the one ' s complement system, one of the strongest . arguments for use of that system in conventional computers, is lost when numbers are represented in unassimilated form. Thus designs (d) and (e) cannot be realized here. Design (f) is not desirable because it is generally necessary to perform at least a partial assimilation to determine the sign of the quantity in the accumulator. The sign then determines whether a correction is necessary. Hence, designs analogous to (a), (b), or (c) can be used here. Registers A, C, S, and B have to be extended to (2n + l) stages. Register Q is used to store the multiplier » In design (a), the quasi-adder and carry assimilator are extended to ^n + 1) stages. In design (b)^ the quasi- sub tractor and borrow assimilator are ex- tended to (2n + 1) stages. In design (c)^ the quasi -adder- subtractor is extended by simplified stages. The carry-borrow assimilator is extended fully. It may be possible to simplify these designs by representing the extended section of numbers in assimilated form. However, carries or borrows resulting from end-around corrections must then be permitted to propagate through the extended section. It is incidentally necessary to perform a one-stage assimilation of positions s^ and b upon right shift, since the combined weight of these digits IS 2 In the quasi-adder this necessity arises when s^ = "b^ = 1. But ^ 2n 2n then h2^_^ = and s^ = o''" . Thus, a^^ = s^^_^, c^^ = h^^^^V s^^h^^. In the quasi- subtracter we have by analogy 2n 2n-l' 2n 2n-l 2n 2n In the quasi -adder -subtracter an interesting analysis yields a^ = s_ ,, c^ = "b^ T © (s^ @ s_) b- . 2n 2n-l' 2n 2n-l ^ 2n 2n 4.34 An Example of Multiplication We present in Figure i4-.l below an example of a hold multiplication with a recoded multiplier, performed in an arithmetic unit with separate carry storage of design (a). k.k Division il-.^l Method of Division One of the more desirable schemes of division is one yielding a truncated quotient and a corresponding remainder such that the remainder is less in absolute value than the divisor and has the same sign as the dividend. This division method yields a quotient y defined by xy + 2 r = r^ where X = divisor y = quotient r^-^ = dividend r = final remainder n = number of non-sign digits of divisor or quotient. The arithmetic units will be extended to s , later for overflow detection purposes. If8 X = 1 . Y = . 1 ^0 = 1 o 1 110= -9/16 Multiplicand in M 11= +11/16 Multiplier in Q 10= -5/16 Hold Contents in A,C :ension Q 0.1011 P = X • Y + 2" P = (-9/16) (+11/16) + 1/16 (-5/16) = -IOJ+/256 1 1 , 1 1 Extension A 1 1 1 1 C . M , 1 1 Add S . 1 1 1 1 1 1 B . 1 Shift rig ht twice A . 1 1 1 1 C . 1 M . 1 1 Add S . 1 1 1 1 1 B . 1 Shift right twice A . 1 1 1 1 C . 1 M 1 , 1 1 1 1 1 1 Add S 1 1 . 1 1 1 B . 1 1 Assimilate A 1 1 „ 1 1 1 1 1 F +1 • 0-1 0-1 '3 = -1: add complement of X and shift right 0: shift right F • 10 +1 -1 ^2 *1 ^ -1: add complement of X and shift right 0: shift right Q , +1 f = +1: add X Figure 4„1 Hold Multiplication in an Arithmetic Unit with Separate Carry Storage h9 This scheme of "tidy" division is difficult to realize if non-restoring division is used since it is generally necessary to perform a final correction of the quotient during which the remainder is destroyed. We note that restoring division is not significantly slower in an asynchronous arithmetic unit because the partial remainder is still available in the accumulator if it is found that the tentative partial remainder (TPR), i.e. the difference (in absolute value) of partial remainder and divisor, is unacceptable. Thus, re-adding (in absolute value) the divisor to the tentative partial remainder to "restore" it can be effected by a simple gating operation. No modification of the quotient is necessary in the tidy division scheme if restoring division is used and if a tentative partial remainder of zero is always accepted. In restoring division, the choice of whether the divisor is added to or subtracted from the partial remainder to form a TPR is made initially such that the difference in absolute value of partial remainder and divisor is formed. Wow the TPR is accepted or rejected according to whether its sign and that of the dividend agree or disagree, and the quantity is then shifted left to form the new partial remainder. It follows that the partial remainders have the same sign as the dividend. The operations performed in step k of tidy division are in detail (l) Form a TPR: TPR- = r- + X . k k — Find the sign of TPR . If sign of TPR and sign of dividend r o agree : accept TPR disagree: reject TPR, If TPR is zero, accept TPR . 50 (2) Shift partial quotient <^ left and insert a quotient digit in q : If sign of TPR and sign of divisor x agree: set q = 1 disagree: set q = 0. , (3) Shift partial remainder left. In the (n + 1) step the same rules are oheyed except that the left shift of the partial remainder, i.e. rule (3), is omitted. It follows from the rules of division stated ahove that the final remainder is less than the divisor in absolute value and has the same sign as the dividend. The formation of the successive partial remainders r, can he ex- pressed in form of a recursion relationship as r, T = 2(r- + X • T. ) k+1 ^ k — k' where k = recursion index, k = 0, 1, ..., n-1 r, = k partial remainder r^^ = dividend X = divisor T, = acceptance factor = 1 if Irj^l > 1x1 = if Ir, 1 < 1x1 . k For the last step, when k = n. r =r = r +x.T n+1 n — n where r = final remainder. 51 The choice of whether x ° T, is added to or suhtracted from r, is made initially such that r, -, = 2(r, + X • T, ) if signs of r„ and x disagree k+1 "^ k k' ^0 ^ = 2(r, - X • T, ) if signs of r„ and x agree. ^ k k ° '^ The insertion of quotient digits and the formation of successive partial quotients can he expressed in form of a recursion relationship as k+1 k "^ k where k = recursion index, k = 0, 1, ..., n-1, n th Y, = k partial quotient, with Y_ = J = quotient digit inserted into position q during step k = 1 if signs of TPR, and x agree = if signs of TPR and x disagree. Thus Y , = y^ _ 2"^ y.o n+1 /-.i=0 ''i A more detailed analysis shows that the arithmetic value of the quotient y is Y = y,(-l . 2-) . Y.U 2'' ^1- It can also he shown that the equation xy + 2 r = r holds. The final remainder r has the same sign as the dividend r^ and satisfies | r | < 1 y | « ^.^2 Division in Conventional Arithmetic Units Tidy division requires that a TPR of zero he accepted unconditionalI_y, since such a TPR represents the most successful attempt at subtraction of the 52 absolute value of the divisor from the absolute value of the partial remainder. However, a TPR of zero will he accepted in an arithmetic unit using an adder only if the dividend, and hence each partial remainder, is negative, and similarly in an arithmetic unit using a suhtractor only if the dividend is positive. The following ruses can be used to force an acceptance of a TPR of zero, without recourse to a zero recognition circuit: (a) In an additive arithmetic unit only negative dividends are used. A positive dividend is complemented before division. (b) In a subtractive arithmetic unit only positive dividends are used. A negative dividend is complemented before division. (C) In an additive-subtractive arithmetic unit the unit is set to add if the dividend is negative and set to subtract if the dividend is positive. The complementations necessary to achieve tidy division in these arithmetic unit designs are listed in Table k.2 below. We note that the divisor may have to be complemented to yield a positive number at the MCC output to per- mit simplifications of the extended section of the arithmetic unit. The effects of these conditional complementations of dividend and divisor can of course be offset by a complementation of the quotient digits as they are in- serted. A comparison of the designs discussed here with those discussed in connection with multiplication shows that (A) and (d), (B) and (e), (C) and (c) are compatible designs. 53 Tatle ^,2 Setting of Complementing Circuits to Achieve Tidy Division in Conventional Arithmetic Units (A) Arithmetic Unit with Adder (B) Arithmetic Unit with Subtractor Original Dividend + + Original Dividend + Original Divisor Original Divisor + + Machine Dividend Machine Dividend + + + + Machine Divisor + + + + Machine Divisor + + + + Quotient Digit Complemented Uncomplemented Uncomplemented Complemented Quotient Digit Uncomplemented Complemented Complemented Uncomplemented (C) Arithmetic Unit with Adder- Suh tractor Original Dividend + 4- Original Divisor + Setting of Adder- Subtractor + + Machine Divisor + + + + Quotient Digit Unc omp 1 eme nt e d Uncomplemented Uncomplemented Uncomplemented 4.43 Division in an Arithmetic Unit with Separate Carry or Borrow Storage Of the three designs discussed in connection with division in con- ventional arithmetic units only (C) is directly applicable to arithmetic units using unassimilated number representations, since (A) and (b) are based on complementations of unassimilated numbers. However, arithmetic units with separate carry or borrow storage were found to have a zero recognition feature. Therefore TPR's of zero can be recognized and accepted without need for facilities to complement the dividend. Hence, designs (a), (b), and (c) described in section ^.32 can be used for tidy division. In all cases, a partial assimilation to the sign digit is necessary to determine the sign of 5h the TPR. If assimilation fails the TPR is zero and is accepted unconditionally. 4.4^ An Example of Division We present in Figure ^,2 helov an example of a tidy division, per- formed in an arithmetic unit with separate carry storage of design (a). The stages to the left of the hinary point are treated in accordance with the overflow considerations presented in Chapter 5o It is interesting to note that a multiplication following a division, using quotient as multiplier^ divisor as multiplicand^, and the remainder as hold contents, will yield a product identical to the original dividend, whereas a division following a multiplication, using product as dividend and multiplicand as divisor, will yield a quotient identical to the multiplier and a remainder identical to the hold contents only if the hold contents was less than the multiplier in absolute value and had the same sign as the final product. 55 R=l. 10010111= -lOif/256 Dividend in A,C X=l. 01101111= -9/16 Divisor in M Signs of dividend and divisor agree, hence add complement of X Extension Q q . q q q q A 1 1 . 1 1 1 C . 1 1 M , 1 1 Add (s) 1 1 1 1 1 (B) 1 . 1 +TPR, rej ect S 1 1 . 1 1 1 B . 1 1 Shift lef -1- u A 1 . 1 1 1 1 C 1 . 1 M . 1 1 Add (s) 1 1 , 1 1 1 1 1 1 1 (B) . -TPR, accept. shift left A 1 1 . 1 1 1 1 1 1 1 C . M . 1 1 Add (s) 1 1 . 1 1 1 1 1 1 (B) , 1 +TPR, rej ect S 1 1 . 1 1 1 1 1 1 1 B . Shift left A 1 . 1 1 1 1 1 1 1 1 C . M , 1 1 AuCL (s) 1 1 . 1 1 1 1 1 (B) 1 -'iOr'R, accept. shift left A 1 . 1 1 1 1 1 1 c , 1 M . 1 1 Ada. (s) 1 1 . 1 1 1 1 1 1 (B) « q q q Q q . q q 1 Q q . q 1 Q q . 1 1 -TPR, accept, shift S,B down, Q left All. 10101111 Q 0.1011 C 0,00000000 X • Y + 2"^ R^ = (-9/16) (11/16) + l/l6(-5/l6) = -104/256 = Rq Figure i|-.2 Tidy Division in an Arithmetic Unit with Separate Cariy Storage 56 5 MALYSIS OF OVERFLOW 5.1 Overflow In Conventional Arithmetic Units In conventional one's complement arithmetic units, negative numbers are generally expressed as complements vith respect to (2 - 2 ), where 2 denotes the weight of the least significant digit in the representation and w where 2 is usually taken as the lowest power of two which permits a repre- sentation of all numbers within a given range, although clearly powers of two could be employed which are higher than necessary. In any case there will be w digits to the left of the binary point. Since the arithmetic value of a number is related to its one's complement representation by the generalized formula -W+l ■^ l=-W4-2 choosing w larger than necessary for a particular range will cause a particular digit, say x , , to be a "model" of all digits to the left of it, i.e., -w+l -w+2 -k-1 -k Thus, for numbers in the range -1 < x < +1, x„ is a model of all digits to the left of x„; for numbers in the range -2 < x ^ 10 -In results presented in Table 501 below. It is evident that the minimum contri- bution of the fractional parts of the representation is 0; the maximum contri- bution is (1 1/2 - 2'^) if Cq = 0, and (l - 2"") if c^ = 1. If a_^ = 1, 2~^ is subtracted from the maximum. 59 Tatle 5.1 Range of NumlDers in an Arithmetic Unit with Separate Carry Storage ^-1 ^0 "0 1 1 1 1 1 1 1 1 1 1 1 Si gn and. Range of x + < X < 1 1/2 + 1^ X < 2 + 1< X < 2 - -2 < X < -1 1/2 - -2< X < -1 - -2 < X < -1/2 - -1 C X <0 - -1< X < + OC X < 1/2 + o< X < 1 Type B A A B C C c Table 5»2 Range of Numbers in an Arithmetic Unit with Separate Borrow Storage a , a^^ c^ Sign and Range of x Type C B C A B A 1 1 1 1 1 1 1 1 1 1 1 Si gn and Range of x - -1/2 < X < + <^ X C 1 - -1 < X < + 1/2 < X < 2 + < X < 1 + 1 l/2< X < 2 - -2 < x< -1 + 1 < X < 2 - -1 l/2< X < . -2 C X < -1 60 -1 Table 5*3 Range of Numbers in an Arithmetic Unit vith Separate Carry-Borrow Storage 1 1 a. 1 1 1 1 1 1 1 1 1 1 1 Sign and Range of x -1/2 ^ X 3 ^i = ^0 = ^1 c» = c' . = i > 1 . -1 -1 — Let us first find the cases for which a' and a' differ: ^^2 ® ^-3 " ^-1 ® ^0 ^0 ® ^-1 ® ^-1 ^0 ^0 = ^-1 ^0 ^0 • However, "s -, s„ b = 1 indicates an operand of Type A (unassimilated overflow) and need not be considered further. Hence, a' can be taken as model digit. Now a' and a' differ in the following cases: ^II ® ^2 " ^0 ® ^0 ® ^-1 ® ^0 ^0 = ^-l^O^O^^-l^^O^ ^0^ The last term, i.e. s -,(s„v ^r))» again indicates operands of Type A and need not be considered further. 65 The case s ^ s 1)„ = 1 corresponds to an operand x in the range -2 < X < -112.. After the left shift, the quantity 2x is in the range -he 2x < -1, and overflow has occurred. This case must he detected. Therefore, a' is properly the model digit for the a.'s if the special case s -, s« h^ = 1 is not allowed to occur. The shifted results can he classified as a Type A or a Type BC number by Table 5«1 on "the basis of the digits s' , s', and b'. Wow it should be true that the arithmetic value of the number after the shift is twice that before the shift, i.e. = a[-(2 - 2"") s_j^ + Csp + bg) + ^°^j^ 2-J(s. + bj)] (mod It). Now we have that a. = s . ^ and c . = b . ^ for 3 . = s. i>2 -1— -^ -1 — A partial assimilation of the b.'s to the left of b„ yields ^'l ^ ^-1 ® ^-1 ^-1 ^ ° = a -, © (a m-^ k V a m^ k) = b' , i > 1 . ^-2 " ^-2 ® ^^-2 ^ ^-1 ^-1^ = a_^(iQ V hIq V k) V a^ m^ k = s;. i > 2 Let us find the cases in which s' and s' differ: ^-1 ® ^-2 " ^-1 ^0 °^0 ^ ® ^-1 ^0 °'o ^ • The expression a -, a m k corresponds to a Type A operand and need not be considered further. 68 The expression a_^ a^ m^ k indicates an augend in the range -2 < X < -1/2, since k = = a.^ m^ V c^ implies that c^ = and we have that ^1 " -^' ^0 " °- ^^^^® ^0 ^ -^' ^^® addend is in the range -1< m < 0. k = a^ m^ V Cq = also implies that a m = 0, Therefore, a, = 0, or m =0 or both. (i) a^ = 0„ This restricts the augend to the range -2 < x < -1. Hence the sum is in the range -^ < (x + m) < -1 and overflow has occurred. (ii) m^ = 0. This restricts the addend to the range -1< m < -I/2. Hence the sum is in the range -3 < (x + m) < -1 and overflow has occurred. Detection of overflow during addition or subtraction therefore con- sists of sensing the special case a_^ a^ m^ k = 1 before the process and checking the result for Type A numbers. The end-around corrections are given by 'k = K ® %^ ® (^-1 ^0 ^ ^n^ K = ^-1 \l = (a_^ a^ m^V a_^ a^ m^) k . 5.3^^ Overflow during a Multiplication Step (Addition or Subtraction followed uy Kignt bhift; Each step of a multiplication can be described by the formula ^k+1 = "^ (Pk + ^k ^^ 69 vhere X = multiplicand in M p = k partial product in A and C f = receded multiplier digit having one Of the values +1, 0, -1. Now p and p , are held in unassimilated form in A and C. The quantity (p, + f , x) in S and B is permitted to exceed range, but p, -, is correctly represented if the digit a' is correctly inserted during the right shift. This will "be true if a , = s' , in the notation of the previous section. However, s' = s' in all cases except a a- m^^ k = 1, and s' can he inserted into a , as well as a_ in those cases. The special case a , a^^ m„ k = 1, which need not be eliminated here since overflow is allowed to occur temporarily, can be sensed as in addition, but then addition can be permitted to proceed. Then, after the right shift, a ., is set to a = 1 in- stead of a , = s' . Alternately, s' could be generated for all cases and inserted into a , during the right shift. It can be shown that even if s ' is generated, the end-around carry remains ^n= ^2^^-2^ ^-1^-1^ = ^-1 ^-1 = (^1 ^0 "^0 ^ ^-1 ^0 "o^ ^ • 5.35 Overflow during a Division Step (Left Shift followed by Addition or Subtraction ) Each step of a restoring division can be described by the formula r- , = 2(r, + X • T- ) k+1 ^ k — k' 70 where X = divisor in M r = k partial remainder in A and C T, = acceptance factor = 1 if Ir^l > Ix i = if |r^l < lx|. The choice hetveen additions and suhtractions in this recursion relationship is made initially such that The acceptance factor is defined by the division rules as = if 1 rj^ 1 < 1 X 1 . This causes all partial remainders to have the same sign. The dividend r„ is restricted to he less in absolute value than the divisor x, i.e. | r„| < | x | < 1, to assure that the quotient y is in range, Thus, Ir^l = 2(lrQl - \x\ - T^) = 2 \Tq\ since 1 r^ 1 <:! U 1 < 2 |x 1 . Let us assume that | r . | \x\ 2, violating \v^\ < 21x1^2. (ii) Cp^ = 0, a, = m, =1 together with a = 0, a = 1 imply that the range of r, is 1 1/2 < r, |r„(o 5,^ Summary A number is represented in an arithmetic unit with separate carry or borrow storage by two sets of binary digits a -, a^ o s.-. , o o » « a -10 1 n 1 n 73 such that c_^ = and a_^ are models of all digits of higher significance. It is evident that the unassimilated representation presents an uncertainty either about the range, or about the sign of the number in some of the cases. An assimilation could of course remove this uncertainty. Due to the sign uncertainty arithmetic methods which are independent of signs are preferable. Otherwise frequent assimilations are necessary to determine the signs. Due to the range uncertainty numbers are separated into three types on the basis of the unassimilated digits: Type A, Numbers which definitely have exceeded range. Type B, Numbers which may or may not have exceeded range. Type C, Numbers which definitely have not exceeded range. Whether or not overflow has occurred for a Type B number can only be decided by an assimilation. It has been shown that arithmetic operations with Type B or I^pe C operands yield results which are either Type B or lype C with model digits of the same significance as those of the operands, or are numbers which are evidently out of range since they are Type A numbers or since the model digits do not have the same significance as before. Sequences of arithmetic operations therefore do not require assimilations solely for overflow detection purposes. The final result of course has to be checked for overflow after assimilation in a conventional manner. The above result applies also to the repetitive steps of multiplication and division in which overflow is permitted to occur temporarily. The case of overflow in the quotient caused by Improper division is treated separately. 7^ 6 CONCLUSIONS Conventional parallel arithmetic units have been built using several systems of number representation^ Many schemes have been devised to make these units as fast as possible » It is true in all cases that the ultimate speed of the arithmetic operations in these units is limited by the time required to let carries or borrows propagate through the registers for each operation, A logical step towards higher operation speeds is the use of separate carry or borrow storage such that numbers in the arithmetic unit are expressed as the sum or difference of the contents of two registers when- ever possible. The numbers are assimilated to the conventional one-register representation only when transfers to other parts of the computer are necessary or when the sign of the number needs to be known. This investigation was undertaken to study the effects of the use of the one's complement notation on the design of an arithmetic unit with separate carry or borrow storage. Although it was possible to design such a unit, it was found that the unit would be complex and hence rather expensive to build, if advantage were to be taken of some speed- improving schemes. The reasons for this complexity of design can all be traced to the necessity for end-around corrections inherent in the one's complement notation. Although the prima facie results of this investigation indicate that the arithmetic units described here have little practical value, several interesting theoretical properties were discovered. It was found possible to design not only an arithmetic unit using a quasi-adder with separate carry storage, but also an arithmetic unit using a quasi-subtractor with separate borrow storage and an arithmetic unit using a quasi-adder-subtractor with separate but coincident carry- borrow storage. (The last mentioned two types 75 Of arithmetic units are of course also usatle, without facilities for end- around corrections, in computers using the two's complement notation.) The arithmetic units with separate carry or borrow storage have the feature that a balance of zero has a unique unassimilated representation which fails to assimilate and therefore must be sensed. It is conjectured that the same holds for arithmetic units with coincident carry-borrow storage. Zero-sensing, however, is useful in a division scheme yielding a truncated quotient and corresponding remainder such that the remainder is of the same sign as the dividend but less in absolute value than the divisor. It is also interesting to note that the easy complementation feature Of the one's complement notation, one of the strongest arguments for use of that notation in conventional arithmetic units, is lost in the separate carry or borrow storage scheme. It is conjectured, however, that complementation is readily possible in the separate but coincident carry-borrow storage scheme. In fact, the latter scheme represents a much more efficient use of the two registers used to store a number in unassimilated form, from an information- theoretical viewpoint. Speculations about the use of unassimilated repre- sentations throughout the computer indicate e.g. convenient roundoff methods. One may consider the conventional arithmetic unit, in which carries are automatically propagated through all stages for each operation and no storage is provided for carries, and the arithmetic unit with separate carry storage, in which carries are propagated automatically over one digital position and storage for carries is provided in each digital position, as designs which form the two extremes of a series of designs in which carries are propagated automatically over, say, k positions and storage for carries is provided every k^^^ position. (ll.e same reasoning applies, of course, to borrows and carry-borrows . ) 76 It is evident that an interrelation exists between the complexity of the arithmetic unit and the amount of time saved. This relation is, in the opinion of the author, very non- linear and depends to a large extent on the statistical properties of carries, horrows, and c arry- borrows . Further investigation along these lines is warranted. 77 BIBLIOGRAPHY 1. A. W» Burks, Ho H. Goldstine, and J. von Neumann: "Preliminary Dis- cussion of the Logical Design of an Electronic Computing Instrument", Report, The Institute for Advanced Study, Princeton, New Jersey, June 19^6. 2. Project Whirlwind: "Whirlwind I Computer Block Diagrams", Digital Computer Laboratory, Massachusetts Institute of Technology, Report R- 127-1, vol. 1, September 19^7^ p. 23. 3. J. E. Robertson: "Preliminary Design of an Arithmetic Unit for Use with a Self-Checking Binary Parallel Digital Computer", Digital Computer Laboratory, University of Illinois, Report No. 19, June 1950» h. J. E„ Robertson: "Logical Design of Automatic Digital Computer Circuits", Notes for the course Mathematics-Electrical Engineering 38^4-, Digital Computer Laboratory, University of Illinois, 195^. 5. Ro K, Richards: "Arithmetic Operations in Digital Computers", D, Van Nostrand Company, Inc„, New York, 1955> PP« 8l-135» 6. B. Gilchrist, J. Ho Pomerene, and S, Y. Wong: "Fast Carry Logic for Digital Computers", IRE Transactions on Electronic Computers, vol„ EC-^, no. k, December 1955, PP- 133-136. 7. G. Estrin, B. Gilchrist, and J, H, Pomerene: "A Note on High-Speed Digital Multiplication", IRE Transactions on Electronic Computers, vol, EC-5, no. 3, September 1956, p^ l40. 8. Staff of the Digital Computer Laboratory: "On the Design of a Very High-Speed Computer", Digital Computer Laboratory, University of Illinois, Report No. 80, October 1957-