ymj 
 
 iltl 
 
 
 
 tf 
 
L I B RAHY 
 
 OF THE 
 
 UN IVLRSITY 
 
 Of ILLINOIS 
 
 510.84 
 Iffcr 
 no. 226-236 
 cop 2. 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 MAY 3 
 JUN 7 
 
 MAY 
 
 1 3 J 
 
 MAY 041938 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/theoryimplementa230atki 
 

 'X$t> 
 
 Report No. 230 
 
 TTulXAi 
 
 COO-1018-1115 
 
 THE THEORY AND IMPLEMENTATION OF SRT DIVISION 
 
 by 
 Daniel E. Atkins III 
 
 June 1, 1967 
 
 THE LIBRARY OF THE 
 AUG 15 19SJ 
 UNIVERSITY Of ILLINOIS 
 
Report No. 230 
 THE THEORY AND IMPLEMENTATION OF SRT DIVISION 
 
 by 
 
 Daniel E. Atkins III 
 
 June 1, 1967 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 *This work was submitted in partial fulfillment of the requirements for the degree 
 of Master of Science in Electrical Engineering, June 1967, and was supported in 
 part by the AEC under Contract No. USAEC AT(ll-l)l0l8 . 
 
ACKNOWLEDGEMENT 
 
 I wish to thank Professor S. R. Ray for his most helpful 
 advice and assistance in the preparation of this report. I also 
 thank Professor J. E. Robertson for the enlightening discussions 
 concerning the material in Chapter 2. 
 
 I further acknowledge and thank Mr. Richard Borovec for 
 his discussions concerning the cost determinations (Section 2.6), 
 Mrs. L. A. Prendergast and Mr. Ronald C. Morrison for the 
 drawings, and Mrs. Anita Worthington for the typing of the final 
 draft. 
 
 iii 
 
TABLE OF CONTENTS 
 
 Page 
 
 1 . INTRODUCTION 1 
 
 2 . THE THEORY OE SRT DIVISION ......... k 
 
 2 . Introduction k 
 
 2 .1 The Recursive Relationship 5 
 
 2 .2 The Representation of Quotient Digits 7 
 
 2 . 3 Range Restrictions 9 
 
 2 .k Redundancy in the Quotient Representation 12 
 
 2.5 The P-D Plot . . 15 
 
 2.6 The Cost of Quotient Digit Selection 2k 
 
 2.6.1 General ............ ................. 2k 
 
 2.6.2 Cost Determination for an Arithmetic Model 2$ 
 
 2.6.3 Cost Determination for a Table Look-Up Model 3^ 
 
 2 .7 Quotient Conversion 38 
 
 3 . IMPLEMENTATION OF SRT DIVISION ....... kl 
 
 3 • Introduction kl 
 
 3-1 General Considerations for Implementation kl 
 
 3.1.1 Relative Occurrence of Division k2 
 
 3.1.2 Acceleration of Division k2 
 
 3.1.3 Compatibility of Division with the Multiplication 
 
 Scheme U5 
 
 3.2 A High-Speed Multiplication Scheme k6 
 
 3.2.1 Notation............. k6 
 
 3 .2 .2 Description and Operation U9 
 
 3.3 Design of Division Scheme 53 
 
 3.3.1 General 53 
 
 3.3.2 An Arithmetic Model 5U 
 
 3.3.3 A Table Look-Up Model 56 
 
 3-k Estimate of Speed' of Elocution 66 
 
 k . SUMMARY AND CONCLUSION 69 
 
 '1 . 1 Summary 69 
 
 k .2 Conclusion 1 70 
 
 LIST OF REFERENCES 72 
 
 iv 
 
1. INTRODUCTION 
 
 Perhaps the major complication associated with digital divi- 
 sion is "best illustrated by your performing the following long-division 
 problem and noting carefully the steps you follow. 
 
 396 A 
 
 1057 6 2 1 
 
 A 1 A 2 A 3 
 
 A = decimal point marker 
 
 Your operations in selecting the first quotient digit are 
 summarized in the flow chart; Figure 1. The salient point is that 
 division is a trial and error process requiring an initial "guess" of 
 a quotient digit followed by a subtraction, or at least a comparison, to 
 determine whether the guess is correct. If it is not, the initial 
 choice is modified and the process repeated. It is the trial and error 
 nature of division, whether performed by man or machine, which complicates 
 its execution. In building a computer arithmetic unit, division is the 
 most difficult basic operation to implement efficiently. 
 
 But despite the complexity, the literature is replete with 
 themes and variations for implementing digital division. Flores, 
 for example, states four methods for increasing speed of division and 
 
 then proceeds to describe no less than twenty-four schemes which in- 
 
 [21 
 
 corporate some or all of these speed-up techniques. MacSorley 
 
 describes four division techniques demanding various divisor multiples 
 to accelerate execution. 
 
 * 
 
 Numbers in brackets refer to the corresponding entry under References 
 
j-l 
 
 j - INDEX 
 
 d = DIVISOR 
 
 Pj= PARTIAL REMAINDER 
 
 P = DIVIDEND 
 
 qj= QUOTIENT DIGIT 
 
 FIGURE I. FLOWCHART OF MANUAL EXECUTION OF DIVISION 
 
There is far less in the literature, however, describing 
 theory and analytic tools to be used in designing a division scheme. 
 Most of the articles describe schemes which are products more of art 
 than of science,, This report is an attempt to contribute to the 
 science of computer arithmetic implementation. 
 
 This report describes a class of division techniques especially 
 suited for implementation in an electronic digital computer. For 
 historic reasons, this class will be referred to as SRT division. The 
 name is derived from the fact that the binary case of this type of 
 division was discovered independently, at about the same time, by 
 Dura Sweeney of IBM, J. E. Robertson of the University of Illinois, 
 
 r 3"! 
 
 and T. D. Tocher of Imperial College, London . The paper, however, 
 incorporates more recent work, due exclusively to Professor Robertson, 
 
 which extends the binary SRT division to a radix higher than two. 
 
 ["51 
 Much of Chapter 2 is based upon his report L and upon numerous 
 
 personal communications - 
 
 After a description of the theory and properties of SRT 
 
 division, the report turns to the problem of actually implementing 
 
 the scheme and presents an example of one possible realization. 
 
2. THE THEORY OF SRT DIVISION 
 
 2 ,0 Introduction 
 
 This chapter introduces a recursive relationship for de- 
 scribing division and from it develops the nature of SRT division. 
 The discussion is augmented with two graphical representations; one 
 to determine the range restrictions associated with SRT, and the other 
 to aid in computing the "cost", of quotient digit selection* 
 
 Most of the following analysis will "be developed for a 
 general radix, r. At first this generality may appear superfluous, for 
 after all, isn't a digital computer a binary machine, and doesn't binary 
 imply radix two? It is true that the basic storage elements of a 
 digital computer are two state devices and that numbers are represented 
 internally by strings of "l's" and "0's". Computer arithmetic, however, 
 is often facilitated by considering groups of bits rather than each bit 
 individually. Such grouping may be interpreted as use of digits of 
 higher radix than two. For example, a pair of bits becomes one, radix 
 four digit; a trio of bits, a radix eight (octal) digit. 
 
 In the literature of arithmetic unit design, one finds re- 
 ferences to such techniques as inspection of bits "two at a time," or 
 perhaps " generation of several quotient bits simultaneously". In 
 this report such techniques would be described in terms of higher radix 
 arithmetic . 
 
2 .1 The Recursive Relationship 
 
 Digital division as implemented in an electronic computer 
 consists of preliminary operations, i.e., normalization, a recursive 
 process, and a terminal operation:,; i.e., changing the form of the 
 remainder. Although preliminary and terminal operations vary from 
 machine to machine, they generally consume much less of the execution 
 time than the recursive operations. For restoring, non-restoring, and 
 the SRT division scheme to be described in this report, this recursive 
 relationship is defined by 
 
 p._=rp. -q._d (2.1.1) 
 
 where the symbols are defined as follows: 
 
 j = the recursive index = 0, 1, ... m-1 
 
 th 
 p . = the partial remainder used in the j cycle 
 J 
 
 p = the dividend 
 o 
 
 p = the remainder 
 m 
 
 q. = the j quotient digit in which the quotient is of the form 
 J 
 
 q A 9 l q 2 • • • q m 
 
 L 
 
 radix point 
 
 m = the number of digits, radix r, in the quotient 
 d = the divisor 
 r = the radix 
 
This relationship and the symbols as defined will be used 
 throughout this report. The relationship is used specifically in the 
 development of range restrictions on the partial remainders in Section 
 2.3. 
 
 Although not germane to the theory of SRT division, it is 
 interesting to note in passing that this relation points to possibilities 
 for accelerating the execution of division. Verbally, the equation says 
 
 that each partial remainder must be multiplied by the radix (rp.), i.e. 
 
 J 
 
 shifted left one digital position and that the selected quotient digit 
 must then be multiplied by the divisor (q. d) and subtracted from this 
 shifted partial remainder. The division process will thus be accelerated 
 if the shift and/ or the subtraction time is decreased. In practice, all 
 values of q d are stored in registers or are readily available via 
 shift gates from the register containing the divisor. The rapid forma- 
 tion of q . d thus reduces to minimizing the necessity for forming 
 awkward multiples requiring an addition, and to accelerating the selec- 
 tion of q . d at the divisor input to the adder/ subtractor . 
 
 Secondly, note that the recursive index, j, is implicitly an 
 inverse function of the radix. When actually implemented on a machine, 
 digits of a higher radix than two are represented by two or more binary 
 bits. A string of £ binary digits (bits) is equivalent to £/2 radix 
 
 four digits. In general for I bits of radix two, there corresponds 
 
 I n 
 
 m = digits of radix r, where for practical cases, r = 2 , 
 
 log 2 r B r v > 
 
 n = integer > 0. Thus to produce a quotient of given precision, the 
 number of iterations required, and, concomitantly, the execution time 
 is decreased as the radix is increased. 
 
2 .2 The Representation of Quotient Digits 
 
 As noted in the last section, the use of a higher radix reduces 
 the number of cycles required to perform a division of given precision. 
 The implementation of such a scheme may, however, be costly, and costlier 
 still if quotient digits are represented as they are in manual methods or 
 machine restoring division. In these cases quotient digits have the 
 values 0, 1, 2, ... r-1. With the' tadix, )x, equal four the possible 
 digit values are 0, 1, 2, and 3* A radix four restoring division there- 
 fore requires that multiples of 1, 2, and 3 times the divisor be available 
 for subtraction from the partial remainder. The 1 times is of course 
 readily available, the 2 times is formed merely by shifting left one 
 binary position, the 3 times multiple, however, requires extra time 
 and/ or hardware. It may be formed by a tripler circuit or by addition 
 of 1 times and 2 times the divisor which is then stored in an auxiliary 
 register. For radix eight, multiples of 3> 5, and 7 times the divisor 
 must be computed and stored. 
 
 With SRT division the problem of forming divisor multiples is 
 mitigated by using both plus and minus quotient digit values. The 
 quotient digits are of the form -n, -(n-l), ... -1, 0, 1, . . . n, where 
 n is an integer such that 1/2 (r-1 ) ^ n _£.r-l. Within this range the 
 actual choice of n for a given r is largely a function of design de- 
 tails. The choice is considered further in Section 2.6. 
 
 The necessity for the range restriction is as follows'. At 
 least r unique digits are required to represent a number, radix r. In 
 the representation introduced above, there are 2n+l unique digitc, 
 
thus the requirement 2n+l 7"r. ( - >n ^he c^her hand, for radix r, the 
 maximum value of a quotient digit, n, should not be greater than the 
 value of the maximum digit representable, thus n ^ r-1. Combining these 
 two inequalities yields the restriction stated above. 
 
 With plus and minus quotient digits, a higher radix division 
 may be implemented with fewer awkward multiples of the divisor. Now 
 the quotient digits for a radix h division are -2, -1, 0, +1, 42. All 
 the necessary multiples of the divisor may be formed by shifting and 
 complementation and require no auxiliary registers. 
 
 The second, but probably more significant consequence of this 
 representation of quotient digits is that it introduces redundancy into 
 the representation of the quotient. If 2n 7 -r-1, then there are more 
 symbols available to represent a number than actually necessary. '■' 1 
 pome numerical values may therefore be represented in more than one 
 form . For example, with r = k, n - 2, and with representing negation, 
 the number 6 could be represented as 12, or 22. As explained in the 
 next sections, this redundancy permits less precision in comparing the 
 divisor and partial remainder in selecting a quotient digit. This 
 statement seems intuitively correct since without redundancy, each 
 quotient digit may be represented only one way and thus must be se- 
 lected precisely. With redundancy, the quotient digit, thus the 
 comparison of divisor and partial remainder, need not be precise. 
 This non-unique representation does, however, complicate the division 
 in that the redundant form must eventually be converted to a conven- 
 tional representation. 
 
 8 
 
2.3 Range Restrictions 
 
 With the quotient representation now defined, consider the 
 
 derivation of range restrictions on the partial reminders. Recall 
 
 from the manual execution of a division that in determining whether a 
 
 quotient digit is correct or not, one is essentially applying the 
 
 restriction that < p. in < d, where p.... is the result of the sub- 
 
 ~ J+l 3+1 
 
 th 
 traction of q . times the divisor from the j ' partial remainder. If 
 
 p.,, is not within this range then q.,-, is changed until it is. For non- 
 restoring division, negative partial remainders and negative quotient 
 digits are allowable, thus the range restriction is |p.,,| _<_| d | . It 
 seems reasonable, therefore, to hypothesize other division techniques 
 for which lp.,,1 < k | d |, and which utilize the quotient digit repre- 
 sentation introduced in the last section. The upper limit on k will be 1, 
 The lower limit, although not yet obvious, is 1/2, thus 1/2 < k < 1. 
 
 To show that this is in fact the case, first reconsider the 
 recursive relationship described in Section 2.1 and restated below. 
 
 P j+1 - rp. - q. +1 d (2.3.1) 
 
 th 
 After p.,, is formed on the j k cycle, it is multiplied by 
 
 the radix r (shifted left); j is increased by one and becomes rp. of 
 
 the present cycle. Since lp. +1 l <. kd, it follows p. must obey the 
 
 same restrictions, i.c 
 
 r |pj I < rk |d | (2.3.2) 
 
Substituting 2.3.1 into 2.3.2 yields 
 
 -kd < rp - q J+1 4, kd (2.3-3) 
 
 At this point the divisor is assumed to be normalized, ice., 
 
 restricted to the range 1/2 < d z_l. Furthermore, (2.3..I) is normalized 
 
 with respect to the divisor and rewritten letting z. = p./d and 
 
 J / 
 
 7 . . = p . , / d . 
 j +1 * j +1 7 
 
 z J+1 = rz. - q (2.3.1*) 
 
 Equation (2.3«M may be interpreted graphically as a plot of 
 
 z. . versus rz . with the quotient digit, q. n as a parameter. Such a 
 J+l J 4 & ' 4 J+1 * 
 
 representation shall be called a z - z plot ,, Recall that the quotient 
 digits assume values -n, -(n-l), . .., -1, 0, +1, . . . , n. Figure 2 is 
 such a graph. To facilitate discussion, each plot corresponding to a 
 different quotient digit is called a q-line. 
 
 The goal of this section is to demonstrate that a correct 
 division procedure exists which incorporates the above range restric- 
 tions and quotient representation . This existence is substantiated 
 
 if for each value of rz . in the allowed range there corresponds a 
 
 J 
 
 quotient digit and a z. ,, also in their allowed ranges. In terms of 
 
 J+l 
 
 Figure 2, this means that for any point on the rz . axis such that 
 
 -rk < rz . < rk, one must be able to move on a line segment normal to 
 
 the rz . axis and interesect a q-line at a point corresponding to a 
 J 
 
 z. , within the range -k ^ z . n <£ k. This allowed range is enclosed 
 J+l - J+l - 
 
 between the lines z. . = k and z. , = -k in Figure 2. 
 
 J+l J+l 
 
 10 
 
LJ 
 Of 
 => 
 Q 
 l±J 
 O 
 O 
 
 a: 
 
 0- 
 
 o 
 
 CO 
 
 > 
 
 fe 
 
 7 
 
 N 
 
 CM 
 
 UJ 
 
 or 
 
 11 
 
To satisfy the foregoing requirements, the maximum value of 
 
 rz , i.e. rk, must occur at the intersection of z. , = k and the q-line, 
 
 z. , - rz . -n. Similarly, the minimum value must occur at the inter- 
 J+l J 
 
 section of z. n = -k and the q-line, z. n = -rz . + n. These bounds on 
 J+l J+l j 
 
 rz . are indicated by the dashed vertical lines of Figure 2. 
 
 Figure 2 now points to the value of k in terms of r and n. 
 
 At the upper right vertex of the bounding rectangle, z. , = k = rz . - n. 
 
 J+l J 
 
 But since rz . = rk, 
 3 
 
 k = ^ (2.3^5) 
 
 The division is now characterized by tangible parameters, namely the 
 
 radix and the maximum value of quotient digits. Combining (2. 3° 5) 
 
 r-1 
 with the restriction on n, -_— c n *- r-1, verifies the statement 
 
 at the beginning of this section, 1/2 £k ^.1. 
 
 2 ,h Redundancy in the Quotient Representation 
 
 Section 2,2 indicated that the quotient digit representation 
 
 of SRT division introduces redundancy into the quotient . This fact is 
 
 also manifested in Figure 2 in the regions on the rz . axis for which 
 
 J 
 
 either one of two q-lines may be legitimately selected. For example, 
 at point A one may move vertically upward to the q . = line or 
 downward to the q. = +1 line. In either case the quotient digit is 
 correct. Figure 3> a specific case of Figure 2, testifies to the fact 
 that this freedom of choice is not merely the result of an inaccurately 
 drawn graph. Here r - k, n = 2. The vertical dashed lines define the 
 
 overlap regions. 
 
 12 
 

 
 1 
 
 1 
 
 sr cvj 
 
 i_ 
 
 -^ 
 
 
 
 
 
 
 
 \ 
 
 c 
 
 *- 
 
 c 
 
 CI 
 
 k_ 
 
 CVJ 
 II 
 
 ii 
 
 I- 
 o 
 
 M 
 
 I 
 
 M 
 
 to 
 
 UJ 
 
 Z> 
 
 o 
 
 13 
 
The production of a redundant quotient requires extra hard- 
 ware and perhaps time, to convert it to a conventional binary represen- 
 tation acceptable by programmers and other sections of a machine. 
 This conversion is discussed at greater length in Section 2.7. The 
 conclusion of the section is that the positive consequences of a 
 freedom in quotient digit selection overshadow the cost of conversion. 
 With no redundancy, the divisor and the shifted partial remainder must 
 be compared (usually by subtraction) to the full precision defined for 
 the machine o With redundancy, the designer is at liberty to inspect 
 fewer bits of the divisor and shifted partial remainder than define 
 full precision. Handling fewer bits may save time and hardware: 
 these ramifications are explored further in the chapter concerning 
 
 implementation. In Figure 3> for example, a correct quotient digit is 
 
 rp . 
 selected knowing rz .= — — "- to a precision only great enough to contain 
 
 it within an overlap region. Exactly what precision is required for a 
 given value of r and n is the subject of the next section. 
 
 In terms of z - z plots such as Figures 2 and 3> the redun- 
 dancy is proportional to the width of the overlap regions. The width 
 of this region in terms of n and r is found as follows r Consider two 
 
 adjacent lines of Figure 2, i.e., z. = rz -i and z '. = rz . - (i-l). 
 
 J +1 J ■ J +1 3 
 
 n 
 The overlap, A rz. is the difference between rz . for z. , = — - and 
 J j j+1 r-1 
 
 rz . for z' = ; — . Solving for this difference yields 
 
 j j 41 r-1 
 
 A rz . = — — — 4 1. The ratio — =- is therefore a measure of redun- 
 j r-1 r-1 
 
 dancy . 
 
 Ik 
 
As redundancy (width of overlap region) is increased , the 
 required precision of inspection of divisor and partial remainder, and 
 thus hopefully the execution time, is decreased . It, therefore, appears 
 that for a given r, n should be as large as possible, i.e., n should 
 equal r-1. Such a choice may not be practical, however, since n = h, 
 requires the ability to form h multiples of the divisor. The choice 
 of n is therefore bound up in the usual trade off between time and 
 hardware . 
 
 2,5 The P-D Plot 
 
 Now consider another graphical representation of the division 
 
 procedure. This construction, suggested by C . V. Freiman of the IBM 
 
 [51 
 Corporation is useful in further describing SRT division and in 
 
 computing the required precision of inspection of the divisor and 
 
 shifted partial remainder. The basis for the plot is the recursive 
 
 relationship 
 
 Vi " rp j - Vi d (2 - la) 
 
 as described in Section 2.1 together with the range restriction 
 
 V 1 
 
 r-1 
 
 developed in Section 2.3- The figure is thus essentially a plot of 
 partial remainder versus divisor values and therefore in this report 
 shall be referred to as a P-D plot . 
 
 15 
 
Solving the recursive relationship for rp . yields 
 
 rp. = p. + q. d. (2.5.1) 
 
 0+1 0+1 v 
 
 For a fixed quotient digit, the upper limit of rp . as a function of 
 
 J 
 
 the divisor, d occurs when p . . is maximum, i.e. when 
 
 J +1 
 
 11 A 
 
 *j+l r-1 
 
 thus 
 
 rp, _ - l-rr + q, Al 1 d. (2.5-2) 
 
 j max I r-1 0+1 
 
 Likewise, the lower limit occurs with p. _ = - — — d, thus 
 ' 0+1 r ~l 
 
 rp. . = ( -^r + q- Jd- (2.5.3) 
 
 F j nun v r-1 0+1 
 
 These linear equations may be plotted as functions of d with q.,-, as 
 
 J 
 
 a parameter ranging from -n to +n in steps of 1. The area between 
 
 rp . and rp . . for a given q. , = i will be denoted the q(i) area 
 j max ^j mm to 0+1 
 
 The division procedure is now determined. A given value of 
 
 th 
 divisor, d and the j shifted partial remainder will specify a point 
 
 in a q(i) area. The digit i will be the value of the next quotient 
 
 digit q. which in turn is used in forming the next partial remainder. 
 
 16 
 
In this representation the redundancy is manifested as overlapping of the 
 
 q(i) regions, i.e. some pairs of d and rp . will specify a point for 
 
 which either q.. = i or q . , = i - 1 is a valid choice* 
 4 J+1 J+l 
 
 Figure k is an example of a P-D plot for a division with 
 
 r = k, n = 2. The equations for the lines plotted, 2 l , 2, etc., are 
 
 given in Table 1. The region for which q. . - 2 is a valid choice, i.e. 
 
 the q(2) area. is between lines 2' and 2; the q(l) area is between 
 
 lines 1' and 1, and so forth. Note the overlap between q(i) areas, 
 
 for example, the region between line 1' and 2 in which either the choice 
 
 q. , = 1 or q . _, -- 2 is correct. Note further that the figure is 
 J+l J+l 
 
 symmetric about both axes. 
 
 On the right half of Figure k (the same may be done on the 
 left), "steps" have been drawn within the overlap of the q(i) regions. 
 The width of a "tread" (constant rp,, d varying) defines a divisor 
 
 interval, the value of rp . for each tread defines a comparison con- 
 
 _ __> ^ 
 
 stant, the distance between comparison constants defines a partial 
 remainder interval . Phrased in this terminology, division consists of 
 locating a given divisor value within the appropriate divisor interval, 
 locating the shifted partial remainder within the appropriate interval 
 (using comparison constants), and selecting a value of q enclosed 
 by the intersection of the boundaries of these intervals. Since a 
 divisor and partial remainder must be located only to within an 
 interval , they need not be inspected to full precision in selecting a 
 correct quotient digit. Here is where the redundancy pays dividends. 
 
 IT 
 
CsJ 
 ii 
 
 c 
 
 n 
 
 X 
 H 
 
 5= 
 
 H 
 O 
 _l 
 Q_ 
 
 Q 
 
 I 
 CL 
 
 UJ 
 
 18 
 
rp . - + . d 
 
 J - r-1 
 
 + Vi d 
 
 r = k 
 
 
 Vl 
 
 p d+l 
 
 2 
 
 2/3 d 
 
 2 
 
 -2/3 d 
 
 1 
 
 2/3 d 
 
 1 
 
 -2/3 d 
 
 
 
 2/3 d 
 
 
 
 -2/3 d 
 
 1 
 
 2/3 d 
 
 1 
 
 -2/3 d 
 
 2 
 
 2/3 d 
 
 2 
 
 -2/3 d 
 
 quation 
 
 rp. = 
 
 8/3 d 
 
 V3 d 
 
 5/3 d 
 
 1/3 d 
 
 2/3 d 
 
 -2/3 d 
 
 -1/3 d 
 
 -5/3 d 
 
 -V3 d 
 
 -8/3 d 
 
 Designation 
 in Figure 3 
 
 2' 
 
 2 
 
 1' 
 
 1 
 
 0' 
 
 
 
 I' 
 
 I 
 
 2- 
 
 2 
 
 Table 1. Equations Defining the Regions of Figure h. 
 
 Techniques for selecting divisor intervals and comparison con- 
 stants are detailed in the next two sections < At this point, however, 
 we shall make several general observations. First, as we shall soon 
 discover, the comparison constants are compared with the high order N 
 bits of the shifted partial remainder and, similarly, the end points 
 of the divisor intervals are compared with the N high order bits of 
 the divisor. The comparison constants and end point of the divisor 
 intervals should therefore be numbers which are representable with 
 
 N and KL bits, respectively. The choices illustrated in Figure h 
 p d ' D 
 
 which maximized the width of the divisor intervals do not meet this 
 requirement. 
 
 19 
 
In Figure 5> however, more practical choices are shown. The 
 dashed lines represent the theoretical choices used in Figure k. Now, 
 although the number of steps has been increased, the boundaries fall 
 at points easily representable in binary notation „ Note that inspec- 
 tion of k bits plus sign of the partial remainder and divisor is 
 sufficient to locate the correct choice of quotient digit. 
 
 The second observation is that the choice of divisor inter- 
 vals and comparison constants is bound up with the required precision 
 of inspection of the partial remainder and divisor; if, for example, 
 the divisor intervals widths are increased, the required precision 
 of divisor inspection, (number of bits) may be decreased. Further- 
 more, the maximum precision of inspection of the divisor is determined 
 by the divisor interval of smallest width. By inspection of Figure 5> 
 the reader might guess where this step is, but, we shall now locate 
 it analytically. The result of this derivation will be useful in the 
 next sections. 
 
 The length of a divisor interval is limited by the boundaries 
 of the overlap region. The maximum precision of inspection is required 
 where the divisor interval is minimum. To determine where this 
 minimum divisor interval occurs consider the detail of the overlap 
 of the q(i) and q(i-l) regions shown in Figure 6. 
 
 For a given value of rp., the maximum width of a divisor 
 
 J 
 
 interval is 
 
 20 
 
0.010 
 
 1 
 
 2 
 
 9 
 
 16 
 
 5 
 8 
 
 II 
 
 16 
 
 3 
 
 4 
 
 13 
 16 
 
 7 
 8 
 
 15 
 
 16 
 
 .1000 
 
 .1001 
 
 .1010 
 
 
 .1100 
 
 
 
 .1111 
 
 FIGURE 5 
 
 DIVISOR INTERVALS AND COMPARISON CONSTANTS 
 WITH r=4, n = 2 
 
 21 
 
r p : 
 
 P: ; [ n/(r-l) + i-l ] d 
 
 p. = [-n/(r-|) + i ] d 
 
 FIGURE 6. DETAIL OF A P-D PLOT OVERLAP REGION 
 
 22 
 
£s 
 
 q.saqqBius aqq sx tuntuxuxiu sxqq uaqq. <q ~7> p 7 e jj °uox8aj dBqjaAo 
 uaAxS b uBds oq. .A!jBssaoau sqBAjaq.ux josx/axp jo jaqiunu urntuxuxm aqq. 
 jaqatuBjBd.Jaqq.ouB aq.nduioo oq. pasn aq ^bth oxq.Bj uoxq.oaqas aqjj 
 
 ■u = T joj qqnox/jjxp q.sotu st jaxqjBa paqBoxpux 
 sb puB -d oq. jBuoxq.Jodojd st uoxqoaqas jo A^qqnoxjjxp aqj; 
 
 1 ; u- (q-J)x 
 
 sx (l-x = b puB x = b uaawq.aq oxqBj uoxq.oaqas 
 
 aqq.) '°Q 3-Bqq. sjBaddB q.f ' 9 axnSxj iuojj °pxtba st qusqsuoo uosxJBd 
 
 -moo aq§uxs b qoxqM joj "[BAJaqux josxaxp aqq. jo qq.pjM aqq. jo ajnsBam 
 
 aAxq.Bqaj b sx oxq.Bj sxqj, ° punoq jaddn aqq. jo adoqs aqq. oq. uoxSaj 
 
 dBqjaAo ub jo punoq jawoq aqq. jo adoqs aqq. jo oxjbj aqq. sb pauxjap 
 
 sx qoxqM <oxq.BJ uoxq.oaqas aqq. paonpo.tq.ux sBq uosq.jaqoa 
 
 La J 
 
 q-u b puB u - b uaaMq.aq pus g/q = p 
 
 oq. q.sasoqo qBAjaq.ux josxaxp aqq. vCq pauxuuaq.ap snqq sx josxaxp jo uoxq 
 
 •oadsux pajxnbaj jo uoxsxoajd aqq ° g/q - p uaqw <°a°x 'g/q = P sq.oas 
 
 -jaqxtx uoxSaj dBXJaAO aqq jo punoq jaddn aqq uaq« jnooo \\xtK u = ■ "b 
 
 C 
 joj ' d j jo anTjBA turouxuxtu aqq. 'u st x jo anqBA luninxxBiu aqq °tuntuxuxiu 
 
 sx dj puB uiniuxxBui st x uaqM 'luntuxuxui sx pv qBAjaqux aqq 
 
 ° (J-J) = H 9Jsqw 
 
 u - au +x H - _T H p 
 
 q-j q-j 
 
 T-x + -=— x + -; — ■ 
 
 • u u- 13 
 p p - P - C P - PV 
 
 * dj " dj 
 
ITO. 21 B tuo-ij q.uaxq.onb jo sqqq q.q§xa aonpojd oq. qSnoua aq.B.ioqBqa 
 
 A*quo aq paau qoxqw msxuBqoaiu uoxsxaxp b oq paq.uasa.id bjb josxaxp 
 
 aqq. jo sqxq £q puB .iapuxBiua.1 xexq-ied aqq jo sqxq gq 'g'9'2 uoxq.aag ux 
 
 uMoqs sb 'sqqq q x[9 x a asaqq aqsjcauaS oj, °.iapuxBtuaj qBxq.JBd jo q.jxqs 
 
 j^d sqxq quaxionb q.qSxa aaxribaj pqnoM uoxsxaxp 9^,2 xxpBJt y 
 
 • a"[duiBxa 
 Suxmoxtoj aqq. ux paxjxqduiaxa sx q.daouoo sxqq, -uoxsxaxp uoxsxoaad qqnj 
 
 aqq jo qapoiu uoxsxoajd paqqiuxi b sb paMaxA aq Avw sqxSxp quaxq.onb jo 
 
 uoxqoaqas .10 j tusxuBqoaiu aqq. qsqq. paq.saS3ns ssq uosq.jaqoy -sq.xSxp 
 
 [3 J 
 
 q.uaxqonb aqq 3uxq.oaqas jo poqq.am aqq jo <q.uaqxa uxBqjao b oq. pus xxpBJ 
 jo aoxoqo aqq jo uoxqounj b aq oq. UMoqs aq qqxM q.soa aqq, 
 
 •uoxq.oa"[as qqSxp 
 
 q.uaxq.onb jo q.soo aqq. sx uoxsxoaxd pajxnbaj sxqq. 'asuas b uj 'uoxq. 
 
 -oaqas qxSxp q.uaxq.onb q.oajjoo aaquBjBnS oq paq.oadsux aq q.srau japuxBiuaj 
 
 "[Bxq-JBd paqqxqs puB jcosxaxp aqq. jo sqqq axibuj Moq fo a*x 'suoxq.Btuxxo.id 
 
 -dB asaqq. ux pajxnbaj sx uoxsxoaaid q.Bq« jo uoxq.sanb oxjxoads a^oiu aqq 
 
 oq ujnq. mou aM • japuxBiuaj: qBxqjBd paqjxqs puB josxaxp aqq. jo suoxsjcaA 
 
 paqBouruq ujojj sqxSxp quaxq.onb qoaqas oq. A'q.xTjqB aqq. sx uoxsxaxp ias JO 
 
 a.inq.Baj quBqjodmx ub qsqq. paqsxqqBqsa aABq aM qxtxod sxqq. oq, 
 
 IBjauaf) 1-9-2 
 
 uoxq.oaqag 3-TSjCI q.uaxq.onb jo qsoo aqq, 9' 2 
 
 • jtioj sx 2 auxq pus , \ auxq uaaMq.aq sdaq.s jo jcaqiunu aqq • t| a.in2xji jo 
 
 x x 
 
 sqpnsaa qBoxqdBjS aqq qq.xM saajSB sxqq. q.Bqq. aq.OM 'f[ = 'S pub 0/ = • 
 
 x 
 ^ = a <2 = u = x <i ^ p 7 s/T mi M aiduiBxa joj • q 3o ^ _ B g 0T 
 
 jo qjBd JtaSaqxix = -g <-a-x ' - > x'o q.Bqq. qons 'g f jaSaq.ux 
 
dividend and a 13 bit divisor. The results of this limited precision 
 division (eight bits) are returned to the full precision mechanism as 
 part of the full precision quotient and are used in forming the next 
 full precision partial remainder . Note that the number defining full 
 precision may be changed in discrete steps by changing the number of 
 "calls" to the model division. Furthermore., the model division scheme 
 may be quite different from that of the full precision division. 
 
 For purposes of computing costs of quotient selection, we 
 shall consider two classes of model division procedures. The first 
 will be those involving the use of an auxilary arithmetic unit and 
 employing addition and/ or subtraction in forming the quotient digits. 
 Examples of schemes in this class include a radix four SRT division 
 
 performed in the exponent arithmetic unit or the procedure suggested 
 
 [9] 
 by Wallace which is logically equivalent to forming the approxi- 
 mate reciprocal of the divisor and multiplying by the partial remainder 
 This class will be referred to as arithmetic m o dels . 
 
 The second class consists of those methods which are the 
 logical equivalent of a table look-up. This technique may be viewed 
 as the direct implementation of a P-D plot, i.e., decoding the divisor 
 interval, the partial remainder interval and producing the quotient 
 digit indicated by their intersection. This class will be referred 
 to as table look-up models . 
 
 Before considering these two type models in further detail, 
 let us state more precisely the conditions which must be obtained in 
 
 25 
 
the choice of model division and precision of inspection. Let 
 
 m = the number of bits to the right of the radix point 
 
 of divisor and dividend. 
 
 /^ 
 
 rp . = the truncated version of the shifted partial re- 
 J 
 
 mainder . 
 
 e = the number of bits to the right of the radix point 
 
 in rp . . 
 J 
 
 Ap =+(2-2 ) ^ + 2 , the uncertainty in rp . . 
 
 d = the truncated version of the divisor. 
 
 5 = the number of bits to the right of the radix point 
 
 in d. 
 
 Ad =+(2 -2 )^r + 2 , the uncertainty in d. 
 
 The following cost criterion summarizes the requirements on 
 
 the quotient selection mechanism, Ad and Ap. 
 
 Cost criterion : Given the approximations rp . + Ap and 
 
 J 
 
 d + Ad, the integer result of rp ./d = i performed in the model must 
 
 J 
 
 be such that on the appropriate P-D plot, the rectangle defined by 
 
 (d + Ad, rp . + Ap) is entirely within the q(i) region, 
 
 J 
 
 2.6.2 Cost Determination for an Arithmetic Model 
 
 We first consider the determination of the cost for a 
 division using an arithmetic model. In this case rp . and d are 
 
 J 
 
 presented to a limited precision arithmetic unit and the division 
 carried out to produce a rounded integer quotient. If the bit posi- 
 tion to the right of the radix point in the model is "1", the integer 
 
 26 
 
portion is increased by one and truncated, otherwise the result is 
 merely truncated. This rounding is necessary if the cost criterion is 
 to hold for an arithmetic model. 
 
 Equation 2„5.^ indicated that maximum precision is required 
 in the overlap of the q(n) and q(n-l) regions in the vicinity of 
 d = l/2. The precision determined here will "be sufficient for any 
 other region of the P-D plot. Figure 7 is a detail of this region. 
 
 Two additional factors must now he considered: a redundantly 
 represented partial remainder and a negative divisor. As illustrated 
 in the next chapter, a division scheme which meshes well with multi- 
 plication must cope with redundantly represented partial remainders. 
 One consequence of the representation is that the truncation error 
 (Ap) attributable to considering only a few higher order bits of the 
 partial remainder may be either positive or negative. When a negative 
 (2's complement) divisor is permitted, truncation error may also be 
 negative . 
 
 In the divisor interval l/2 + Ad, the dividing line between 
 
 the selection of q = n and q = n-1 is rp . = l/2(n - l/2) since rp ./d = 
 
 J J 
 
 2 x 1/2 (n - 1/2) = n - 1/2 which must be rounded to n. For the cost 
 
 criterion to hold, the rectangle (l/2 + Ad, l/2(n - 1/2) + Ap) must 
 
 not extend below the bottom of the overlap region defined by rp . = 
 
 J 
 
 (n - 2/3)d. Such a rectangle is indicated by the dashed lines in 
 Figure 7. Since this rectangle is not unique, there is some avail- 
 able trade off between Ap and Ad. To achieve more quantitative 
 
 27 
 
rp. = (n- 1/3) d 
 
 J 
 
 r Pj = (n-2/3)d 
 
 r pj = 1/2 (n-l/2) 
 
 — d 
 
 FIGURE 7. 
 
 COST CALCULATION FROM P-D PLOT 
 
 28 
 
results, we now limit the analysis to a special but useful case: that 
 
 2k 
 in which the radix is of the form r = 2 • where k is a positive 
 
 (non-zero) integer. 
 
 2k 
 A division with r = 2 may be implemented with a cascade of 
 
 k adder/ subtractors with multiples of 1 times and 2 times the divisor 
 
 available to the first stage of the cascade, k times and 8 times to the 
 
 second, and so forth through 2 times and 2 ; times available 
 
 to the k stage. In this case, n, the largest multiple of the 
 
 divisor which may be formed, is the sum of the largest multiple which 
 
 may be formed at each stage in the cascadej i.e. n = 2 + 8 . . .+ 2 - 
 
 Furthermore, the sum of this geometric series is — — = 2/3. Thus we 
 
 r-1 
 
 2k 
 shall consider the case r = 2 , n = 2/3(r-l). 
 
 For practical implementation, the rectangular region defined 
 
 horizontally by Ap will be symmetric about d = 1/2 and rp . = l/2(n-l/2) 
 
 Referring to Figure 7, note that Ad must be smaller than the smaller 
 
 of Ad n and Ad„ . The following demonstrates that Ad^<-Ad n 
 
 . 1 max 2 max ° 2 1 max 
 
 Ad Q = 1/2 ( n - y, 2 - l) (2.6.1) 
 
 2 max \n - 2/3 
 
 Ad = l/2 ( - n - 1/2 + 1 
 
 1 max ' V n - 1/3 
 
 Ad, -Ad Q .1- %- n + l/k (2.6.2) 
 
 1 max 2 max n 2 . n + 2 / 9 
 
 Since 
 
 2 
 n - n + l/k 
 
 n 2 - n + 2/9 
 
 > 1 
 
 29 
 
Ad - Ad ^ 
 1 max 2 max 
 
 Ad., < Ad (2.6.3) 
 
 ]. max 2 ma> 
 
 Thus choosing Ad ^_Ad_, will insure that the rectangle will fit 
 — 1 max 
 
 horizontally. 
 
 Similarly 
 
 Ap x = (n - l/3)d l - l/2(n - l/2) (2.6.1+) 
 
 Ap 2 = - (n - 2/3)d 2 + 1/2 (n - l/2) 
 
 Ap x - Ap 2 = (n - l/3)d 1 + (n - 2/3)d 2 - (n - l/2) 
 
 (2.6.5) 
 
 let 
 
 d = l/2 - Ad 
 
 d = l/2 + Ad (2.6.6) 
 
 Substituting (2.6.6) into (2.6.5) yields 
 
 Ap x - Ap 2 = — ^ 
 
 30 
 
thus 
 
 Ap 1 ^ Ap 2 (2.6.7) 
 
 As implied, earlier, if we are certain that rp . = 1/2 (n - l/2) 
 
 J 
 
 will produce the quotient selection, q = n, then Ap < Ap will be 
 sufficient. If we cannot guarantee this, then Ap < Ap must hold. 
 
 We shall adopt the latter, more cautious approach. If we 
 selected the former, then the (n - l/3) term in equation 2.6.13 would 
 be replaced by (n - 2/3). The results in Table 2, however, will be 
 the same . 
 
 Recalling that Ad - 2 we want 
 
 2" 5 < Ad n (2.6.8) 
 
 — 1 max 
 
 which from 2.6.1 becomes 
 
 2" 5 <l/2 ( 2^| - 1 ) (2.6.9) 
 
 where 
 
 n = 2/ 3 (2 2k - 1) 
 
 Let 
 
 J.(x) = x if x is an integer. 
 
 = next larger integer if x is not an integer 
 
 31 
 
The minimum value of 6 is therefore 
 
 min 
 
 log 2 (1/2(1 
 
 1/2 
 
 173 
 
 (2.6.10) 
 
 Possible values of 6 are thus 
 
 6 = 6 ■ ) b . + 1, . . . m 
 
 nun min 
 
 (2.6.11) 
 
 Similarly since Ap = 2 , combining 2.6-7 and 2.6.U yields 
 
 2" £ zil/12 - 2" 8 (n - 1/3) 
 
 (2.6.12) 
 
 and thus 
 
 G = -I 
 
 log 2 L/12 - 2 _5 (n - 1/3)) 
 
 where 6 is defined by 2.6.11 
 Now let 
 
 (2.6.13) 
 
 N n = number of bits of d = 6 
 d 
 
 N = number of bits of rp . = € + 2k 
 P J 
 
 Note also that the sign of d and rp . must be known to model. Table 
 2 summarizes the results of equations 2. 6. 11 and 2.6.13 for k = 1, 2, 
 3} *+• Note that € approaches a lower limit of h when the l/l2 term 
 in 2.6.13 becomes dominant. 
 
 32 
 
k r n 6 e N , N 
 
 , d p 
 
 1 4 2 6. = 5 5 5 7 
 
 mm 
 
 17 10 
 
 3 Gh k2 5 . 
 
 mm 
 
 k 256 170 
 
 6567 
 7^76 
 8^86 
 
 m 4 m 
 
 
 
 min 
 
 7 
 
 7 
 
 7 
 
 11 
 
 
 8 
 
 5 
 
 8 
 
 9 
 
 
 9 
 
 14 
 
 9 
 
 8 
 
 
 10 
 
 i4 
 
 10 
 
 8 
 
 m 4 m 
 
 9 
 
 9 
 
 9 
 
 15 
 
 10 
 
 ' 5 
 
 10 
 
 11 
 
 11 
 
 k 
 
 11 
 
 10 
 
 12 
 
 ■ k 
 
 12 
 
 10 
 
 m k m 10 
 
 5 . = 
 mm 
 
 11 
 
 11 
 
 11 
 
 19 
 
 
 12 
 
 5 
 
 12 
 
 13 
 
 
 13 
 
 k 
 
 13 
 
 12 
 
 
 Ik 
 
 k 
 
 Ik 
 
 12 
 
 m k m 12 
 
 Table 2. Costs for Arithmetic Models 
 
 33 
 
Thus it appears there are three feasible cases for which the 
 
 cost of inspection is as follows: 
 
 Case 1 
 
 N = 4k + 3 
 P 
 
 N, = 2k + 3 
 
 d 
 
 Case 2 
 
 N = 2k + 5 
 P 
 
 N = 2k + k 
 a 
 
 Case 3 
 
 N = 2k + k 
 P 
 
 N = 2k + 5 
 
 Case three would probably be the most practical case to 
 
 implement since N is minimum. N bits of the redundantly represented 
 P P 
 
 partial remainder must be converted into conventional form before each 
 model division. Since this assimilation is essentially a serial 
 process, the assimilation time is directly proportional to N . 
 
 2.6.3 Cost Determination for a Table Look-Up Model 
 
 This class of model is a logical implementation of the P-D 
 diagram. In its most brute force form, this model may be viewed as 
 a grid or matrix with vertical lines which are the outputs of decoders 
 
 applied to d and with the horizontal lines which are the outputs of 
 
 s\ 
 
 the decoders applied to rp . . At each intersection of the lines is 
 
 J 
 
 and AND gate with one input connected to the vertical line, the other 
 to the horizontal line. Each point of intersection corresponds to a 
 
 3^ 
 
quotient digit value, i, and thus the output of each AND gate is 
 
 connected to the input of the appropriate j#R gate the true output of 
 
 which is q. = i. 
 3+1 
 
 The overlap regions are divided by steps as discussed in 
 
 Section 2„5 such that the cost criterion (Section 2.6.1) will hold in 
 
 all intervals. To determine the required N and N in this case, we 
 
 p d 
 
 again consider the worst case region of the P'-D plot where d = 1/2 
 and between q(n) and q(n-l) as shown in Figure 7- 
 
 Again, if we choose the dividing line between q. = n and 
 
 q. , = n-1 to be at 1/2 (n - l/2 ) , then the calculations of Section 
 
 j+1 ' ' 
 
 2k 
 2„6«2 also hold for the table look-up case with r - 2 . Recall, 
 
 however, that we generally wish to minimize N since this will reduce 
 
 the assimilation time in forming rp . in each cycle. We can accomplish 
 
 J 
 
 this by selecting the comparison constants, the dividing line between 
 choice of quotient digit values, as close to the top of an overlap 
 region as possible. 
 
 In the arithmetic models, the comparison constants are 
 implicit in the model, and thus, for example, we had no choice but 
 to use l/2j[n - l/2) in the cost calculations. In the present case, 
 however, we may select any value which is within the overlap region 
 and an integer multiple of 2 
 
 The value of 1/2 (n - l/2) is always an exact binary number, 
 specifically a number with a fractional part of 3/^+» The distance 
 from 1/2 (n - l/2) to the upper limit of the overlap region along 
 d = 1/2 is 1/2 (n - 1/3) - l/2(n - 1/2) •-- 1/12. This means that the 
 
 35 
 
largest comparison constant we may choose in this region without 
 increasing e to be greater than four is 1/2 (n - l/2) + l/l6. If we 
 design the logic such that rp . = l/2(n - l/2) + l/l6 and d = l/2 
 
 J 
 
 selects q. , = n, then Ad and Ap cost calculations are as follows: 
 3 +1 ' 
 
 In this case 
 
 2" 6 <Ad 
 
 — max 
 
 2~ e <7/W - 2" 5 (n - 2/3) 
 
 In the same manner as that outlined in the last section we obtain 
 
 Table 3 and the three cases. 
 
 Case 1 
 
 N = 2k + 1+ 
 P 
 
 N, = 2k + 3 
 d 
 
 Case 2 
 
 N = 2k + k 
 P 
 
 JL = 2k + h 
 
 d 
 
 Case 3 
 
 N = 2k + 3 
 P 
 
 N J = 2k + 5 
 d 
 
 The first entry N = k, N = 6 is not included in the above 
 d p 
 
 linear equations but this is the most practical case for k = 1, radix 
 
 36 
 
s N d N p 
 
 6 . - k k h 6 
 
 min 
 
 5 k k 6 
 
 6 3U6 
 
 7 3 3 5 
 
 10 5 . = 7 
 
 mm 
 
 U2 5=9 
 
 mm 
 
 10 
 
 11 
 
 m 
 
 k 
 
 7 
 
 8 
 
 k 
 
 8 
 
 8 
 
 3 
 
 9 
 
 7 
 
 m 3 m 
 
 It 
 
 9 
 
 10 
 
 14 
 
 10 
 
 10 
 
 3 
 
 11 
 
 9 
 
 m 3 m 
 
 1* 
 
 11 
 
 12 
 
 h 
 
 12 
 
 12 
 
 3 
 
 13 
 
 11 
 
 170 6 . = 11 
 
 min 
 
 12 
 13 
 
 m. 3 m 11 
 
 Table 3. Costs for Table Look-Up Models 
 
 four. By comparison with the results of Section 2.6.2, note that for 
 a given k, a case may be found for which a table look-up model re- 
 quires fewer bits of comparison than the corresponding arithmetic 
 model. 
 
 37 
 
2 .7 Quotient Conversion 
 
 The quotient developed by SRT division will in general in- 
 clude negative digits and eventually must be converted to a conventional 
 binary form. This conversion time and hardware is the greater part of 
 the price paid for the accrued advantages of redundancy. 
 
 First consider a specific case: conversion of a result pro- 
 duced by a non-restoring division. Here quotient representation is 
 the same as that discussed in Section 2.2 except that zero is not an 
 allowable digit. The algorithm for such a conversion is illustrated 
 in Figure 8. This conversion may be performed sequentially as the 
 quotient digits are generated, and thus requires no additional terminal 
 operations. The digit q is unchanged if it is positive, otherwise 
 it is replaced by r + q. . , and the adjacent higher order digit q., 
 decreased by 1. Note that since zero is not a permissible digit, 
 
 there is no requirement for a borrow propagation in decreasing q. by 
 
 J 
 
 1. The hardware required is of the order of a two digit subtractor. 
 
 It is not generally possible, however, to perform SRT divi- 
 sion not allowing q = 0. Non-restoring division may be viewed as SRT 
 division with n = r-1. For this case, the q(0) region of a P-D plot 
 is completely overlapped by the q(l) and q(-l) regions. The quotient 
 digit value q = may, therefore, be eliminated and the conversion 
 consequently simplified to that of Figure 8. For cases of SRT di- 
 vision with n*cr-ljthe q(0) region is not subsumed by other regions 
 and thus q = must be allowed if the division is to be completely 
 defined. 
 
 38 
 
START 
 
 ^ q,<0 \ 
 
 NO 
 
 
 
 
 |yes 
 
 
 i 1 
 
 
 q,-r+ q| 
 
 q,*-q, 
 
 
 11 
 
 1 
 
 ■ 
 
 
 SIGN OF 
 QUOTIENT 
 NEGATIVE 
 
 SIGN OF 
 QUOTIENT 
 POSITIVE 
 
 
 i 
 
 
 
 
 
 NO 
 
 
 
 
 i = i 
 
 
 
 
 <0^\ 
 
 
 / q ;J _ 
 
 
 
 
 [YES 
 
 
 i ' 
 
 
 q-qj., 
 
 a-*- q. 
 
 
 i 
 
 
 \ 
 
 ' 
 
 
 q j+i*- r+ qj+i 
 
 qj + i"-qj+i 
 
 
 
 
 
 
 
 
 1 
 
 
 
 
 
 
 J— 1+1 
 
 NC 
 
 
 
 
 
 
 [YES 
 
 FIGURE 8. QUOTIENT CONVERSION FOR NON- RESTORING DIVISION 
 
 39 
 
With the possibility of q = 0, the conversion is complicated: 
 the algorithm of Figure 8 is no longer adequate, for now the difference 
 q. - 1 may require a borrow from q. . Furthermore, this borrow must 
 propagate to the left until it encounters a non-zero digit. This 
 potential for borrow propagation requires that the equivalent of a 
 full precision subtractor be available to the quotient register if 
 conversion is to occur as the quotient digits are generated* 
 
 Alternately, the full precision quotient may be generated 
 and stored in the redundant form and then converted during an extra 
 terminal step. A high-speed arithmetic unit frequently employs a 
 redundant representation of the partial product during multiplication, 
 e.g. carry-save adders, which also require a terminal conversion. One 
 possibility, then, is to share the hardware for conversion of both 
 products and quotients ■> The sample implementation presented in the 
 next chapter incorporates this approach. 
 
 UO 
 
3. IMPLEMENTATION OF SRT DIVISION 
 
 3-0 Introduction 
 
 Armed with the theory and techniques unfolded in the last 
 chapter, now consider an example implementation of SRT division- This 
 example is not presented as a detailed construction proposal, "but is 
 rather intended to contribute the following: 
 
 1. A description of several fairly general considerations 
 for implementing digital division and of how SRT division 
 meshes within these considerations. 
 
 2. An elaboration, in a rather concrete way, of the concept 
 of limited precision modeling. 
 
 3. A notion as to the hardware demands and operation time 
 of functional blocks required in implementing SRT 
 division. 
 
 Throughout this chapter, it is assumed that the designer has 
 already made the decisions as to the speed of the electronic components 
 he will use, and that now he is attempting to organize these components 
 into a faster, more efficient system. 
 
 3=1 General Considerations for Implementation 
 
 Chapter 2 introduced a class of division techniques which 
 appear especially suited for implementation in a digital machine. 
 Having accepted this premise and having decided to tackle SRT division, 
 the designer is still faced with many decisions and dirty design details, 
 
 kl 
 
These details are strongly related to the structure of the allied parts 
 of the arithmetic unit and to such real life questions as available 
 logic, speed demands, available packaging space, and to a large extent 
 to the price the designer is willing to pay for a high-speed divide. 
 A thorough exploration into these factors is well beyond the scope of 
 this paper, however, there are several more general guidelines which 
 may apply. 
 3-1.1 Relative Occurrence of Division 
 
 The first guideline emerges from the observation that divi- 
 sion is usually the least frequently executed of the basic arithmetic 
 operations: add, subtract, multiply, and divide. The designers of the 
 
 r6i 
 
 IBM STRETCH computer estimated that on an average, out of l6 opera- 
 tions of a general purpose computer, the relative occurrence by opera- 
 tion type is as follows: 
 
 1 division 
 
 3 multiplications 
 
 6 additions 
 
 6 control transfers 
 
 These figures indicate that the designer should pay more to 
 accelerate multiplication than division: that in a conflict between 
 accelerating multiplication and division, the former should be the 
 victor. 
 3.1.2 Acceleration of Division 
 
 With decreasing hardware costs, increasing packaging density, 
 and demands for still faster arithmetic units, the first guideline may 
 
 k2 
 
not be as significant as it was in the days of STRETCH. Today the 
 designer will probably aim both for very high-speed multiply and divide. 
 The design question is not merely how to implement division, but rather, 
 how to implement high-speed division, or yet more specifically, high- 
 speed SRT division. 
 
 The next guidelines, therefore, related to organizational 
 factors affecting the speed of execution of division,, Of course, in 
 selecting the SRT method, the designer has already seized upon the 
 possibility of accelerating execution by decreasing the precision and 
 thus reducing the time required in selecting a quotient digit. There 
 are, however, other possibilities beyond this fundamental decision. 
 
 As mentioned in Section 2.1, the recursive relationship 
 points directly to four possibilities for accelerating division. A 
 fifth, obvious, but important factor is added here. These possibilities 
 are as follows: 
 
 1. Decrease the time for forming rp , i.e. the left 
 shift time. 
 
 2. Decrease the selection time for multiples of the 
 divisor at the divisor input to the adder/ subtractor . 
 
 3° Decrease the add/ subtract time. 
 
 h. Increase the radix and thus decrease the number of 
 
 cycles required to generate a quotient of specified 
 
 precision. 
 5° Decrease the time for selecting a quotient digit, i.e. 
 
 for comparing the divisior and shifted partial remainder. 
 
 h3 
 
The first of these is essentially the problem of minimizing 
 the number of logic stage delays required to transfer and shift the 
 contents of the secondary rank of the accumulator back to the primary 
 rank. 
 
 Similarly, the second item relates primarily to minimizing 
 control delay in operating a shift gate once a quotient digit is 
 selected. 
 
 In approaching the third factor of this list, decreasing 
 the add/ subtract time, the designer is likely to turn to a carry/ 
 
 borrow save type unit which eliminates propagation until a terminal 
 
 [71 
 
 step . This is a standard technique in implementing multiplication, 
 
 but must be approached cautiously for the case of division. 
 
 The necessity for caution arises from the fact that such 
 schemes actually introduce redundancy into the representation of a 
 sum or difference and thus, for division, produce a redundant partial 
 remainder. As mentioned in Section 2.5-2, redundancy in the partial 
 remainder complicates the quotient selection and, for a practical 
 scheme, requires that at least part of the partial remainder be 
 converted to conventional form after each pass through the subtractor (s) 
 
 Increasing the radix, although it does decrease the number of 
 cycles required, also carries with it some disadvantages. For a fixed 
 n (the upper limit of a quotient digit) an increase of r decreases the 
 redundancy — — and thus requires either greater precision in selecting 
 quotient digits, or an increase of n. As noted earlier, an increase 
 in the value of n demands the availability of more multiples of the 
 divisor and thus more hardware. 
 
 U4 
 
The fifth factor is explored further in Section 3 = 3 with 
 reference to the selection of the model division, 
 
 Note that the question of minimizing control step-up time 
 is largely beyond the scope of this paper. It is, however, a very 
 real and related problem to be faced in accelerating an arithmetic 
 process o There is little efficiency in building a system which 
 operates faster than control signals can service it. 
 3=1=3 Compatibility of Division with the Mul t iplication Scheme 
 
 According to the STRETCH statistics mentioned in Section 
 3-1.1, multiplications occur half as often as additions. Multiplica- 
 tion, however, is usually executed as a series of considerably more 
 than two additions and thus requires the use of acceleration techniques 
 if the speed of multiplication and addition are to be compatible. These 
 techniques essentially reduce to the first four of those mentioned in 
 Section 3 = 1=2 with the word "divisor" replaced by multiplicand', "left 
 shift" replaced by "right shift", and "quotient" by "product," Thus, 
 at least to a first approximation, acceleration of multiplication and 
 division are compatible. 
 
 A high-speed arithmetic unit usually includes a substantial 
 investment in hardware to accelerate the execution of multiplication. 
 Hopefully, much of this investment may also be used for division. 
 
 With this in mind and accepting the premise that accelera- 
 tion of division should place second to accelerated multiplication, 
 we adopt the following strategy: design a high-speed multiplication 
 
 U5 
 
scheme, then embed division within it , Although not the ideal, it is, 
 in fact, a practical strategy which has been used in arithmetic unit 
 design. In a sense, this guideline summarizes the guidelines mentioned 
 in both of the previous sections. 
 
 3-2 A High-Speed Multiplication Scheme 
 
 Having adopted the design strategy "multiply then divide", we 
 must now propose a high-speed multiplication scheme with which we hope 
 to mesh division. The description of the scheme will necessarily be at 
 the block diagram level and will by no means be fully justified „ Also, 
 details such as overflow and handling of the exponent will not be dis- 
 cussed. The scheme, however, has been studied and, in fact, simulated 
 by the author.. It is similar to that proposed for implementation in 
 the Illinois Pattern Recognition Computer (llliac III). The number 
 format to be handled by this device is assumed to be an 8 byte (8 bits 
 per byte) normalized floating point number with 1 byte of exponent and 
 7 bytes of mantissa. 
 
 Figure 9 is a simplified block diagram of the proposed unit. 
 3.2.1 Notation 
 
 The conventions used in Figure 9 are as follows: 
 
 1. Flipflop registers are denoted by rectangles with the 
 horizontal subdivisions indicating bytes. For example, 
 the M register (M REG) is 7 bytes (56 bits) long. 
 
 2. Groups of combinatorial logic are shown in circles or 
 rectangles with rounded corners. Any gating is re- 
 presented in terms of AND (•), OR.(v), and EXCLUSIVE 0R($). 
 
 1+6 
 
1- 
 
 
 £3 
 
 UJ o 
 
 _i 
 o 
 cr 
 
 i- 
 
 2 
 O 
 
 p c 
 
 x X 
 
 o 
 
 cr 
 
 
 < 
 
 
 Z> 
 
 o 
 
 UJ 
 
 X 
 
 cr 
 < 
 
 UJ 
 
 _i 
 
 CL 
 
 < 
 X 
 UJ 
 
 u_ 
 o 
 
 < 
 < 
 
 o 
 
 o 
 
 _1 
 m 
 
 CT> 
 UJ 
 
 Z> 
 
 U T 
 
3. The widest lines indicate a bus for data in SD format 
 (2 "bits per digit, see Section 3 '2. 2), the next 
 widest for numbers in conventional notation (l "bit per 
 digit). 
 h. Gating signal names are of the form F F„ X T T where: 
 a* F and F (F p is optional) are the names of the 
 registers from which data is transferred. 
 
 b. X = D if the transfer is direct ; i.e. not shifted. 
 X = Rn if the data is shifted n places to the 
 right during the transfer. 
 
 X = Ln if the data is shifted n places to the 
 left during the transfer. 
 
 c. T and 1 (T is optional) are the names of the 
 registers to which data is transferred from F 
 and F p respectively. 
 
 d. The concatenation of register names starting 
 with the same letter such as UM and US is further 
 abbreviated as UMS. 
 
 5- Examples of gating signal names: 
 
 a. VDM - Gate the data on the V-Bus directly into 
 the M-Register. 
 
 b. ML7Y1 - Gate the contents of the M-Register 
 shifted left seven positions into the Y input 
 of signed-digit subtractor SI. 
 
 c. UHQDLHQ, is equivalent to the two names UHDLH 
 and UQDLQ. 
 
 ^8 
 
6, The label TC MD or FROM MD indicates connections to the 
 Model Division to be described in Section 3«3»3= 
 3 . 2 . 2 Description and Operation 
 
 As mentioned earlier, multiplication is substantially accel- 
 erated by the use of an adder or adders which eliminates carry propa- 
 gation until a terminal step. The "adder" proposed for this model, 
 Sl-SU is actually a signed-digit subtractor (SDS): it incorporates 
 facilities for postponing borrow propagation . Actually, the device 
 performs both addition and subtraction under control of the "KEG" 
 signal. We shall digress a moment for a brief description of this 
 device . 
 
 Each stage of the signed-digit subtractor (SDS), as shown in 
 
 Figure 10, is a 3-input, 2 -output device together with an interstage 
 
 connection and a "NEG" control line. Y is a bit of the subtrahend 
 
 i 
 
 (minuend - subtrahend = remainder) in conventional binary form. S. 
 
 and X. together comprise the minuend in a redundant notation which will 
 
 be called SD format. Each digit of the minuend is of the form S. X. 
 — ■ to ii 
 
 where X, is interpreted as a magnitude, 1 or and S as a sign, 
 
 - + 1 = -. The SD format digits are therefore represented as follows: 
 
 s. 
 
 1 
 
 X. 
 
 1 
 
 DIGITAL VALUE 
 
 
 
 
 
 +0 
 
 
 
 1 
 
 +1 
 
 
 
 
 
 40 
 
 1 
 
 
 
 -1 
 
 1 
 
 1 
 
 -1 
 
 h9 
 
i-1 
 
 —I 
 
 Y. S, X. 
 11 1 
 
 W V i£ 
 
 Stage i 
 
 T. 
 
 Z. 
 
 NEG 
 C. 
 
 S. = 
 
 l 
 
 X. = 
 
 1 
 
 Y. = 
 
 l 
 
 T. = 
 
 l 
 
 Z. = 
 
 l 
 
 NEG = 
 
 C, = 
 
 l 
 
 T. = 
 
 l 
 
 Z. = 
 
 l 
 
 i-1 
 
 C = 
 
 l 
 
 sign of minuend digit 
 
 magnitude of minuend digit 
 
 subtrahend in conventional binary form 
 
 sign of difference digit 
 
 magnitude of difference digit 
 
 control to complement T, 
 
 NEG = -* T. not complemented 
 
 NEG = 1 ■*- T. complemented 
 i 
 
 interstage interconnection, but not a propagating borrow/carry 
 
 C . NEG 
 
 l 
 
 C. t (X. » Y. ) 
 ill 
 
 S. X. v X. Y, 
 11 11 
 
 S. , X. Ln v X. _ Y. , 
 l+l l+l l+l l+l 
 
 Figure 10, Stage of a Signed-Digit Subtractor 
 
 50 
 
The output of the subtractor is in this same forma t, i.e. Z. 
 
 is the magnitude of the digit, T. is the sign. C. and C. , are 
 
 t> i i l-l 
 
 interstage connections and, as may be seen from the logic equations 
 are not propagating borrows. Another advantage of SD format is that 
 a number may be negated merely by complementing the sign (S) bits. 
 
 Note that the postponing of borrow propagation is achieved 
 only at the expense of introducing redundancy into the representation 
 of the result. Actually two registers, for example US and UM, are 
 required to store a number in this redundant form. 
 
 We must also pay the price of conversion or assimilation , to 
 conventional form. This assimilation actually requires a borrow pro- 
 pagation and one additional subtraction. The propagation is accelerated 
 by use of look-ahead techniques, but is still rather time-consuming 
 and expensive. The propagation occurs in the propagation logic the 
 output of which is then applied to the Y input of Qh to produce the 
 assimilated result. 
 
 The propagation logic forms the outputs 
 
 B. . = 3. Z. v T. Z. 
 
 l-l 11 ii 
 
 and o4 is used to produce the assimilated result with bits 
 
 A. = Z. 9 B. 
 
 ill 
 
 roi 
 
 The SDS is described in more detail in reference . 
 
 In the proposed scheme, four of the signed-digit subtractors 
 are cascaded to provide multiplication, radix 256, i.e. 8 bits of the 
 
 51 
 
multiplier are used simultaneously. The multiplicand is loaded from 
 the V-BUS into M, the multiplier into UQ . The low order byte of UQ 
 drives recoding logic which couples to the control lines in the shift 
 array. 
 
 This recoding, suggested by Wallace , requires plus and 
 minus multiples of 128, 6k, 32, l6, 8, k, 2, and 1 times the multiplicand, 
 The multiples are formed by the shift array; the signs by the KEG con- 
 trols, i.e. by adding or subtracting the multiple. The MDY1 input is 
 used only for an ADD or SUBTRACT instruction, not for MULTIPLY. 
 
 After passing through the SDS cascade, the contents of 
 LS-LM and LH-LQ -(partial product and multiplier) are shifted right 8 
 bits back into the US-UM and UQ Registers. This continues for 8 
 cycles; the 9th is an assimilation cycle. Here the product in SD 
 format is applied to the propagation logic, the output of the propa- 
 gation logic to S^+, and consequently converted to conventional 
 representation. 
 
 Admittedly the scheme just outlined is expensive and in many 
 cases may not be justified. The designer may wish to choose a similar 
 scheme but with fewer levels of cascade, i.e. smaller radix. Although 
 the division scheme to be designed is built upon this radix 256 multi- 
 plication scheme, the techniques and procedures should be easily 
 reducible to a lower radix case. 
 
 Before concluding this section, we must admit a slight 
 diversion from our design strategy. The reader may have noticed that 
 all four of the SDS in Figure 9 have been extended to the left one byte. 
 
 52 
 
Actually, if the multiples of M were added in the order, 1, 2, h, 8, 
 l6, 32, Gh, 128 rather than the way shown, only S^4 would have to he 
 extended a full 8 bits. Since, however, quotient digits are formed 
 most significant first , (the product is formed least significant first) 
 and we wish to use this same shift array for divide, the arrangement 
 must be as shown. The extra SDS stages must be included and thus the 
 division scheme has, to some extent, infringed upon the design of the 
 multiplication scheme. 
 
 3.3 Design of Division Scheme 
 3°3°1 General 
 
 Now begins the task of embedding a division scheme within 
 the multiplication scheme described in the last section. Since the 
 SDS cascade will perform both addition and subtraction of the contents 
 of the M-Register and the number in SD format in the UM-US Registers, 
 the obvious extension is to place the divisor in M and the dividend 
 and subsequent partial remainders in UM-US. The quotient digits will 
 be produced in redundant form. In this case a logical choice would be 
 to produce quotient digits in SD format so that they may be assimilated 
 by the same circuits as used in multiplication. The contents of UH-UQ 
 may be gated to US-UM via UHQDUSM and then assimilated as in the final 
 cycle of multiplication. The quotient is thus stored in UH-UQ: the 
 sign bits in UH and magnitude bits in UQ. Furthermore, division with 
 the hardware will require an 8 bit shift from LS-LM to US-UM 
 (LSML8USM) and from LH-LQ to UH-UQ (LHQL8UHQ) . 
 
 53 
 
The full precision division is now generally defined. The 
 divisor is first stored in M, the dividend in UM and the sign of the 
 dividend in all positions of US. Quotient digits are then formed by 
 a model division using d and rp . The quotient digits are stored in 
 SD format in UH-UQ and also used to set the multiples of the divisor 
 in M to be subtracted from the dividend. The next partial remainder is 
 formed in the SDS cascade (SI, S2, S3, SU), stored in LS-LM, and then 
 shifted left 8 bits into US-IM. These cycles continue until the full 
 precision quotient has. been generated. The quotient is then gated 
 directly from UH-UQ into US-UM, assimilated, and gated into EM where it 
 is available to the central processing unit. 
 
 We must now design a model division to select the quotient 
 digits to be stored in UH-UQ and to be used to control the M- shift 
 
 array in forming a full precision partial remainder. Note that the 
 
 2k 
 division scheme here is of the class with radix r = 2 , n = 2/3 (r-l) 
 
 as mentioned in Section 2.5-2. The number of cascades, k, is k in this 
 
 case. The value of n is the sum of the maximum multiples of the divisor 
 
 which may be formed at each stage of the SDS cascade and here is 
 
 128 + 32 + 8 + 2 = 170. The radix point is between the leftmost and 
 
 next leftmost byte of the UM-US and LM-LS Registers. 
 
 3.3.2 An Arithmetic Model 
 
 First considering an arithmetic model, we select case 3 of 
 
 Section 2.5.2 and calculate that for k = k. N = 12 bits and N, = 13 
 
 p d 
 
 bits. The first 12 bits of the shifted partial remainder could there- 
 fore be assimilated' into conventional form and divided by the 13 high 
 
 5h 
 
order bits of the divisor to produce 8 quotient bits. This operation 
 could be performed by a non-restoring scheme in auxilary hardware such 
 as the exponent arithmetic unite Since an exponent unit normally 
 does not perform division, some augmentation is required. The minimum 
 addition would be a left shift path from the secondary to the primary 
 rank of the accumulator. Also, since we have specified only a 7 bit 
 exponent, the width of the exponent unit would require an extension 
 of 5 bits. These additions would, however, be relatively inexpensive. 
 The exponent unit, which normally sits idle during most of the division 
 operation, could be used more efficiently. 
 
 There is however, a major disadvantage to the arithmetic 
 models: the necessity to round the quotient digits produced by the 
 model before being used by the full precision mechanism. This 
 rounding was mentioned in Section 2.5-2 and is obligatory if the cost 
 criterion is to hold. Without this requirement the quotient bits 
 could be used sequentially as they are generated to set the gates of 
 the M-Shift array. In this case, the full precision divisor would be 
 formed in LS-LM very shortly after the last quotient bit was produced 
 by the model. Since, however, the rounding may affect the most signi- 
 ficant bit of the quotient returned from the model, the propagation 
 through the SDS array cannot begin until the model division is complete. 
 This restriction severely limits the feasibility of the arithmetic 
 models and due to this rounding requirement, a table look-up model 
 will be used in the example developed here. 
 
 55 
 
3.3-3 A Table Look -Up Model 
 
 As described in Section 2.6.3; the round-off problem does not 
 arise in a table look-up model. The major disadvantage here is hard- 
 ware cost and large fanout requirements of d and rp . to the selection 
 
 u 
 
 logic. In the example arithmetic unit being developed here, multipli- 
 cation is radix 256. For compatibility we would also like division to 
 be radix 256, and consequently, would like a radix 256 table look-up 
 model which would produce 8 bits of the quotient in parallel. By 
 considering a P-D plot for radix 256, n = I7O, or merely the fact 
 
 that N = 12 bits and N, = h bits, the reader may quickly convince 
 p d 
 
 himself that the hardware requirements for such a scheme are prohibi- 
 tive, at least with conventional logic. 
 
 A radix 16 -table look-up is probably possible with integrated 
 circuitry and perhaps with more conventional circuitry if the designer 
 is willing to pay the price; approximately 25O, 5-input NAWDS; 160, 
 8-input NANDS; 250, 8-input N#RS; and 160 drivers which will drive up 
 to 50 NOR loads. 
 
 In this example we will adopt a more modest approach in 
 implementing a radix U-table look-up and apply it successively at four 
 positions of the SDS cascade. In a sense, we have been forced to 
 reduce the radix 256 division to ^4-radix k divisions. 
 
 From Section 2.5.3 a radix k table look-up model requires 
 
 N = k, N =6. The 6 bits of the partial remainder are supplied 
 d p 
 
 sequentially from four stages of the full precision hardware labelled 
 "TO MD" in Figure 9. The first stage is the output of US-UM, the other 
 
 56 
 
three from the output of SI, S2, and S3- The high order bit supplied 
 
 to the model is displaced 2 bits right at each stage „ Thus if the 
 
 /\ 
 
 subscript 1 denotes the high order digital position, the first rp . 
 
 J 
 
 to the model is US , UM through US,-, UM^ • The second input is the 
 third through eighth output of SI, etc. 
 
 A block diagram of the proposed table look-up model is shown 
 in Figure 11 and described in Table k* The P-D plot which is actually 
 implemented is shown in Figure 12c Table 5 explicitly illustrates the 
 quotient digit selection for each rp . and d. Note the correspondence 
 
 J 
 
 between the steps in the overlap regions of Figure 11 and the steps 
 shown in the table . 
 
 Before studying these figures and tables note the following 
 considerations which are incorporated in the design: 
 
 1. Only the first quadrant of the P-D plot is actually 
 
 implemented. The approximations d and rp . are considered 
 
 to be positive and the real sign is computed as with a 
 
 sign-magnitude representation- If rp . is negative when 
 
 J 
 
 presented to the model, it is made positive before 
 assimilation by complementing the sign bits. 
 
 2. The divisor and thus the selected divisor interval is a 
 constant for a given division and thus the speed of 
 selecting the divisor interval is much less critical 
 than that of forming the partial remainder interval. 
 
 57 
 
3. The QUOTIENT SELECT TABLE actually implements ZERO and 
 TWO regions of the P-D plot in Fig-are 12 and forms j6NE 
 
 as ZERO TWO. The TWO and ZERO regions are easier to 
 
 implement than the j&NE region since they are bounded 
 
 on one side by the range restrictions on rp . . 
 
 The inputs to the model and the controls are supplied from 
 
 the full precision unit as shown in Figure 9 and are designated as 
 
 follows: 
 
 i,j = integer subscripts ° 
 
 US. = the true output of the j-th position of the US 
 
 Register containing the sign bits of the partial 
 
 remainder o 
 
 UM. = the true output of the j-th position of the UM 
 J 
 
 Register containing the magnitude bits of the 
 
 partial remainder. 
 T. . = the j-th sign bit of the output of isigned- 
 
 digit subtractor Si. 
 Z. . = the j-th magnitude bit of the output of signed 
 
 digit sutractor Si. 
 
 M. = the true output of the j-th position of the M 
 J 
 
 Register containing the divisor. M is the sign 
 of the divisor- 
 C. = sequence control signals. 
 E = logical simmation (j6R.). 
 H = logical product (AND) 
 The other symbols used in Figure 11 are defined in Table k. 
 
 58 
 
— _ c\J co ro ro _,. 
 >->->->->->->_^"0_cjr<TT 
 
 I — I I I — 1-J—IOLiJujLiJUjUJ 
 
 -z. 
 
 ac 
 
 Ill 
 
 UJ 
 
 — 
 
 ll 
 
 h- 
 
 ll 
 
 o 
 
 3 
 
 Z> 
 
 DQ 
 
 o 
 
 
 H O 
 u. cr 
 
 CO § 
 o 
 
 ■*• o 
 
 ■*■ a 
 
 00 
 
 CO 
 
 CO 
 
 o 
 
 O 
 
 CO 
 
 > 
 
 UJ 
 Q 
 O 
 
 o 
 
 LlI 
 
 o 
 
 a 
 
 CD 
 < 
 
 < 
 
 cr 
 < 
 
 Q 
 
 o 
 o 
 
 _l 
 
 QQ 
 
 UJ 
 
 a: 
 
 => 
 
 CD 
 
 cr 
 o 
 to 
 
 > 
 
 < 
 > 
 
 cr 
 
 o 
 
 UJ 
 CO 
 
 _ CM K> » 
 
 o o oo 
 
 59 
 

 , — v 
 
 
 
 T, 
 
 
 T5 
 
 
 
 
 d) 
 
 
 
 \W> 
 
 
 -p 
 
 
 
 IK 
 
 
 o 
 
 
 
 
 s 
 
 a 
 
 
 
 
 o 
 
 
 
 
 CVJ o 
 
 H 
 
 w 
 
 
 
 1 K II 
 
 EH 
 
 05 
 
 
 
 •H t-o 
 
 Ph 
 
 
 
 
 
 H 
 
 -P 
 
 
 
 
 « 
 
 Qj 
 
 
 
 •H 
 
 O 
 
 (U 
 
 
 
 CO 
 
 o 
 
 
 
 Ph 
 
 w 
 
 X 
 
 
 
 •H 
 
 n 
 
 CD 
 
 ^ -? ^ 
 
 <¥ •? ^ 
 
 CO 
 Ph 
 
 j 
 
 vO 
 
 H *H -H 
 
 •H -H -H 
 
 
 < 
 o 
 
 H 
 
 
 • rH »\ "n *\ "H 
 
 »\ r\ *\ 
 
 r-\ 
 
 V 1 
 
 S H C\J CO CO 
 P IS] tsl IS] P 
 
 rH C\J CO 
 EH EH EH 
 
 vow .« 
 
 O 
 
 ■H 
 
 H c\j on -a- H 
 
 caj on J- 
 
 
 o 
 
 
 O O O O c_> 
 
 o o o 
 
 II H 
 
 J 
 
 V 1 
 
 II > > > II 
 
 > > > 
 
 Ph II 
 
 
 H 
 
 H PM 
 
 
 Cis O 
 CO |Ph 
 
 
 
 rH 
 
 
 
 
 
 G CD 
 
 
 +^ a 
 
 
 
 O T3 
 
 
 CD -H -H 
 
 
 
 •H >} O 
 
 
 x! a oo 
 
 
 
 -p ^) s 
 
 
 -p 00 -H T3 
 
 
 
 CO 
 
 
 •h-O fl 
 
 
 
 m T3 O CD 
 
 
 Cn w cO 
 
 o 
 
 
 p 0) -P XI 
 00 -p -P 
 
 
 O O CD 
 CD m W X 
 
 H 
 
 
 •r) o "H 
 
 
 a X CD En 4J 
 
 Eh 
 
 
 <+H CD O Qh 
 
 
 M-P N < 
 
 Ph 
 
 
 a H O 
 
 
 °H 1 O Ch 
 
 H 
 
 
 O CD H 
 
 
 w • a w o 
 
 pc; 
 
 
 O W cO W 
 
 
 (DOS 
 
 o 
 
 
 C CD 
 
 
 cd • a a • 
 
 CO 
 
 
 00 -r-±|D M 
 
 
 X! "H rH 00 W 
 
 M 
 
 
 0( CU -H CO 
 
 
 -p 00 O -H P 
 
 O 
 
 
 • rH H W -p 
 
 
 •n a ^h w -H 
 
 
 -p go 
 
 
 CD ttH -P 00 
 
 XI 
 
 
 CO CD rH 
 
 
 a< a <n a cd -h 
 
 3 
 
 
 00 x) O -P 
 
 
 •rH M CO O X Ti 
 
 
 4^> M C 
 
 
 B CJ -P 
 
 s 
 
 
 PCI -P CD 
 
 
 rl ti rl -P 
 
 H 
 
 
 O cd a 3 
 
 
 cd cd o oo a 
 
 Eh 
 
 
 -poo 1 
 
 
 p> p> cd -p a cd 
 
 o 
 
 
 1 CO O CD 
 
 
 CD O XI -rH -H 
 
 1 
 
 
 00 w 
 
 
 Tj CD -P T5 B -P 
 
 
 P CD XI 
 
 S O x! ^ 
 
 
 H CD rl O 
 
 o cd ch w o a 
 
 
 < -p -p w 
 
 
 EH co O p Ch cr 
 
 
 
 
 
 EH 
 
 
 
 
 
 O 
 
 
 
 
 
 H 
 
 
 
 
 
 H 
 
 
 
 
 
 fi 
 
 N 
 
 
 EH 
 
 D 
 
 
 fe 
 
 o 
 
 
 
 
 H 
 
 S 
 
 
 H 
 
 
 CO 
 
 H 
 
 •H 
 
 Ph 
 
 Ch 
 O 
 
 a 
 o 
 
 •H 
 
 4^ 
 
 •H 
 
 rl 
 O 
 
 00 
 CD 
 
 O 
 
 CO 
 o 
 
 •H 
 00 
 
 o 
 
 l-H 
 
 a 
 
 CO 
 
 cO 
 
 a 
 o 
 
 •rH 
 
 -p 
 o 
 a 
 a 
 
 Ph 
 
 CD 
 H 
 
 r° 
 
 CO 
 Eh 
 
 6o 
 
B 
 
 M 
 
 Ph 
 H 
 K 
 O 
 CO 
 
 P 
 
 s 
 
 o 
 
 P 
 
 1 
 
 •P 
 
 
 
 
 pq 
 
 
 
 
 •H 
 
 
 
 
 S 
 
 
 
 
 E 
 
 
 
 
 > 
 
 
 •H 
 
 pq 
 
 
 •H 
 
 
 
 . J" 
 
 CO 
 
 
 e 
 
 IS 
 
 Ph 
 
 
 
 en 
 
 •P 
 
 
 •H 
 
 IS 
 
 s 
 
 
 s 
 
 , oj 
 
 E 
 
 O 
 
 Ph 
 
 IS 
 
 II 
 
 I! 
 
 II 
 
 II 
 
 •H 
 
 VD 
 
 •H 
 
 H 
 
 W 
 
 m 
 
 < 
 
 P 
 
 -cj- 
 
 m 
 
 IS 
 
 c\j 
 
 oj 
 
 
 
 
 
 oo 
 
 VD 
 
 
 
 
 
 
 
 p 
 
 P 
 
 
 
 
 
 
 
 > 
 
 > 
 
 
 
 
 
 
 
 OJ 
 
 LT\ 
 
 LT\ 
 
 VD 
 
 on 
 
 s 
 
 
 
 p 
 
 P 
 
 P 
 
 P 
 
 no 
 
 cn 
 
 > 
 
 > 
 
 > 
 
 > 
 
 S 
 
 S 
 
 IS 
 
 S 
 
 
 
 
 
 ■3^ 
 
 , w 
 
 OJ 
 
 OJ 
 
 H 
 
 -J- 
 
 J" 
 
 UA 
 
 is 
 
 IS 
 
 S 
 
 S 
 
 P 
 
 P 
 
 P 
 
 p 
 
 ii 
 
 ll 
 
 ll 
 
 ll 
 
 II 
 
 II 
 
 II 
 
 II 
 
 PO 
 
 -=f 
 
 LT\ 
 
 vO 
 
 t— 
 
 CO 
 
 ON 
 
 O 
 
 p 
 
 P 
 
 P 
 
 p 
 
 P 
 
 PT 
 
 P 
 
 rH 
 P 
 
 M 
 
 I — 1 
 
 « 
 
 co 
 
 p 
 •i 
 
 o 
 
 H 
 H 
 O 
 
 5 
 
 O 
 
 H CD 
 
 -I h 
 
 CO P 
 
 -p 
 
 M CO 
 
 d cd 
 
 H Ch 
 
 J>> CD 
 H Xi 
 
 d CO 
 
 CD -H 
 
 S x! 
 
 CD -P 
 H 
 
 £ -^ 
 S -p 
 
 O -H 
 
 O JS 
 
 >> 
 
 ,Q . 
 CO 
 
 ■np 
 <Q-H 
 'fn X> 
 
 0) CO 
 
 P Ph 
 cd 
 
 M CD 
 
 CD x! 
 
 d -P 
 
 O <Pi 
 Eh O 
 
 -P 
 
 <p 
 
 O 
 
 -P 
 d 
 CO 
 
 M 
 
 t3 
 CO 
 
 cr 1 a, 
 
 -P -P T- 
 
 O w a 
 CD m<M 
 H -H 
 CD Ch 
 W 
 
 CD 
 
 X! 
 
 -P<t3 
 
 +5 
 
 d 
 a.» 
 ■H 
 P 
 O 
 
 d . 
 
 CJ 1 CD 
 
 13 
 
 d 
 co 
 
 -p 
 d 
 
 CD 
 
 -P 
 
 o 
 H 
 
 a 
 
 0) Dm P 
 
 xj £ i 
 
 -P -H Ph 
 
 
 w 
 
 
 
 cO 
 
 CD 
 
 w 
 
 
 
 O 
 
 D 
 
 O 
 
 
 -P 
 
 
 -P 
 
 
 d 
 
 
 
 
 •H 
 
 • 
 
 CD 
 
 
 
 H 
 
 d 
 
 
 ■P 
 
 CD 
 
 a 1 
 
 
 crt 
 
 £1 
 
 •rH 
 
 
 a 
 
 6 
 
 d 
 
 
 H 
 
 a 
 
 X! 
 
 
 o 
 
 d 
 
 CJ 
 
 a 
 
 4h 
 
 
 cd 
 
 Oj 
 
 
 >> P 
 
 a; 
 
 P 
 
 rH 
 
 
 -p 
 
 CO 
 
 crt 
 
 T-J 
 
 w 
 
 
 d 
 
 CO 
 
 
 CI 
 
 •H 
 
 a; 
 
 w 
 
 ■H 
 
 rd 
 
 XI 
 
 •H 
 
 
 
 CO 
 
 X! 
 
 ■i-EH 
 
 1 
 
 -P 
 
 < D4 
 
 CO 
 
 M 
 
 
 M 
 
 d 
 
 o 
 
 CD 
 
 
 o 
 
 o 
 
 P 
 
 w 
 
 •rH 
 
 H 
 
 CO 
 
 -p 
 
 P 
 
 
 U 
 
 M 
 
 d 
 
 S 
 
 CD 
 
 CD 
 
 CD 
 
 
 
 H 
 
 > 
 
 > 
 
 rH 
 
 a; 
 
 d 
 
 d 
 
 rH 
 
 O 
 
 
 
 o 
 
 O 
 
 o 
 
 CJ> 
 
 CJ 
 
 X> 
 
 co 
 
 -H- 
 
 s 
 
 o 
 P 1 
 
 CD 
 X> 
 
 CO 
 
 r 1 s 
 
 -p 
 
 co 
 
 a 
 
 o 
 
 cj> 
 
 II t3 
 
 CD 
 rHP 
 
 S <tf 
 d 
 
 CD -H 
 
 a .h 
 
 •P H 
 CO 
 
 6l 
 

 
 
 H 
 
 
 OJ 
 
 
 
 
 p 
 
 
 P 
 
 
 ,^ H 
 
 
 VD 
 
 H 
 
 VO 
 
 
 IP 
 
 
 < 
 
 P 
 
 <; 
 
 
 , ^ 
 
 
 UA 
 
 if\ 
 
 UA 
 
 
 l<; 
 
 O 
 
 \< 
 
 <C 
 
 <aj 
 
 J* 
 
 . -* 
 
 H 
 
 -=*- 
 
 J- 
 
 j- 
 
 < 
 
 \< 
 
 P 
 
 < 
 
 <q 
 
 <c 
 
 c»l 
 
 , °° 
 
 , ^ 
 
 en 
 
 OO 
 
 oo 
 
 < 
 
 i<c 
 
 l< 
 
 < 
 
 < 
 
 < 
 
 C VI 
 
 , °J 
 
 ■ W 
 
 . oj 
 
 OJ 
 
 OJ 
 
 < 
 
 l< 
 
 \< 
 
 K 
 
 K 
 
 l< 
 
 H 
 
 . H 
 
 , H 
 
 , H 
 
 H 
 
 H 
 
 < 
 
 \< 
 
 l< 
 
 \< 
 
 l<q 
 
 l«T 
 
 II 
 
 > 
 
 > 
 
 II 
 
 > 
 
 > 
 
 O 
 
 
 
 
 
 
 K 
 
 
 
 O 
 
 
 
 
 
 
 
 
 
 4° 
 
 H 
 
 <: 
 
 
 
 -4" 
 
 
 
 P 
 
 
 OA 
 
 UA 
 
 
 P 
 
 < 
 
 
 -J- 
 
 , -J- 
 
 
 < 
 
 l<! 
 
 oo 
 
 . °^ 
 
 oo 
 
 P 
 
 l< 
 
 \< 
 
 oo 
 
 OJ 
 
 OJ 
 
 < 
 
 < 
 
 <; 
 
 <\l 
 
 , rH 
 
 H 
 
 <; 
 
 \< 
 
 l<3 
 
 OJ 
 
 > > 
 
 o 
 
 •H 
 > 
 ■H 
 
 <H 
 
 o 
 
 II 
 
 o 
 
 e 
 
 |o a g 
 pg So o 
 
 IN W CO 
 
 II 
 
 3 OH 
 O O CO 
 
 t3 
 CD 
 
 d 
 
 •H 
 
 -P 
 
 a 
 o 
 o 
 
 H 
 
 H 
 
 H 
 Ph 
 H 
 « 
 O 
 CO 
 
 p 
 
 I 
 
 H 
 
 EH 
 
 U 
 5 
 
 a 
 
 
 si 
 
 
 
 
 aj 
 
 -p 
 
 
 •H 
 
 ?H 
 
 •H 
 
 
 -p 
 
 3 
 
 5 
 
 
 crt 
 
 W) 
 
 
 
 ■P 
 
 •H 
 
 H 
 
 
 g 
 
 P«H 
 
 CD 
 
 
 (U 
 
 
 -P 
 
 • 
 
 
 
 a 
 
 o 
 
 O 
 
 CD 
 
 •H 
 
 3 
 
 •H 
 
 H 
 
 
 u 
 
 on 
 
 a, -p 
 
 -p 
 
 o 
 
 £ 
 
 O 
 
 w 
 
 H 
 
 •H 
 
 H 
 
 G 
 
 
 
 Qj 
 
 O 
 
 x 
 
 H 
 
 
 c; 
 
 •H 
 
 cd 
 
 P 
 
 
 £h 
 
 o 
 
 1 
 
 CD 
 
 -p 
 
 •H 
 
 P-i 
 
 £> 
 
 01 
 
 hi) 
 
 
 
 a 
 
 o 
 
 CD 
 
 >> 
 
 
 H 
 
 XI 
 
 crt 
 
 a; 
 
 
 -P 
 
 a 
 
 Ti 
 
 (D 
 
 
 
 O 
 
 jG 
 
 H 
 
 p 1 
 
 •H 
 
 H 
 
 O 
 
 H 
 
 TJ 
 
 taD 
 •H 
 
 H 
 
 o 
 
 a 
 o 
 
 •H 
 
 P> 
 
 a, 
 
 •H 
 
 o 
 w 
 
 CD 
 P 
 
 o 
 
 (an 
 o 
 P 
 
 CO 
 
 3 
 
 o 
 
 •H 
 -P 
 
 O 
 
 a 
 
 o 
 
 3 
 
 CD 
 H 
 
 ,a 
 
 cO 
 H 
 
 62 
 
(=5 
 O 
 M 
 
 EH 
 
 Ph 
 M 
 K 
 O 
 CO 
 
 o 
 
 H 
 O 
 O 
 
 r-H 
 
 & & 
 
 CO 
 OJ 
 
 »>\ 
 
 H 
 
 O 
 
 S M 
 O CO 
 
 •H T 
 
 B 
 
 H 
 CO 
 
 O 
 
 II 
 
 QJ 
 
 H 
 
 i 
 
 ■H -H -H 
 
 OJ OJ OJ 
 
 ? CO CO 
 
 ? C3? 0? 
 
 CO CO CO 
 
 la? Ic? Icsf 
 
 (\l ro J- 
 u o o 
 
 , co. co, vo, -3-. oj 
 
 CO CO CO CO CO 
 
 la? \<y Ic la? Ic? 
 
 r-\ rH OJ CO _"t 
 
 o o o o o 
 
 O rH OJ CO -4" 
 O C5 O tt O 
 
 CO t- VD LT\ -3" 00 OJ 
 
 H 
 
 >H 
 
 H 
 
 >H 
 
 OJ OJ 
 
 >H >H 
 
 Lf\ _"t 
 
 1-H h3 
 
 ro no 
 
 >H >H 
 
 s 
 
 H 
 
 >H 
 
 H 
 
 HH 
 
 s 
 
 PI 
 
 s 
 
 c 
 
 (L) 
 
 H 
 H 
 
 CD 
 
 m 
 3 
 
 •H 
 
 P=h 
 <P 
 
 o 
 
 C! 
 
 o 
 
 ■H 
 -P 
 Oi 
 •H 
 
 Ph 
 
 o 
 co 
 
 CD 
 
 O 
 
 H 
 
 cO 
 o 
 
 •rH 
 
 bO 
 
 o 
 
 PI 
 
 T3 
 Ci 
 
 co 
 
 a 
 
 o 
 
 •H 
 -P 
 
 O 
 
 0) 
 H 
 
 XI 
 CO 
 EH 
 
 
 
 
 
 QJ 
 
 
 
 CO 
 
 
 
 
 
 
 
 Xl 
 
 
 
 T* 
 
 
 
 
 ra 
 
 T) 
 
 
 p 
 
 CO 
 
 XI 
 
 T. 
 
 
 
 
 p 
 
 C 
 
 
 
 H 
 
 o 
 
 Cfj 
 
 Ti 
 
 
 s 
 
 •rH 
 
 crt 
 
 rH 
 
 co 
 
 CO 
 
 •H 
 
 
 QJ 
 
 • 
 
 O 
 
 bO 
 
 
 a; 
 
 a; 
 
 a 
 
 xl 
 
 co 
 
 P 
 
 rH 
 
 H 
 
 •H 
 
 nH 
 
 rr) 
 
 o 
 
 Ml 
 
 rs 
 
 n 
 
 O 
 
 
 
 H 
 
 Tl 
 
 a; 
 
 rH 
 
 3 
 
 ■H 
 
 
 w 
 
 cij 
 
 CO 
 
 Ph 
 
 
 d 
 
 o 
 
 tj 
 
 co 
 
 CO 
 
 
 rH 
 
 •H 
 
 H 
 
 P 
 
 rH 
 
 
 o 
 
 
 H 
 
 QJ 
 
 OJ 
 
 > 
 
 « 
 
 G 
 
 O 
 
 fn 
 
 rH 
 
 QJ 
 
 01 
 
 XI 
 
 CO 
 
 ■H 
 
 u 
 
 QJ 
 
 <Jh 
 
 OJ 
 
 Ph 
 
 P 
 
 G 
 
 -P 
 
 
 th 
 
 CO 
 
 •H 
 
 
 3 
 
 
 CO 
 
 M 
 
 
 aj 
 
 
 w 
 
 -P 
 
 to 
 
 
 
 
 ho 
 
 ■H 
 
 rH 
 
 XI 
 
 QJ 
 
 « 
 
 o 
 
 rH 
 
 H 
 
 B 
 
 
 co 
 
 QJ 
 
 p 
 
 XI 
 
 
 pi 
 
 01 
 
 
 C? > 
 
 
 XI 
 
 
 P 
 
 H 
 
 a 1 
 
 
 0) 
 
 D 
 
 rf 
 
 H 
 
 P 
 
 to 
 
 
 § 
 
 
 rn 
 
 x! 
 
 1 
 
 p3 
 
 C3 
 
 tu 
 
 p 
 
 HH 
 
 g 
 
 (L) 
 
 
 -P 
 
 [xj 
 
 Ph 
 
 M 
 
 XI 
 
 o 
 
 O 
 
 o 
 
 xl 
 
 r-\ 
 
 
 £3 
 
 S3 
 
 3 
 
 s 
 
 01 
 
 
 H 
 
 P 
 
 H 
 
 o 
 
 
 
 
 
 rH 
 
 aj 
 
 H 
 
 
 crt 
 
 p 
 
 4h 
 
 p 
 
 aj 
 
 H 
 
 P 
 
 H 
 
 U 
 
 CO 
 
 
 
 O 
 
 <p 
 
 XI 
 
 o 
 
 X> 
 
 a. 
 
 fe 
 
 0) 
 
 H 
 
 rtf 
 
 
 ■H 
 
 p 
 
 rH 
 
 3 
 
 •H 
 
 J - ) 
 
 rH 
 
 •rH 
 
 QJ 
 
 a; 
 
 x! 
 
 
 P 
 
 CO 
 
 p 
 
 Ph 
 
 o 
 
 P 
 
 P 
 
 £ 
 
 cO 
 
 rrf 
 
 c 
 
 
 H 
 
 
 -p 
 
 q 
 
 cO 
 
 i 
 
 G 
 
 o 
 
 rH 
 
 d 
 
 
 cO 
 
 2 
 
 M rO 
 
 2 
 
 cO 
 
 o 
 
 o 
 
 S 
 
 o 
 o 
 
 63 
 
r 
 J 
 
 ■ 
 
 
 
 
 
 3 - 
 
 
 S*'\*\* 
 
 H- 
 
 
 
 2 - 
 
 
 
 
 yr 2 
 
 ^^4 Pj .§d' 
 
 '*- 
 
 
 
 
 
 
 
 ^^^-4 Pj =|d 
 
 1 - 
 
 1 
 
 
 ^ \ 
 
 
 1 
 
 ^ ^ 4 Pj = f d 
 
 2 
 
 
 
 
 
 
 
 
 
 
 1 1 
 
 4 Pj=i d 
 
 - 
 
 1 
 
 1 1 
 
 1 
 
 1 1 -c 
 
 1 
 
 9 
 
 5 
 
 II 
 
 3 
 
 13 
 
 7 
 
 2 
 
 16 
 
 8 
 
 16 
 
 4 
 
 16 
 
 8 
 
 1.0 
 
 FIGURE 12. P-D PLOT FOR TABLE LOOK-UP MODEL 
 
 6k 
 
hp. 
 
 QUOTIENT DIGIT SELECTED 
 
 10.1100 
 
 01.1100 
 01.1011 
 
 01.1010 
 01.1001 
 01.1000 
 01 . 0111 
 01.0110 
 01 . 0101 
 01 . 0100 
 01 . 0011 
 01.0010 
 01.0001 
 01 . 0000 
 00.1111 
 00.1110 
 00.1101 
 00.1100 
 00 . 1011 
 00.1010 
 00.1001 
 00.1000 
 00.0111 
 00.0110 
 00.0101 
 00.0100 
 00 . 0011 
 00.0010 
 00.0001 
 00 . 0000 
 
 I J 
 
 
 
 
 
 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 
 2 
 2 
 2 
 2 
 2 
 2 
 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 
 2 
 2 
 2 
 2 
 2 
 
 _2_ 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 1 
 
 Divisor d 
 
 .1000 
 
 .1001 
 
 1010 
 
 .1011 
 
 1100 
 
 .1101 
 
 ,1110 
 
 1111 
 
 Table 5. Quotient Select Table. 
 
 65 
 
3.U Estimate of Speed of Execution 
 
 Although in this report we have described the division 
 scheme only at the block diagram level, a detailed simulation has been 
 programmed and will be available in . Based upon this simulation 
 and actual logic design of the arithmetic unit of Illiac III we can 
 estimate the execution time of this division scheme in terms of 
 transistor collector delays „ The actual logic is of the direct coupled 
 saturated DTL type . 
 
 Table 6 summarizes the number of transistor collector delays 
 associated with operation of each block of the model division, Figure 
 11, and with the relevant blocks of the complete arithmetic unit shown 
 in Figure 9° These figures are used in Table 7 ■ in tracing the opera- 
 tions involved in performing one division cycle i.e. making one pass 
 through the SDS cascade and producing 8 quotient digits in SD format- 
 The final cycle assimilates the redundantly represented quotient as 
 described under ASSIMILATION. 
 
 To estimate the execution time in seconds we shall assume a 
 collector delay of 15 ns and thus 8 bits of quotient require 76 x 15 ns 
 1.1 usee. A 56 bit division such as proposed for Illiac III therefore 
 requires 7 •! J^sec. plus 0.3j^sec. for assimilation or a total of 8 jusec, 
 Initial and terminal shifting of operands have not been included but 
 represent a negligible time compared to the execution time of the 
 recursive operations. 
 
 66 
 
BLOCK 
 
 NUMBER OF 
 COLLECTOR DELAYS 
 
 Model Division Figure 11 
 
 Input Gating 
 
 Sign Detect 
 
 Negate 
 
 Borrow Generate 
 
 Quotient Select Table 
 
 Quotient Storage and Shift Control 
 
 2 
 1 
 1 
 3 
 2 
 
 3 
 
 Total for Model per 2 Digits of Quotient 12 
 
 Full Precision Division Figure 9 
 
 Signed-Digit Subtracter (Each) 
 (SI, S2, S3, Sk) 
 
 M-Shift Gates (including Driver) 
 Register to Register Transfer 
 Propagation Logic 
 
 3 
 3 
 2 
 
 7 
 
 Table 6. Transistor Collector Delays of Blocks of the Division Scheme 
 
 67 
 
Initial Conditions: 
 
 Divisor in M-Register- Dividend in UM-Register „ 
 Sign of Dividend in All Positions of US-Register. 
 
 EVENT 
 
 QUOTIENT GENERATION 
 
 NUMBER OF COLLECTOR DELAYS 
 
 Perform Model Division 
 Set ML7Y1 or ML6Y1 
 Perform Add/ Subtract in SI 
 Perform Model Division 
 Set MLSY2 or MLUY2 
 Perform Add/Subtract in S2 
 Perform Model Division 
 Set ML3Y3 or ML2Y3 
 Perform Add/Subract in S3 
 Perform Model Division 
 Set ML1YU or MDY^ 
 Perform Add/ Subtract in Sk 
 Store Result in LS-LM 
 Left Shift via LSML8USM 
 
 Total Time per 8 Bits of Quotient 
 
 12 
 3 
 3 
 
 12 
 
 3 
 
 3 
 
 12 
 
 3 
 
 3 
 
 12 
 
 3 
 3 
 2 
 2 
 
 76 
 
 ASSIMILATION 
 
 Gate Quotient in UH-UQ to US-UM via UHQDUSM 2 
 
 Direct through SI h 
 
 Generate Borrows in Propagation 7 
 
 Assimilate to Conventional Form in Sk 3 
 
 Store in LM _2_ 
 
 Total Time for Assimilation 18 
 
 Table 7* Transistor Collector Delays in Execution of Division. 
 
 68 
 
k. SUMMARY AND CONCLUSION 
 
 k.l Summary 
 
 The first half of this report was largely a constructive 
 definition of SRT division. It introduced a recursive relationship 
 defining division, a representation of the quotient allowing both 
 positive and negative digits, and range restrictions on the partial 
 remainders. It was then shown that the consequence of this quotient 
 representation and range restriction was that correct quotient digits 
 could be selected by inspection of truncated versions of the divisor 
 and shifted partial remainders. The P-D plot was described and used 
 as a key tool in the development. 
 
 Next, the report turned to the more specific task of deter- 
 mining the number of bits necessary in these approximations. The cost 
 criterion was stated as the fundamental requirement on the precision of 
 
 inspection. Although this criterion is general, to obtain numerical 
 
 2k 
 results the discussion was restricted to a radix of the form r = 2 
 
 and to the arithmetic or table look-up type. The chapter concluded 
 with a short discussion of the conversion of the redundantly represented 
 numbers to conventional form. 
 
 The second major section of the report attempted to relate 
 the equations, graphs, and statements of the first section to real- 
 world problems of designing a digital arithmetic unit. It described 
 some general design considerations and pointed to compatibility of 
 division with multiplication as one of the most important. 
 
 69 
 
At this point, the discussion of division digressed to one of 
 proposing a multiplication scheme and to the block structure of an 
 arithmetic unit with which it could be realized. The focus then 
 returned to division where, after rejecting an arithmetic model, a 
 table look-up model division was proposed. 
 
 The model was described at the black-box level and some 
 estimate was given as to the expected operation time of such a scheme 
 implemented with conventional DTL. 
 
 h.2. Conclusi 
 
 on 
 
 To a large extent, this report has been directed to the 
 designer faced with the task of implementing digital division. The 
 mode of presentation, however, has not been intended to be of an 
 algorithmic style, but is rather aimed at a basic understanding of 
 SRT division in hopes that the designer will be able to adapt it to 
 his particular specifications and hardware. The chapter on imple- 
 mentation was included merely to indicate one way of applying SRT 
 division. 
 
 The author also hopes that this report will support ex- 
 ploration into development of higher radix quotient selection models, 
 e.g. a true radix 256 model which can select 8 quotients bits in 
 parallel. Note that the operating speed of the model in the example 
 implementation is by far the slowest link. 
 
 70 
 
Much of the delay in quotient select is, however, charge- 
 able to the necessity for assimilating the redundantly represented 
 p . . It would therefore appear appropriate to explore models which 
 could select quotients directly from a redundantly represented partial 
 remainder., Perhaps this could he accomplished with analog techniques 
 in which rp . was converted to a voltage proportional to the weighted 
 
 J 
 
 sum of the bits. Such a converter could handle both plus and minus 
 weights. It may also be possible to mitigate the round-off problem 
 associated with the arithmetic models. The P-D plot could then be 
 implemented with analog-digital rather than strictly digital circuits. 
 
 Also note that the form of the quotient selected by the model 
 in the example implementation is by no means unique. In this case, the 
 SD format was selected so as to be compatible with the M-Shift Array 
 control signals and the assimilation circuitry used for multiplica- 
 tion. There may, however, be more efficient recodings. Perhaps the 
 goals could best be summarized as attempting to implement division so 
 that it is actually performed as the inverse of multiplication. 
 
 71 
 
LIST OF REFERENCES 
 
 [1] Ivan, Flores, The Logic of Computer Arithmetic, Englewood Cliffs, 
 New Jersey, Prentice-Hall, Inc., 1963; pp* 2^6-3^7- 
 
 [2] 0. L. MacSorley, "High Speed Arithmetic in Binary Computers," 
 Proceedings of the IRE , U9, January, 1961, pp. 80-91. 
 
 [3] J. E. Robertson, "A New Class of Digital Division Methods," 
 
 IRE Transactions on Electronic Computers , EC-7, No. 3; September, 
 1958, pp. 218-222. 
 
 [h] J. E. Robertson, "Lecture .1 Notes for Math/EE 39V University of 
 Illinois, Urbana, Illinois, 1965° 
 
 [5] J- E. Robertson, "Methods of Selection of Quotient Digits During 
 Digital Division," File No. 663; Department of Computer Science, 
 University of Illinois, Urbana, Illinois, 1965- 
 
 [6] J. E. Robertson, Private Communication, September, 1966 . 
 
 [7] Roger E. Wiegel, "Methods of Binary Addition," Report No. 195, 
 Department of Computer Science, University of Illinois, Urbana, 
 Illinois, 1966. 
 
 [8] D. E. Atkins, "Arithmetic Unit of Illiac III: Simulation and 
 Logical Design-Part I," File No. 713; Department of Computer 
 Science, University of Illinois, Urbana, Illinois, 1966. 
 
 [9] C. S. Wallace, "Suggested Design for a Very Fast Multiplier," 
 Report No. 133; Department of Computer Science, University of 
 Illinois, Urbana, Illinois, 1963; PP- 8-9- 
 
 [10] D. E. Atkins, "Arithmetic Unit of Illiac III: Simulation and 
 Logical Design-Part II," File Note in progress, Department of 
 Computer Science, University of Illinois, Urbana, Illinois, 
 1967. 
 
 72 
 
AUG 1 5 1SC3