LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAICN cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN SEP 2 1977; SEP l£ RECfO SEP 3( IN 2 2 i 1QQR L161 — O-1096 :Ua> reportno ' v> 33?- 332 /lUutZi - RASCEL - A PROGRAMMABLE ANALOG COMPUTER BASED ON A REGULAR ARRAY OF STOCHASTIC COMPUTING ELEMENT LOGIC by John W. Esch June, 1969 JUL 17 Report No. 332 - RASCEL - A PROGRAMMABLE ANALOG COMPUTER BASED ON A REGULAR ARRAY OF STOCHASTIC COMPUTING ELEMENT LOGIC* by John W. Esch June, 1969 Department of Computer Science University of Illinois Urbana, Illinois 6l801 Submitted in partial fulfillment for the Doctor of Philosophy Degree in Electrical Engineering, at the University of Illinois, June, 1969* Digitized by the Internet Archive in 2013 http://archive.org/details/rascelprogrammab332esch - RASCEL - A PROGRAMMABLE ANALOG COMPUTER BASED ON A REGULAR ARRAY OF STOCHASTIC COMPUTING ELEMENT LOGIC John William Esch, Ph.D. Department of Electrical Engineering University of Illinois, June, 1969 RASCEL is a successful working programmable analog computer. Input numbers are converted inside the system to a mapped ratio repre- sentation which consists of two clocked random pulse sequences, one representing a numerator and the other the denominator. The machine value of one of these sequences is the probability that a logical one will occur during a clock period. This means of representing numbers was chosen because the probability of a logical one occurring can be a continuous time varying function and yet the basic arithmetic operations of addition, subtraction, multiplication and division can be performed with simple logic gates. With this number representation, a programmable stochastic computing element was designed which can be programmed to perform on 2 2 its two inputs, a and b, any of the operations a, b, a , b , a+b, a-b, a x b, a/b. These computing elements are permanently wired together in a tree structure which allows any function of this structure's inputs and the above mentioned operations to be implemented by the computer. Any other function can be computed by just reprogramming the computing elements . The results indicate that large stochastic computing systems can be built which have a number range of three to four orders of magnitude, an accuracy of 1% and a .Is slewrate. Because stochastic computers can be designed using only digital circuitry, they do not have the problems of conventional analog computers and can be built to almost any size and complexity. It appears that they are well suited for and very efficient at certain kinds of applications or computations. Ill ACKNOWLEDGMENT The author is pleased to have the opportunity to express in writing his thanks to his advisor, Professor W. J. Poppelbaum, for the four rewarding years that the author has spent working for him. During that time Professor Poppelbaum' s counsel and friendship were gratefully accepted and greatly appreciated. To many of the other graduate assistants working for Professor Poppelbaum this author extends his thanks for their advice, help and most important their friendship, all of which have made his work easier and more pleasant. To the shop personnel for their help in constructing RASCEL, to Carla Donaldson and Elinor Peterson for their typing and to Mark Goebel and Fred Hancock for their drafting, this author is greatly indebted and expresses his thanks to them for their help in getting RASCEL built and his thesis in publishable form. The author would also like to thank the Department of Computer Science for the research assistantships granted him while he was engaged in his graduate work at the University of Illinois. IV TABLE OF CONTENTS Page ACKNOWLEDGMENT iii LIST OF FIGURES vi INTRODUCTION 1 CHAPTER 1 THE ENGINEERING APPLICATIONS OF STOCHASTIC COMPUTING . . 2 1.1 Research Outside the University^of Illinois .... 2 1.2 Research at the University of Illinois 3 2 THE BASIS AND SCOPE OF RASCEL 6 2.1 The Idea which Makes Stochastic Computing Interesting - 6 2.2 The Context and Scope of this Paper 7 3 A CLOCKED RANDOM PULSE SEQUENCE (CRPS) 11 3.1 The Effects of a CRPS on an Oscilloscope or Voltmeter 11 3.2 The Definition and Characterization of a CRPS ... 12 3-3 The Generation of a CRPS 13 3.4 The Estimate of a CRPS Value 18 3-5 The Accuracy of an Estimate of a CRPS Value .... 19 k THE ARITHMETIC OPERATIONS ON CRPSs 26 k.l The Effects of Logic Gates on CRPS Inputs 26 4.2 The Effects of Delays and Logic Gates on CRPSs ... 29 ^•3 A Mapped Number Representation 31 k.k The Effects of Logic Gates on Mapped CRPS Inputs . . 32 5 A PROGRAMMABLE STOCHASTIC COMPUTING ELEMENT 36 5.1 The Role of Programmable Computing Elements .... 36 5.2 A CRPS Mapped Ratio Number Representation 36 5.3 The Equations and Circuits for a Programmable Stochastic Computing Element 38 6 THE RASCEL SYSTEM k2 6.1 The Using of the RASCEL System 42 6. 2 The Programming of the Array of Computing Elements 45 6. 3 An Example Function 46 V 7 THE DECODING OF A MAPPED RATIO NUMBER REPRESENTATION • • 50 7-1 The Reason Decoding is Necessary 50 7-2 The General Form of the Decoding Circuitry .... 51 7.3 The Dual- Subtrac tor 51 7.4 The Subbie 55 7. 5 The Divider 59 7-6 The Scale Factor Generator 6l 7-7 The Display 63 8 THE EVALUATION OF RASCEL'S PERFORMANCE 66 8.1 The Results of Input CRPS Generators 66 8.2 The Performance of a Computing Element 68 8.3 The Performance of the Dual-Subtractor 63 8.4 The Performance of the Scaling Divider 72 8. 5 The Performance of the Array of Programmable Computing Elements 73 8.6 The Attrition Problem 76 9 THE SUMMARY AND CONCLUSIONS 79 9.1 A Summary of the RASCEL System 79 9.2 A Summary of the Results Obtained from RASCEL ... 8l 9- 3 The Advantages and Disadvantages of RASCEL .... 83 9-4 Some Thoughts on RASCEL and Future Systems .... 85 APPENDIX 87 LIST OF REFERENCES 99 VITA 101 VI LIST OF FIGURES Page Figure 2-1. A model of a computing system 8 2-2. A programmable array 9 3-1. The idea of analog to CRPS conversion l4 3-2. The dual CRPS generator card circuit 16 3-3» The binary to CRPS conversion 17 5-1. A programmable stochastic computing element 4l 6-1. A block diagram of the RASCEL system 43 6-2. The appearance of the RASCEL system 44 6-3. A flow diagram for calculating operators 48 6-4. An example of a function implemented on RASCEL .... 49 7-1. Block diagram of decoding and display circuits .... 52 7-2. Block diagram of the dual-subtractor 54 7-3* Block diagram of a Subbie 56 7-4, Block diagram of the divider circuit 60 7-5* Block diagram of the scale factor generator 64 7-6. Block diagram of the display 65 8-1. The results of squaring a CRPS 67 8-2. The results of multiplying CRPSs 70 8-3« The results of adding and subtracting CRPSs 71 8-4. The results of the dual-subtractor 72 8-5. The results of the scaling divider 7^ Vll 8-6. The results of two functions programmed into RASCEL . . 75 8-7* The results of different sample sizes 78 A-l. The dual CRPS generator card circuit 88 A-2. The effects of logic gates on CRPS inputs 89 A-3- The mapped CRPS ratio representation of numbers .... 90 A-U. The programmable stochastic computing element card circuit 91 A-5» The dual-subtractor card circuit 92 A-6. The dual five-bit synchronous up-down counter card circuit 93 A-7- The autoscaling and divider card circuit 9^ A-8. The binary number magnitude comparison card circuit . . 95 A-9« The random binary number generator card circuit .... 96 A- 10. The binary number randomizer card circuit 97 A- 11. The display circuits 98 INTRODUCTION This paper describes and analyzes the RASCEL system which was designed and built at the University of Illinois ' Digital Computer Laboratory under the guidance of Professor W. J. Poppelbaum. RASCEL consists of an array of identical analog computing elements. Each element can be programmed to perform on its two inputs, a and b, any 2 2 of the operations (a,b,a ,b , a+b,a-b,ab,a/b) . All the computing elements are wired together in a particular regular pattern which allows the user to program the array to perform any function of the operations listed above on its independent analog inputs. Inside the RASCEL system numbers are represented by a random sequence of binary voltage levels corresponding to logical "0" and "1". The number represented this way is the probability that the wire will have a voltage on it corresponding to a logical "1"„ Because numbers are represented by these random binary sequences, the computing elements mentioned above are composed of digital circuitry only. But the probability that a logical "1" will occur is a continuous time varying function, so the RASCEL system is an analog computer which uses digital circuitry to perform arithmetic operations. ;h 1 THE ENGINEERING APPLICATIONS OF STOCHASTIC COMPUTE 1.1 Research Outside the University of Illinois Today, theoretical and hardware research Li :nastic computing have advanced to the point where many diverse applications are being contemplated or tried. For instance, Olavaria and Schugurensky^"" are using random pulse generators to simulate the random oscillations of molecules of an enzyme system. Since their hardware model closely approximated the theoretical behavior, they are planning to extend stochastic computing techniques to more complicated models and systems. Professor Brian Gaines, now at the University of Essex, nas been instrumental in the design and construction of the STELLA learning machine. The basic component of this machine is the ADDIE, a stochastic summing integrator with random pulse inputs and random pulse output proportional to the integral of the sum. 13 Brian Gaines' latest paper, "Stochastic Computing Systems," is highly recommended to persons interested in pursuing the subject of stochastic computing. In this paper he describes many number representa- tions (i.e. different mappings of numbers onto the range [0,1]) and gives circuits which perform the basic arithmetic operations and integration for each representation. Other applications which he discussed include: Elimination of Round-off Error in Analog to Digital Converters, Lineari- zation of Polarity Coincidence Correlators, Adaptive Elements for Learning Machines (because variables are continuous convergence is assured), Adaptive Threshold Logic, Gradient Techniques for the Identi- fication of Linear Systems, Bayes Predictors for Binary Inputs and Networks of Stochastic Computing Elements which include a solution to LaPlaces Equation and 'Neural Nets'. These are just a few of the many ways other people feel Stochastic Computing might he applied. 1.2 Research at the University of Illinois Under the guidance of Professor Poppelbaum, the Department of Computer Science's Circuits and Systems Research Group has been involved in developing Stochastic Computing theory and hardware since 1963. At that time Chushin Afuso, who received his Doctorate from the University o of Illinois in 1968, began investigating the noise characteristics of diodes in plasma breakdown. After a successful attempt to transmit audio signals with the average frequency of the noise as the information carrier, Afuso developed the Random Pulse Sequence (RPS) system of stochastic computing. The basic concept of all such systems is that , if one input to a logical AND gate has a probability P of being a "1" and the other input a probability P of being a "1", then the output has a probability P P of being a "1" if the inputs are independent. This is a very simple way to multiply. The RPS system culminated in a discrete component circuit which, with proper logical combinations of control signals, could add, subtract, multiply or divide. Since this system required a local random pulse generator for each such programmable computing element, a better system was developed by Afuso, called the Synchronous Random Pulse Sequence (SRPS) system. When a pulse occurred in the SRPS system, it was forced to occur in synchronism with the system clock. This permitted, with the use of a small buffer memory, all arithmetic operatio -formed using 01 [his was a big .;tep forw"' ise of a small bui' introduced memory Lnt e pulse sequences seriously li system when computing elements were cascaded to form more complex arithmetic expressions, i.e. buffering leads to unrandomizing. In February of I967 this author joined Afuso and worked with him to extend the SRPS system to -magnitude number representati- and to develop a pr lable arithmetic computing element. The ne system developed, the Clocked Random Pulse Sequence (CEPS) syste . utilized a mapping of numbers in the range [-1,1] onto the range [0,1] which permitted all arithmetic operations to be performed with only combinational circuitry, thus preserving probability distributions. The CRPS system of number representation and an application of it in a "large" system is described in the remaining sections of this paper. "Bundle processing", as it is being called at the Universi of Illinois, maps a sequence of n successive pulses in time onto n wires in space. In other words, a number is now represented by the ratio of the number of wires with logical "1" to the total number of wires. Dave Ring is in the process of building a 100 wire bundle processing system. He is confident that the failsoft nature of his system will be easily demonstrated by cutting a few wires and showing that this has almost no effect on the results. Currently ideas about a ,r bundle processor" which would be failsafe are being discussed. This is accomplished by having any failure or any break in a wire, which would affect a computation, be propagated along to the result. The bundle which represents the result can have all wires indicating failures or breaks eliminated and the remaining ones used with absolute assurance that the result of the calculation so represented is correct. A more ambitious system called Trans formatrix, is being undertaken by Orin Marvel, Larry Ryan and Yiu Wo. This system will be able to digitize on a grid of 32 x 32 photo resistors a slide or T.V. monitor picture and produce on-line at its output a linear or Fourier Transformation of that 32 x 32 input pattern. The many multiplications which are necessary are going to be done in parallel using stochastic computing techniques. Besides thinking about using stochastic computing for calculating phases in a phased array radar, we have thought of applying it to solve many of the needs that have crossed our paths. Like many new ideas, stochastic computing will have to go through the process of being compared with other ways of satisfying a computing need before it finds its own niche in the engineer's bag of tools. 1APTER THE S A1JD SCOPE I ;EL 2.1 The Idea which Makes Stochastic Computing Inter- Men have tried for many years to perform analog type calculations with digital circuitry. The digital computer is Ci mon example, while the Digital Differential Analyzer, b well known, is also used. Both of these computers require a rather large, complex amount of circuitry to add and multiply, whereas in the stochastic computer single quadrature n variable multiplication can be performed by a single n input AND gate and four quadrature n variable multiplication can be performed by approximately log (n) two input exclusive-or gates. Besides this large difference in circuit complexity, two other factors stand out. (l) Other Digital Systems use discrete approximations of the analog quantities which show up in round -off errors etc. (2) On- line real time problem solving becomes very difficult for sequential rather than parallel structured computers. These problems do not arise in a stochastic computer because the variables are continuous functions of time, and the output of one computing element can be used directly as the input to other computing elements. Put more succinctly, stochastic computing techniques lend themselves nicely to parallel processors of analog functions of time. This is accomplished by representing numbers in a stochastic computer by the probability of a Boolean variable being a logical "1". If we have two such variables A and B which are independent and have probabilities of being a logical "1" of a and b respectively, C = A • B has a probability of being a logical "1" of c = ab. This is one of the most fundamental properties of stochastic computing. The variables are continuous since probabilities are, and they can easily be made functions of time. In addition the output of one AND gate can be connected directly to the input of other AND gates. The goal then is to find number representations (i.e. mappings of numbers onto the probability space [0,1]) and Boolean functions whose probabilities of being a logical "1" are useful mathematical expressions of the probabilities of the input variables in the number representation chosen. 2.2 The Context and Scope of this Paper A hybrid computing system, Figure 2-1, might be thought of as a programmable analog computer with a digital computer to programming the analog computer to perform a certain operation. This can be done by having the digital computer analyze the structure of the function to be solved. It then sends to the analog computer inputs, operations to be performed and interconnections among inputs and operations. In this paper, the family of operations considered are addition, subtraction, multiplication, division and squaring. However, as will be shown later, the structure of the chosen programmable analog computer, shown in Figure 2-2, is independent of the family of operations selected. The inputs, digital or analog, to the programmable analog computer are converted to stochastic computer variables, operated on in the tree structure of programmable analog elements and reconverted at the output. 8 EXPRESSIONS VARIABLES MPILER kTORS SIGNAL ROUTING f 1 PROGRAMABLE ELEMENTS RESULTS Figure 2-1. A model of a computing system. CONVERT ru CONVERT COMPUTE ru • • • COMPUTE 0- CONVERT 0~* CONVERT LP COMPUTE LP CONVERT H> V = VARIABLE • OPERATOR R m RESULT Figure 2-2. A programmable array. Alco discussed in the remainder of thie j. re: Civ | •.■• j Mi electrical representation of numbers inside RASCEL, Chapt< t - tl implementation of arithmetic operations with logic gates, Chaptei i the design of a programmable computing element, Chapter 6 the design and use of the RASCEL system, Chapter 7 the decoding of RASCEL's number representation, Chapter S the performance of RASCEL, Chapter 9 the summary and conclusions and Appendix the notations, equations and circuits used in parts of RASCEL. 11 CHAPTER 3 A CLOCKED RANDOM PULSE SEQUENCE (CRPS) 3.1 The Effects of a CRPS on an Oscilloscope or Voltmeter If an oscilloscope probe is attached to a wire which has a CRPS on it, the most prominent attribute of the signal appearing on that wire would be its random changing. This would soon be very obvious to anyone familiar with oscilloscopes because he would not be able to synchronize the oscilloscope. However he can ascertain two things of importance: (l) Because the signal is changing, the output of the logic element from which it comes is working properly. (2) If the intensity of the oscilloscope is set properly, the observer can obtain a rough estimate of the relative amount of time the signal is in the logical "1" state. Since the signal is randomly a logical "1" or "0", this is actually an estimate of the probability that the signal is a logical "1". This is usually everything the operator needs to know, i.e. Is the source of the signal ok and what is the probability of a logical "1" occurring? If he uses an averaging volt meter instead of the oscillo- scope, he would also get a better estimate of the probability. Going back to the example of the 2-input AND gate, suppose that the voltmeter indicated that inputs A and B with output C = AB had respective probabilities a, b and c of being a logical "1". Then, if c is zero, either a or b is zero or the gate is not working properly; if c is one, either a and b are both 1 or the gate is not working properly; if c is not ab and the gate is working properly, the inputs are not independent; and if c = ab, everything is working properly. A simple tool then, the voltmeter, is all that is necessary to stochastic computer multiplier. It is important to note that it is not necessary to worry about frequency response, offset voltages, tolex regulation, rejection ratios, thermal drift, ground noise and a host of other problems present in conventional analog computers. Pulse shape is important but this too can be eliminated by using a sampling counter instead of a voltmeter. A sampling counter also allows one to adjust the sampling interval and thus control the accuracy of the estimate. 3.2 The Definition and Characterization of a CRPS Intuitively time is quantitized by a clock and during each unit of time (i.e. clock period) a voltage occurs which randomly takes on the logical values "0" or "1". The intrinsic value of the CRPS is the probability of a logical "1" occurring during a clock period. Definition : A CRPS-X with value x is a sequence X, , X , . . .X ... of independent identically distributed random variables with mean x where X. can take on the values of or 1. 1 The notation used for the expectation operator will be EXF(X.} and according to the definition above EXP{X.) = x for all i. In practice the random variable 1 n EST {X} = - .21 X. (3-1) n n i=l 1 is calculated to get an estimate of the value of CRPS-X and is called the estimated value . By the weak law of large numbers , the limit of EST {X} as n approaches infinity is the value of CRPS-X or x. This can 13 also be shown by using the fact that the expectation operator is linear because n n EXP{EST n (X}) = EXP{- i g 1 X.} = - i g 1 EXP{X i ) = X (3-2) Since the expectation of the estimated value of a CRPS is the value of that CRPS, no confusion should arise by using the notation that the expectation of a CRPS is the value of that CRPS or for the above case EXP(X} - x (3-3) 3.3 The Generation of a CRPS 3.3-1 The Conversion of an Analog Voltage to a CRPS The basic idea here is to compare a noise signal to an analog detection level as shown in Figure 3-l» A detection circuit makes its output a logical "1" whenever the noise is larger in amplitude than the detection level and makes its output "0" otherwise. Clearly, one can control, with the analog input detection level the probability of the output of the detection circuit being a logical "1". This output signal is applied to the input of a sampling flip-flop which transfers, at each "0" to "1" transition of a clock, its input to its output and stores that value until the next clock "0" to "1" transition. Since the definition of a CRPS requires that successive samples be independent, the maximum clock frequency is limited by the bandwidth of the noise and detection circuit. Ik NOISE SOURCE A LEVEL DETECTION B , SAMPLING C. < ANALOG INPUT CLOCK uu (DETECTION LEVEL) CRPS DETECTION LEVEL TIME Figure 3-1. The idea of analog to CRPS conversion, 15 In practice, three things are wrong with directly using this approach in a system: (1) a positive increment of the detection level results in a negative increment in the probability that the output will be a logical ii-i 11 (2) for a fixed input detection level the output drifts with temperature and supply voltages and (3) the transfer function from analog input detection level to CRPS value is non-linear. The solution, shown in Figure 3-2, to these problems is to use a feedback integrator which estimates the probability of a logical "1" occurring at the output, compares that estimate with the analog input voltage and generates an error signal which is used as the detection level. 3»3«2 The Conversion of a Binary Number to a CRPS The circuit to perform this operation has three basic components: a register to store the input binary number, a random binary number and a circuit which compares the relative magnitudes of the two binary numbers. Their interconnection is shown in Figure 3-3* The output of the comparison circuit is a logical "1" whenever the input binary number is larger than the random binary number. The probability that the output of the comparison circuit will be a "1" is the proba- bility that the input number is larger than the random number. The distribution of the output is a function of the distribution of the random binary number. If each bit of the random binary number has a probability of one -half of being a logical "1" and all bits are independent, then the random binary number has a flat distribution. In 16 IS o c z o < "I (T* pO < "- 1 (0 X u. — IT W U a: ^ HI yU -p •H O •H O § o -p a c 3 I CO fe 17 n BIT BINARY NUMBER (r) < r < 1 - 2" n • • • 1 n-1 < i i f _J i BINARY NUMBER i SIZE COMPARITOR R = 1 IF r > x CRPS - R PROB (R = : o' 1 i n-1 i • • • RANDOM BINARY NUMBER (x) GENERATOR Figure 3-3. The "binary to CRPS conversion. 18 this case the output of the comparison circui - . ifitb mei equal to the input binary number. More will be said about this circu later. 3.k The Estimate of a CRPS Value 3.U.1 The Conversion of a CRPS to an Analog Voltage In section 3*2 it was mentioned that the usual way to estimate the probability that X. equals a logical "1" is to calculate EST (X) = 1 n — . Z, X. which is just the average of the X. values. This can be n i=l 1 1 obtained by a long time constant RC circuit for which the time constant and the clock frequency determine the value of n. Instead of an RC circuit, an RL circuit can be used as in the voltmeter mentioned earlier. In his thesis, "Analog Computations with Random Pulse 3 Sequences," Chushin Afuso derives an expression for tne output voltage of an RC circuit with a CRPS as an input. His results and our experience indicate : (1) The error in the estimated value decreases as 1/vn and the fluctuations of the voltage decrease and l/n, and (2) The pulse shape going into the RC circuit is the prime source of error. 3.^.2 The Conversion of a CRPS to a Decimal Number Since all CRPSs have a time reference, namely the clock, any CRPS can be sampled by that clock. If a decimal counter is incremented only when the sample is "1", and if every n counts the contents of the decimal counter is transferred to a decimal display and the counter is reset to a count of zero, then the contents of the decimal display, with 19 the decimal point placed according to n, is the current value of the estimate. In practice a frequency counter with built in time base can be used with good results. Chapter 7 will explore in depth the possibilities which arise when a binary counter is used. 3.5 The Accuracy of an Estimate of a CRPS Value 3.5.1 The Variance of an Estimated Value of a CRPS To be sure that no confusion arises, the following distinction is made again: the value of a CRPS is the probability that a "1" will occur during an arbitrary clock period and the estimated value of a CRPS is 1 n EST {X} = - .1, X. n n i=l 1 where X. is the actual value of the CRPS during the i clock period. Since we have no way of directly measuring the value of a CRPS, the usual practice is to estimate it by calculating EST {X). However, we are dealing with a random process and therefore it is useful to know the probability density distribution of the estimated value. From this it is useful to get an idea of how good an estimate EST (X} is of x. The normal parameter for measuring this is the variance of the estimated value. The distribution of the X. is binomial because a CRPS consists 1 of a sequence of independent trials each with probability x of being a 15 logical "1" and by the central limit theorem the distribution of EST {X} is shown to be Gaussian in the limit. However to calculate the n variance, this fact is not needed. The variance of I jndom variables X. about the me- 1 given by VAR(X.) = EXP((X. - EXP{X.}) 2 } = EXP[X. 2 - 2X. EXP(X.} + EXP 2 {X.}} = EXP{X. 2 } - EXP 2 {X.} 2 Since X. is either or 1, X. = X. and i li VAR{X.} = EXP{X.} - EXP 2 {X.} = x - x 2 = x(l - x) (3-U) The variance of the estimated value about the expected value x is given by VAR{EST {X}} = EXP{(EST (X] - EXP{X)) 2 ) 1 n 2 = EXP{(- .1, (X.) - xH n 1=1 i 1 n 2 = EXP{(± ± L ± (X. - x)n n n ? = EXP{.Z, .L, (X. - x)(X. - x)}(l/n ) ^i=l .1=1 l .1 ' This double summation can be separated into two parts, one where i = j and one where i ^ j . 21 n n n VM{EST n {X)} = EXPf.^ (X. - *r + i g x .2*. (X. - x)(X - x)}(l/n ) n n n (.^ EXP{(X. - x)^} + .| 1 | x EXP{(X. - x)(X - x)})(l/n^ = Because EXP{X.X.} = EXP{X. }EXP{X. } = x 2 f or i ^ j . EXP{(X. - x)(X. - x)} = for i ^ j which simplifies VAR{EST {X}} to n ? ? VAR{EST n (X}} - .2^ (EXP{(X. - x)^})/(l/n ) 2, ° .^ (VAR{X.})(l/n") VAR{EST {X}} ^VAR{X.}/n (3-5) This means that the variance of an n-term average, the th estimated value, is one n as large as that of a 1-term average X. , i.e. the variance of the estimated value of a CRPS decreases as l/n. 3-5.2 The Precision and Accuracy of the Estimated Mean of a CRPS Imagine calculating the estimated value using a frequency- counter . Successive calculations of the estimated value will appear and they will in general all be slightly different because of the variance of the estimated value. It will also be apparent that only the first d most significant digits are relatively stable while the rest appear to be changing randomly. Define the estimated mean m to be determined by the first d + 1 digits of a frequency counter rounded off to d digits where the frequency counter is calculating the estimated value. If the radix is r, then the precision PRE of m is PRE = r -d (3-6) For example, if r = 10 and d = 3? then m can only be resolved to a precision of 0.001. In this paper, the answer to the question, "What is the accuracy of the estimated mean?", is F% where P is 100 times the probability that the error in the estimated value is less than r /2 in magnitude, i.e., P = 100 PR0B[|EST (X) - x| < r" /2) (3-7) Because the probability density distribution for the estimated value is Gaussian for large n, it can be expressed as -(EST (X) - x) 2 /(2c 2 ) f(EST n (X}) =— i- e n (3-3) V 2no Rearranging, equation (3 _ 7) becomes -d -d P = 100 PR0B{x - — - < EST {X} < x + ^-} Because the probability that x is in some interval is given by the integral of f (EST [X] ) over that interval, P becomes 23 x + r /2 ! -(EST {X} - x) 2 /(2a 2 ) x - r" d /2 P = 100 / j 1 e dEST {X} J , v2jta After making the change of variable EST (X} - x n z = h7 and using symmetry, P becomes l /(a/2ff) _ 2 200 r* /^^ _ z < P = -7 — / e dz ^ n Jo One definition of the error function is 2 f* ERF{w} ,^/ Q w 2 e dz Using this definition, P becomes -d P = 100 ERF(r" /(a/2a)) Because the error function cannot be evaluated in closed form, it is usually given, as below, in tabular form in terms K and ERF(K/v2] . K ERF(K//2} 0.5 0.3 1.0 O.683 .'> 0.955 3.0 0.997 i+.o 0.9999*+ As an example, if K = k.0, then P is 99.99H* If K is chosen to be 3-0. this means that r" d /2o > 3 (3-8) will assure a system which is at least .3% accurate. From section 3.5.1, the variance is equal to x(l - x)/n which has a maximum at 2 x = 1/2. Taking the maximum value of c only makes equation (3-8) harder to satisfy, so letting c = l/Un or = l/2vn gives 2d Vn > 3r or n > 9r (3-9) If r = 10, the minimum acceptable value of n is approximately n^2d + 1 n = 10 If r = 2, the minimum acceptable value of n is approximately ~2d + 3 n = 2 J 25 For example, for a precision of two decimal digits and an accuracy of ,5 at least 99%> n must be at least 10 . If a one megahertz clock is being used, then the minimum sample time to obtain a precision of two decimal digits with an accuracy of 99% is a tenth of one second. It should be emphasized that this precision and accuracy arises only because we are trying to estimate a probability. Inside a stochastic computer the accuracy of the computation performed by any computing element is independent of that element (provided all the gates are working) and depends only on how well the probabilities of the inputs and output are estimated. I CHAPTER k THE ARITHMETIC OPERATIONS ON CRPSs k.l The Effects of Logic Gates on CRPS Inputs In order for the reader to more fully understand the power of stochastic computing, this section will analyze in detail the results of applying CRPSs to the inputs of simple logic elements. Because of the basic assumption of a CRPS that X. and X. are i J independent for i ^ j , the effects of logic gates need only be analyzed during one clock period. Consequently, unless confusion may arise, the subscript of X will be dropped. i+.l.l Not Element Assume that CRPS-X is the input and that EXP{X} = x. The output of the NOT element X has a value given by EXP{X} = EXP{1 - X} = 1 - EXP{X} = 1 - x (k-1) This is as expected since the probability that X has a logical "1" is the probability that X will have a logical "0". 4.1.2 AND Gate If CRPS-A and CRPS-B with EXP{A} = a and EXP{B} =b are the inputs to an AND gate, then as we have seen before EXP{AB} = EXP[A}EXP(B} = ab (k-2) 27 if A and B are independent. It should be noted that this result generalizes directly to n independent CRPSs. 4.1.3 OR Gate with Disjoint Inputs If A and B are CRPSs with values a and b respectively and if AB = 0, then EXP{A ss B] = EXP(A B} = EXP{1 - A B} disj = 1 - EXP{A |) «1- EXP{(1 -A)(l - B)} = 1 - EXP{1 -A-B+AB} =1- EXP{1 -A - B] EXP[A „ B) = EXP{A) + EXP{B} = a + b (4-3) disj Note that by using an inductive argument, this result can be generalized to n disjoint CRPSs. 4.1.4 An Arbitrary Boolean Function of CRPSs Let a Boolean function F (expressed by the OR of disjoint rninterms) be written as F = . y . M. 1€0 l where 9 is the set of minterm indices which compose F. Also let EXP{M.} = m. for all i e e . >'>', Theorem 4-1 : EXP(F} = .£ m. b f and VAR{F] - f(l - : . Proof : EXP[F) equals EXP{.y M.) which, because the M. are disjoint, is equal to .Z EXP{M. ) by section 4.1.3 and by the assumption EXP{M.} = m., EXP{F} ■ .L m. = f. The proof that VAR(F} = f(l - f) follows directly from equation 3-4, VAR{X} = EXP{X} - EXP 2 {X) , which is completely general and applies to any CRPS, in particular to F. To illustrate the implications of Theorem 4-1, the EXP and VAR functions will be calculated for the function F = A V B, Expressed in rninterm form F = Ai v AB v AB. EXP{AB} = EXP{A - AB) = EXP(A} - EXP(AB) , EXPlAB} = EXP{B} - EXP{AB} , and EXP(F} = EXP{A} + EXP(B) - EXP{AB} . If A and B are independent, EXP{F] = a -i b - ab, and VAR{F) = (a i- b - ab)(l - a - b + ab). If some of the inputs to F are not independent, this is taken care of when the EXP function of the individual minterms are calculated. For example, if in F = A v B, AB = 0, then EXP{F} = EXP{A) + EXP(B} because EXP{AB} = and f = a + b as in 4.1.3. 4.1.5 Linear Combination If M. , for i = 0, . . .,n-l, are the minterms of a modulo n 1 n-1 counter and if F = . V\ M.A.X. , then because the M. are all disjoint and i=0 l 1,1 l n-1 by Theorem 4-1, EXP{F} = . 2. EXP(M.A.X.}. If the M. , A. and X. are J ' *• J i=0 l i i J i' l l independent for i = 0, 1, . . . , n - 1, then 1 EXP{F] = - .Z n a.x. (4-4) ' n i=0 li 29 This is a weighted linear summation or linear combination of the X. values. Of particular importance is the special case where a. = 1 for all i and n = 2 which gives EXP{F} = (x + x )/2 for F = MX sy MX. To avoid the introduction of any periodic or non- random effects, which would be the case for the minterms of a modular n counter, let H be a CRPS with EXP{H} = 1/2. Then form F = HX v HX which has EXP{F} * [x x + X 2 )/2 (4-5) In this way, by using a CRPS with value 1/2, it is possible to form a scaled sum of two other CRPSs. 4.1.6 Multiplexing In most analog systems, the need arises to be able to switch one of many analog signals onto a common bus. In more conventional analog computers this operation requires digital controlled analog gates, but in a stochastic computer only simple logic gates are needed. If we want f to be a when K s= and f to be b when K = 4, then F = KA v KB. 4.2 The Effects of Delays and Logic Gates on CRPSs One of the most useful and powerful techniques available in stochastic computing is the ability to delay a CRPS one clock period and use this delayed version along with the original in some calculation. Because of the fundamental assumption that different clock periods of a CRPS are independent, the original and delayed version of a CRPS are independent . 4.2.1 Delay and AND to Calculate the Square The notation that AA is the i fold delay of A and that AA i 1 and AA are the same is very useful. Since AA and A have the same value and are independent, an AND gate can be used to multiply them. F = AAA has value EXP{F} = EXP(AAA) = EXP(A) EXP{AA} = EXP 2 {A} = a 2 (4-6) This result is interesting because it implies the ability to square; however, it is important because it implies the ability to check any CRPS to see if it satisfies the assumption that different clock periods are independent. 4.2.2 Delay and AND to Calculate the Variance From equation 3-4 the variance is given by x(l - x) and from equation 4-1 the value of the NOT with x as the input value is 1 - x. In order to multiply x by 1 - x an independent CRPS is needed, but such a sequence can be obtained by delaying. If X has value x and V = XAX, then value of V is EXP{V} = EXP {XAX} = EXP(X} EXP{AX} = x(l - x) (4-7) Again, as in the previous section, if the value of V is not equal to x(l - x), then the sequence X is not a true CRPS because, if the value of V is not equal to x(l - x), then X and AX are not independent. A very interesting aspect of the equation V = XAX is that V is a 1 each time X makes a to 1 transition. This implies that the var- iance of a CRPS is determined by the number of changes in that CRPS, which intuitively makes a lot of sense. 31 4.3 A Mapped Number Representation In all computers there is some absolute range of machine values onto which the variable values of the user must be mapped. In a stochastic computer this range of machine values is identical to the range of probabilities, i.e. [0,1]. In order to represent negative numbers or numbers larger than 1 in magnitude, it is necessary to map them onto the range [0,1]. This section deals with mapping negative numbers in the range [-1,1] onto the range [0,1] while section 5-2 deals with numbers larger in magnitude than 1. 4.3.1- Possible Linear Mappings Once a mapping is chosen, a reasonable question is: "If I have a CRPS with variable value x, how do I obtain a CRPS with variable value -x?". In this paper only those mappings will be considered for which the NOT element acts as a negator of the variable value. The constraints on the mapping are that g maps the range of numbers [-1,1] onto the range [0,1] such that if X is a CRPS with variable value x then X has variable value -x. More concisely [-1,1] - [0,1] € g(-x) = 1 - g(+x) (4-8) The following notation is used throughout the remainder of this paper. A CRPS-X will be written X = [x,x'] where machine value x' = g(x) and variable value x = g (x'). Because of the restrictions on g there are only four linear mappings possible. (1) x' = g (x) = |X|/2 if £ X < 1 - 1 - |x|/2 if -1 < X < (2) X' = g 2 (x) = |x|/2 if -1 < X < b 1 - |x|/2 if < X £ 1 (3) x- = g 3 (x) = 1/2 + x/2 (h) x' = gi+ (x) = 1/2 - x/2 The next reasonable restriction is that the circuits to multiply and add should be as simple as possible. Several attempts to find such circuits for g n and g proved unsuccessful; however, circuits 13 for g and g, are easily obtained. Since others, among them Gaines , have discussed g , this paper will concentrate on g, and from this point on will call it g, i.e. x' = g(x) = 1/2 - x/2 and x = g~ 1 (x*) = 1 - 2x' (U-9) Also define the function which yields the variable value of a CPRS - X = [x,x' ] as W{X} = g _1 (EXP{X}) = g _1 (x') = x (U-10) k.k The Effects of Logic Gates on Mapped CRPS Inputs This section parallels section 4.1 in that each subsection will analyze the effects of a logic gate or circuit on mapped CRPSs. 33 k.k.l Not Element If X = [x,x'J is a CRPS connected to the input of a Not element, then W{X) = g (x 1 ) = x and W{X) = g _1 (EXP{X} = g -1 (l - x'; = 1 - 2(1 - x') = 2x' - 1 Substituting for x' gives W(X] s 2(1 - x)/2 - 1 = -x W(X} = -W[x} (^-11) k.k.2 AND Gate If A = [a, a'] and B = [b,b'] are the inputs to an AND gate then, if A and B are independent, W(AB} = g _1 (EXP{AB}) = g -1 (a'b') = 1 - 2a«b» « 1 - 2(1 - a)(l - b) /k W{AB} - 1 - (1 - a)(l - b)/2 (^-12) 4.4.3 OR Gate If A = [a,a'] and B = [b,b'] are the inputs to an OR gate, then if A and B are independent W{A ,y B} ■ g _1 (EXP(A v B}) = g" 1 (a' + b 1 - a'b') = g _1 (l - (1 - a»)(l - b')) - 1 - 2(1 - (1 -a')(l - b')) = 2(1 - a')(l - b')-l = 2(1 - (1 - a)/2)(l - (l - b)/2) - 1 W(A sy B} = (1 + a)(l + b)/2 - 1 (4-13; 4.4.4 OR of Disjoint Inputs With A and B as disjoint CRPS inputs W(A v B} = g _1 (EXP(A s, B}) = g _1 (a' + b'} disj = 1 - 2(a* + b') m (1 - 2(1 - a + 1 - b)/2) W{A ^B}=a+b-l= W{A} + W(B} - 1 (4-l4) disj 4.4.5 AND-OR of Disjoint Inputs If A = [a,a*]j B = [b,b f ] and C = [c,c'] are independent CRPSs, then W{AB v AC} = W{AB} + W{AC} - 1 by equation (4-1*+) because AB and AC are disjoint. By equation (4-12) W{AB} = 1 - (l + a) (l - b)/2 and VV{AC} = 1 - (1 - a)(l - c)/2, so W{AB s/ AC} = 1 - (1 + a)(l - b)/2 - (1 - a) (l - c)/2 (4-15) This result in itself is not useful until special cases are taken. For example if C = B, then W{AB v AB} - W{A © B} = 1 - (l + a)(l - b)/2 - (l - a)(l + b)/2 W(A © B} = ab = W{A} x W{B} (4-l6) 35 which means under the mapping g, the exclusive-or operation has the effect of multiplying the variable values of the inputs. Another special case is when a = 0. Let Z be a CRPS with machine value 1/2, then W{Z] = and equation (k-15) becomes W{ZB v ZC] = 1 - (1 - b)/2 - (1 - c)/2 W{ZB ^ ZC} = (b + c)/2 = (W{B} + W{C})/2 0-17) This means that under the mapping g, the AKD-OR circuit can also be used to give a scaled summation of the input variable values. It is worth noting that both equations (h-l6) for multiplication and (U- 17) for summation can be generalized to n input CRPSs. In the appendix there is a summary page which, for easy reference, lists all these results. CHAPTER 5 A PROGRAMMABLE STOCHASTIC COMPUTING ELEMENT 5 . 1 The Role of Programmable Computing Elements As mentioned earlier, in a general hybrid computer the analog section can be a regular array of programmable analog computing elements These elements can be programmed by control signals to perform on its inputs any one of some fixed set of operations . The aspect of a stochastic computer which makes it so attractive is the ability to design such a computing element using only digital circuitry. 5.2 A CRPS Mapped Ratio Number Representation Since division cannot be done easily, a number in the RASCEL system is represented by the ratio of a numerator and denominator. This also eliminates the problem of mapping a number greater than 1 in magnitude onto the range "[0,1]. It means that each number in the RASCEL system has many representations 'which, when the performance of the system is evaluated, becomes more meaningful. However it does have the advantage that the circuitry to perform any basic arithmetic operation is rather simple. 5.2.1 Notation for Numbers in the RASCEL System If a is a rational number expressed by a /a,, then the CRPSs which represent the numerator and denominator are A = [a , a '] and n n n A = [a ,a ']. For example if a = -5, then one of the possible values for a /a, is a = -.5, a, = .1. In this case, A = [-.5, '75] and n' d n ' d ' n A d = [.1,.U5]. 37 5.2.2 Expressions for Arithmetic Operations If a and b are two rational numbers and p = a x b, q - a/b, s = a + b, and d = a - b, then the object of this section is to give Boolean functions of the RASCEL representations of a and b whose variable values correspond to p, q, s and d. Let a = a /a,, b = b /b . J ' n' d rr d P=p/Pn? 9. = <1 AUs s=s /s and d = d /d n be expressed in RASCEL by n' d Tr d n' d n' d A n = [a n' a n' ] ' A d = [ VV ] > B n = [ VY ] > B d = [ VY ] > P n = [ W ] > P d " [ Y P d' ] ' % = ^ n >V ] > \ = ^VV ] ' S = [s ,s '], S, = [s,,s '], D s [d,d '], D^ - [d,,d']. n n n d d d n n' n d d d To find Boolean expressions for P and P. note that p = p /p. = n d n' d (a /a n )(b /b ) implies that p = a b and p, = a n b n . Equation (k-±6) n' d n' d n n n d d d * indicates that WfA © B ) = a b and WfA^ fi Bj = a,b,. so n n n n d d d d P = A © B and P = A, 9 B n (5-1) n n n a d d Division can be accomplished by switching the rolls of the numerator and denominator CRPSs of b to obtain Q = A © B, and Q n = A^ © B (5-2) Ti n d d d n a b The sum of a and b can be expressed as a+b = — + ; — = d d a b + a b q + q : = . Utilizing equation (k-ll) to form the sum a d b d ^ q^ + q d gives W{ZQ^ ZQ d ) = (q d + q d )/2 Since the numerator is scaled, it is also necessary to scale the denominator by multiplying p by 1/2. A CRPS which has a variable value of 1/2 is Z Z where EXP{Z ] = EXP{Z r } = 1/2 because W{Z Z r ) « 2" g" 1 (EXP(Z i Z }) = g -1 (l/)0 = 1/2. Using the above results yields S = Z~Q, v Z.Q, and S. = P, © (Z n Zj n In Id d d 1 2 - Subtraction is a simple modification of addition with Q,, replaced by d Q because W{Q } = -W(Q }. The Boolean expressions for subtraction become D = Z vZO J and D, = S J n 1 n 1 d d d (5-U) The results of this section are also listed in the appendix. 5-3 The Equations and Circuits for a Programmable Stochastic Computing Element The set of operations selected for the programmable stochastic 2 2 computing element to perform is a,b,a ,b ,a+b,a-b,axb and a/b. Since the equations for the result numerator and denominator CRPSs are very similar, the notation that e = n or d will be used. If a and b are the inputs to, and r the result of, the computing element, then let A , B and R respectively be their representations in the RASCEL system. The switch names and table of corresponding operations is given below. 39 FOI r 000 a 001 b 010 a 2 Oil b 2 100 a + b 101 a - b 110 a x b 111 a/b If EXP{U} = a or b, EXP(U 2 } ~ a 2 or b 2 , EXP(+} = a + b and EXP{x} = ab or a/b, then the Boolean expression for the result can be written from the above table as : R = (F 0)u v (F0)U 2 v (F0)+ v (F0)x ,,, R . e € £ e e (5-5. 2 In other words U is a function of one variable, U is that function squared, + is the additive binary function and x is the multiplicative binary function. These functions can be expanded to give U = IA v IB and U 2 = U © AU (5-6) e e e £ £ £ where A, the delay operator, gives an independent CRPS with the same value as U . £ From section 5.2.2 above the expression for P , P 3 , Q ana n' d' n d can be rewritten as P = A © B and Q = A © B_ (5-7) € £ € € £ which can be combined to form x^ given by x = IP vIQ (5-8) e g e Also from section 5- 2. 2 the expression for S and D can be n n combined to form + given by +n = Z 1 (I © Q d ) v z x Q n (5-9) and the expression for S, = D, can be rewritten as d d + d ■ Z 1 Z 2 - Z l P d (5 " 10) These equations and actual circuits to implement them are shown in Figure 5-1 • I n those circuits clocked flip-flops are used to buffer the Z., and Z^ inputs and the R and R, outputs to insure correct 12 n d timing. k2 CHAPTER 6 THE RASCEL SYSTEM 6.1 The Using of the RASCEL System A person who wishes to use RASCEL merely adopts the role the compiler in Figure 1-1. His first step is to parse the arithmetic expression to be implemented and then program the array of computing elements (Figure 1-2) to calculate that expression (see next section). His second task is to set the inputs to their proper value. Now other parts of RASCEL, which are shown in block diagram form in Figure 6-1, become very important. The circuits which decode, (i.e. perform the -1 g operation) divide, and scale have outputs which drive a display +Ch whose format is of the form +Ma x 10— where < M < 1 and — — a — < Ch < 3« There is also an array of indicators each of which has written next to it the input or computing element it corresponds to and on which printed circuit card that input or element can be found. This latter piece of information is important because the cable which goes to the decoding circuits has a connector on it, and each input or computing element printed circuit card has a set of test fingers which match that connector. These can be seen in Figure 6-2. The procedure then is first, to look at the array of indicators to find the input to be set and where it is located; and second, to slide the decoder cable connector onto that printed circuit card's test fingers. One of the indicators will then light telling the user which card he put the connector on. If he found the correct connector, then the indicator of the input he wants to set will be lit. ^3 SAMPLING COUNTER AND DISPLAY CLOCK CONTROL CIRCUITS AND PRPS GENERATOR ARRAY OF INQICATORS CABLE B I INPUTS AND ARRAY OF COMPUTING ELEMENTS I DECODING DIVIDING SMOOTHING AND SCALING CIRCUITS POWER Figure 6-1. A block diagram of the RASCEL system. hk Figure 6-2. The appearance of the RASCEL system. h5 To continue his task of setting the inputs, the user now looks at the display and adjusts the numerator and denominator potentiometers until he has the desired input number. He then locates the next input, slides the decoding cable connector onto that input printed circuit card, checks the indicator to see that he has the correct input card, sets the input number's value while watching the display and goes onto the next input. After all inputs are set he can locate any computing element in the array, slide the decoding cable connector on that card and check its output value on the display. In particular he would want to observe the value of the last element of the array which always has as its output the value of the arithmetic expression which was programmed into the array of computing elements. In actual applied use, the input values would be adjusted by some physical variable value and the programming of the array of computing elements would be handled by a computer. 6.2 The Programming of the Array of Computing Elements The array of computing elements has the structure of a binary tree as shown in Figure 1-2. Every element has two inputs and one out- put. Consequently, to parse an arithmetic expression f, it must be written in the form of nested binary operations. For example, f = (a + b + c) must be written as f = ((a + b) + c) or f = (a + (b + c)). Once this is done, f has the form f " B 0, f» fi ~« CM r«. Oi l -1 o 5i g en q ^t N o DIVIDER C19T0 2 go 12 S l 9 DISPU ETUBE Dl TO 2' u ild c K 3 c c 2 CD CD —» o ^ A u tf> O ~l / L 17 DUA UBTRA C8T0 3 ae COX* u .W h. c g* * i I 1 I o c If? DECODIN CABLE TEST CONNEC If? ( ) t > w -p o •H O H P< w 1 O O o o o H I t- <1> 53 where Up = BA and Down = AB. If A and B are both 1 or both 0, the counter does not change. If A = 1 and B = 0, then the counter counts down, and if A = and B = 1, the counter counts up. The counter is wired so that it does not count down once it reaches a zero count and the difference is taken from the borrow output of the counter. If the carry output of the counter ever becomes a 1, then B is larger than A in magnitude. In this case, the inputs A and B can be switched and the sign of the difference changed. A block diagram of the dual-subtractor is drawn in Figure 7-2. In that drawing C and B & i . & stand for Up and Down and C and B for Carry and Borrow signals. n n The conversion from a sign-magnitude number representation to the mapped number representation used in RASCEL of g(r) = 1/2 - r/2 = r' Rs for CRPS-R - [r,r'J is also possible and is easier. If r = -1 x |r|, EXP{R'} = |r|, EXP(H} = 1/2, and R = Rs (H v HR'), then — R^ EXP{R} = EXP{Rs 9 (HvHR')l =1/2 + -1 X |r|/2 = g(i\ EXP{R} = r and W{R] = g _1 (EXP{R}) = g _1 (r' W{R] = -1 RS x Irl = r Yo TC„ u c n R n CIO CARRY BORROW NUMERATOR UP NON UNDERFLOW COUNTER DOWN C9810 Cll B n R n TC d CONTROL CIRCUITS ca S CoRn 13 C19 14 T B R n C9 V C n Rd 20 Cll CA**Y BORROW DENOMINATOR UP WOM UNDERFLOW COUNTER DOWN C9ft.ll CIO B n R d E C Rd 13 C9 14 F B Rd ! C9 K w 1 X Y ' 1 ' • C19 F C19 E R «n R td *T<\ Figure 7-2. Block diagram of the dual -sub tract or. 55 7-U The Subbie 7.4.1 The Basic Subbie This device is based on the Addie which was invented by Brian 13 Gaines. Basically the binary number to CRPS converter of Figure 3-3 is modified so that the binary input number is the content of a counter. If a CRPS is fed into this counter, then the counter will integrate the input CRPS, consequently the output will be the integral of the value of the input CRPS . If the up-down counter of a subtractor is used for the input binary number, then the output of the converter is the integral of the difference of the input CRPSs. Such a circuit is shown in Figure 7 _ 3- It is interesting to note that even though A and B are random variables with means and variances of their own, R is still a CRPS. This is because the counter integrates out some of the variance but more so because even if the input binary number has a mean and variance, the sampling by an evenly distributed random process gives the output the mean of the input binary number and independence of distinct clock periods . The time constant of the integrator is determined by how many stages the counter has and the clock frequency. In other words , n n to count from zero to 2 at a frequency of f takes 2 /f seconds or T = 2 n /f . ' c 7.4.2 The Subbie with Feedback A Subbie with the output R connected to the minus input B can be thought of as a difference integrator with its output fed back to the exp{a}*o EXp{b}*6 W EXP R ^*r r«/*(o-b)dx AB A I AB UP NON OVERFLOWING OR UNDERFLOWING COUNTER DOWN 1 ! 2 ! ! • • • n 1 1 MAGNITUDE COMPARITOR R ^ / 1 1 2 I • • • n \ RANDOM BINARY NUMBER Figure 7-3. Block diagram of a Subbie 57 minus input. This circuit gives an exponential estimate of the plus input. To show this, let d(t) - a - r(t) and r(t) = f d(t)dt J o Taking LaPlace transforms yields d(s) = a/s - r(s) and r(s) = d(s)/sT Combining and simplifying gives 2 str(s) = a/s - r(s); s ir(s) + sr(s) = a a _ a/s sTi s T + s r(s) ~ 2 - s(s+1/t) Taking inverse LaPlace transforms gives r(t) = a(l - e _t / T ) 7.^.3 Generating a Random Binary Number h 16 In the RASCEL system, a feedback shift register ' generates pseudo random pulse sequences which are used as CRPSs having a probability of a logical one occurring of almost exactly one-half. The circuit of the random binary generator card is shown in tl 1 ( appendix figure A-9* Based on Peterson's tables , a thirty- four bit shift register is implemented where the input equals the modular two sum of the outputs of the seventh, and thirty- second, third and fourth stages of the shift register. To obtain CRPSs with values of one-half which v/ere not successive delayed versions of one another, the output of the first stage of the shift register was chosen as a reference. Other "indepen- dent" sequences were generated from that reference sequence on a ran- domizing card, Figure A- 10, by delaying the reference and using the exclu- sive-or of the delayed reference and the reference as the "independent" sequence. In RASCEL this was done up to and including the seventy- fourth delayed version of the reference with no correlation showing up. To check these results, a Subbie was built which had a ten stage up-down counter and a clock frequency of 100Hz. A step function was put into the Subbie and the time for its output to reach its l/e point was measured. Twenty samples of this time produced a mean of 10.^25 seconds. The time to start and stop the watch was approximately 0.15 seconds. This gives a measured time constant of 10.275 seconds. The theoretical value with f = 100Hz and c n = 10 is 10.24 seconds. Considering that the samples were rounded to the nearest hundredth of a second, these results agree very well. The Subbie with feedback is analogous to an averaging voltage follower, but instead of voltages we are dealing with probabilities. The value of the output CRPS of a Subbie with direct feedback follows the value of the input CRPS with time constant = 2 /f . 59 7-5 The Divider The feedback path of a Subbie can be modified by inserting some function in that path. For example, if the output is multiplied by a CRPS with value c and the result of that multiplication is fed to the minus input , then r = a/c . Because of the negative feedback the output of the Subbie tends to a value which will make the number of up counts equal the number of down counts . When that happens the mean value of the contents of the counter is constant. If the B input is CR where EXP{C} = c, EXP(R} = r and EXP(A) = a, then EXP{up} - a - acr and EXP{down] - cr - acr. If these are equal, a = cr and r = a/c. This analysis can be extended to many other functions such as squaring r before it is fed back to make r = va. To test the hypothesis that the large fluctuations of the quotient were due to the variations of the divisor, an alternate form of division was tried. This method is based on the work of C. Afuso-^ and 7 J. Esch and is open loop. That is, there is no feedback path of the quotient back to the input as there is when a Subbie is used to divide. In this method, after each numerator pulse, the output is made a logical one for the entire next denominator period. A small buffer keeps track of excess numerator pulses while waiting for corresponding denominator periods to occur. In the few comparison cases tried, the magnitude of the fluctuations of the quotient seemed to be independent of which means of dividing was used. The divider circuit has one other variation from a basic Subbie with simple feedback which RASCEL uses. This is premultiplication 6o Cll v N,B n R n D»B n Rd > 14 CIO v 14 s~ C14 4 V 0/ 4 C 4 > C14 V 12 S~ Ch» Ch, 10 -Ch CONTROL CIRCUITS C19 flTTl BnNr "01 UP NON OVERFLOWING OR UNDERFLOWING COUNTER DOWN C20 • • • _l_i MAGNITUDE COMPARITOR ttl IT • • • 17 M a C14 l V CI' 7 F RANDOM BINARY NUMBER C22 Figure 7-k. Block diagram of the divider circuit, 6l of the numerator or denominator by some power of ten to insure a condition similar to .1 < r < 1. These circuits are shown in block diagram form in Figure ^-h. 7.6 The Scale Factor Generator The scale factor generator (SFG) is not incorporated in the divider because of response time. The response time of the divider is not only determined by t but by the magnitude of the inputs . When one input is very small the other gets scaled down so that the quotient is approximately in the range .1 < r < 1. When Ch = 3 5 this means that the 3 up and down inputs of the counter will be getting less than 10 counts per second if f - 1MHz. Consequently the true response time of the divider is 1 = 2 n 10 '/f • For n = 10, f - 1MHz and Ch - 3 (its maximum value in RASCEL), the time constant becomes approximately one second. This is the first reason that a separate Subbie is used for scaling. The second is that when the divider circuit was used for scaling, the variance of the mean value of the counter's contents was large. Consequently reasonable values at which scaling would occur could not be obtained. The SFG Subbie has direct feedback from its output to the minus input and the plus input is driven by the output of the divider. Because the output of the divider circuit is rarely outside the range .1 to 1, the time constant of the SFG Subbie can be longer. The value of n = 17 was chosen by experimentation to be small enough not to make the response time of the divider and SFG together much worse than the divider alone, yet large enough to decrease the variance of the value of tl most significant bits of the SFG Subbie. The reason tl important is that these bits are wired through logic gs rhlch lo for two specific values. When the upper limit value is reached, the Ch scale factor (10 ) is decreased by a factor of ten to decrease which in turn decreases Ma , the output of the SFG Subbie. When the lower limit value is reached, the scale factor is increased by a factor of ten. Again because of the variance about the mean of Ma , the limits have to be far enough apart so that after scaling the opposite limit won't be reached causing oscillations. For example, with a 1MHz clock rate and a one second sample time, the number of samples n is 10 and the fluctuations are 3 theoretically about 10 . Consequently only the third significant digit should fluctuate. Since the scale factor is ten and the Subbie output is binary, it is necessary to use at least the first four most signifi- cant digits of the Subbie. If this is done, scaling down occurs at approximately .9^+ when a number is increasing in size, and scaling up occurs at approximately .06 when a number is decreasing. Because of the fluctuations, scaling down occurs from approximately .92 to .93 and up from .9 to .07 depending on the magnitude of those fluctuations. Using these results, the lower limit was chosen to be when the four most significant bits of Ma were zero and the upper limit when the four most significant bits were ones. To avoid the problem of multiple scaling, when a limit was reached, the Ma Subbie is forced to a value of approximately one -half by changing just the most significant bit of 63 the Ma Subbie whenever a scale factor change occurs. Figure 7-5 gives a block diagram for the SFG Subbie and associated circuitry. Note that the basic building blocks of the circuits discussed in this chapter are the dual-subtractor , up-down counter, comparitor, random binary number generator and autoscaling and divider cards. Circuit diagrams of these cards can be found in the appendix. 7.7 The Display The display lias two major components, (l) the display panel with its NIXIE tubes and sample time control and (2) the sampling counter circuits. These are shown in Figure 7~6. The reset and store signals occur every second unless the sampling time is ten seconds. The sampling time switch's primary function is to control the length of the time the gate signal is on. This signal goes on immediately after the reset signal and lasts for the length of time set by the sampling time switch. The sampling counter consists of cascaded ripple BCD counters and the number in the cascade is determined by the sampling time switch. The contents of its three most significant stages are transferred to storage flip-flops each time the counter is reset. These flip-flops drive the NIXIE tube decoding circuits located with the NIXIE tubes. The remainder of the display's indicators are driven directly from the dual- subtrac tor and scale factor generator circuits. The complete circuits for the display can be found in the appendix . 6k 2 4 K 15 LlI OC jpz * - < CD § CO o cr tr o m CO *^ u vO O o $ ; «, - •H P-4 A 3* 65 > CT SAMPLING COUNTER BCD BCD BCD t , I , 1 STORAGE REGISTER R sq w BCD BCD BCD GATE RESET STORE I TIME BASE c hs c hoi c h23 V V V v II 0-9 0-9 0-9 X10 DISPLAY RftNEL SAMPLE TIME Figure 7-6. Block diagram of the display. 8 THE EVALUATION OF ANCE 8.1 'Erie Results of Input CRPS Generators Section 3»3«1 described an analog to CRPS conversion technique, The basic idea was to compare an analog input voltage to a noic detect which was larger and sample the result with a clock signal. If the bandwidth of the noise is sufficiently larger than that of the clock, the logical value of the output during any given clock period will be independent of the logical value occurring during all or any other clock periods. Recalling that AA is CRPS-A delayed one clock period, equation (h-6) EXP(AAA) - EXP 2 {A) (k-6) can be used to verify independence. Figure 8-1 shows in the top graph the plot for all eighteen inputs to verify that each does generate a CRPS. On the x-axis is plotted an estimate of the input CRPS versus on the y-axis an estimate of the square of the input. The bottom graph is a plot of one good noise source and several bad noise sources. As is clearly shown, the characteristics of the diodes which were used as noise sources can vary significantly, implying that it is necessary to select for those diodes that give satisfactory results. 67 u 2 = EST {U ©AU) n = 1CT samples t 2 u = EST n (U © JJ} n = 1(T samples +.8 + .6 + .4 +.2 -.2 -.4 -.6 -.8 -I *■ ■ ■ ■ ■ - ■■-■■ 1 ■ — I j 1 ■ T ^^^^LiJh^^h _^h _^_ __■ ^^m _aari + 1 + .8 + .6 + 4 +.2 -.2 -.4 -.6 -.8 -I -B -.6 -4 -.2 +.2 +4 +.6 +J8 +1 u = EST (U) -» 18 inputs superimposed — MM— 111 III I c — t— — ™ ff^ =^ EEE == ==fc ~~" -JB -.6 -4 -.2 +.2 +4 +.6 +B +1 u = EST n (U} - 1 good and 6 bad noise sources Figure 8-1. The results of squaring a CRPS. 8.2 The Performance of a Computing Element One of the nice aspects of stochastic conrput ';e the correct Boolean equations have been determined and circuits designed to implement them, it is not necessary to test the circuits in terras of numbers. It is only necessary to see if they are working properly i to check that they satisfy the Boolean equations they were intended to implement. However to test the theories of stochastic computing, the performance of the computing elements in terms of input and output numbers is necessary. Chapters h and 5 developed the theoretical results of logic gates with CRPS inputs for both machine variables and mapped number representations. Specifically, Boolen equations of CRPS inputs were derived whose theoretical values corresponded to multiplication, division, addition, subtraction and squaring of the input values. These equations are listed below for convenience where e = n or d (numerator or denominator) and A is the delay operator. U =IA vIB and U 2 - U © AU (5-6) e e e e e e P = A © B and Q = A © B- (5-7) £ £ £ £ fc £ x - IP s, IQ (5-8) £ £ £ + n = Z X (I ©Q d ) v Z 1 Q h (5-9) + d - Z 1 Z 2 - Z l P d ( 5 - 10 ) 69 8.2.1 The Results of Multiplication and Division Figure 8-1 already gives the results of squaring (i.e estimating U ) for all eighteen inputs and really also demonstrates that multiplication works. In the top graph of Figure 8-2 only one of the possible combinations implied by equation (5-7) is plotted. This is sufficient to check multiplication since we know that all inputs are independent and that all computing elements implement the correct Boolean equations. Because the denominator for addition and subtraction also involves multiplication, its results are shown in the bottom plot of Figure 8-2. 8.2.2 The Results of Addition and Subtraction The Boolean equations for addition and subtraction are more complicated because a ratio representation of numbers is used. This means, that to add or subtract, it is first necessary to cross -multiply. Consequently, the numerator CRPS for addition or subtraction is composed of four parameters. To plot a graph of the sum or difference the x-axis was chosen as the a numerator CRPS, the b numerator was held constant at +1, and + was plotted on the y-axis for several values of n the denominators as shown in Figure 8-3. 8.3 The Performance of the Dual-Subtractor The theoretical operation of a machine value subtractor was discussed in section 7 - 3- Its input is a mapped CRPS and its output is a sign magnitude number representation. Figure Q-k shows a plot of a b c = € 6 (P.] EST (Q ] n = KK samples ■f 1 > + 8 + 6 + 4 ^X 4- ? ■ ■ \~\ 2 1 — < — 4 ^ - 6 -.8 -1 -.1 -I -B -6 -4 +.2 +4 +6 +B +1 a = EST (A } ~ straight multiplication a d b d/ 2 = EST n ( +d } n = 10 samples + 1 +.8 + .6 + .4 +.2 -.2 -.4 -.6 -.8 -I II F ZTZ-ZTTH I I I I I 1 I [ I * I I I I I I I I i b,, = +1 + .1 + .01 -.01 -.1 -.5 -I -B -.6 -.4 +.2 +.4 +.6 +B +1 a d = ES W - scaled multiplication Figure 8-2. The results of multiplying CRPSs. + 1 +.8 + .6 a d = b d (a b.+a.b )/2 + •« +.2 n d d n = EST [-> n n n = KT samples b = 1 -.4 -.6 -.8 ^\ ^ -— -■ V I--" *>= s -— __ L_ +1 71 + .5 .1 + .01 -.01 -.1 -1 -I -& -.6 -.4 -.2 +.2 +.4 +.6 +£ +1 a = EST A ] n rr n J addition a d = b d = +.8 + .6 (a b.-a.b )/2 + .4 v n d d n'' = EST(+J n +.2 -.2 -.4 -.6 n = 10 samples b_ = 1 -.8 -I r" x' +1 + .5 + .1 + .01 -.01 -.1 -I -JB -.6 -.4 +.2 +4 +.6 +B +1 a = EST {A } - n n n subtraction Figure 8-3. The results of adding and subtracting CRPSs 72 r 1/2 = EST {B B ) + o n n e + -* n = lcr samples + 1 +.8 + .6 + .4 + .2 -.2 -.4 -.6 -.8 — i -S -.6 -4 -.2 +.2 +4 +.6 +JB +1 r = EST (R =TC ) - 6 n l e € Figure 8-^. The results of the dual- sub tractor. the input CRPS value versus the magnitude of the output of the subtractor, The mapping g is g(x) = 1/2 - x/2 = x*. Consequently, the output of the subtractor is |x' - 1/2 | = |xj/2 as shown. Q.h The Performance of the Scaling Divider In the previous chapter, sections 7*5 and 7.6, dividing and automatic scaling are discussed. Together these function as a scaling divider whose output is a mantissa and characteristic of some power of ten. Scaling occurs whenever the mantissa reaches an upper or lower bound. These bounds must be separated by at least a factor of ten to avoid oscillations; i.e., scaling back and forth, because first one 73 limit is reached and, after scaling, the other limit is reached. Consequently, the scaling divider must exhibit hysteresis. To demon- strate the performance of this circuit its output was plotted first, as a function of the numerator with the denominator held at +1, and second, as a function of the denominator with the numerator held at +1. Plots of these graphs are shown in Figure 8-5. Note that the hysteresis is large because otherwise the random fluctuations of the variables about their means can cause unwanted scalings. 8.5 The Performance of the Array of Programmable Computing Elements In order to evaluate the entire RASCEL system several functions were selected and the system programmed to compute them. In Figure 8-6 the top graph is a plot of y = 3x - -5 and the bottom of y = x - x /3» - sin(x). To avoid unnecessary scaling the scale factor for y = 3x - .5 was limited to zero and plus one and for sin(x) to zero. The most important aspect of these plots is the amount of noise which is present. All previous graphs were much smoother and less noisy. There is a very good reason for this (discussed in the next section) which points again to what I feel is the fundamental problem of stochastic computing. in „ - r a2 n/ r d " EST n (M a2 ' n = icr samples + 1 + 8 + .6 + .4 +.2 -.4 -.6 -.8 r . ._ ji i io-2 71 l- lo 1 1-1( ^ 0° 10 2 3 0-1 / lh -B -.6 -4 -.2 +.2 +4 +.6 +£ +1 r = EST (R =TC ) - n n n n r d = + l n '= r It, = a2 n' d EST {M ') n a.d n = l(T samples + 1 +.8 + .6 + 4 +.2 -.2 -.4 -.6 -.8 r I >o< r \ \~ V t- ^" +^ 2 +^ 2 ■— -=s -fas- ***"% ;:5 _5 t \ 1 10<3 4T j± -I -& -.6 -4 -.2 +.2 +4 +.6 +£ +1 r. = EST (R =TCJ - d n a a r d = 1 Figure 8-5. The results of the scaling divider. y = 3x - .5 = EST (M '] n a2 n = 1CT samples + 1 +.8 + .6 + .4 +.2 -.2 -.4 -.6 -.8 75 __H7 J___ ^ 7 S . J::: 3- -tit Z. T _ io' 1 ^^ ~J ^*» a "' / -b> -"^ ~!-ir -"""* z .X > I <1 1 t 1 -B -.6 +.2 +4 +.6 +B +1 x = EST {X ] n n n + 1 +.8 + .6 t = x - x 3 /3: = EST{M a2 '} y = sin(x) + •■* +.2 -.2 -.4 -.6 n = i ^ ^ / fj / 1 J / / ft * f — 1 fe-i f ' ■I -B -.6 -4 +.2 +4 +.6 +JB +1 x n = x/90° = EST n (X n } - Figure 8-6. The results of two functions programmed into RASCEL. 76 8.6 The Attrition Problem On a very simple scale, by multiplying machine values using an AND gate and adding using a scaled summation technique, the number of pulses emerging on an output wire from a network of such elements is always equal to or less than the input with the largest number of pulr- Consequently, as networks get larger and larger, the number of pulses on an output gets smaller and smaller. This is called attrition. In the PASCEL system an attempt was made to overcome this problem by represent- ing a number by a ratio. Thus, as both numerator and denominator had fewer pulses, the value of the ratio would remain constant. The ques- tion of interest is, "Would the division performed by the scaling divi- der be influenced by attrition?". Looking at Figure 8-6, the answer is obviously yes. The reason is that when the denominator is small any variation about its mean will result in a large change in the ratio. Figure 8-7 showns the same parabola plotted for two different sample sizes. The sample size is determined by the clock frequency (1MHz in PASCEL) and the time constant (.Is in the top and Is in the bottom graph) of- the RC averaging circuit which drives the plotter. The significant difference is that the increased sampling time smoothes out the fluctuations but does not substantially reduce the magnitude of those fluctuations. This means that the distribution of the ratio of two random variables is not the same as the distribution of the numera- tor and denominator random variables. Consequently, contrary to equation (3-5) 77 VAR{EST n (M a2 '}} ^ VAR{M a2 '}/n (8-1) This indicates that any circuit which attempts to solve the attrition problem by directly increasing the number of pulses by some constant, will not yield a CRPS. 78 = E ST n (M a2 «) n = 10 J samples + 1 + 8 + .6 • 75(x-.3^ - .8 +4 + .2 -.2 -.4 -.6 -.8 — ■% \ > \ \ -- "i \ ^ 1 \ / -I -8 -6 -.4 -.2 • 75(x-.3r - -8 =EST n (M a2 '} n = 10 samples +2 +4 +.6 +8 +1 x = EST (X } - n n n *d = 1 -M +.8 + .6 + .4 > \ +.2 \ \ s \ -.2 -.4 -.6 -.8 — i -I -8 -.6 -4 -2 +.2 +.4 +.6 +8 +1 x n = EST n (X n ) - Figure 8-7. The results of different sample sizes. 79 CHAPTER 9 THE SUMMARY AND CONCLUSIONS 9-1 A Summary of the RASCEL System Basically RASCEL is a programmable analog computer. Its in- puts consist of constants, variable values, and the arithmetic operations which combine them. The basic element is a programmable computing ele- ment which can perform on command any one of a specified set of opera- tions on its two inputs. To obtain complete generality, the inputs and outputs of these computing elements are permanently wired together to form a tree structure. Figure 2-2 shows this tree structure and Figure 6-3 gives a flow table for specifying the operations. The system is general enough (because of the tree structure) to be able to calculate any function of its inputs and the set of operations that each element can perform. So far there is nothing unusual in this approach. The un- usual aspect of RASCEL is the way chosen to implement the programmable computing elements. Their design was based on the then current ideas of stochastic computing. Those ideas were expanded upon to obtain a number representation which, (1) could represent positive and negative numbers with a range of several orders of magnitude and, (2) would allow programmable computing elements to be designed whose outputs could be used directly as inputs to other computing elements insuring that arbitrarily large networks of these elements could be constructed. In a conventional digital computer the machine variables are binary numbers and the problem vari?j mapped mapping associated with floati iced poi or sign magnitude number representations. In the machine variables are clocked random pulse sequences (' . The number represented by a CRPS is the probability that a logical "1" will occur during a clock period. Circuits to generate such CRPSs are shown in the appendix, Figure A-l. Its absolute range of value. . for a fixed point binary number, is [0,1] and as in a radix complement number representation the range of variable values -1 < x < 1 : I to the range of machine values with the mapping g given by g(x) = 1/2 - x/2 = x'. Using the notation of CRPS-X that W{X) = x, the analogy to radix complement techniques can be carried farther. In those techniques the negative of a variable value is obtained by bit wise complementation and in RASCEL -x = W{X}; i.e., a CRPS-B whose value is the negative of the value of CRPS-A is formed by complementing A or B = A. The effects of logic gates on machine and variable values is summarized in the appendix, Figure A- 2. To obtain a range of numbers with several orders of magnitude, a rational number x - x /x. is represented by two CRPSs X and X, where n 7 d ^ ° n d W{X } = x and W{X } = x . Figure A- 3 in the appendix summarizes the mapped ratio number representation, as the above is called, and gives Boolean equations to perform all the arithmetic operations. These equations are implemented on a programmable stochastic computing element printed circuit card whose circuit diagram is shown in Figure A-k. Because the Boolean expression for the arithmetic operations require only combinational circuitry the results of the computing ele- ment cards by Theorem k-1 are CRPSs if the inputs are independent CRPSs. 81 Thus the mapped ratio number representation allows a programmable ele- ment to be designed and built which satisfies the above mentioned two criteria of a number range of several orders of magnitude and system compatibility. Because numbers are represented as a ratio, eventually it is necessary to divide to find out what the number represented really is. This would be no problem if it were not for attrition which as more and more operations are performed causes the ratio to approach 0/0. Thus the small variation in the denominator due to its random nature cause large fluctuations in the quotient. The remaining figures in the appendix and Chapter 7 give the details of how this division is done through the use of a Gubbie. This element is based on the A'-' die of Brian 13 Gaines and they are probably the most interesting elements in stochas- tic computing from a theoretical and practical systems point of view. A user of RASCEL looks at its NIXIE tube display, which gives +Ch +_Ma x lO 1 , to determine the value of the number being decoded. To decode some other number the decoder cable connector is connected to the test figures of any input or computing element card. Chapter 6 gives a more detailed explanation of this procedure for using this program- mable analog computer. 9.2 A Summary of the Results Obtained from RASCEL. The idea of building a programmable computing element based on stochastic computing techniques has proved to be a good one. The mapped ratio number representation selected for RASCEL allowed a programmable stochastic computing element to be designed with sixteen dual-in-line digital integrated circuit packages on one printed circuit card. Figures 8-] thru 8-3 show that the results of an individual element GU good and agree with theoretical calculations. Those plots also indie - ; that generating a CRPS by comparing an analog voltage ie px duced by a reverse biased PN junction works very well. It is easy to select diodes by plotting their square and CRPSs generated by different d^ are independent . The algorithm selected to program the array of computing ele- ments has to date worked well and closely approximates the way an experi- enced user of RASCEL programs tne array based on his experience. Al- though only sample functions were plotted in Chapter 6, all functions that have been tried are computed correctly. The only catch is that due to attrition the results of more complicated functions had even larger fluctuations than those shown in Chapter 8. One problem that was not anticipated (but should have been) 2 was that of forming higher powers than just x . For example, if W{X} = x, then W[X © AX} = x , but VV{ (X © AX) © a(X © AX)} = W{X © AX} = x . 2 In other words, the square of a sequence that already is a square doesn't gain you much. Consequently it may not be possible to build a tree- array which can be programmed for arbitrary squaring or higher powers. It is possible however to build circuits which can form arbitrary powers. The last large section of RASCEL is the decoder and display. The performance of the system is limited by attrition, not by the decod- ing circuitry. However, the time constants of that circuitry can make a great deal of difference in how well the decoding is done. The ori- ginal decoder had the divider and scaler using the same Subbie. However, because the time constant of its response was so long due to small inputs, the number of stages in its counter had to be reduced. This in 83 turn caused greater fluctuations because the integration time was short- er. Consequently, the divider has ten stages and a faster response time (about 1 ms) while the scaler has seventeen stages and a longer response time (about 128 ms) and thus more smoothing. The scaler is not affect- ed by small inputs because the output of the divider is always relative- ly large. In other words, the divider and scaler look like a bang-bang circuit followed by a smoothing circuit. The combined circuits can go from +Ma x 10 " to +Ma x 10 ' in six seconds or about Is/order of magni- tude change in input. If the magnitude of numerator and denominator are kept within the range [.01,1], then the decoder can resolve any number in the range [.01,100] to an accuracy of 1%. 9«3 The Advantages and Disadvantages of RASCEL Most of the disadvantages have been touched upon already. Starting at the input, numbers must be converted from either analog or digital to CRPSs. This will in almost all cases require a few to ten transistors or integrated circuits per input. When the time comes to reconvert, the circuitry is again of the same order of magnitude as the input conversion. In addition the product of the sampling time and clock frequency increases as the square of the desired accuracy. This at present severely limits the areas of applications of stochastic com- puting to the .Is slewrate and 1% accuracy range. The fundamental problem is that of attrition. The analog computer would have a similar problem if it only had available for use amplifiers with long time constants. In other words they could only integrate not amplify. The Subbie wired as a divider can act as an amplifier with time constant but not purely as an amplifier. In favor of RACCEL are several interesting properties which the most prominent is its digital nature. Because there u analog system problems such as noise, offset voltages, linearity and a host of others, a stochastic system could conceivably be built as large as desired. In addition there are no cumulative errors due to round-off, word size and other numerical analysis problems to limit size. The price paid for these advantages, as mentioned earlier, is response time and accuracy. The disadvantage of having to convert to a CRPS can actually be helpful. If the input signals are noisy, the conversion to a CRPS has the effect of averaging out the noise, which could be a tremendous help in some applications or environments. Note that many input signals are inherently noisy or pick up system noise. One of the other interesting properties which would be very significant in a commercial environment is that of testing elements. Because all the gates are operating only on logic levels, not on numbers or probabilities, the proper operation of some stochastic computing ele- ments is dependent only on the logic. Thus the elements need only be checked or tested to see if they satisfy the proper Boolean equations. An analogy is that of being able to test an amplifier under d.c. condi- tions and know that it will work properly under a.c. conditions. Another interesting property related to that mentioned above, is that the computations performed by a stochastic computing element are independent of accuracy. During each clock period the element produces logical ones and/or zeros according to the Boolean equations it imple- ments and its inputs. The element doesn't care or know how many clock periods the user is averaging over to get an estimate of the probability. 9.k Some Thoughts on RASCEL and Future Systems The reason RASCEL has as many problems with attrition as it does is the nature of the project, a general purpose programmable analog computer, or a lack of a specific application. Being so general RASCEL can compute any function of its inputs and set of operations, but for a specific function it is not necessarily as efficient as is possible with a special purpose stochastic computer. If the system also had to fit into some larger system it would be possible to design to specific input and output specifications. In this case some other number representation might be used for which attrition is not so great a problem. One such number representation is a floating point representa- tion where the mantissa is a CRPS and the characteristic is a CRP;; or binary number. The scaling which is sometimes necessary for summation can be accomplished by shifting the most significant digit of a Subbie's up-down counter with respect to the comparitor. These ideas have not been thought out by the author, but seem to have enough merits to be worth investigating. A good example of the above is the design and construction of a machine to compute a polynomial. RASCEL is particularly bad at this because the inputs are physically separated. This means that if two or more inputs are to have the same value, they have to be adjusted inde- pendently. In a special purpose stochastic machine one could just be a delayed version of the other. Depending on the range of values of the variable of the polynomial, any of several number representations (including floating point) could be used with some advantages to the mapped ratio number representation used in RASCEL. 66 One aspect of RASCEL which became apparent to the author is that without squaring, a clock is not needed until decoding and display- ing are done. If the machine value of a number is the probability of having a logical "1" at any instant of time, then the mapping j£ can still be used, complementation will still cause negation of the variable value, the exclusive-or operation will still cause multiplication of the vari- able values, and addition can still be done with the same circuits. In a large system, eliminating the flip-flops used to maintain a time refer- ence, could yield appreciable savings in complexity, size and cost. 87 APPENDIX The figures which follow summarize many of the concepts, notations, operations and equations found in the RASCEL system as well as providing detailed circuit drawings of some of the printed circuit cards used in RASCEL. 88 o < o oc o < s V) 0. o: o ll 5 o 1" Z UJ X Z < tf) z -I K Sgfc "IS g£S ? a 1 »> GC IT UjU o u •H O ■E a) o h O -P a3 U (V a So d> X! EH I < NOTATION: A CRPS IS WRITTEN A = [a, a'] WHERE MACHINE VALUE a' - (l-a)/2 AND VARIABLE VALUE a = l-2a'. 89 NOT AND OR OR,. . disj AND - D " - disj A = EXOR AND sun Z — A = [-a,l-a AH [1 - l/2(l-a)(l-b1 , a "o ' } A v B /2(1 + a)(l + b) - 1, a' + b' - a'b'l A v B = [a + b-1, a ' + b ' AB v AC = 1 - l/2(l-a)(l-b) - 1/2(1 + a)(l -c), a'b' + (1-a' )c'] Sx [ab, a'(l-b^) + (l-a')b'] ZB v ZC f l/2(b + c ). l/2(b' + C ) 1 C = [c,c A-2. The effects of logic gates on CRPS inputs. RATIO REPRESENTATION OF NUMBERS NOTATION: A RATIONAL NUMBER a = a /a, IS REPRESENTED BY NUMERATOR CRPS rr d A = [a , a '] AND A DENOMINATOR CRPS A. = [a,, a,'], n rr n d ? + + -P ■H o •H O o p £ H 0) W) id •H -p I o o o •H -P W 03 o o -p w 0) H ■a o3 O U ft 01 EH 92 J • u M '1 E < ** *i ■ i^T : _pr » »l " -1 > •> r\ 1 »o r-t 3 o •H V T> u a o u ■P o a3 Jh -P £> w H 03 9 0) XX 9 © ne-Hi. se 93 2>- i4- 3<- 12 «- *- Q« I I Ob li_ PIN 13 ON ' SM1-5 PIN 13 ON SH6, 10-13 -PRESENT > PIN 5 ON ALL SM A- 6. The dual five-bit synchronous up-down counter card circuit. •CMCnMg VCMBnMg E>- ^z^O^O^^^ ^ -»H A-7. The autoscaling and divider card circuit. >C|Q •«>=L I3>^ I6>£*- I5>^ 3>^ 9^ **■ ^ 5>^ »>^a- B> Sfl_ Gl« 2 Sba GIO 6 flig n -»* Co.l IMPLIES R-S 4j-- AI3 °TVO» T5-- AI3 0«ro« TN, /5 _^}^ El Hi II r- 1 i T xA 12 J Si z , MO- 1 IMPLIES R rO -i C) N 't kfl EMU Pfflfgyflfc J 9 c «? o to 2 1 J < m A HZ] > o I o ■p 05 0) c a i c a} x: Eh ON I < 97 PRPS RANDOMIZER H 1469- 264A i?^ R >- _3r o 5 i 10. 11 13 -H ■S "n— t- Di — *■ r> Rl-Ro ©Di A- 10. The binary number randomizer card circuit. 56 '4 O X 1 3 r o 1-1:1 9 c o H 4 1-31 i 1 n r 1 " I'D! * at*! 5 t 1 :• ■fPT*! I'll 1-3! u 99 LIST OF REFERENCES lo Afuso, Co, "Quarterly Technical Progress Reports", Circuit Research Section Part I., Department of Computer Science, University of Illinois, January 19^5 to September 1966. 2. Afuso, C. and Esch, J. W. , "Quarterly Technical Progress Report", Circuit Research Section Part I., Department of Computer Science, University of Illinois, October 1966 to December 1967* 3. Afuso, C., "Analog Computation With Random- Pulse Sequences," University of Illinois, February 1968. If. Ash, R. B. , Information Theory , Wiley and Sons, New York 1965> section 5.1, 5. Esch, J. W., "Quarterly Technical Progress Report", Circuit Research Section, Part I, Department of Computer Science, University of Illinois, January 1968 to June 19^9° 6. Esch, J. W. , "Stochastic Computing: What is it? What can it be used for?" Electronic Communicator , Vol. 3; No. 2, March 1968, p. k. 7- Esch, J. W. , "A Display for Demonstrating Analog Computations with Random Pulse Sequences", Report 312, Department of Computer Science, University of Illinois, March 1969* 8. Gaines, B. R. , and Andreae, J. H. , "A Learning Machine in the Context of the General Control Problem", Proceedings 3rd Interna - tional Congress IFAC, London 1966. 9. Gaines, B. R., "Stochastic Computing", AFIPS Proceedings , 1967 SJCC, Vol. 30, pp. 11*9-156. 10. Gaines, B. R. , "Techniques of Identification with the Stochastic Computer", IFAC Symposium, June 12, 19^7 > Prague. 11. Gaines, B. R., "Stochastic Computer Thrives on Noise", Electronics , Vol. 1*0, No. Ik, July 10, 1967, pp. 72-79. 12. Gaines, B. R. , "Stochastic Computing", Encyclopedia of Information Linguistics and Control , Pergamon Press 1968, pp. 766-78I. 13° Gaines, B. R., "Stochastic Computing Systems", to appear in Advances in Information Systems and Science, Plenum Press, Ed. by Tou, J. T. ll+. Gilstrap, L. 0., Cook, H. J. and Armstrong, C. W. , "Study of Large Neuromime Networks", Adaptronics, Interim Engineering, Report No. 1 to the Air Force Avionics Laboratory, USAF, Wright-Patterson A.F. B. , Ohio, August 1966, pp. 73-116. 15. Papoulis, A., Probability, Random Variables, and ."to- . ■.- ..--.. Processes , McGraw-Hill, New York, 196>, section.'; ">2, j-3 and B-5« 16. Peterson, W. W. , Error-Correcting Codes, M. I.T. Press and Wiley and Sons, New York^ I96I. 17. Poppelbaum, W. J., Afuso, C. and Esch, J. W. , "Stochastic Computing Elements and Systems", AFIPS Proceedings , I967 FJCC, Vol. 31, pp. 635-6Mi. 18. Poppelbaum, W. J., "What is Next in Computer Technology," Advances in Computers , Vol. 9> Yj6Q. 19. Riberiro, S. T. , "Comments on Pulse-Data Hybrid Computers", IEEE Transactions on Electronic Computers, Vol. EC-13, October 1964, pp. GkO and 6hl. 20. Ribeiro, S. T., "Random- Pulse Machines", IEEE Transactions on Electronic Computers , Vol. EC-16, June 1967, pp. 261-276. 21. Schugurensky, C. M. and Olaravria, J. M. , "Direct Simulation of Enzyme Systems: First Results with a Direct Simulation Basic Element", Universidad Nacional de Tucunian, Republica Argentina. 101 VITA John William Esch was born on January 15, 19^2, in Madison, Wisconsin. While attending the University of Wisconsin at Madison as an undergraduate, he worked for a Dean in the College of Letters and Science as a programmer and for the Space and Astronomy Laboratory as an electronic technician. . While at Wisconsin, he was elected to three honorary societies: Eta Kappa Nu, Tau Beta Pi and Phi Kappa Phi. With the aid of a NASA grant, he participated in the I96U Summer Institute of Space Physics at Columbia University. ■ After receiving his Bachelor of Science Degree in Electrical Engineering in June of 1965 from the University of Wisconsin, Mr. Esch began his graduate studies in September at the University of Illinois in the Department of Electrical Engineering with a research assistantship from the Department of Computer Science. He received his Master of Science degree in February of 1967 and his Doctor of Philosophy degree in June of 1969 in Electrical Engineering from the University of Illinois As a result of research at the University's Digital Computer Laboratory in the area of stochastic computing, he presented a paper which he co- authored at the 1967 Fall Joint Computer Conference at Anaheim, California. Mr. Esch is also a member of IEEE. UNCLASSIFIED Security Classification DOCUMENT CONTROL DATA - R&D (Security classification of title, body of abstract and indexing annotation must be entered when the overall report is classilied) 1 ORIGINATING ACTIVITY (Corporate author) Department of Computer Science University of Illinois Urbana, Illinois 6l801 2a. REPORT SECURITY CLASSIFICATION Unclassified 2b GROUP 3 REPORT TITLE - RASCEL - A PROGRAMMABLE ANALOG COMPUTER BASED ON A REGULAR ARRY OF STOCHASTIC COMPUTING ELEMENT LOGIC 4 DESCRIPTIVE NOTES (Type ol report and inclusive dates) •jhnical Report, Ph.D. Thesis June , 1969 5 AUTHORfS; (Last name, first name, Initial) Eseh, John W. 6 REPORT DATE a, 1969 7a. TOTAL NO. OF PAGES 108 76. NO. OF REFS 21 8« CONTRACT OR GRANT NO. N000 14-67-A-0305-0007 b PROJECT NO. 9a. ORIGINATOR'S REPORT NOMBER(S) 9b. OTHER REPORT NO(S) (Any other numbers that may be assigned this report) 10 AVAILABILITY/LIMITATION NOTICES 11. SUPPLEMENTARY NOTES 12 SPONSORING MILITARY ACTIVITY Office of Naval Research 219 South Dearborn Street Chicago, Illinois 6060h 13 A8STRACT RASCEL is a successful working programmable analog computer. Input numbers are converted inside the system to a mapped ratio representation which consists of two clocked random pulse sequences, one representing a numerator and the other the denominator. The machine value of one of these sequences is the probability that a logical one will occur during a clock period. This means of representing numbers was chosen because the probability of a )gical one occurring can be a continuous time varying function and yet the basic arithmetic operations of addition, subtraction, multiplication and division can be performed with simple logic gates. With this number representation, a programmable stochastic computing element was designed which can be programmed to perform on its tow inputs, a and b, any of ■he operations a, b, a^, b^, a+b, a-b, a x b, a/b. These computing elements are )ermanently wired together in a tree structure which allows any function of this cture's inputs and the above mentioned operations to be implemented by the >mputer. Any other function can be computed by just reprogramming the computing elements. The results indicate that large stochastic computing system can be built whiclr ave a number range of three to four orders of magnitude, an accuracy of 1% and a •Is slewrate. Because stochastic computers can be designed using only digital circuitry, they do not have the problems of conventional analog computers and can built to almost any size and complexity. It appears that they are well suited and very efficient at certain kinds of applications or computations. FORM 1 JAN 84 1473 UNCLASSIFIED Security Classification fTNCTASSIFIED Seoun'" Classification 14 KEY WORDS I n igrammab I - Ana 1 1 >g Computer Stochastic Computing Numbers Represented by Pr Continuous Variables Random Pulse Sequences Mapped Ratio Number Representation Digital Circuitry Array of Programmable Computing Elements LINK NM C INSTRUCTIONS t. ORIGINATING ACTIVITY: Enter the name and address it- of th fpnse *■ * iv : ' > t^,* report. ;* ^F.POKI ! 1 c «•, urii v subcontractor, grantee. Department of D« >r .ther organization 'corporal* author) issuing imposed by security classification, using standard statement! such as: C -F,( '/RTTY CLASSIFICATION. Enter I he over- »* ' ' rr.. progress, summary, annual, or final. C.ive the in! lusive tates when a spec. lie reporting period is i overed. i, AUTHOR'S): Enter the name(s) of authoKs) as shown on cr in the report. Entei 1 ast name, first name, middle initial. If vilttarv hr.w ">nk and brencr ■( >erv. e. The name of the principal .. ■■".•■r is an absolute m mmum requirement. f, REPORT DATL. Enter the date of the report as day, month, year, or month, year. If more than one date appears on the reporv use late of publication. 7a TOT A! N'MTiER OF PAGES: The total page count should foil « ' ""al pagination procedures, i e . enter the number of , *£es . ontaining information. •h NUMBER Of. REFERENCES: Enter the total number of rpfprf-n.-..^ .; ite.i n the report. Rfl rONTi-'*( T >P GRAM NUMBER: If appropriate, enter ■hf applies; > number of the contract or grant under which the repor> was written. Bfc V, 3f.fi 1 TROJECT NUMBER: Enter the appropriate ^j.tarv rt^nr-ment identification, such as project number, w,ibproje< I amber, system numbers, task number, etc. 4 ORIGIN ■> TTtR'S REPORT NUMBER(S): Enter the offi- w ,. .^ .. - <.. \ y w hu h the document will be identified „, cen . , ■ \ k -he originating activity. This number must t .• tiruqii' ' this report. W m HI K P PORT NUMBER S) If the report has been assigned .■ - oihrr report numbers (either by the originator • .,|,, « , also enter this number(s). 10. AVAll.AB!i-l T Y LIMITATION NOTICES: Enter any lim- itations on further insemination of the report, other than those (1) (2) (3) "Oualified requesters may obtain copies of thi» report from DDC " "Foreign announcement and dissemination of this report by DDC is not authorized." "U. S. Government agencies may obtain copies of this report directly from DDC. Other qualified DDC users shall request through (4) "U. S. military agencies may obtain copies of this report directly from DDC Other qualified users shall request through (5) "All distribution of this report is controlled. Oual- ified DDC users shall request through If the report has been furnished to the Office of Technicil Services, Department of Commerce, for sale to the public, indi- cate this fact and enteT the price, if known. 1L SUPPLEMENTARY NOTES: Use for additional explana- tory notes. 12. SPONSORING MILITARY ACTIVITY: Enter the name of the departmental project office or laboratory sponsoring (pay- ing (or) the research and development. Include address. 13. ABSTRACT: Enter an abstract giving a brief and factutl summary of the document indicative of the report, even though it may also appear elsewhere in the body of the technical re- port. If additional space is required, a continuation sheet iM" be attached. It is highly desirable that the abstract of classified report! be unclassified. Each paragraph of the abstract shall end wiu an indication of the military security classification of the in- formation in the paragraph, represented as (TS). (S), (C). or ( There is no limitation on the length of the abstract. How- ever, the suggested length is from 150 to 225 words. 14 KEY WORDS: Key words are technically meaningful term! or short phrases that characterize a report and may be used «s index entries for cataloging the report. Key words must W » selected so that no security classification is required, wen" fiers, such as equipment model designation, trade name, mi u project code name, geographic location, may be used as key words but will be followed by an indication of technical con text. The assignment of links, roles, and weights is option DD FCi 1473 (BACK) TTMHT.ARSTTTTKD Security Classification £ y UNIVERSITY OF IlllNOIt-URIANA 3 0112 002612627