LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN XJlCbv QjO^ . z CENTRAL CIRCULATION AND BOOKSTACKS The person borrowing this material is re- sponsible for its renewal or return before the Latest Date stamped below. You may be charged a minimum fee of $75.00 for each non-returned or lost item. Theft, mutilation, or defacement of library materials can be causes for student disciplinary action. All materials owned by the University of Illinois Library are the property of the State of Illinois and are protected by Article 16B of Illinois Criminal Law and Procedure. TO RENEW, CALL (217) 333-8400. University of Illinois Library at Urbana-Champaign 'UL x j. /.UUJ JUL 1 1 20U7 When renewing by phone, write new due date below previous due date. L162 Digitized by the Internet Archive in 2013 http://archive.org/details/apemachinenovels556woyi //U UIUCDCS-R-T3-556 /yi^tAAi C00-1U69-0216 APE MACHINE A NOVEL STOCHASTIC COMPUTER BASED ON A SET OF AUTOMONOUS PROCESSING ELEMENTS by YIU KWAN WO February, 1973 IHE LIBRAE OF IH £ MAR 7 1373 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS UIUCDCS-R-73-556 APE MACHINE A NOVEL STOCHASTIC COMPUTER BASED ON A SET OF AUTOMONOUS PROCESSING ELEMENTS by YIU KWAN WO February, 1973 Department of Computer Science University of Illinois Urbana, Illinois 6l801 This work was supported in part by Contract No. US AEC AT(ll-l)lU69 and was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering, February, 1973 Ill ACKNOWLEDGEMENT The author wishes to express his sincerest gratitude to his advisor, Professor W. J. Poppelbaum, for suggesting this thesis topic and for his continuous guidance, friendship, and invaluable support throughout the entire course of his graduate study. He is also grateful to his co-worker D. L. Olson, who built part of the APE machine. Furthermore, he would like to thank all other members of the Hardware and Systems Research Group for their friendship and many valued discussions over the past few years. He is also indebted to all the personnel in the fabrication group and machine shop under Frank Serio for the excellent work they contributed toward the building of the APE machine. Special thanks go to Jerry Fiscus, who did an outstanding job in some difficult high density circuit layouts. All of the part-time technicians, Lloyd Ravlin, Mike Selander, Jerry Menchhoff, and Bert Semmelink who worked on the APE project at various times during the past two years are thanked for their help. Thanks are also due to Barbara Bunting for typing this thesis, to Mark Goebel for the drawings, and to Bernard Tse for typing the first draft. Last, but not least, he would like to thank his wife, Catherine, for her support, understanding, and encouragement throughout the project. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. SYSTEM DESCRIPTION OF THE APE MACHINE 1+ 2.1 General Description k 2.2 The Modulation Scheme Employed by the APE Machine for Data Communi cat ion 7 2. 3 Flexibility in Forming the Set of APEs and Fault Tolerant Capability of the APE Machine 9 2.U The Structure of an APE 11 2.5 Power Supply of the APE 11 3. STOCHASTIC DATA PROCESSING METHODS OF THE APE MACHINE AND THEIR STOCHASTIC PROPERTIES 13 3.1 Number Representation in Stochastic Data Processing .... 13 3.2 Some Simple Boolean Operations on SRPS 15 3.3 Representation of Negative Numbers by SRPS l6 3.h Addition and Subtraction Operations in Terms of Machine Variables 17 3.5 Operation of Multiplication in Terms of Machine Variables . 20 3.6 Division in Terms of Machine Variables 20 3.7 Statistical Properties of Stochastic Processing Operations of the APE 25 3.7.1 Statistical Properties of an SRPS 26 3.7.1.1 Statistical Parameters and Sampling Statistics of an SRPS 26 3.7.1.2 The Relation between the Confidence Level , Error, and the Integration Time 27 3.7.2 Statistical Properties of the Result of Addition and Subtraction by an APE 30 3.7.3 Statistical Properties of the Product of Multiplica- tion Operation 31 3.7.1* Statistical Properties of the Quotient in Division Operations 32 3.7.5 Statistical Properties of the Result of Cascading Two Operations 36 3.7.6 Statistical Properties of the Results of Differential and Integral Operations hi k. THE GENERATION OF THE SYNCHRONOUS RANDOM PULSE SEQUENCES U2 U.l General Requirements on the SRPSs Used in the APEs U2 k.2 Conversion of a Number into a SRPS ^2 k. 3 Generation of a 10-bit Pseudorandom Binary Number with Uniform Distribution hk h.k The Singular State of a Linear Feedback Shift Register . . U7 1+.5 The Binary-Number-to-SRPS-Converter of the APE k9 5. CIRCUIT DESCRIPTION OF THE APEs 51 5.1 Special Design Considerations for the APEs 51 5.2 Block Diagram of the APEs 51 5.3 The Timing Circuit of the APEs 5^ 5.3.1 Digital Synchronizing Signal Separator 56 5.3.2 The Delayed Reset Signal and the Preset Signal Generator 58 5.U Mode Control Signal Detector 58 5.5 The Function-Decoder and Channel Multiplexer 60 5.6 Input Duty-Cycle Decoder 62 5.7 The Stochastic Processor 65 5.8 The SRPS Integrator and the Output Encoder 65 5.9 The Communication Subsystem of the APE 67 5.9.1 The Remotely Tunable Data Receiver 67 5.9.2 The i+2. 5 MHz Clock Receiver lh 5-9.3 The Switching Transmitter lh 6. THE APE CONTROL UNIT 77 6.1 Block Diagram of the APE Control Unit 77 6.2 The Instruction Code Generator 79 6.3 The 9-Channel Instruction Transmitter 82 6.h The Clock Signal Transmitter 85 6.5 The BCD/Machine Number Converter 85 6.6 The All-Channel Receiver 87 6.7 The Machine Number to BCD Converter and the 1997/1023 Scaler 87 7. THE APE SENSORS AND THE REMOTE POWER SUPPLY OF THE APEs 92 7.1 The APE Sensor 92 7.2 The Remote Power Supply for the APE 95 8. CONCLUSION AND OUTLOOK 98 LIST OF REFERENCES 101 VITA 103 Figure No. 2 .1 2 .2 2 .3 3 .1 3 .2 3 .3 3 ,lt 3 .5 h ,1 h .2 h 3 k, It h, 5 5. 1 5. 2 5. 3 5. U 5. 5 5. 6 5. T 5. 8 VI LIST OF FIGURES Page The APE Machine 5 A Simple Program to Compute a'b +c 8 The Functional Diagram of the APE Machine 12 A Sample of a Synchronous Random Pulse Sequence ik Circuits to Perform Addition and Subtraction 18 An EXCLUSIVE-OR Gate to Perform Multiplication 21 An EXCLUSIVE-OR Gate to Compute the Absolute Difference of Two Machine Variable Values 23 The Division Operation in Terms of Machine Variables . . 2^ Digital to SRPS Converter U3 A General Linear Feedback Shift Register Arrangement . . U5 The Arrangement for the Generation of a 20-Bit Maximum Length Sequence ^5 The Protective Circuit to Prevent the Feedback Shift Register from Staying in the All-Zero State kQ Conversion of Two Numbers into Statistically Independent SEPSs 50 Block Diagram of an Autonomous Processing Element ... 52 The Composite Clock Waveform 55 Synchronizing Signal Separator and its Waveform .... 57 The Delayed Reset and Preset Signals Generator 59 The Mode Control Signal Detector 6l The Decoder for Functions and the Multiplexer 63 The Input Data Decoder 6h SRPS Integrator/Output Encoder ....... 66 Vll Figure No. Page 5.9 Some Common Spurious Responses of Heterodyne Receiver. 72 5.10 Remotely Tunable Heterodyne Receiver 73 5.11 The 1i2.5MHz Clock Receiver 75 5.12 The Switching Transmitter of the APE 76 6„1 A Block Diagram of the APE Control Unit 78 6.2 A Block Diagram of the Instruction Code Generator . . 80 6.3 Schematic Diagram of the 32-Bit Instruction Code Generator 8l 6.k An Instruction Word and the Corresponding Pulse Width Modulated Signal 83 6.5 The Instruction Transmitter for an APE Channel .... Qk 6.6 The BCD/Machine Number Converter 86 6.7 The All-Channel Receiver of the APE Control Unit ... 88 6.8 The Block Diagram of the 1997/1023 Scaler 89 6.9 The Block Diagram of the Machine-Number-to-BCD Converter 91 7.1 A Block Diagram of A Light Sensor 93 7.2 The Light Intensity to Pulse Width Converter 9 1 * 7»3 Typical Terminal Characteristics of a Photocell ... 97 7.^ Spectral Response of Photocells 97 1. INTRODUCTION Rapid development in Integrated Circuit Technology together with the increasingly widespread use of computers in recent years has brought about many significant changes in computer design philosophy. ICs now can be mass- produced at very low cost: The resulting drop in the relative cost, as compared to that of other components., other construction and operation costs of a com- puter, leads to a new set of criteria in determining an optimum system organi- zation and the design of the detail circuits. These new criteria inevitably emphasize on extensive use of mass-producible items like the LSI and avoid as much as possible the components which are not mass-producible and construction processes which are difficult to automate. On the other hand, extensive use of computer by people from all walks of life calls for new types of computers with more flexible system organization, higher reliability and simpler opera- tion procedures. Under these new conditions, a novel stochastic computer with highly flexible structure, originally suggested by Professor Poppelbaum, is being built. It takes advantage of the latest advances in component technology, and leads to new standards of computer structure and performance. This com- puter is dubbed APE machine as its basic building blocks are a set of Autonomous Processing KLements. These basic building blocks consist of absolutely identical circuits (except for transmitter frequencies), making the structure of the APE machine highly homogeneous. This homogeneous struc- ture lends itself exceedingly well to mass-production by integrated circuit technology. Each basic building block is a small processor in its own right. However they can be grouped together to form a more powerful system, Without any actual physical hook-ups. The freedom and ease of incrementing its com- puting power by simply adding more APEs makes the APE machine a futuristic design in keeping with the changing needs of computer users: At the present time, an increase in the computing power of a computer system usually requires the replacement of the system or a major processing unit. (For a small but increasing number of recent models, provision is made to allow additional pro- cessors to be connected to the existing facilities to boost the computing capability. But in all cases, the user has no control over the size of the incremental power and has little influence in organizing the computer system in a problem-dependent way in order to obtain better performances.) For the APE machine, a bigger computing capability simply means using more basic build- ing blocks, and these building blocks can be organized in various ways to suit particular applications! Fault tolerance is a highly desirable feature for complicated machines such as computers. It is even more so for computers handling critical jobs. (2)(3) Research in this area has grown rapidly over the past decade. It is quite clear at this point that fault tolerance is going to be one of the most important features of future computers. Should the need arise, additional circuitry could be easily added to the APE machine such that various fault tolerant features (such as static self-checking, dynamic self-checking, and even self-repair) may be incorporated. The processing method of the APE machine is based on the stochastic processing principles to be discussed in Chapter 3. Stochastic processing is used because the project APE machine represents an effort to explore the new approaches to data processing and to develop new processing techniques and new hardware. Stochastic processing has been investigated both in the U.S. and abroad '*- ? '* ', Research conducted in the Hardware and Systems Research Group under Professor Poppelbaum has made significant contribution to the theory and practice in this area. Several complex computing machines based on stochastic computation principles have been successfully constructed over the past few years by this research group . The APE machine is yet another attempt to further investigate this area. In the following chapters, the system structure and the design of the APE machine will be discussed. Some detail analysis is given to its sto- chastic operations. Its limitations as well, as their possible solutions are also presented. 2. SYSTEM DESCRIPTION OF THE APE MACHINE 2.1 General Description The APE machine is an on-line real-time stochastic computer with reconfigurable structure. It consists of a set of Autonomous Processing Elements known as APEs, a set of sensors to acquire raw input data for pro- cessing, a remote power supply and a program control unit. The system diagram of the APE machine is depicted in Figure 2.1. Data communication "between various constituents of the APE machine is carried out through radio frequency channels only. Power is also transmitted remotely into the APEs in the form of light energy. Therefore the APE machine does not require assembly into a physically wired network. For communication with other parts of the computer each APE has two tunable data input channels, a fixed instruction channel, and an output transmitting channel operated at the same fixed frequency but on a time-multiplexed basis with the instruction channel. The two data input channels can be tuned to establish RF linkages with the output of any other APEs or sensors. The APE can perform on its two input data any of the opera- tions of addition, subtraction, multiplication, division, differentiation, integration and storage. The tuning of the data input channels as well as the setting for an APE to perform a specific operation is done remotely through the instruction channel linking with the control unit. The sensors provide the system interface with the outside world: They convert the input variables, such as temperature, light intensity, etc., into machine variables compatible with the data input channels of the APEs. In Figure 2.1, the APEs are symbolized by triangles with i and j denoting the tunable data input channels and k denoting the fixed output channel. A specific output channel k is identified by its channel frequency to UJ UJ _l UJ < p 3 O < X ISO +0 *0 '0 +0 o = o > < -j y TTTT a' UJ £ y> UJ 5 h- 5 a \l UJ en UJ 2* H O l/J *: UJ Hj CD U a3 CM < o CO or ^ P v s H 8 UJ Q. < >- 2 I 7 / JC \ • / / \ • / \ JC /- <3> — \ 1 1 1 1 + in o 2 3 a v There are N number of different output channels available to the system. As mentioned before the computational power of the APE machine is incrementable, However, the increment is not without bound. The number N is a parameter indicating the maximum computational power which a particular APE machine can obtain. It should be noted that the number of APEs in a specific APE machine might be greater or smaller than N. In other words, it is allowed to have redundant APEs operating in parallel or completely missing APEs for any parti- cular channel. This will be discussed in more detail later on. A specific operation instruction for an APE is denoted by 6 . It could be any one of the seven operations mentioned above. The sensors are denoted by a circle in Figure 2.1. Their outputs are transmitted through channels other than those for the APEs. Their channel frequencies are denoted by v„ + , . A simple example of organizing the APEs and sensors to perform 2 a*b+c is illustrated in Figure 2.2. In this example, three input variables a, b and c are provided by sensors S , S_ and S_. Their values are encoded and transmitted from their respective output channels. The input channels i and j of the APEs belonging to the channels denoted by 1 and m are tuned to receive input data from S , S p and S_ as shown. APEs in channels 1 and m are set to perform multiplication. The APE or APEs operating in the channel denoted by n, with their data input channels tuned to receive input data from the output of APE channel 1 and APE channel m, are set to perform addition. 2 The final result a*b + c is then transmitted at frequency v . The program con- trol unit is equipped with an all-channel receiver. It can be set to receive 2 the final result a'b + c from APE channel n as well as outputs from any 7 other APE channels and sensor channels. The display of the result is done by means of a Nixie-tube display panel. For more permanent outputs, provision is made to interface with a teletype for printing the outputs. Programming of the APE machine can be carried out by selecting two APE channels i and j for the two tunable data input receivers and assigning a specific operation for each of the APEs involved, as symbolized by the i, j and k dials together with a row of buttons for different operations in Figure 2.1. The clock is used for synchronization while the 'Alive' and 'Set' buttons indicate part of the fault tolerant features and are used to test an APE channel before it is put into operation. More detail about synchroni- zation and testing will be discussed later. The topology table of the control unit shown in the figure simply serves symbolically as a convenient place to keep a record of the program of the APE machine. The APE machine has three different modes of operation. In the test mode, test signals are sent out to check all APE channels involved in the pro- gram to be executed. In the programming mode , the APEs and sensors are organized in a suitable structure according to the program. During the execu- tion mode , data transmission and data processing are done by the APEs in a synchronous manner. All APEs communicate with each other during the communica- tion period of the computing cycle and process the data during the remaining part of the computing cycle. 2.2 The Modulation Scheme Employed by the APE Machine for Data Communication Special consideration is required in choosing the proper modulation | scheme so that more than one APE having identical output channel frequency N O + O J3 O / X \ / ii \ /: Q> A i i t tf 5 S 2 < J o o U) UJ z z 3 3 f- H cc or UJ UJ > > UJ UJ o o UJ UJ or cc CM + 'a 0) -p I o o o -p O a) H I •rl CQ CM CM CD U are allowed. In the case of several APEs having the same output channel, they are said to he of the same type. All APEs of the same type operate in parallel in every respect. Their corresponding input data channels are tuned to receive data from the same source, and they are set to perform the same opera- tion. In other words, the set of APEs actually consists of many different types of APEs. A specific type of APE has the same specific fixed frequency channel for instruction input and its data output. Whenever an instruction is sent out from the program control unit, all APEs of the same type will be programmed exactly the same way. To guarantee a proper reception of data from the same type of APEs by the APEs in the following stage, a synchronous pulse width modulation scheme is employed for data communication. All APEs and sensors are synchronized for data transmission and reception: At exactly the same time, each of the APEs and sensors begins to transmit an RF pulse whose width carries the information. If there are several APEs in one parti- cular channel, they stop transmission at about the same time due to the fact that the data sent out from each of them are equal (to within the accuracy of the processor of the APEs). With this kind of modulation scheme, it is evident that the data transmission is not seriously degraded by the existence of redundant APEs. 2 -3 Flexibility in Forming the Get of APEs and Fault Tolerant Capability of the APE "achine As indicated in Figure 2.1, the program control unit can send out a test signal to every APE channel. Depending on the complexity of the test, the operator could find out from the response how well the APE or APEs of that 10 channel is functioning before he tries to put that type of APEs into operation, Only a simple existence test for an APE in a specific channel is implemented in our machine: A negative response from this test would indicate either that there is no APE in the set belonging to this type or that none of the APEs of this channel are functioning properly. Then this channel is simply not to be put into operation . . . The freedom to have redundant APEs of the same type operating in parallel as well as to have no APE or no properly functioning APE in a parti- cular channel leads directly to some important features of the APE machine. First, the set of APEs can be grouped together untested and in a random fashion. In the imminent LSI era, an APE being build as a single LSI chip becomes a realistic possibility. The fact that testing is not needed helps to simplify the production of these LSI chips. Ultimately, it is quite con- ceivable that the APEs could be mass produced at low cost: Whenever a set of APEs is needed to set up an APE machine, they could just be picked up at random from a large stock of untested LSI chips. If too many of them are redundant or perhaps do not function, the operator simply throws more LSI chips into the set of the APEs until he has sufficient number of different types of APEs (i.e. different frequencies) at his disposal for the job. Secondly, the redundant APEs increase the reliability of the APE machine. Of course there are many possible ways a malfunction could occur. All those mal- functions which result in sending shorter output pulses will have no signifi- cant effect upon the APE machine if there is at least one APE in each type functioning properly. 11 2.k The Structure of an APE A functional diagram of an APE is given in Figure 2.3. It has a instruction receiver tuned permanently to channel v for the k type APE elements. During the programming mode the transmitter of every APE is bein^ shut off, and the instruction receiver is activated to receive programming instructions from the APE control unit. The program instruction carries the information of how to tune the data receivers A and B as well as what specific operation to perform. This information is stored and decoded by the decoder for function. After completing the programming of all APEs involved in a specific program, the APE machine is switched to the execution mode. The two tunable data receivers A and B are now tuned according to the instruction. The data which they receive are in duty cycle modulation. They are converted into binary numbers and stored by two duty cycle decoders as shown in the figure. These binary numbers x and y are now fed into the stochastic computing element for a specific type of processing according to the value of 6 from the decoder for function . The processing result z is in binary and is again encoded into duty cycle representation before it is transmitted. The clock receiver provides the synchronization for the trans- mission of the output data, and it also carries the mode control signal. The 'alive' reply signal control is incorporated to answer a test signal from the control unit. The answer is also sent out through the output transmitter. The details of structure and the operation of the APE will be discussed in Chapter 5. 2.5 Power Supply of the APE The APEs are powered by four solar cells illuminated by incandescent lamps. The details on this part of the project is given in Chpater 7. 12 TXJ1 31 UJ LU " 2 5 pi o o J< 01 w C\J (U ■H 13 STOCHASTIC DATA PROCESSING METHODS OF THE APE MACHINE AND THEIR STOCHASTIC PROPERTIES 3.1 Number Representation in Stochastic Data Processing During the past decade, considerable interest has arisen in a new way of representing numbers on electronic computers. Unlike the common met nods of representing a number by a continuous electrical quantity or by a temporal sequence of discrete electrical quantity, the new method represents a number by the statistical average of a random variable in a discrete stochastic pro- cess. Although this is not a very efficient way to represent numbers from the information encoding point of view, extremely simple digital hardware is suf- ficient to process data in this representation. The most striking advantage of this representation of numbers lies in the fact that any Boolean operation on binary variables corresponds to an arithmetic operation on the numbers corresponding to a sequence of these binary variables. Therefore a simple logic gate can perform a complicated arithmetic operation which would require scores of gates to perform if the numbers were in common representation. Several different types of stochastic number representations have been developed. The type being employed for the APE machine is called synchronous random pulse sequence or simple SRPS. In this case, a number is represented by the probability of occurrence of a pulse in any time slot of a clocked time pulse sequence. An example of a SRPS is depicted in Figure 3.1. In terms of mathematical language, a SRPS is a discrete stochastic process denoted by a sequence of statistically independent random variables X where X is a n n Boolean variable. For a specific n, it can be either '1' or '0'. The sub- script n covers the entire positive integer. The m order joint distribution function of X , is given by P(X ,X , ... X ). Because X. and X. with 1 4 J n 12m i j lb JC o o O £ x: o u T3 o C UJ o a: c •"■ ■ a> 0) jz j/> *™ 3 j= Q_ O £ «*- o o c ja V) e c o 3 3 0) u (U a> O j= O. 0> H- o cr o ■*- >S o 0) o ^^^ 3 c • — O - a> o en Q. •a o 0) « w v> JZ 0) 3 \- a_ Q_ E ▲ o T 0) CL O (J Q> c O U s •H 15 are independent random variables, it follows that m p(x , x 2 , ..., x m ) = n p(x n ) 3.1 1 n=l To map a number into a SRPS , one must first of all normalize the number and express it as a fraction of the full range. Then a SRPS is to be produced such that the probability of occurrence of a pulse in any clock period equals that fraction. In general, the SRPS is a non-stationary and therefore non-ergodic process because the number which it represents is gen- erally a function of time. However, if the time average of a SRPS is computed over a period in which the number it represents is held fixed, as it is precisely the case for the APE, a SRPS can be treated as an ergodic process. It follows that the time average of any sample function of a SRPS is identical to the statistical average. This property allows the conversion of a number from stochastic representation back to the common analog representation s imply by integration over a sampling time period. 3.2 Some Simple Boolean Operations on SRPS Suppose two SRPS's X and Y represent two numbers x and y respectively. If an AND gate is fed by these two SRPS's, the output is visibly a new SRPS with the probability of having a pulse in any time slot given by P(X =1, Y =l). Let the resulting SRPS be denoted by Z and the number it represents by z. Then the statistical mean of Z is given by n Exp {Z } = P(Z =1) = P(X =1, Y =1) 3.2 r n n n n If X and Y are statistically independent, Equation 3.2 becomes Exp{Z } = P(X =1) • P(Y =1) 3.3 * II n n 16 Recalling that Exp^Z } = P(Z =1) = z 3.U c Exp^X > = P(X =1) = x 3.5 Exp^Y > = P(Y =1) = y 3.6 n n Equation 3.3 "becomes z = x*y 3.7 This means that a simple AND gate is what is needed to perform multiplication for numbers in SRPSs. Similarly, it can be shown that two numbers in SRPSs X and Y are mutually exclusive, their Boolean OR operation n n corresponds to the addition operation. 3.3 Representation of Negative Numbers by SRPS Since the range of the value of probability is from zero to one, it becomes necessary to have a suitable transformation before a negative number can be represented by a SRPS. Without loss of generality, one only needs to examine the transformation of a variable in a closed region i-l, +l) into another variable in the closed region (0 9 l) . Variable values outside the range of ( -1, +l) can always be normalized to within this range. To avoid confusion, it is helpful to keep in mind the distinction between three different notations all representing the same quantity but in different representation X : external variable with range ( -1, +l) x: machine variable with range {0, 1} X : SRPS representing x given by IT For an APE, the transformation between X and x is linear and is ' e X = 1 - 2x 3.8 e With this type of transformation, it can be shown that the SRPS representing x can be easily converted to one representing -x by passing it through an inverter. 3. k Addition and Subtraction Operations in Terms of Machine Variables Addition in terms of external variable is given by Z = X + Y 3.9 e e e with X and Y being the input data and Z being the sum. The summation in terms of machine variables can be obtained readily. Let x, y, and z be the machine variables corresponding to Z , X , and Y respectively. According to Equation 3.8 Z = 1 - 2z e X = 1 - 2x 3.10 e Y e = 1 - 2y Substituting Equation 3.10 into Equation 3.9 we get z = x + y - 1/2 3.11 Similarly for subtraction, given by Z = X - Y , the operation in terms of machine variables can be expressed by z = x - y + 1/2 3.12 The hardware circuit for implementing the addition and subtraction operations is shown in Figure 3.2 The processor itself consists of only two AND gates, one OR gate and an inverter. The variables to be processed, namely 18 M c o a>C\J o >^\ c + 7* 0) 0) S ^ Q. 11 X 0) w. 0) > c o o o c m o en CL a: c/) 00 Q_ QT CO c X ii **""» Q. c X X Ld CL X CO Ld CL en CO ■ ^ i o> X N + I x n M a o ■H p o -P ■9 CO PI O •H -P •H T3 < O «H fH 0) PL, o p cn P •H CJ Sh ■H o - c 19 X and Y are in SRPS representation. They are gated alternatively through the n n AND gates by the complementary gating clock signals. This insures that the two operand SRPSs are mutually exclusive. The gating clock C has a duty cycle of precisely 50%. The output from the OR gate would therefore consist of half of each input pulse sequence of X and Y . This mixed sequence is subsequently gated into the second least significant bit of the output counter. The func- tion of the counter is to convert its SRPS input into the binary number. Tnis is done by counting the number of pulses in the SRPS over as many clock per- iods as the maximum counts of the counter. The final count would therefore be an estimation of the time average of the input SRPS. In other words, the counter converts the value of a variable in SRPS representation into binary representation. When the input is fed into the 2nd bit of the counter, it amounts to multiplying the input by a factor of two. Because only half of each input SRPSs X and Y has been gated into the counter, the multiplying factor of two would make the final count correspond to the summation of the two inputs X and Y . At the beginning of each integrating period, the output counter is preset to half full. Since the counter is operated in a modulo fashion, this means subtracting 1/2 from the result. Therefore the final count of the output counter indicates the estimate of x + y - 1/2, which is the summation of the two input variables in terms of machine variables as expressed in Equation 3.11. As mentioned earlier, inverting a SRPS corresponds to inverting the sign of the number it represents. Therefore, subtraction merely amounts 1 i inverting the subtrahend before doing the addition, as indicated in Pigure 3.2. 20 3.5 Operation of Multiplication in Terms of Machine Variables As in the case of addition or subtraction, multiplication in terms of machine variable also changes its form under the transforation of Equation 3.8, In terms of external variables, multiplication is given by Z = X -Y 3.13 e e e where X Y and Z are the multiplicand, multiplier and the product respec- e' e e tively. Upon transforming these variables into machine variables, the multi- plication operation is then given by z = x + y -2x*y 3.1*+ The hardware required to implement Equation 3.1*+ turns out to be a simple EXCLUSIVE OR gate as shown in Figure 3.3. 3.6 Division in Terms of Machine Variables Like the other operations discussed above, the division operation in terms of external variables is different from the one in terms of machine variables. Let Z , Y and X be the quotient, numerator and denominator e e e respectively. The relation of their corresponding machine variables can be obtained by applying Equation 3.8 to the expression of division operation. The result is — H-i For the implementation of this expression, somewhat more circuitry is involved. To simply the hardware design, Equation 3.15 has to be rewritten to match the hardware characteristics. Since z is necessarily non-negative, z can therefore also be expressed as CO 21 c o II c X CL + o c II c X CL n ii c M X I X II X C\J I + X II o ■H P nJ o •H H P» •H P H 3 O In dJ Ph O P HI P a K O I w CO l-'l o X! W on on tu i •rl fa x - y z = 4 *- 22 3.16 . 1 - 2x In absolute value, the numerator and the denominator can be computed more easily. As mentioned in Chapter 2, the machine variables are transmitted and received in duty cycle modulated form, i.e. the width of the transmitted pulse carries the information about the value of the machine variable. Happily enough, a simple EXCLUSIVE OR gate is all it needs to compute the absolute value of the difference of two machine variables in duty cycle modulated form, as depicted in Figure 3.H. To compute the denominator | 1 - 2x| , the first step is to feed the input decoding counter of the duty cycle decoder (which will be discussed in more detail later on) by the 2nd least significant bit instead of the least significant bit. This corresponds to entering 2*x into the input decoding counter. This counter originally is of 10 bits length to match the range of the numbers it handles. In order to compute the denominator of Equation 3.l6, one more bit is added to the most signifi- cant position. This extra bit is referred to as the control bit . This con- trol bit is used to control a TRUE/ COMPLEMENT gate as shown in Figure 3.5 The remaining 10 bits of the decoding counter represents the data in binary fraction. If x is less than one half, or 2*x <1, the control bit is a '0'. The TRUE/ COMPLEMENT gate would gate the complement of the 10-bit binary frac- tion, i.e., 1 - 2x, to its output. On the other hand, if 2x > 1 the control bit is '1» and the content of the 10 data bits equals to 2x - 1. These 10 bits are directly gated to the output of the TRUE /COMPLEMENT gate. Combining these two cases, the output of the TRUE/ COMPLEMENT gate is given by |1 - 2x| . After obtaining the numerator |x - yj and the denominator |l - 2x| , it is now necessary to find the quotient. In order to explain the operation NJ 23 A ? in ro CVJ CO CO CO c c c 3 => 3 ro LO <\J n II ii X ^ M 0) o a; aj r n •rH -p H o < CD X! P (U ~S 0) ft 3 6 H O «5 o > O CI) ^2 0) aj +j -H cd *n > W o a; i a H O ° > Cm O on OJ -I 21+ E o o "O o O o 3 C o ■o ■o N w O +B ^ 'Z CD O . o. 9 £ o i 10-Bit Division Counter i o Ll o c in c C o X ra 0) H ^> aJ •H Sh a$ > cu a CM 1 •H o aJ S i— i — O II X ca CVJ U 1 CD 1 > X EH rH C II •H 1 •H Q cu EH on cu •H P>4 25 clearly, however, it would be "best to examine the following case first. Suppose a number a is re-presented by a SRPS A . Let this SRPS A be gated into a counter * v n n and let the output reading of different bits of the counter represent a binary fraction. Then the time it takes for this counter to accumulate enough counts to make the output reading equal to a given binary fraction n is a random vari- able with mean value n/a . This relation can be applied to perform the division operation. To do so, let the binary fraction n be equal to |x - y| and let the number a, which the SRPS A represents, equal |l - 2x| . Then the quotient | x — y | / | 1 — 2x | is obtained by counting the SRPS representing |l - 2x| until the counter reading the binary fraction is equal to |x - y| . The time it takes to accomplish this is then the quotient, already in duty cycle modulation form. The hardware required for the division operation is actually not nearly as complicated as it appears in Figure 3.5- This is because the decoding, encoding and storage circuits are also included in the figure. 3.7 Statistical Properties of Stochastic Processing Operations of the APE In this section, a statistical analysis is given of l) the SRPS, 2) the result of arithmetic operations on numbers represented by SRPSs, and 3) the result obtained by cascading several APEs together. 26 3.J.1 Statistical Properties of an SRPS As mentioned before, an SRPS corresponds in general to a non- stationary stochastic process. However, it can "be treated as ergodic process if the number it represents is held fixed for each computing cycle. Since this is the case for the APE machine, the SRPS is treated as ergodic through- out the remainder of this paper. 3.7.I.I Statistical Parameters and Sampling Statistics of an SRPS An SRPS X representing a number x has a distribution function n P(X.=l) = x for all i. If an SRPS X is given, it is impossible to find out exactly what the number x is in a finite period of time. The best that can be done in a finite period of time is to find an estimate of x. The estimate is obtained by integrating or (for our case) counting over a finite number K of clock periods. The result is called sampling mean and is given by K X. (X ) = Z _i 3.16a n K 1=1 K This sampling mean is also a random variable and has its mean y/ \ U n j K and standard deviation oty- \ The objects of the section are to derive these ■nV parameters. Let the probability of having L number of l's in K number of clock periods for an SRPS X be P(K, L). Since the SRPS X is a sequence of indenen- n n * dent random variables with a distribution function P(X.=l) = x, P(K, L) is binomially distributed and is given by 27 K! L, sK-L p(K > L)= LKirrT! x (1 " x) 3a7 By comparing their definitions, it is not difficult to find that P(K, L) is actually the distribution function for (X ) with (X ) given n a. n K by L/K. Therefore one can write with (X ) v = L/K n K. Since (X ) is binomially distributed, its mean and standard deviation are well known and given by ^(X n ) K = x 3.19 (X L = ^-^ 3.20 n'K : k- V K 3.T-1.2 The Relation between the Confidence Level, Error, and the Integration Time The most general approach to this problem uses the Tchebychef f ' s inequality which states that for a random variable X having an arbitrary distribution with mean u and standard deviation o , the following inequality holds for every e > 2 P(|x - y| > e)< ~ 3 * 21 e The confidence level for the random variable X lying within the maximum allowable error c from its mean p is given by a ■ P(|x - y| < e) 3.22 Because of the fact that P(|x - -n| < e) + P(|x - y| > e) ■ 1 3.23 confidence level a can also be expressed as a ■ 1 - P(|x - y| > e) 3.2U 28 Now if the random variable is (X^, Equation 3.21 becomes . p(|(x n ) K - H{ )K | >.>*!%* 3.2^ By combining Equations 3.2H, 3.23 and 3.20 one can obtain the relation between confidence level a , maximum allowable error e , and the minimum number of clock periods K over which the integration must be taken: x (1-x) K> — 3.25 e 2 (l-^) However, recalling that < x < 1, x(l - x) < 1/k 3.26 By substituting Equation 3.26 into Equation 3.25, one gets 1 K > i 2 d \ 3.27 4e ll-aj For example, in order to find out the time average of a SRPS by taking the k ai average over 5 x 10 clock periods, one will have at least 95% confidence in the reading being within 1% of the exactly correct result. The fact that at least 95% confidence instead of simple 95% confidence is due to the 'less than or equal to' relation in the Tchebycheff ' s inequality. Equation 3.27 is derived from Tchebycheff ' s inequality, which is applicable to random variable with any distribution function. Consequently, Equation 3.27 represents a requirement somewhat unnecessarily restrictive for the binomially distributed (X ) . n K A less general approach, which leads to a weaker requirement than that of Equation 3.27, takes into account the specific distribution of (X )_,„ n K Based on central limit theorem of statistics, this approach approximates the binomial distrubition of (X ) with a normal distribution. !(x ) K -x] 2 29 n K P{(X n» K )= 7?r77-— e ^Vk 3.28 1 Vk The confidence level a with maximum allowable error e is given by X+ E »- 'p((X n ) K > ^\h 3.29 " X- £ By defining a new variable Z = " K ~ 3.30 and substituting Equation 3.28 into 3.29, it becomes w a = - - e dZ 3.31 -w where W = / - q L 3 . 32 U n J K In terms of the error function ERF, Equation 3.31 becomes a = EBFinm -} 3.33 y ^ ° (y ) U n j K substituting Equations 3.20, 3.26 into Equation 3.33, one can write a = ERF(e/2K} 3.3^ This gives the relation between a, e, and K. The validity of the approxima- tion being made in this approach requires a large K. In the case of the APiS machine K is over 10 , Equation 3.3^ gives a good approximation* to the relation between the parameters involved. For the example of 95% confidence with 1% maximum allowable error, the integration must be taken over approxi- U k mately K = 10 periods. Comparing with the result K = 5 x 10 obtained from According bo the diBCiission given in Reference (20), the error produced by 'A: 5 1 proximation for the case in which K = 10 and Exp[(X^) v ] = — is less II..-. f\ -5 0/ 30 the previous approach, they differ by a factor of 5- However, they are still consistent because what the previous approcah concludes is that for K = 5 x 10 , the reading provides at least 95% confidence and accurates to within 1%. The result of the second approach obviously agrees with it. 3.7.2 Statistical Properties of the Result of Addition and Subtraction by an APE A close examination of the operation of addition reveals that inte- grating the sum over K clock periods corresponds to integrating the addend and the augend over -r clock periods each. Let x, y be the addend and the augend respectively. The sum in terms of machine number is given by z = x + y - — as shown before. According to Equations 3.19 and 3.20 the sampling statistics of a sample size of H = t from X and Y are given by H X. V, Y , = E *p( ^ -|} = x 3.35 ^VH i=l o Zx(l-X) (X n } H~ H 3.36 H Y. y (y j = Exp{r -|} = y 3#37 'n'H i=l °(Y V " H 3 ' 38 n H It can be shown that the sampling mean of the sum is given by 3.39 By substituting Equations 3.35, 3.37 into Equation 3.39, one can write "< 2 A ■ M < X n>H + " 1 he (l-a) 2.I43 This gives a relation between confidence level a , maximum allowable error e and the sampling size K, based on the 1st approach. The result based on the 2nd approach can also be obtained by replacing the . The result is given by U n j K a = ERFU/2K } 3.UU The subtraction operation has the same statistical properties as the addi- tion operation, because subtraction is done exactly the same way as the addition operation, except that the subtrahand SRPS is inverted before it is added to the minuend. This only amounts to changing the value x of the sub- trahand SRPS X to (l-x) before the addition operation takes place, n 3*7.3 Statistical Properties of the Product of Multiplication Operation Let z, x and y represent the product, the multiplicand and the multiplier. As mentioned in Section 3.5, the sequence obtained at the out- put of the EXCLUSIVE-OR gate is an SRPS representing the product with the mean value given by M Z = z = x + y - 2x-y 3.U5 All the result about the statistical properties of an SRPS derived in 3.7.1 are therefore applicable for the product. In particular, the fluctua- tion of the samples is given by 32 /(x+y - 2x-y)(l-x -y + 2x-y) , > °(Z )_ " / K 3 ' hb n H The relations "between a , K, and e for the product is identical to the ones given in Equation 3. 27 and Equation 3.3 1 *. 3.7.4 Statistical Properties of the Quotient in Division Operations It was shown in Section 3.6 that the Quotient Z of the division operation is a random variable. Its statistical properties is now being investigated. The distribution function of Z corresponds to the probabilities of getting a fixed number of pulses for different numbers of clock periods. Let T. be the number of clock periods between the i * pulse and its follow- ing pulse in a sample function of an SRPS. The total number of clock periods T for an SRPS D , representing the denominator in Equation 3.l6 in the case of division operation, to produce N number of pulses is therefore given by N T = E T. N < T < 00 g ._ n . ., 1 n 3.4 f i=l Theoretically, T ranges from N and infinity. For a real computer, integers cannot be larger than a limit determined by the particular struc- ture of the computer. For the APE machine, this limit is K = 2 . Therefore T n actually ranges from N to K. The normalized value of T , given by t = -r- ■> corresponds to the final result of the division operation in the form of a binary fraction. To examine the statistical properties of t , it is better to start from the distribution function of T. . Since T is 1 1 the number of clock periods between two consecutive pulses in the SRPS D n representing the number d, the probability that T. = n is given by the probability of having no pulse in (n-l) clock periods times the probability of having a pulse in the n period. One can, therefore, write P(T i =n) = d(l-d) n_1 3.1+8 / n d Let b denote (1-d) and D be the differential operator ~ . The expectation value of T. can then be expressed as 1 oo Exp[T.} = I nd-b 11 * 1 1 n=l = d Z D*b n n=l = d'D I b n n=l = d'D(r\-) 3.U9 1-b' 2 The mean value of T. is given by 00 Exp{T i 2 } = Z n 2 -d-b n-1 n=l 00 00 = d Z (n+l)n-b n_1 - d Z n-b n_1 3.50 n=l n=l Upon substituting Equation 3.^9 into the 2nd term of Equation 3.50, one gets 00 Exp(T. 2 } = d I D 2 -b n+1 - i n=l 2 l = -?-! 3 - 50 3U The standard deviation a T is given by a T = Aar{T.} = /Exp{T, 2 } - Exp 2 {T.} 3.50a By substituting Equations 3.^9 and 3-50 into Equation 3.50a, one gets CTrji -J 2 -. d 2 1 _ d 1 'd 2 = A-'. d 2 1 d 3.51 The mean and standard deviation of t can now be obtained as follows CO OO CO 1» Exp{t w }=77 E E ... E I T 'P(T T, ..., T ) 3.52 N K T 1= l T 2 =l T K =1 i=l X 1 2 N Since T. and T. 5 i 4 j 9 are independent random variables - 1 - J P(T X , T 2 , ..., T N ) = P(T 1 )P(T 2 ) ... P(T N ) 3.53 Together with the fact that I P(T. ) = 1 for all i 3.5^ T.=l X 1 Equation 3.52 becomes N Exp{t N } = | E Exp{T.} = N-| 3.55 i=l Wow the variance of t can be expressed in Var{t N } = Exp[(t N - Expf^} ) 2 J 3.56 35 Combining Equations 3.55 and 3.56 one obtains N N Var{t N ) = Expt^ ( I T - E Exp{T.}) K i=l 1 i=l 2- 1 N . = ^Exp{[ E (T. - Exp(T.};l > K i=l 1 X N N = ~ Exp E E [T.-Exp(T.} ][T . -Exp(T.> ] 3.57 K i=l j=l X x J J Because of the fact that T. and T., for i ^j , are statistical independent and CO that. Z_ P(T.) = 1, the expectation operator and the summation operators in Equation 3.57 can be shown to be commutative, i.e., N N Var{t } = i- E E Exp{ [T.-Exp(T. ) ] [T .-Exp(T . ) ] } 3.58 K i=l j=l i i J J Again because T. and T., for i ^ j, are independent, all terms under the summation sign with i i j in Equation 3.58 are zero. There Equation 3.58 becomes 1 N 2 Varft } = i- E Exp{(T -Exp(T.r)} K 1*1 1 N = is- E Var{T } 3.59 K i=l Because all the T.'s have identical distribution, they therefore have identical variance. Consequently Equation 3.59 becomes Var(t N } = ^7 Var{T.} 3.60 it follows that the standard deviation is given by _ K \ K 3.61 36 Combining Equations 3.51 and 3.6l, one gets N JWWAd 2 d 3.62 Recall that the division operation is performed by counting the pulses of an SRPS D representing the divisor I 1-2x1 , shown in Equation 3.l6, over a n period such that the final reading of the counter in binary fraction equals to the dividend I x-y I . The term (— ) in Equation 3.62 is the normalized value 'K N of the number of pulses being counted. Therefore the term — actually corresponds the dividend | x-y| in the division operation. The parameter d is the divisor in machine number. It has the value of | l-2x| . After replacing d, and — with the expression in terms of x and y, Equation 3.62 K becomes a t = x- 1 1 1.63 i ! I ■-: ' -,r I l-2x| The relation between a, e and K according to the Tchebycheff ' s inequality can be obtained by combining Equations 3.63 and 3.27. The result is given by lx-y l / 1 K 3.6U (l-a)e M l--2.x.| '" ! 1- 2r| Based on the central limit theorem, the relation between a, £ and K can be approximated by a = ERF 033 K x-y I x-y I 3.65 3.7. 5 Statistical Properties of the Result of Cascading Two Operations The statistical properties of the basic arithmetical operations have been examined. They are investigated under the condition in which the inputs 37 to the stochastic processor are deterministic. Now the problem of probabilistic inputs is to be investigated. This is the case when two levels of stochastic processors are operated in cascade, like the example shown in Figure 2.2. The stochastic processors in the second level are driven by the outputs of the first level ones. These APE outputs are the sampling means, instead of the means, of the output SRPSs of the first level APEs and are therefore random variables. Let the outputs of two first level APEs be denoted by X and Y. They are random variables with normal distribution, according to central limit theorem, as discussed above. When X and Y are fed into another APE for further processing, the result can be represented in general by Z = f(X, Y) 3.66 Theoretically, the statistical properties of Z can be derived from those of X and Y. Unfortunately, even for some rather simple function f(X, Y), to obtain the result in explicit closed form becomes very much involved. A direct approach to this problem is to find out the distribution of Z from those of X and Y and from f(X, Y). However, the following approach dodges the evalua- tion of the distribution of Z and simplifies the problem somewhat. To com- pute the expectation value of Z, one writes I 1 Z =j f(X, Y)P(X, Y) dX-dY 3.67 where P(X, Y) is the joint distribution function of the input variables. For the standard deviation a ? one can first find out Z , the simplified notation p for Exp[Z ] , by 1 1 Z 2 = f 2 (X, Y)P(X, Y)dX*dY 3.68 ' — 2 2 2 and makes use of the expression a_ = Z - Z 3.69 Li 38 The standard deviation a is then given by 11, 11 o a z = /{/7f (X,Y)P(X,Y)dX'dY} - j/y A f (X,Y)P(X,Y)dX-dY} * . Ot) 3.70 This equation indicates the fluctuation of the result due to that of the inputs. For example, if the operation is multiplication with the multipli- cand and the multiplier given by the sampling means of two SKPSs and with the product in the form of another SRPS , the mean value of the product SRPS , instead of being a deterministic number representing the exact value of the product, is a random variable and fluctuates with a standard deviation given by Equation 3-70. The relation between the confidence level, maximum allowable error, and standard deviation as given in 3.33 is apparently applicable here. Let the confidence level, error and standard deviation associated with the fluctuation of the mean value of the output random vari- able due to the input fluctuation be a , e and a respectively. Then the Zi A A relation between them is given by a = ERF(-— ; } 3.71 VT G z Let the confidence level, error and standard deviation associated with the estimation, with a finite sample size of K samples, of the output variable having deterministic mean value be a , e and a . Their relation K K K has been derived before and is given by a = ERF{ — } 3.72 K /2 a K Now, let the confidence level and error due to the combination of the two cases mentioned before be n ^, c -. It is not difficult to see that T Z K Combining Equation 3-71, 3. 72 and 3.73 one gets 39 £ £ £ + v 3.73 a = ERF{-— -^-}{ERF — ^-} 3.7^ • 2 a /2 a Z K As an example, let us consider an addition operation with the addend and augend coming from the outputs of other APEs. According to the previous notation, the addend and augend are denoted "by (X ) and (Y ) to indicate that they are n K n K obtained from SRPSs X and Y by averaging over K clock periods. In this n n section, the subscripts are dropped to simplify the notation so that (X ) is n K now denoted by X and (Y )„, by Y. The addition operation is therefore given n is. by f(X,Y) =Z=X+Y-| 3.75 To find a according to Equation 3.70, one must first find out P(X,Y). Since Li X and Y are independent random variables, one can write ?(X,Y) = P(X)-P(Y) 3.76 By substituting Equation 3.28 into 3.76, one gets _[(x-y x ) 2 + (Y-M y ) 2 P(X,Y) = X e L 2n X 2a Y J 3.77 2tto X°Y where y and y are the mean values of X and Y respectively and a v and a AX AX are the standard deviations of X and Y respectively. Upon substituting equations 3.75 and 3.77 into Equation 3.67, one gets Uo C/ .- )* (Y-y y ) 2 V 2 a, 2 a 1 1 r r z -J J o o (X + Y - i)' 2 27Ta x a x dXdY y + y X Y Similarly, one can compute the expectation value of Z , by (X-^) 2 (Y-u y ) 2 z 2 = 1 1 2 a X 2o Y lv2. (X + Y - |) "2rra^J Y dXdY a^ 2 + y 2 + a v 2 + y v 2 + \ + 2y y y Y - y y - y. J X 'X Y k " M X M Y M X H Y The standard deviation a is therefore LA 3.78 3.79 a = / 2 -2 3.79A Upon substituting Equations 3.78 and 3.79 into Equation 3.79A one finds that many terms in the righchand side of Equation 3.79A cancel out, so that 3.80 / 2 2 = vo + a Z X Y Combining Equations 3.80 into 3.7^+ one arrives at a T = ERF srrr^y ERF K ^c 3.81 'X U Y "" ~K Recall that cu represents the fluctuation of the estimation of the mean value x of the random variable X by averaging over K samples. According to Equation 3.20 Ill / x(l - X / K a x ,- However, x(l - x) is always less than r\ Therefore, one can write 2 1 a v < — 3.83 Similarly a Y 2 < fe 3 ' 8U a 2 < 1 o K ' UK 3 ' 85 Upon substituting these expressions into Equation 3.8l, it becomes, a > ERF[e_/K] -ERF[e /2K] 3.86 1 Zi K As an example, let K = 13 x 10 , c = 0.5%, e = 0.5% so that e = 1%, L K 1 one gets ct T > ERF{1.80}ERF{2.55> 3.87 > 0.988 U In other words, if integration is taken over K = 13 x 10 number of clock periods, the output reading of this cascaded operation would be within 1% of its correct result with 98.8% confidence. 3.7-6 Statistical Properties of the Results of Differential and Integral Operations The differential operation is actually a subtraction operation. Therefore all the statistical properties derived for subtraction operation are applicable. The statistical properties of the result of the integral operation are similar to those of the addition operation with nondeterministic inputs. Therefore, the results derived in Section 3.7.5 must be applied to compute the relation between confidence level, accuracy and sampling time. 42 4. THE GENERATION OF THE SYNCHRONOUS RANDOM PULSE SEQUENCES ; l+.l General Requirements on the SRPSs Used in the APEs There are four basic requirements on the SRPSs used in the APEs: a) It has been shown in the previous chapters that numbers represented by SRPSs can be operated upon with great ease. However, the significance of this advantage depends on the ease with which a number can be converted to and from an SRPS. Therefore, the conversion must be able to be accomplished with simple circuits, b) Most of the operations of the APEs involve two operands. It is therefore required to have two SRPSs representing two dif- ferent operands. These two SRPSs must be statistically independent as required in the multiplication operation. c) The APEs obtain the power remotely. The maximum power available to an APE is less than 100 mW. There- fore, the converter must consume very little power. It is this fact that precludes the well-tested method of converting a number into SRPS by means of noise diodes and thresholding technique, d) Because the number to be converted might be a function of time, the resulting SRPS must be able to fol- low this change and reach a new steady state within a small fraction of a computing cycle. 4.2 Conversion of a Number into a SRPS The basic idea of converting a number into its SRPS representation is to compare the number with the output of a noise source. There are many approachs to the conversion based on different ways to make comparisons and different principles to generate noise^ 12 ^ 13 ^ ^ 15 ' ) . The one most suitable for the APE machine is shown in Figure 4.1, The number to be converted is U3 INPUT NUMBER X m VARIABLE VALUE X m IN BINARY REGISTER OUTPUT Z IN SRPS \ > CENTRAL CLOCK IN COMPARATOR OUTPUT Z=(° IF Xr >Xm ll IF X r X 2 • • • k 17 & '18 '19 20 (+) = Modulo -2 Adder Figure U.3 The Arrangement for the Generation of a 20-Bit Maximum Length Sequence k6 ! X l = C 1 X 1 + °2 X 2 + * * * + C n X n x. = x.-l i = 2, 3, ... , n 1 i k.l , th where x denotes the current value of the i stage, x. denotes the value i x of the i stage during the last clock period. This means that a new n-bit binary number, related to the last one by Equation k.l, is generated by the register every clock period. It is obvious that a different set of coefficient c. corresponds to a different sequence of n-bit binary number produced by the register. With an appropriate set of coefficients, the length of the sequence before it repeats itself can be made to be 2 - 1, which is only one less than the maximum number of states a n-stage register can possibly be in. For a 20 stages feedback shift register, this corresponds to a sequence of more than one million binary numbers before any periodicity occurs. It can be shown that this maximal-length sequence with a period of 2 - 1 can be obtained if the set of coefficient c.'s equal to the coefficients of a pri- th (16) mitive n degree polynomial ' . For a 20 stages feedback register, the set of coefficient has been worked out in reference (IT). The result is that all coefficients equal to zero except c „ = c on = 1, as depicted in Figure U.3. The first ten bits of the output of the register are used to form 10-bit binary number while the remaining ten bits are used to form another 10-bit binary number. If these two binary numbers are taken once after every shifting of the register to form two binary number sequences, they can be shown, by following a similar argument as presented in reference (lU), to be two statistically independent pseudo-random binary number sequences with uniform distribution. They are therefore being used as the noise sources for the APE machine. hi k.h The Singular State of a Linear Feedback Shift Register A n-stage feedback shift register has 2 different states. A maximal length sequence will go through every state except the all-zero state. A close look at Figure U.3 reveals that once a linear feedback shift register is in this all-zero state it remains there indefinitely. If this is the case, the feedback shift register ceases to generate pseudorandom numbers. There- fore, protective circuitry must be used to provide continuous monitoring of the feedback shift register. If it ever gets into this all-zero singular state for some reason, the protective circuit must be able to pull it out. A simple circuit to do this is outlined in Figure h.k. It employs a 5-stage counter driven by the same clock that drives the shift register. The reset of the counter is connected to the output of any stage while the carry output of the counter is added, modulo-2, to the output of an arbitrary stage. The sum is then fed into the subsequent stage. If the X.'s are not all zero, the feedback shift register will not be in the all-zero state under proper operat- ing conditions. In this case, the reset line will be in logical ' 1' level at least once every twenty clock periods. As a consequence, the 5-bit counter will be reset before the carry bit becomes '1'. Hence, the carry output is constantly '0'. The output of the modulo-2 adder is therefore identical to the output of the stage denoted by X . In other words, the protective circuit will have no effect on the operation in this case. However, if the feedback shift register slips into the all-zero state, the carry-out will be at logical '1' level after 31 shifts because of the lack of the reset signal. A logical '1' signal in the carry-out line when the feedback shift register is in the all-zero state results in inserting a '1' into X + 1 stage during the next clock period. Therefore, the feedback shift register is again put back into a non-singular state. 1+8 0*- S in •H -P

C 0) ft 0) T3 H a$ O •H -P 01 •H -P CO o -p a ■rl [0 in CD & O > Eh o S3 O •H CO !h CU t> o o tr CO u p bO •H 51 5. CIRCUIT DESCRIPTION OF THE APEs 5.1 Special Design Considerations for the APEs Because of the limited power available for the APE, the overriding factor in its design is low power consumption. An original estimate of the capability of remote powering of diverse systems puts the ceiling on the power consumption of an APE at 100 mW. This severe limitation on power con- sumption "brings about many challenges in the implementation of the APEs and affects every facet of the circuit design. It immediately rules out the possibility of taking advantage of many commerically available communica- tion circuits. In fact, all communication circuits used in the APEs were developed in the Digital Computer Laboratory of the University of Illinois. For logic circuits C0S/M0S integrated circuits are used exclusively because of their extremely low power consumption. In the design of the communication subsystem of the APE, multiplexing techniques are employed whenever possible to reduce the number of receivers and transmitrers required. 5.2 Block Diagram of the APEs A functional diagram on an APE has been shown in Figure 2.3. It is presented in a way suitable to explain how the APEs function. However, in the implementation of the APE, some of the functional blocks are merged together while others are realized with separate physical hardware. From the hardware implementation point of view, the block diagram of an APE can be depicted somewhat differently as shown in Figure 5.1. There are two tunable receivers for data inputs. However one of ihese tunable receivers, namely receiver B in the figure, is operated on a time-sharing basis for both input data reception as well as program instruction reception. Furthermore, this 52 S»" > o > o< < E> u o o ft» o >» V U i 8 c 3 UJ o o o- o 3 Duty -Cycle Data Decoder o -a: o < o 5 Q < k. . o o *- c a >» t (0 g « en -p a> 8 a; H bO C •H w en (U o o O s o a o -p a a3 o bO •H n o H pq H •H 53 receiver is operated at different priority levels with the reception of the program instruction having higher priority over the reception of input data. Whenever a program instruction is to he sent out from the program control unit, the latter is first switched to the programming mode. This causes the clock transmitter to stop the transmission of the clock signals. The ahsence of the clock signals, in turn, shuts off all output transmitters of the APEs . As a result the frequency channel for the program instruction trans- mission is clear from RF interference. The control of the tunable receiver B is then taken over for the transmission of program instructions and the receiver automatically tunes to the fixed frequency channel assigned to that particular type of APE for program instruction transmission, as denoted by v... in Figure 5.1. The same channel frequency \> is later used to transmit output data when the APE machine is in the execution mode. Because the APE transmitter is already shut off and no data output is sent during the program- ming mode, the channel is used exclusively for the transmission of the program instruction. The output signal of the time-shared receiver B is used to feed decoder for function during the programming mode and to feed the data decoder during the execution mode. The program instructions are sent, one at a time, to all types of APEs involved in a specific program. Upon the completion of the programming process, the program control unit is switched to the execution mode. Note that a program instruction contains the information of what channels the two tunable receivers are to be tuned to as well as what specific operation the stochastic processor is to perform! In the execution mode, the control of the tunable receiver B is returned to the APE for data input reception. Like the tunable receiver A, tunable receiver B is now tuned according to the program instruction. A multiplexing technique is also employed to operate the transmitter, which is used to transmit the output data and the reply to the test signal from the program control as veil. During the execution mode, the transmitter is used for output data trans- mission. It is otherwise used for replying to the test signal during the test mode. The multiplexer control signal is sent to the APE from the clock channel. A continuous presence of an RF signal in the clock channel tells the APE to operate in the testing mode. For programming, as mentioned before, no RF signal is sent through the clock channel. In the execution mode, a sequence of RF pulses carrying the timing information is sent through the clock channel. 5.3 The Timing Circuit of the APEs All operations of the APE are performed in synchronism with a com- mon time reference. A composite clock carrying the clock signal and the synchronizing signal is sent from the APE control unit through the clock channel at U2.5 MHz. The clock signal has a frequency of 165.^50 KHz and the synchronization signal uses 1 Hz. The APEs begin to communicate with each other at the occurrence of the synchronization signal. The communica- tion period lasts — of a second. Immediately afterwards, the computation Q begins and lasts for — of a second. Then comes the next synchronizing signal and a new computing cycle begins. The waveform of a composite clock signal is illustrated in Figure 5.2. The clock signal is carried by the regular pulse while the synchronizing signal is carried by two consecutive wider pulses, 55 Lf UJ _j a. o UJ -J a >• o z o o UJ z o o a> o c •— *>» o t E (A II U a» CA 00 a, w .-i cl2 TO o w 0) a. c o o u E *r 3 CD E ro E l£>* o — • o (A _ a5 ^ o O rH O (D ■P •H W o ft 6 o o EH CM ^ 56 Three timing signals are required by the APE. They are the synchron- zing signal , delayed reset signal and the preset signal . The synchronizing signal, also called reset signal, appears at the beginning of a computing cycle whereas the delayed reset appears one ninth of a second later and marks the end of data transmission and the beginning of computation. The preset signal occurs at about 3 msec before the next synchronizing signal does, and is used to set up the transmitter just before it is used to transmit the output data. The detail discussion on the generation of these timing signals is given below. 5.3.1 Digital Synchronizing Signal Separator This part of the timing circuit separates the synchronizing signal from the composite clock. This is done by triggering a one-shot circuit with the composite clock. The output of the one-shot is a sequence of pulses with a fixed width which is wider than that of the clock pulse but narrower than that of the synchronizing pulse, as illustrated in Figure 5.3. The composite clock pulse sequence and the output pulse sequence of the one-shot are com- pared by feeding them into the data input and the clock input of a D-type flipflop respectively. The output of the D-type flipflop for the next clock period is equal to the value of the D input at the time the clock input C changes from a '0' to a '1'. By comparing the waveform of the composite clock and the output of the one-shot as shown in Figure 5.3, it is easy to see that the Q output of the flipflop is a '1' when the synchronizing signal occurs and is a *0' otherwise. Therefore the Q output of the flipflop contains only the synchronizing signal and is used for resetting the input storage at the beginning of every computing cycle. 57 •► RESET OUT D FLIP FLOP COMPOSITE CLOCK A OUTPUT OF ONE SHOT B 6 8 M s _ruu~nr_r^^ r "LiiiiAriJiJiny^JiJiJiiiJ 1 13.6us 1 lsec 1 RESET 1 w Figure 5.3 Synchronizing Signal Separator and its Waveform 58 5.3.2 The Delayed Reset Signal and the Preset Signal Generator A delayed reset signal is required to signal the end of data trans- mission and the "beginning of data processing. The delayed reset occurs 16,381+ clock periods after the reset, or approximately — of a second after the "beginning of a computing cycle. The preset signal is used to turn on the output transmitter about 3 msec "before the "beginning of data transmission, The circuit used to generate these signals is shown in Figure 5.^. It employs an l8-"bit counter with the reset input driven by the synchronizing signal. At the beginning of a computing cycle, the l8-bit counter is reset to the all-zero state, and the delay reset output is also set to '0' by the synchronizing signal. Then, the counter starts to count up. Once.it reaches l6,38H counts, the output of the 15th bit changes from '0' to '1', causing the delay reset to change from '0' to '1' and to stay there afterwards, (due to the latch action of the S-R flipflop), until the next reset signal comes along. The preset signal is set to '0' at the beginning of a computing cycle. It changes to *1' when the counter counts up to lU6,9Uli, i.e. 511 clock periods or approximately 3 msec before the next reset pulse appears. 5.4 Mode Control Signal Detector As mentioned earlier, the APE can be operated in three different modes, namely, the programming mode, the execution mode, and the test mode. During the programming mode, the output transmitter of every APE is shut off and a data input receiver is converted into the instruction receiver. For the execution mode, the output transmitter is active and the instruction eiver is converted back to an input data receiver. The signal to control the mode of operation is being transmitted to the APEs through the clock lannel. For the execution mode, a composite clock is transmitted. In the 59 fw k- (/) a: a> >s 0) Q u o CJ a> V) o a. E o o a> or o 5 5* -p a> "O *f* 2 i 5 * 0> u c Test 0th o "O pH o (~ 0> O. % E M h- CVJ cvi ♦ — vw 00 2 <* o o 00 00 CVJ cvi Q Q > o o CO o c H o O t— i 0_ 1— 1 O o C7> c c C "O o (J E o o O 5 0) O 3 O ^ O 0) to w~ X a> ^ Q_ U 1- \ * V pi 2 -> >— t ij ai > U c o o 3 E o o ha *: u_ r-l u o -p o (Ll -P o C bO •H W O U •P O O 0) £ EH bO •H 62 with that of a standard pulse in a sequence from the one-shot circuit. The comparison is accomplished "by feeding the pulses into the data- and clock-inputs of a static shift register as shown in the diagram. When a narrow pulse in the instruction signal comes along, the data input of the register will be at a logical '0' level by the time the clock input changes from a logical ' 0' level to a logical '1' level. The register then receives a logical '0' input. When a wide pulse in the instruction signal comes along, a '1' is received. In this way, a sequence of pulse -width-modulated pulses is decoded and stored in the form of a sequence of binary numbers. The parts of the instruction carrying the tuning information are then sent to two non-linear D/A converters. This results in two analog tuning voltages custom-matched to the tuning characteristics of the individual tuning diode. An analog gate is employed to switch one of these analog tuning voltages to tune receiver B in the execution mode and to switch a fixed voltage for tuning the receiver B for instruction reception in the programming mode. 5.6 Input Duty-Cycle Decoder Each APE has two input duty-cycle decoders, one each for input X and input Y. The function of these decoders is to convert the machine number in duty-cycle modulation into a 10-bit binary number and to store the result. The decoder consists of a 10-bit up-counter with a gated input. A clock signal is gated to the input of the 10-bit up-counter by the data input signal in duty-cycle modulation, as shown in Figure 5.7. Therefore, the data input is measured against the clock and the data input signal width is registered in the counter in terms of the number of periods of the clock. 63 i_ e Pulse — i .1 P i J r _ 1 ii O) o C V) o w .-. a, TT O o> «/> CL O w W. O Z o 3 C 0T M > O Q E o or fc O o ♦- GO •t: o CO Ifc go q is k. o 01 C/> o» at or Q) Tuning Voltage f < or > o or k. a> < w «- H— CO f— « Sfe w IE f— 1 o CO o o "*— to o O >♦- CO CO or > 1 CD fe (1) .E o o — <1) CVJ = O o "^ "O CO < a) ^ > h- > or 3 o > a> S f— < a c X o o «— « Ifc o o ruction eption ning r > fc, ^W / i i h +- O 3 «» a> \— c or C CO M E 01 D o o 0) 4h o» 5 B c o o c O 0. < o 4» O o c < c o (_> jC o L — 1 'i o k- CO 1 tio- ^> • o ^ O O ^ c ^ 0» c* A TJ — T O CO u 0) X c o» p o OQ ^ u o o or > > ^or 3 CD TJ o o CD Q (D \o CD 3 bO •H EM 6U Reset Every Computing Cycle Zk R A B Ir iput Data Sig na in Dut y Cycle M odulation v A Input Clock \ B o 10- Bit Counter i T 10 Bit Output i t Beginning of a Computing Cycle Figure 5.7 The Input Data Decode: 65 5.7 The Stochastic Processor After the inputs X and Y are decoded and stored in "binary numbers, they are processed "by the stochastic processors according to the operation code 9 from the instruction decoder. The stochastic processor consists of two binary-to-SRPS converters and the processing network, which have already been discussed in Chapter 3 and Chapter h. They are not repeated here. 5.8 The SRPS Integrator and the Output Encoder The function of the SRPS integrator is to convert the result of the stochastic processor in SRPS representation into "binary representation for encoding into pulse width modulated signal. The integrator and the encoder are implemented with an up-down-counter as shown in Figure 5.8. The delayed reset changes from a logical '0' level to a logical ' 1' level at the end of — of a computing cycle, triggering the one-shot circuit to send out a pulse. This pulse presets the counter according to the preset input: In the case of addition and subtraction, the preset input is a binary fraction of one half. It is otherwise all zero for multiplication and division. For storage the preset input equals the number to be stored. For any operation other than storage, the counter is set to count up (right after it is preset by the delayed reset pulse) and continues to do so until the end of the computing cycle. During the up-count period, it is driven by the output of the stochastic processor. At the end of the up-count period (which is also the end of the computing cycle), the content of the up-down-counter represents the result of the stocahstic processor in binary representation; it is ready to be encoded and transmitted at the beginning of the next 66 Delayed Reset SRPS from the Processor (Up Count Input ) Output Clock (Down Count Input) ■^cy One-Shot up/dn preset 10- Bit Up / Down Counter clock in Borrow Out 10- Bit Preset Inputs : -0- 5 for Addition or Subtraction = Stored Number for Storage t> Result in Duty Cycle Modulation -. ire 5.8 SRPS Integrator/Output Encoder 67 computing cycle. For the storage operation the up-down-counter is preset to the value of the number to "be stored "by the delayed reset pulse. This number remains in the counter until the end of the computing cycle. At the beginning of the next computing. cycle, the counter is set to count down. At the same time, the output clock, replacing the output of the stochastic processor, is used to drive the up-down-counter. The count- down operation continues until the counter is empty. It is obvious that the period over which the up-down-counter counts down corresponds to the initial number in the counter, which is the number to be stored in the case of storage operation and the estimate of the result of the stochastic pro- cessor in the case of other operations. Furthermore, the output represented by the length of the count-down period is already in the form of duty-cycle modulation. It can be used immediately to modulate the output transmitter. 5.9 The Communication Subsystem of the APE The APE transmits and receives all types of information through radio frequency channels only. Each APE is equipped with two tunable receivers, a clock receiver and a transmitter for communication with other units of the APE machine. All these communication subsystems must be of the micropower type. This precludes the use of many commercially available communication circuits. They are built in fact entirely with discrete compon- ents. In the following sections, they are discussed in detail separately. 5.9.1 The Remotely Tunable Data Receiver Although receiver design is thoroughly understood, it remains a very time-consuming process to design and develop a good one for a specific application. Of the numerous considerations given to the design of the receiver, only the important ones shall be discussed below. The basic requirements of this receiver are as follows: 1) micropower consumption, with a maximum drain equal or less to 2.5 ma from a 6 volt source, 2) remote tunable feature for receiving pulse width modulated RF from any of the eleven channels covering a frequency band of 5.300 MHz, 3) a bandwidth of about 10 KHz to resolve a 100 msec input pulse, k) a sensitivity of 500 microvolts, 5) a selectivity of at least "60 db down " for the nearest channel Various types of receiving principles and techniques have been investigated for the implementation of this receiver. It was found that the heterodyne receiver is the most suitable because of its high sensitivity, high selectivity and simplicity in tuning. However, it is also well known that image frequency response and spurious response are the inherent short- comings of this type of receiver. Although more sophisticated types of receiver design, such as superheterodyne and double conversion hetero- dyne types, could overcome these shortcomings, they would easily exceed the limit on the power consumption. Happily enough, proper reception with a heterodyne receiver is still possible if channel frequencies are assigned carefully, so that no channel frequency lies near a major spurious response frequency. Originally, it was planned to use a powerful microwave source at 1296 MHz to transmit the electric power to the APEs. To avoid interference, the frequency of the data channels are chosen to be as far down below the 1296MHz as possible. On the other hand, the range of tuning must be wide 69 enough to cover all channels. For a given timing range, an increase in the frequencies of the band would decrease the tuning ratio and leads to simpler tuning circuits. Therefore it is preferrable to use higher channel frequencies from the tuning point of view. Other factors, such as the spurious frequencies and the availability of the components also affect the frequency assignment. A good compromise solution is to have the frequencies assigned between 15 MHz and 20.3 MHz, with the IF frequency at 10.811 MHz and with the channel spacing being 530 KHz. Within the tuning range of the readily available 10.7 MHz IF transformer for commercial FM receivers, the 10.811 MHz is chosen to be the IF because it minimized the interference of the spurious response. This can be best shown by examining the spurious response in more detail. Spurious responses are generated when harmonics of the input signal beat with the local-oscillator signal (or its harmonics) to produce a signal close to the intermediate frequency (or its subharmonics) . To see how these higher har- monics mix, one could expand the transfer characteristics of the mixer tran- sistor about the operating point by means of Taylor series Where I denotes the total collector current c I fi denotes the collector bias current V denotes the base to emitter voltage v denotes the AC component of the base to emitter voltage. Equation 5.1 can also be written as I C= X + a l V be + a 2 V be 2 + a 3 V be 3 + '" 5 ' 2 where a is the n order coefficient in Equation 5.1. TO In the ideal case of quadratic mixing, i.e. with a. = i > 2, involving two pure sinusoids, no spurious response could occur. However, for a practical mixer, the coefficients of higher degree are not zero, resulting in cross- modulation, modulation deepening and spurious responses. For a pulse width modulation scheme as in the case of the APE machine, the first two phenomena have less effect on the reception than the spurious responses. With the above-mentioned frequency assignment, the image response occurs in the hand between 36.62 MHz and 1+1.92 MHz. These image frequencies are far away from any signal frequencies, they therefore cause no significant interference. The same is true for the first order subharmonic response. In this case, the response is caused by the higher even degree terms, such as the 1+th, 6th, ... etc., as these terms cause the harmonics of the beat frequencies to appear. Let the wanted input signal be f ; the spurious response, f ; the IF frequency, f ; and the local frequency, f , with i e f = f + f. 5.3 e w i ^ The first order subharmonic spurious response is given by f . f = f + 7T 5> s w 2 ' f . When this f beats with f , the result is ~, the first order subharmonic s e d ■ of the f . According to Equation 5.1+, this type of spurious response for f w = 15.00 MHz is 20.1+05 MHz. The frequencies of this type of spurious response for higher frequency channels are of course higher in frequency. Therefore, they all lie outside the active band of frequencies and cuase no significant interference! For higher order but less significant spurious response, it is not possible to eliminate all unwanted signals completely from the active band. 71 However, a properly chosen IF frequency places the next most significant spurious response as far away from data channels as possible. The next most significant spurious response is the second order image response. It is caused by the second harmonics of the unwanted signal heating with the local frequency to produce fj F . For example, let f =15.00 MHz, f = 10.811 MHz, the local frequency is therefore f = 25.811 MHz. Wow if an unwanted signal exists at frequency f = 18.311 MHz, its second harmonic beating the local frequency will produce exactly the intermediate frequency. If f = 18.311 MHz happens to be another data channel frequency transmitting data at much higher power level, harmful interference could occur. The intermediate frequency obviously affect the location of these spurious response. The optimum IF frequency can affect the location of these spurious responses. The optimum IF frequency can be shown to be 10.811 MHz. Figure 5.9 shows different type of spurious response. The electronic tuning of the receiver is done by means of a reverse-biased diode. It is well known that the junction capacitance of a reverse biased diode is given by C-X, where C is the capacitance at zero bias V is the reverse-biased voltage 4> is the contact potential £ is a constant determined by the impurity gradient of the diode (for an abrupt junction £ = — ) The tuning voltage ranges from 1 volt to 6 volts is supplied from the decoder for function as described before. The tuning circuit and the other part of the receiver are shown in Figure 5.10. 72 asuodsay afiouui asuodsey oiuoiujDijqns japjo isi Aouanbaj-j |auuoi|0 asuodsay a6Duui japjo P u 2 asuodsay 36dwi japjo pus Aouanbajj lauuDqo Aouanbajj |auuDi|0 asuodsay asuodsay |Du6jg pajuDM<: asuodsay jj ^1} = j o c 0> J < < -i. C\J (VJ c — o o a> CO v v JI I o t> •H (D O a. -J y\ Ov * mi* —i O •- r-t -H C\J <\J ■ *f i 5 a I i -M H _ II u • an OS IS I- CO S\ CD > ■H a w CU a & o Jh (U ■p ■3 01 -p o e o ITS 0) u p •H 5.9.2 The U2.5 MHz Clock Receiver This receiver consist of a 2-stage tuned RF amplifier followed by an AM detecting stage. The total power consumption is 3 ma from a 6-volt source. The sensitivity is 200 microvolts, and its bandwidth is about 1 MHz. The circuit diagram of this receiver is given in Figure 5.11. 5.9.3 The Switching Transmitter The transmitter is shown in Figure 5.12. It is a crystal-controlled low power switching transmitter developed in the Computer Laboratory to satisfy the particular requirements of the APE machine. It is a tune-gate and tune-drain oscillator. The second gate of the dual-gate FET is used to switch the gain to turn on or turn off the oscillator. The transistor 2N2U75 is used to reduce the standby power consumption while the 2N289^A is used for cutting down the Q value of the tank circuit when the transmitter is turned off, i.e. in order to produce a RF pulse with a sharp decay. The total power consumption at 15% duty cycle is about 6 mW. 75 CO tr UJ 5 tr o u. cr to hi z 5 o < QC CO z < i* to I r- 1 CVJ o 1 CVJ O 1 ■H CD CJ 76 >° St CD ,d ■P cr cd P s tei H f H z £.£ _i zcr UJ O UJ z - t- x tr CHAN RUCT SMIT O UJ < > 5 z i fc z ^ o 1 CO < o en z cr CO 1 1 1 t cr z UJ o a X UJ _> a. c 3 •"UJ h cr P u [ j _1 H^2 co u J Z z e > i i z -l a 22cr ODj^O < K UJ z h cr o o cr z P o z> o RCV CHA SELE o cr o cr i- ui 0. tO O z I cr o CJ UJ RCVR A CHANNEL SELECTOR OPERATION SELECTOR SWITCHES UJ to UJ O O 2 i V tr H cr uj 1- _ UJ > ASCI NCOD a Y DRI P UJ h. H Q U IT) "S LC Ol z UJ 1- cr UJ UJ > z z X o 8 < 2 cr P < t- _j <2 _i < r; 2 ° * to O +3 •H C H o -P c o o a ■p O M cd •H P J*i O O H pq < CO u •H 19 received signal is the output data from the APE of that channel. This data is in machine numbers and is converted into signed BCD codes for decimal display. The 1997/1023 scaler is to scale a 10-bit binary machine number into three decimal digits plus sign. In the following sections, detailed discus- sion will be given to the non-trivial functional blocks. 6.2 The Instruction Code Generator It consists of a pulse sequence generator, a counter, a multiplexer, and a monostable multivibrator as shown in Figure 6.2. The pulse sequence generator produces a sequence of 32 pulses when a starting signal is received. The counter and the multiplexer are connected to form a parallel-to-serial converter. The instruction word is fed to the data inputs of the multiplexer. As the counter starts to count, the inputs to the multiplexer are gated sequentially to the output. The output of the multiplexer is used to modulate the timing network of a monostable multivibrator by switching an auxiliary timing resistor in and out of the timing network. The output from the mono- stable multivibrator produces therefore a binary-pulse-width-modulated signal. Figure 6.3 shows the actual circuit of the instruction code generator. It should be pointed out that the instruction code generator is designed to handled instructions up to 32 bits long. (For the APE machine, the instruction is only 28 bits long! ) Figure 6.3 shows the actual circuit of the instruction code generator. Two SN7Ul50N's are employed to form the 32-bit multiplexer. The diode D is used to switch the k.fK auxiliary timing resistor R in and out of the network. The ready indicator circuit is to turn on an indicator light when the instruction code generator has completed sending out a full instruction 80 51 AK 1 INb SIGNAL PULSE SEQUENCE GENERATOR 5- BIT COUNTER 32- BIT PARALLEL INPUT DATA i 32 -BIT MULTIPLEXER i TIMING NETWORK MONOSTABLE MULTIVIBRATOR 4 1 i r 32- BIT PULSE WIDTH MODULATED CONTROL SIGNAL p igure 6.2 A Block Diagram of the Instruction Code Generator 81 I* 4 4 JWr- 1| -oO- 1 nS &xr • ai b>3" r " 8 * at & 6 SH2 2>- D — £l£ TT -Cx>- 1 w (*) © A igg o -p u o CD O o a o ■H -P O in P en +3 •H pq i CM ro p Cm O to ctf ■H Q CJ ■H -P a) 6 cu ,3 cj W 0) g, ■H Ix, 82 word and is ready for loading for another word in another APE channel. Figure 6.U shows an instruction word and the corresponding pulse-width- modulated signal. 6.3 The 9-Channel Instruction Transmitter This transmitter consists of nine separate crystal-controlled RF oscillators and modulators. Whenever an instruction is to be transmitted over a specific channel, the power to the corresponding RF oscillator and modulator is switched on. Otherwise, their power supplies are switched off. This simplifies a great deal of the RF shielding problem because the RF signals of the instruction transmitter are generated only when they are needed during the programming mode. The schematics of the transmitter for each channel is given in Figure 6.5: It is essentially a gated RF source. The crystal-controlled oscillator generates the carrier which is sent through a tuned amplifier to remove the harmonics and is amplified to about one volt in amplitude. Then it is fed to the first gate of a dual gate FET. The second gate of the FET is switched between +8.5 volts and -5 volts. When it is held at +8.5 volts, the dual FET is operated in the active region. The RF signal will be further amplified before it is transmitted. When the second gate is at -5 volts, the FET is switched off, allowing only very small amounts of RF to leak through. The reason for using the dual gate FET is its extremely small gate-to-drain capacitance which leads to a better on-off ratio in switching the RF signals than would be possible with its bipolar equivalent. The switching voltage is provided by the switching circuit of the transistors 2N1308 and 2N1309. The diode in the emitter of the 2N1308 is used for improving the noise margin of the input. 83 uj o z (X z UJ o Ld < 3 Ql O U. Z o q z Z 3 CM h- cc O hi > 1- < UJ 2= o UJ o u. h- z 3 Q. o Z z z 3 o o J z o c/) o Q_ CO o o o o o o o o o o o o cr o o 3 cr h- co z z < O o UJ If) CD UJ "> w<* £ * 0. ^r UJ r— t 3 _J < Z C£ CO O UJ h- < _1 3 Q O ] e UJ (/) _l 3 o_ CD ] ? o Q_ CO UJ cr tr o o UJ X H £ •H •n -p H •H H s cd EH a o •H -p o u -p en CD En 0) u w •H 85 6.k The Clock Signal Transmitter The U2.5 MHz clock transmitter has the same schematics as the instruction transmitter "but is operated at higher frequency. However, the clock transmitter operates during the execution mode. In order to avoid interference with other circuits, it is physically built in a shielded box. The operation principles of the clock signal transmitter are identical to those of the instruction transmitter discussed in Section 6.3. It is not repeated here. 6. 5 The BCD/Machine Number Converter The inputs to this converter are three decades of BCD code together with a sign bit, all from the thumbwheel switch in the front panel. The output is a 10-bit machine number. The conversion is accomplished in two steps. The first step is to convert the signed BCD number into a signed binary number. The second step is to transform, according to Equation 3.8, the signed binary number into a machine number. The first step is implemented with six U-bit BCD/binary converters SN7^l8U, connected as shown in Figure 6.6, The transformation in the second step corresponds to inverting the binary number after shifting it one bit position to the right. The inversion is done by nine EXCLUSIVE-OR gates acting as TRUE/ COMPLEMENT gates. Shifting is done simply by wiring the output binary bits into the shift position. Note that in the machine number, a negative number can be obtained by inverting a positive number with the same magnitude. Therefore, if the sign bit is used to control the TRUE/ COMPLEMENT gates, as shown in the figure, the result will be the machine number equivalent to the 3-digit signed decimal number of the inputs. 86 o CD CD CM 00 CO *• in <0 1^- CO 0> CO CO CD 00 CD CO i E 3 5 I M.l >\-i O C CO C\J JO S CO CD < QW @ z 00 Q g) 2 23 z (2)(S)fe) fefa) a &® a e 3 o o CO o o c CD O Q rowiw co o o < co u to o o Q o CD CO z 00 z to CD c o> to 87 6.6 The All-Channel Receiver The receiver is used to receive the output data signal as well as the test reply signal from the APEs. It is a heterodyne type AM receiver, as shown in Figure 6.7. Its operation is similar to the one used for receiving input data in the APE, except that this all-channel receiver has an extra IF stage: Its sensitivity is then about 30 microvolts. It is housed in a shielded "box to reduce the interference from the nosiy environment of the TTL logic networks. The tuning is done with twelve pairs of potentiometers. Each pair of potentiometers is adjusted to produce a pair of voltages for tuning the RF input resonant circuit and the local oscillator resonant cir- cuit for the reception of a particular channel. A particular pair of voltages is switched into the tuning circuits whenever the particular channel is to "be tuned in. 6.7 The Machine Number to BCD Converter and the 1997/1023 Scaler After the input data to the APEs are processed, the results are sent out "by the APEs in machine number representation. For easy read-out of the results, it is necessary to have them converted back to signed decimal repre- sentation. This is done in the control unit by the 1997/1023 scaler together with the machine-number-to-BCD converter. The output signals from the APE are in pulse-width modulation. Th3 pulse width ranges from to a maximum of 1023 clock periods. The 1997/1023 scaler is to linearly map a number in this range into a number ranging from to 1997 (to the limit of the truncated least significant digit ...) Figure 6.8 shows the block diagram of the scaler. What it does is simply to delete one pulse for every forty input pulses. A fast clock having a frequency twice as in 9\ (O r-1 O J* H H i-» cm cm -2 £ % 5 i T I rH CM On a a a %4 Vl %4 Vl a », a a. H rH H <-f r- I I I I I H CM CO - o m o o ■H H O CJ CJ O CJ U "-» V. V. *h 3s T) a- sl. a. H at r-4 H H 1 1 1 1 1 O VX) (- O ft H o CJ CJ CJ CJ CO J- \X) o o o r r 3. 3. %-l oo O CH \D O ir\ • t— t- O .a- I I I oj m -a- i cj cj o o o o « « o CM C\J t— IA O CM CM CM r-t -H I I I I I \o t— CO 0\ o r-\ r-\ H rl N e cs cc os « o « « o o CM O h- CM CM W H J N N I I I I I H CM CO -tf UN _JH JH J* H ^ K K tt « « « CM t— t- O t- CM CM CM H -» I i i i i q \0 r— OO OS rH (E CC K (X « « o « o p t— CM O CJ\ CM CM CM H MO CM I I I I I H (\l fl ■» Ift bTk K K K O -P O o 5! CD -p O CD > tu o CD K H CD a o i H H < CD EH CD 89 .a io z £ IE CO > o o o F H O c o o 0) u o £ a> T3 O Q. H W fO CVJ O r-l C o c B — 3 at o !| gi O => 2 0- c a> c o o 0) c v> * c k- <1> a> r j_ •♦— £0 II Q < Twic The J* o o o o E a> o g Li_ >% CO o "> M. < 3 Q. o o O < ii IO u bfl ■H pti o Q X V o 90 high as the system clock is used in the input of the scaler for comparison with the pulse-width modulated signal from the output of the APEs . With such an arrangement, the number of the output pulses is that of the input, scaled by a factor 1997/1023. After the scaling, an inverse transform of Equation 3.8 is performed by the machine-number-to-BCD converter as shown in Figure 6.9. The decade up-down counters are preset to 999 at the beginning of a new computing cycle, just before the counter starts to count down. When the counter reaches the all zero state, it is switched to count up. How long the counting process lasts depends on the output signal from the APE channel to be displayed. If the counting process ends during the count-down period, the sign of the result is positive. Otherwise it is negative. In both cases, the magnitude of the result is given by the value of the final count. The full range of the input to this converter is 1997 pulses. This corresponds to an output range from -999 to +998. Therefore, the combined function of the scaler and the con- verter will transform a 10-bit number into a 3-digit signed decimal number to within the accuracy of the 10-bit number. 91 c o Q. O 3 Q >> <= B fe cr> q. ^ c/> !? Q k, a; $ a> "H o — < w ^ a» ' o 3 i i — ^_ c o 3 3 c £ o o •- c o c o o (J o a. *r c o (/> <= =; o — Cou n Coi unf S fc- o CD -£ * * r c i ° * 6 o •o *. S o S _ S h. z gO ex a* ' o 7 ^ in 3 ■o o o 1 i ^- O *_ (V Coun ro St D Sta ■3 o c 3 C o - u Q PO ~ w >- Q- M _ M 3 o «3 u U O 0) ■> w a, | . D S "S CD < o Q c r N 1 fe P ^ I •— * °^o i L i , c c 1 p i r i r 3 o 9 5 u Ol *" s s J — a* ^_ < a Of o o CO ro C\J CD O en \ en N o CD Cn E 91 o k. Q. u. 0) -p u CD > a o o ,3 o q i o ■p i !4 •3 o hO d ■H Q o O H cq .a On 0) 3) •H 92 7. THE APE SENSORS AND THE REMOTE POWER SUPPLY OF THE APEs 7.1 The APE Sensor For input data acquisition, two APE sensors have "been implemented for the APE machine. The convert a light intensity into a pulse width modulated signal with the leading edge of the pulse synchronized with a reference clock. A sensor consists of a clock receiver, a switching transmitter, a synchronizing signal separator, a transmitter timing control, and a light- intensity-to-pulse-width converter, as shown in Figure 7.1. As in the case of the APEs, the output of the clock receiver is the composite clock signal. The synchronizing signal is separated from the composite clock signal by the separator. This synchronizing signal is used to trigger a one-shot circuit in the converter, whose timing network consists of a photocell and a capacitor. The width of the output pulse from the one-shot therefore depends on the light intensity which the photocell senses. The leading edge of the output pulse from the one-shot is in synchronism with the triggering signal, which is the synchronizing signal. The output of the one-shot, together with the clock signal and synchronizing signal, is sent to the transmitter timing control circuit, whose function is to turn the transmitter on and off at appropriate instants of time. All functional blocks in Figure 7.1 except the light intensity converter are identical to the cor- responding parts in the APEs: Their details have been given in Chpater 6. The light intensity converter is shown in Figure 7.2. A COS/MOS AND gate and an inverter is connected to form the one-shot as shown. The output pulse width is depending on the product of the photoresistance and the capacitance. 93 o &2 w S *: • « o. E ® >* (/) 'M O W c CO -p •H o a •H Q CJ o rH P>4 9U 3 Q. c v. M o ■^- .^ JC in +- 3 c Q. o 11 T en 3 Q. o -P Sh c o o J=i -p •H ^ Oh O -P >> P ■H en P M •P "tib •H CM 0) 3 bO •H 95 The former is dependent on the input light intensity. Consequently, the output pulse width is determined by the input light intensity. 7.2 The Remote Power Supply for the APE The APE machine is equipped with a remote power supply for the APEs. The power is sent to the APE remotely to free it completely from any physical connection with other parts of the APE machine, therefore further enchancing the structural flexibility of the machine. The general requirements of the remote power source are: l) The available power to the APEs placed in the active region of the power source should be about 100 mW. 2) The operation of the remote power supply must not interfere with the operations of the APE. This part of the APE machine has been investigated originally by another graduate student in the Computer Hardware and Systems Research Group. The possibility of sending power to the APEs over a microwave frequency channel was first considered. A detail report on this study is given in reference (21). An experimental set-up having a 6 -watt transmitter at 1296 MHz with a helical transmitting antenna was built. It was found that almost sufficient power for operating an APE could be delivered to an APE which is equipped with a quarter wave slotted line antenna for the reception of power and is placed a few feet from the transmitting antenna. However, it was also found that the transmission of the microwave power interferes with the operations of the APEs. As described in Chapter 5, each APE is equipped with three receivers operating at frequencies between 15,000 MHz and U2.5 MHz. The transistors employed in those receivers have necessarily a high gain-bandwidth product. Unfortunately, these transistors also response to RF signals at 1296 MHz. Without elaborate shielding and 96 feed-through filtering, the leakage of the 1296 MHz RF signal into the box housing the APE circuits is sufficient to upset the operations of the. receivers. Some experiments have been conducted to house the APE circuitry inside an elect romagnetically shielded box, with 1296 MHz resonant traps guarding all feed-through terminals feeding data signals into the APE cir- cuits. With careful tuning of the 1296 MHz resonant traps, the interference could be reduced to an acceptable level. However, such tuning is quite critical and could be detuned easily by coupling with nearby objects. Hence, this approach is not employed for remotely powering the APEs. The more suitable solution to the remote powering of the APE is by means of solar cells. This approach eliminates the harmful RF interference. These solar cells are constructed with a n-type silicon base material. A very thin layer of p-type material is formed on the n-type base through dif- fusion to produce a p-n junction with large area. When this p-n junction is short-circuited in darkness, no steady current will flow in the external cir- cuit inspite of the existance of the contact potential of the p-n junction as expected on thermodynamic grounds. However, if light in a suitable range of wavelength is allowed to fall on the p-n junction, a voltage will develop across the external circuit and current starts to flow with a terminal char- acteristic as shown in Figure 7.3. More details about such a photovoltaic effect can be found in many good references (22) (23). Figure 7.U shows the spectral response of the solar cells. For powering the APE, four solar modules, designated 5SM1020GE10PL by International Rectifier, are packed together on the top surface of the APE to absorb power from an array of incandescent lamps. 91 50 TYPICAL 1 X 10% EFFICIET CELL ILLUMI 2cm SILICON PHOTOVOLTAIC CELL, «ICY, OUTPUT CHARACTERISTIC. NATION = 100mW/cm2. M = 1 40 30 20 10 0.1 0.2 0.3 0.4 0.5 VOLTAGE OUTPUT (VOLTSI 0.6 Figure 7.3 Typical Terminal Characteristics of a Photocell 100 80 60 DC ui > < 40 _i LU K 20 1 fP -^ 2 a • | • / 1 \ ^ -ULTRA /* "1 - VISIBLE > a 1 • J RAN 'ri _» * uj a > C \ 1 t •U / * / // I 4 5 6 7 .8 9 WAVELENGTH, A IN MICRONS 10 11 Figure 7.1+ Spectral Response of Photocells 98 8. CONCLUSION AND OUTLOOK With the help of the latest development in COS/MOS integrated circuits and high gain-bandwidth-product transistors, the APE machine has been success- fully implemented. The maximum power consumption of an APE is actually about TOmW. The APE machine, as described in this thesis, has been operating satis- factorily. It is the world's first computer with variable topology of its kind. It is hoped that it will open up a new dimension in computer system and circuit design for highly flexible and reliable computers, with such advanced features as variable topology, incrementable computing power, readily mass-producible structure as well as self-checking and self-repairing capabilities. In regard to the fault-tolerant features of the APE machine, only static checking procedures, such as the alive testing, have been implemented. Dynamic testing could easily be implemented by employing a highly reliable APE as the checker inside the control unit and comparing the output of the channel to be checked and that of the checker with identical inputs and operations. Furthermore, these checking procedures can be done automatically rather easily with the type of structure found in the APE machine: Whenever a failure is discovered in a specific channel, the element could be replaced automatically. If there are no more spare channels to replace the defective ones, the control unit could be readily programmed either to notify the operator or, in some inaccessible circumstances, to reconfigure the topology of the set of APEs such that the required number of APE channels is reduced to the available number, with some lower-priority functions of the program being cut out. To match the reliability of the other parts of the computer, information transmission of the APE machine should, of course, employ error checking and error correcting codes. 99 The APE machine described in this thesis made use of RF linkages to transmit data. There is obviously a limit to the number of APE channels that could be placed in a given frequency band. This limitation could be overcome with some sacrifice of the flexibility of the APE machine. For example, if information transmission is carried out over a confined space, such as a com- mon bus, instead of the free space, then the number of APE channels could be increased by using several different confined spaces. In this case, special consideration must of course be given to the communication between the units associated with different confined space. As an alternative approach to the design of the communication subsystem for the APE machine, a time-multiplexing scheme can be used instead of the frequency division scheme employed in the APE machine. In the time-multiplex- ing approach, the data receivers of all APEs are operated over the same fixed frequency channel having a much wider channel width. Then each type of APE is given a specific portion of the communication period to transmit the output data. To receive data from a specific type of APE. The data input register is programmed to take in data only during the specific portion of the communica- tion period reserved for that type of APE. Such an approach eliminates the need of tuning the data input receivers and requires only one data receiver instead of two. On the other hand, additional logic circuits are required in this approach. Furthermore, it reduces the degree of homogeneity of circuits between different types of APE. As described in Chapter 3, the fluctuation of the result obtained by stochastic processing would increase with the increase of the number of cas- caded stages. This would limit the number of stages, and ultimately the number 1Q0 of APEs, that could be used. This difficulty could not be overcome with stochastic computation using truly random SRPSs. Nevertheless, this fluctuation problem could be completely eliminated with special pseudo random SRPS whose period equals precisely the sampling integration period for the estimation of the mean value: Because the integration is taken over the entire period, it follows from the periodicity of the SRPS that the same result is obtained for every integration period. More importantly, the result thus obtained can be shown to be the mean value of the SRPS. In other words, the correct result is obtained every time without fluctuation. 101 LIST OF REFERENCES 1. Poppelbaum, W. J., "ONR Proposal N000 1U-67-A-0305-0007" , May 1969. 2. Ramamoorthy, C. V., "Fault-Tolerant Computing", IEEE Transaction on Computer , Vol. C-20, Wo. 11, November 1971. 3. Carter, W. C, and Bouricuis, ¥. G. , "A Survey of Fault-Tolerant Computer Architecture and its Evaluation", Computer , January 1971. k. Poppelbaum, W. J., et . al. , "Stochastic Computing Elements and Systems", Fall Joint Computer Conference, Anaheim, California, 19&7 • 5. Ribeiro, S. T. , "Random-Pulse Machine", IEEE Transaction on Electronic Computer , Vol. EC-l6 , June 1967. 6. Gains, B. R. , et. al. , "Stochastic Computing", AFIPS Proceedings , SJCC , Vol. 30, April 1967. 7. Gains, B. R. , "Foundations of Stochastic Computing System',', IEEE Inter- national Convention Digest , New York, March 1968. 8. Afuso, C. , "Analog Computation with Random Pulse Sequences", Report No. 255, Department of Computer Science, University of Illinois, Urbana, Illinois, February 1968. 9. Poppelbaum, W. J., et. al. , "Transformatrix - the World's Most Parallel Computer" to be published. 10. Esch, J. W. , "Rascal - A Programmable Analog Computer Based on a Regular Array of Stochastic Computing Element Logic", Report No. 332, Department of Computer Science, University of Illinois, Urbana, Illinois, June 1969. 11. Esch, J. W. , "A Display for Demonstrating Analog Computation with Random Pulse Sequence", Report No. 312, Department of Computer Science, University of Illinois, March 1969. 12. Korn, G. A., Random-Process Simulator and Measurement , McGraw-Hill New York, 1966. 13. Poppelbaum, W. J., Computer Hardware Theory , The Macmillan Company, New York, 1972. 14. Tauworthc, R. C. , "Random Number Generation by Linear Recurrence Modulo Two", Mathematic of Computation, April 1965. 102 15. Sabolewski, J. S., et . al . , "Pseudonoise with Arbitrary Amplitude Distribution", IEEE Transaction of Computers , Vol. C-21, No. U, April 1972 . 16 . Galomb , S . W . , Shift Register Sequence , Holden-Day , 1967 . 17. Peterson, N. W., Error-Correcting Codes , MIT Press, Cambridge and Wiley, New York, 1961. 18. Cobbold, R. S. C, Theory and Applications of Field-Effect Transistors , Wiley-Interscience, New York, 1970. 19. Hoel, D. G., Introduction to Mathematical Statistics , John Wiley and Sons, Inc., 1971. 20. Breiman, L., Probability and Stochastic Processes , Houghton Mifflin Company, Boston, 19&9- 21. Olson, D. L., "Remote Power Supply for the APE System", Report No. UlO , Department of Computer Science, University of Illinois, August 1970. 22. McKelvey, J. P., Solid-State and Semiconductor Physics , Harper and Row, New York, 19 66 , 23. Van Der Ziel, A., Solid State Physical Electronics , Prentice-Hall, Inc., 1968. 103 VITA Yiu Kwan Wo was born on January 23, 19^2 in Saigon, Viet Nam. He did his undergraduate work in Honors Electrical Engineering at McGill University, Montreal, Canada from 1963 "to 1967. He was awarded the British Association Prize in 196U, and University Scholarships as well as the title of University Scholar for the academic years of 196U-65, 65-66, 66-67. In June 1967 he received the B. ENG. degree in Honors Electrical Engineering and was awarded the British Association Medal for high distinction in over all performance at McGill University. From May to September 1965, he was a summer research assistant in the Pulp and Paper Research Institute of Canada in Montreal, working on the problem of automatic control of paper mills. The following summer, he spent four months as a summer research assistant in Whiteshell Nuclear Establishment in Manitoba, Canada, helping to design and implement electronic instruments associated with a particle accelerator. From May to September 1967, he joined the Canadian Marconi Company in Montreal as an engineer and took part in the development of an air-borne Doppler radar system. In September 1967 he began his graduate studies under Professor Poppelbaum in the Computer Hardware and System Research Group, Department of Computer Science at the University of Illinois. He was awarded a University Fellowship from the University of Illinois for the academic years of 1967-68 and 1968-69. He has been employed as a Research Assistant by the Department of Computer Science since June 1969. He received his Master of Science degree in Electrical Engineering in February 1970. He is a member of Phi Epsilon Alpha and Phi Kappa Phi. ,rmAEC-427 U.S. ATOMIC ENERGY COMMISSION (6/68) UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR AEC DISPOSITION OF SCIENTIFIC AND TECHNICAL DOCUMENT ( See Instructions on Reverse Side ) AEC REPORT NO. C00-1U69-0216 Report No. 556 2. TITLE APE MACHINE: A NOVEL STOCHASTIC COMPUTER BASED ON A SET OF AUTOMOHOUS PROCESSING ELEMENTS TYPE OF DOCUMENT (Check one): [x] a. Scientific and technical report ~ J b. Conference paper not to be published in a journal: Title of conference Date of conference Exact location of conference Sponsoring organization □ c. Other (Specify) [ RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): [x) a. AEC's normal announcement and distribution procedures may be followed. 2 b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractors. 2 c. Make no announcement or distrubution. REASON FOR RECOMMENDED RESTRICTIONS SUBMITTED BY NAME AND POSITION (Please print or type) Yiu Kwan Wo Research Assistant Organization Digital Computer Laboratory University of Illinois Urbana, Illinois 6l801 Signature Date February, 1973 FOR AEC USE ONLY AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY. ON ABOVE ANNOUNCEMENT AND DISTRIBUTION RECOMMENDATION: PATENT CLEARANCE: I I a. AEC patent clearance has been granted by responsible AEC patent group J b. Report has been sent to responsible AEC patent group for clearance. [_] c. Patent clearance not required. IBLIOGRAPHIC DATA HEET 1. Report No. UIUCDCS-R-73-556 3. Recipient's Accession Ni. Title and Subtitle PE MACHINE: A NOVEL STOCHASTIC COMPUTER BASED ON A SET OF AUTOMONOUS PROCESSING ELEMENTS 5. Report Date February 1 Q7? 6. Author(s) iu Kwan Wo 8. Performing Organization Rept. No. Performing Organization Name and Address epartment of Electrical Engineering niversity of Illinois rbana, Illinois 6l801 10. Project/Task/Work Unit No. 11. Contract /Grant No. 46-26-15-301 2. Sponsoring Organization Name and Address S AEC Chicago Operations Office 800 South Cass Avenue rgonne, Illinois 60U39 13. Type of Report & Period Covered Thesis Research 14. 5. Supplementary Notes i. Abstracts he APE Machine is a real-time computer with features such as reconfigureable struc- ure, incrementable computing power and certain fault tolerant capabilities. This ighly flexible computer is dubbed APE Machine because its basic building blocks are set of Autonomous Processing Elements known as APEs. Each APE is a small processor n its own right. However, many APEs can be grouped together to form a more power- ul processing unit. The APE Machine contains a set of sensors to perform input data cquisition and also a program control unit. '. Key Words and Document Analysis. 17o. Descriptors PE Autonomous Processing Elements 'b. Identifiers/Open-Ended Terms ?c. I OSATI Field/Group liability Statemcni 19. Security (lass (This Report ) UNCLASSIFIED 20. Security ( lass (This Page UNCLASSIFIED 21. No. of Pages 22. Price 5RM N TIS-15 ( 10-70) USCOMM-DC 40329-P7I OCT 2 4 19/3