Digitized by the Internet Archive in 2013 http://archive.org/details/communicationsun880kuja to* tr ■jit**; ■0#%IUCDCS-R-77-88O A COMMUNICATIONS UNIT FOR A MULTI-MICROPROCESSOR NETWORK 9* UILU-ENG 77 1735 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS \$oW *~- UIUCDCS-R-77-880 A COMMUNICATIONS UNIT FOR A MULTI-MICROPROCESSOR NETWORK by Gary Ku.jawinski June, 1977 Submitted in partial fulfillment of the requirements for the degree of Master of Science in Electricl Engineering in the Graudate College of the University of Illinois at Urbana-Champaign, 1977 5/6 iii ACKNOWLEDGEMENTS The author would like to thank Professor Michael Faiman for his generous support and guidance, G. Gostin and G. Chesson for their helpful suggestions, and the Information Engineering Laboratory of the Department of Computer Science for its support. IV TABLE OF CONTENTS Page I. INTRODUCTION ! II. SURVEY OF EXISTING SYSTEMS 4 III. GOALS AND RATIONALE 7 IV. SYSTEM OVERVIEW 9 V. SERIAL PROTOCOL 14 VI. TIE MODULE 17 1. General Description 17 2. Hardware Description 21 VII. COM-UNIT 27 1. General Description 27 2. Hardware Description 32 A. Request Scanner 32 B. Multiplexed Transceiver 35 C. Crosspoint Matrix .39 VIII. SOFTWARE 4 3 1. TIE Software 43 2. COM-UNIT Software 46 IX. SUMMARY AND CONCLUSIONS 5 3 I. INTRODUCTION In education and research, it is desirable to have available hardware modules general and powerful enough to accommodate a wide variety of projects. Microprocessor systems for these activities must be reconf igurable , independent of a specific processor and able to grow gracefully as technology develops. MUMS (Modular Unified Microprocessor System) is an attempt to meet this need, as evidenced by its four basic goals: generality, modularity, extendability , and low cost [1,2] . The MUMS concept centers around a generic bus protocol with signals chosen to perform the basic operations of memory access, I/O access, interrupt, and DMA. The term, generic, means that the bus control signals and their protocol are chosen to be logically meaningful to the tasks they must perform, and do not necessarily reflect the conventions of a particular microprocessor. The aim is to simplify interfacing, which usually requires no more hardware than would be necessary in a processor specific system. The protocol is essentially asynchronous, so as to be relatively independent of the varying speeds of different microprocessors. The end result is that modules such as a processor card, look identical from the exterior. A system redesign entails nothing more than exchanging a few cards. The structure chosen incorporates one processor per bus for the following reasons. First, a simple, easy to learn and understand protocol was desired. Second, most microprocessors have been designed to control their environment, with the exception of a DMA capability and interrupt. A multiple processor could be developed by utilizing the hold and hold acknowledge signals. For example, one processor would act as the bus master and operate in the normal fashion. The other processors on the bus would issue hold requests and receive hold acknowledge before accessing the bus. Not only would an arbiter be needed, but for more than a few processors, performance would be seriously degraded. There are however, many applications where multiple processors are needed or desired. The COM (Communications) UNIT will provide multiprocessing in a manner consistent with the MUMS ideals. The basic aim of this design is to provide a flexible, yet powerful interconnection scheme which is cost compatible with a microprocessor based subsystem. The situation at present is that I/O devices and memory comprise much of the cost of a computer system. In microprocessor systems this is even more evident. The COM-UNIT will allow the snaring of expensive peripherals such as a floppy disk among many subsystems. In the area of research, multiprocessing and computer networks are a major source of interest. A low cost multiple microprocessor system can be used to directly model large systems in development. Various architectural solutions can be implemented in studies of cooperating tasks, networking, and distributed intelligence. An inexpensive and expendable microcomputer is a useful tool in education. Laboratory courses utilizing a number of microcomputer systems interconnected to share a mainframe link and mass storage give students hands on experience with a bare computer. Not even with a relatively inexpensive minicomputer could one afford allowing students to do hardware interfacing or allocate the machine to a single student to run system software. In addition, designing a hardware module and the corresponding software to drive it illustrates the hardware - software interaction in a very clear manner . II. SURVEY OF EXISTING SYSTEMS Multiple microprocessor systems have evolved from systems utilizing minicomputers or specially designed processors. The systems described here illustrate a variety of methods to implement interprocessor communication and resource sharing. The Carnegie Mellon C.mmp system consists of k PDP11 processors connected to m memory modules and j shared I/O devices via two large crosspoint switches, set either by manual or program control [3 ,4] . Each bus, which includes a processor and local memory, is connected to the crosspoint through a mapping device, D.map. The D.map transforms local addresses at the processor into physical memory addresses at the large shared memory. The intent of the I/O switch is for long term assignments such that a set of processors completely control an I/O resource. As opposed to C.mmp, the Carnegie Mellon Computer Modules system has no central, shared memory. This system consists of processor memory pairs operating on an intra - CM bus[3,5]. These buses are in turn connected together in clusters via inter - CM buses. Inter - cluster communication is at the next level in the network . A mapping module is again employed to recognize, map, and forward an external address. Mapping can be performed in this manner across several inter - CM buses. A similar scheme is employed by the BBN Pluribus system utilizing Lockheed SUE computers [ 4] . Seven processor buses, which include a bus arbiter, two processors and two 4K memory modules are connected to two shared memory and two I/O buses through bus couplers. The bus couplers are again mapping modules similar to D.map. The MINERVA multiprocessor consists of a set of compatible , asynchronous devices organized around a single demand multiplexed bus(IDBUS) [6] . The devices include 8080 processors, 3000 processors, I/O devices, shared memory, an arbiter, and a 256 flag mutual exclusion module. Since a single 3000 could almost fully load the IDBUS, a cache is incorporated for each. The bus is reserved via a request to the arbiter followed by a grant. The arbiter consists of FIFO ports used by processors, and priority ports used by I/O devices with predictable bandwidth requirements. A single shared bus scheme such as MINERVA is economical and expandable to a limit but lacks concurrency, reliability, and performance due to its single data path. In addition, the interconnection costs to cable the entire set of bus signals to each module separated even by short distances would be substantial. To increase the performance by adding multiple data paths will only accent this problem. The idea of bus links or mapping devices, as in C.mmp, is modular and versatile but can become expensive when the interconnection scheme becomes complex. The delays imposed by mapping across several buses would be undesirable for long data transfers . This problem is alleviated in C.mmp by requiring only one D.map on each bus and structuring the interconnections through the crosspoint. However, since the connections to I/O devices are long term and accommodate only a portion of the buses, the need for an I/O switch could be eliminated by dedicating these devices to a single processor and utilizing the memory crosspoint to share these devices among buses. The COM-UNIT differs basically from all of these schemes oy aiming toward data block transfers as opposed to single accesses from shared memory, thus resulting in a loosely , rather than tightly, coupled system. Many of the ideas mentioned here are compatible with COM-UNIT goals, however, and have influenced their specification and implementation. III. GOALS AND RATIONALE The COM-UNIT must perform three basic functions. They are data transfer, direct interprocessor communication, and the ability to interrupt another processor. In accomplishing these functions, flexibility, cost and performance are the main criteria to be applied. The following outlines the design goals specifically with both these ideas and prospective applications in mind. 1) Although message passing can be performed as a data transfer of a few bytes, as in a mailbox scheme, the setup overhead argues against it. The COM-UNIT will provide direct processor conversations which can be interpreted as either a data transfer setup or a sequence of requests and answers. On the other hand, this implies that a data transfer could be performed as a conversation, although this would not usually be the case. In the same sense, interrupting another processor is basically a signal accompanied by a few bytes of data to indentify the cause of the interrupt. If the COM-UNIT interface on the MUMS bus is interrupt driven, when an external access arrives, interrupts are naturally accomplished merely by sending a request to the target processor. Thus, by not actually separating data and control, flexibility in communication is achieved. At some point, however, the data transfer should proceed with no processor intervention as in the classical sense of an I/O channel. 2) A communications module such as this must necessarily be a rather expensive device. Intelligence should then be allocated to keep the most numerous modules ( interfaces) as simple as possible. In accordance with this, a serial communication link was chosen to 8 eliminate a proliferation of costly cables among MUMS busses and the COM-UNIT which may be separated by short distances of approximately twenty feet. 3) For this unit to be applicable to a wide variety of projects with different traffic requirements, a high degree of concurrency should be provided so that performance will not be severely limited by the number of processors in the system. Since communication is generally two way, although this will not be explicitly restricted, the maximum number of concurrent conversations, which is one naif the number of processors, will be supported. 4) To allow a system to grow gracefully, the COM-UNIT should be expandable to a certain degree. In addition, for systems with a small number of processors, a rather expensive COM-UNIT would be a major cost factor. The COM-UNIT will be designed to build on a basic skeleton in a modular fashion such that the amount of extraneous hardware will be minimized. 5) Due to the nature of a MUMS system, which may support a wide variety of processors and devices, the communications protocol must be able to handle the variety of speeds at which individual systems operate. With different amounts of bus contention and devices, data will be made available to the interface modules at different rates. To overcome this variability, the COM-UNIT protocol will employ a handshake control for the serial link. IV. SYSTEM OVERVIEW The basic system architecture is shown in Figure la. The COM-UNIT is a modified serial crosspoint with a microprocessor controller which will handle up to 16 MUMS busses. The TIE module interfaces to a MUMS bus and appears as a DMA device to the local processor. On the COM-UNIT side, the TIE module communicates via a synchronous, serial link over just five lines at a rate of 500 KBAUD to provide adequate bandwidth for a device such as a floppy disk. Basically, a processor wishing to communicate with another processor sends a request via the TIE module to the COM-UNIT. If the COM-UNIT determines that the requested module is available, it establishes a connection between the two by firing the appropriate crosspoint. The end result is a direct TIE to TIE path over which the two processors may interact. For generality, the TIE is designed such that it may be directly connected to another TIE as shown in Figure lb. This feature is attractive for systems with a number of processors not large enough to warrant a rather expensive COM-UNIT. These connections can even be employed in the presence of the COM-UNIT, if so desired, as dedicated, high traffic data paths. It is even possible for a MUMS processor to be involved in more than one simultaneous interaction via multiple TIE modules on its bus. The basic configuration can be extended one step further by introducing more COM-UNITs. Each COM-UNIT with its associated local processors could then appear as a node in a network of such systems. Thus with the flexibility of interconnection schemes allowed by the TIE to TIE paths and multiple COM-UNITs, practically any network 10 MUMS <: mums 1 <): moms <*: n ^^ TIE Q COM UNIT TIE • • • TIE n a) Basic Configuration MUMS <] [ > TIE TIE 1 Q > MUMS. b) Dedicated Link Figure 1. MUMS COM-UNIT Architecture 11 configuration can be realized. The realization of these goals requires a careful inspection of possible hardware-software tradeoffs to maximize performance while minimizing cost. On one hand, a completely hardwired COM-UNIT and TIE module would enhance performance and require local processors to merely initiate a transfer with a single command. DMA registers would still have to be loaded by software and the data and request format would be necessarily fixed by hardware. More importantly, a hardwired controller which would perform the functions of request decoding, status checking and updating, and matrix control would be an expensive and complex device. On the other hand, an entirely software driven controller is extremely flexible and inexpensive, but has all the disdvantages of programmed I/O. The slow speeds of microprocessors would limit the bandwidth of the channel and require buffering for faster storage devices. As a compromise between these extremes, the COM-UNIT has been designed to be software controlled for data transfer setup and interprocessor communication, but to handle data transfers in DMA mode with no software intervention. The flexibility of software control allows the user to specify the exact nature of the COM-UNIT operation and tailor it to a specific application. An even more important tradeoff occurs at the MUMS side. Since a TIE module exists on each bus, it should be as cheap and simple as possible: ideally, it should fit on one card. Consistent with the 12 idea of software controlled communication, the TIE is a passive device during conversations and data transfer setup by merely acting as a transceiver for the serial link. During data transfer, the TIE is a DMA device which generates an interrupt when the transfer is complete . In effect then, communication involves software at a MUMS bus conversing with software at the COM-UNIT and eventually at another MUMS processor. The executive routine sets the format and content of control commands to be interpreted on the other end of the link. One disadvantage of this scheme is that a processor must exist on each MUMS system connected to the COM-UNIT. This will usually be the case, however, since shared devices such as a floppy disk will need a processor anyway for file handling. Only in the case of shared memory will a processor be extraneously needed. In this case, since each processor has its own local memory, all memory is in effect shared and can be distributed as needed among the subsystems. The processor local to the shared memory can then allow access to data blocks by symbolic names and implement protection schemes . Since the COM-UNIT processor need only set up communication paths, and each processor has its own local memory, thereby easing the load on bandwidth requirements, the COM-UNIT processor may be used in other roles. Being the focal point in the system, it could inherently be the master processor in a master slave organization. The master would act as a dispatcher and suballocate tasks to be performed in parallel to the appropriate subsystem. Another application is as a resource allocation processor. Resources, even 13 local ones, would be allocated by the COM-UNIT to implement semaphore primitives with a resource request queue. If memory is considered a resource, the COM-UNIT could also implement protection by keeping protection codes for memory blocks and access codes for each module. A memory request would be granted by the COM-UNIT provided a protection error did not occur. One example of a resource could be an executive routine for each submodule. Instead of a ROM copy at each MUMS bus, a small bootstrap routine which communicates with the COM-UNIT could be used to load the executive from another module. In studies of multiprocessor systems, the COM-UNIT would be valuable as a system evaluation tool. It could keep records of resource usage and performance characteristics of the system. The COM-UNIT would also periodically interrogate subsystems to ensure reliability through interactive diagnostic routines. The status of the entire system could then be kept continuously updated. 14 V. SERIAL PROTOCOL The hardware support for interprocessor communication is centered around a serial link composed of the following five lines. 1) REQUEST in - An active high signal which indicates that an incoming transmission is pending. 2) REQUEST out - An active high signal to indicate that a request for service is being made. 3) SYNCH - This line is open collector, active high used to synchronize data transmission and implement a handshaking scheme . 4) DATA - The DATA line is synchronous and half duplex. 5) CLOCK - The clock is furnished by the COM-UNIT and is the global system clock for data transmission. One example of a serial protocol is the SIMSER (SIMple SERial) standard proposed by Nicoud[7]. In that scheme, which is intended for asynchronous lines, the receiver provides the clock to the transmitter. If the receiver cannot receive any more information, it simply stops the clock to inhibit the transfer. In this way a kind of handshaking is implemented. Since the COM-UNIT link is synchronous and half duplex, this scheme was inapplicable for this situation. The COM-UNIT protocol was developed as follows. 15 No assumptions can be made as to the speed at which a TIE can send or receive data, since by definition of MUMS various processor speeds and amounts of DMA contention can be employed. To overcome this variability, a handshake control was needed to indicate ready or not ready status. In addition, since the serial line is synchronous, there must be some means of indicating when a data word is beginning. Two possibilities are; internal synch (detection of a flag header on the data line) or external synch (separate flag line). The SYNCH line will accomplish both functions. Since both modules must be ready before a transmission is to begin, the AND function is performed on the SYNCH line, which is open collector, active high. The transmitter will release the line when it is ready to send; the receiver will release it when it can receive a data word. If either or both modules is not ready, the line will be held low, disabling the transmission. The SYNCH line will go high when enabled on the leading edge of the clock to indicate that valid data will be shifted out on the next leading edge. SYNCH remains on for the length of the data word and can therefore be used as a shift enable for a shift register or a FIFO buffer. SYNCH can also be used directly as the SYNDET (SYNch DETect) input for a device such as INTEL'S 8251 Programmable Communication Interface. In addition to synchronization and handshaking, the protocol must arbitrate between two intelligent modules each of which may simultaneously attempt to initiate a conversation. This is done by the REQUESTin and REQUESTout lines. 16 The basic arbitration stategy is first come, first served. A module may make a request by raising its REQUESTout line on the leading edge of the clock if an incoming request is not pending. The TIE will yield to the COM-UNIT in the event that concurrent requests are made; that is, on the same edge of the clock. If two TIEs are directly connected, both will backup to avoid a possible collision. The local processors will detect from status that both request lines are inactive and make another attempt. Since the two processors are not synchronized, the request should be granted to one of them on the next try. 17 VI. TIE MODULE 1. General Description To avoid cabling the entire set of MUMS bus signals to the COM-UNIT, an interface is necessarily required. The TIE interfaces directly to the MUMS protocol and transforms data and control to the serial link. Since the structure requires a TIE on each bus, it is desirable to keep this module as simple as possible. For this reason, much of the intelligence first designed into this circuit has been passed back to the local processor. The TIE simply acts as an I/O device to transfer and accept request words. This is consistent with the flexibility allowed by the COM-UNIT controller. The basic TIE hardware is shown in Figure 2. It consists of a DMA section, control section, and a serial transceiver. These devices are accessed through the following I/O ports by the local processor . The high order six bits of the I/O device number are DIP switch programmable to allow the TIE to be inserted anywhere in the I/O address space. The low order two bits are decoded as follows. Output port is the transmitter buffer register. Port 1 is the word count register. Port 2 is the command register which has the following format. BIT 7 RECEIVE REQUEST BIT 6 SEND REQUEST BIT 5 START DMA WRITE BIT 4 START DMA READ 18 C=5> o o o C > <=^> ADDRESS AND CONTROL A 11 DMA MODULE COMMAND REGISTER INTERRUPT CONTROL SERIAL MODULE V I/O PORT LOGIC MASTER CONTROL REQUEST in -*► REQUEST out -*» DATA -► SYNCH -»► CLOCK Figure 2. TIE Module Block Diagram 19 BIT 3 INTERRUPT ENABLE BIT 2 SET REQUEST OUT The master control decodes the command register and activates the appropriate module in the TIE corresponding to the active bit. Output port 3 accesses the DMA starting address register. Since a 16 bit address must be specified, two writes are required, with the low byte entered first. There are two input ports. Port is the receiver buffer register and port 1 is the status register which appears as follows. BIT 7 SERIAL READY BIT 6 WORD COUNT ZERO BIT 5 REQUEST IN BIT 4 REQUEST OUT BITS 3-0 TOP 4 BITS OF DEVICE NUMBER An outgoing request is initiated when the set REQUEST OUT command is received. The REQUEST OUT line is set if possible as discussed previously. The processor reads the status of both request lines to determine if it may proceed, as indicated by an active REQUEST OUT and an inactive REQUEST IN. If this is not the case, the host will instruct the TIE to receive the request or make another attempt. If the request has been granted, the request word is sent to the TIE serial section to be transmitted. The master control will load the word into the transmitter and activate the serial controller. The serial controller releases the SYNCH line and 20 transfers the word when both modules enable SYNCH. Since the request has already been acknowledged, the next command sent to the TIE should have the SET REQUEST OUT bit reset to remove REQUEST OUT. Succeeding events are software dependent and may comprise a series of requests and answers. When an incoming request is received, an interrupt is issued to the local processor. A simple request can therefore be used to interrupt another processor as for a wake up signal . When a data transfer is to begin, the conversing processors will load their DMA registers and a start DMA read or write into the command register. Upon completion of the data transfer, an interrupt is again issued by each TIE to its local processor. A new interaction may then be initiated or they may merely quit. If a COM-UNIT is employed, the processor which initiated the operation must make another request to inform the COM-UNIT to dissolve the connection. As stated previously, two TIE modules may be directly connected to form a dedicated path. The operation is basically the same, except that the intermediate conversation with the COM-UNIT is absent. The other difference is that a common clock must be furnished to both TIEs on their CLOCK lines. The connection is made by directly coupling the CLOCK, SYNCH, and DATA lines and cross connecting REQUEST IN and REQUEST OUT. 21 2. Hardware Description The TIE hardware is broken up into five sections with a central, master control. Each section will be described below as to its internal operation and relationship with the other sections. The following discussion refers to the TIE hardware logic diagram in Figure 3. The I/O port logic compares the top six bits of the I/O address and indicates a match when RAV (Read Address Valid) or WAV (Write Address Valid) and WLD (Write Low Data) is valid. In addition, the I/O line must also be active to indicate that the address is an I/O device number. On a read cycle, a match enables the 2 to 4 decoder which decodes the low two bits of the address and pulses ACK (Acknowledge) to indicate that valid data is on the data lines. On a write cycle, a signal is sent to the master control to generate a synchronized enable pulse to the address decoder. This is necessary since the shift register is a synchronously loaded device. The decoder outputs provide a gating signal to the proper register to latch in data from the data lines. The enable pulse is also used to generate ACK. The serial module is loaded passively through the I/O port logic just described, by setting the load function for the shift register from the load enable pulse. The serial control is activated by a send or receive serial signal from the master control. When enabled, the serial control will set the direction of l transfer and enable SYNCH. When SYNCH actually goes high, the ! controller will count until an entire character has been 22 (O a +-> (13 (U c O o Jj 1 cc I N U t f t f t • • • • • • MULTIPLEXED TRANSCEIVER A V MUMS MATRIX AND CONTROL * *- *- »■ ■« »» «« »» >s Figure 5. COM-UNIT Block Diagram 29 TRUNKS BILATERAL DATA, DATA. DATA _A^ MULTIPLEXER » » 1 ^z- _s • • • 1 1 1 ,^-. s • • • • • • ' 1 • • • 1 :::;: s • • • Figure 6. Switching Matrix Organization 30 both. The actual operation of the COM-UNIT is dependent upon the software service routines defined for a specific application. The hardware allows the user to specify the data and request formats, conversation protocol, etc. The following discussion describes in general the hardware support and a basic operation sequence. The request scanner sequentially inspects the REQUEST IN lines until an active one is detected. The scanner updates its status and issues an interrupt to the controller. Since the MUMS interrupt scheme is three leveled, two levels can be set aside for the two interrupting devices, i.e. the scanner and the multiplexer. In ; this way the interrupting device is identified to the processor immediately. The interrupt vector contains status information and the line number of the requesting module. Although the processor has the option to re-enable scanning to implement a priority scheme other than round robin, generally the processor will instruct the multiplexer to receive the request by sending it the line number and a read command. The requesting module resets the requests request line after the request has been transferred. The processor will decode the request word and perform the software dictated function. Beside requesting a connection to another MUMS bus, functions such as master slave signalling, message passing as in a mailbox scheme, test and set operations, and system status updates can be defined. 31 If a request for another processor is received, the target is checked for availability by inspecting its status and the current state of its request line. If that module is also active, the COM-UNIT must receive that request due to the first come- first served arbitration scheme. COM-UNIT software may then send the original request to the target or inspect both requests before deciding upon the proper action by some user defined strategy. If the target is available, the COM-UNIT will forward the request to it. The processor establishes the connection over an available trunk by writing the coordinates into matrix control. The two MUMS systems now converse on a one to one basis with no COM-UNIT intervention for the remainder of their interaction. Connections are released in the same manner. The original requester again makes a request as before to inform the COM-UNIT that the connection is no longer needed. Both modules must be informed that the connection is no longer valid. Since the two modules are still coupled through the matrix, the request sent to break the connection can be interpreted as the path invalid signal for the target module. Alternatively, both modules could wait for an explicit message from the COM-UNIT. Maximum concurrency is achieved when the number of trunks is one half the number of lines. The system can be expanded to a limit by tying in more lines without increasing the number of trunks without significantly degrading performance. Requests could then be denied when all trunks are busy. At the point where degradation is significant, more trunks would have to be added. The COM-UNIT could keep a log of queue length for system evaluation purposes. 32 2. Hardware Logic A. Request Scanner The request scanner hardware is shown in Figure 7. This module is accessed by the COM-UNIT processor through an I/O port. The 6 bit address comparator indicates a match in the same manner as discussed for the TIE module. The high order 6 bits of the I/O address are jumper selectable. The low order bit is decoded as follows . Output port is the command register which has the following format. BIT 7 SET REQUEST OUT BIT 6 SCAN ENABLE i BIT 4 LOAD LINE NUMBER BIT 3-0 4 BIT LINE NUMBER Port 1 is used to set the value of the interrupt enable flipflop from bit of the data bus. Input port is the device status register which also acts as the interrupt vector. It has the following format. BIT 7 REQUEST OUT BIT 6 REQUEST IN BIT 5 INTERRUPT ENABLE BIT 3-0 4 BIT LINE NUMBER CD a- on OJ s_ en o - (VJ K) ID * r- \iC __ ( Q Q Q Q Q Q Q o < < CQ CO CD m CD CD CD l< CD CO > |Q |> .,-. r- <© m » io «« m * ro C\J — o rz Q O Q Q Q Q o Q 1^5 l< CD CD CD CD CD m CO CO 37 The following functions can be performed through a combination of the command register bits. 1) Send or receive and disable SYNCH afterwards. This is the normal mode of operation. 2) Send or receive but leave SYNCH enabled. This function is used to transmit a connection complete signal to both modules after a path has been established. 3) Load a line number to inspect a data line. The control state counter has the state diagram shown in Figure 9. The direction of transfer is set by the control bits in the command register. The bit counter, which counts the number of bits transferred, indicates that it is done when the transfer is complete. The line number written into the command register sets the control for both the SYNCH and DATA multiplexers and demultiplexers. The SYNCH control is made up of an enable demultiplexer and a disable demultiplexer. Each SYNCH line has associated with it a flipflop which is set and reset by the respective outputs of the demultiplexers. When idle, the SYNCH is held low by the COM-UNIT to disable transmissions. When two modules are conversing, SYNCH is released by setting the appropriate control flipflop. When a conversation is complete, the original requester must make another request to break the path. It does this by setting its REQUEST OUT line and enabling SYNCH for the transfer of the request. For this reason, the rising edge of REQUEST OUT clocks and resets the flipflop, clamping SYNCH. This must be done in hardware since the COM-UNIT may not be able to respond fast enough to receive the 38 IDLE 000 SERIAL READY INPUT OUTPUT OUTPUT I WRITE 001 GATE DATA RECEIVE+SEND+ ENABLE DISABLE ONLY INPUT • INTERRUPT COMPLETE TRANSFER COMPLETE • ENABLE READ 110 ENABLE OUTPUT ACKNOWLEDGE TRANSFER COMPLETE-DISABLE DONE 100 INTERRUPT CLEAR COMMAND It END Oil DISABLE SYNCH Figure 9. Transceiver Control State Diagram 39 request. Since the other module needs to know when the connection is no longer valid, it must wait for a connection invalid signal, which may be merely the request to break the path. C. Crosspoint Matrix The matrix card appears in Figure 10 and consists of the following four sections. a) I/O port and control logic. b) Line number decoder. c) Trunk multiplexer and storage. d) Crosspoint switch matrix. When a connection is to be established, the two lines must be connected to an inactive trunk. The processor checks the status of the matrix and chooses a free trunk. The connection is established through the following output port. BIT 7 MAKE/BREAK BIT 6-4 3 BIT TRUNK NUMBER BIT 3-0 4 BIT LINE NUMBER To establish a connection to a trunk, the processor applies the trunk and line numbers, and asserts the MAKE bit to the output port. The logic flow begins when the I/O address on the bus is recognized, as discussed previously. The I/O port logic is identical to that of the request scanner. The data strobe pulse loads the command register from the data lines in the format shown above. 40 r^»W- H" rWf r'v^ T \ \ T \ HIl \ T \ o < Q X u £ < X o > N — O o Q Q Q Q < 00 00 CO 00 Q X •r - S_ -M en o 4-> CD S- 3 en 41 The line number selects the proper addressable latch through the 4 to 16 decoder. The addressable latch operates in three modes. On reset, all latches are cleared, disabling all switches. Following this, all decoder outputs are high and the latches are in the memory state. Prior to the data strobe, the select inputs of the latches are set to the proper trunk number. The strobe clocks the data on the MAKE/BREAK line into the addressed latch. The state of the other latches is unaffected. When the strobe is removed, the memory state is again entered. The eight outputs of the addressable latch are always valid and control the bilateral switches. The result is a connection of DATAn and SYNCHn to trunk i. A second such operation with line m as input connects modules m and n. A break operation is exactly the same with the exception of the data written on the MAKE/BREAK line into the latch. One could attempt to optimize this strategy by removing only one line from the connection. In that way, the controller would remember that a module is connected to a certain trunk and route its subsequent connections to that trunk. On the average, only one write cycle would then be required. It is desired to run the serial link at a rate of approximately 500 KBAUD. Twisted pair line with a charactar istic impedence of approximately 150 ohms will be used. One would normally terminate , the line with a 150 ohm load to eliminate reflections. This strategy is complicated due to the charactar istics of the 4066 CMOS transmission gates. 42 To minimize the on resistance of the device, 12 volts is used as the supply voltage. This requires that the control voltage also be 12 volts. To accomplish this, the transmission gates control inputs are driven by open collector inverters with high voltage outputs pulled up to 12 volts. The fact remains that the on resistance of the device is still about 200 ohms. A load of IK is then necessary to ensure a logic zero (0.8 volts) at the receiving end. The only solution is to do a D.C. analysis to determine the proper resistance values and experimentally verify the speed at which the line can reliably operate. The conditions that a logic zero is assured and that the line normally sit at a logic one (3.5 volts) must be satisfied. The resistance values thus obtained were 2.2 K to +5.0 volts and 5.1 K to ground. At clock rates of about 5 Mhz, the up time of a square wave was reduced due to the capacitance of the line. A second iteration with the line voltage normally at 4.0 volts yielded a more symmetric square wave. The final values are 2.2 K to +5.0 volts and 12 K to ground. This results in a load of 1.85 K. At high frequencies the square wave is naturally rounded due to the RC time constant of the line. An input buffer is used to square up the signal. Experimentally, the line switched at frequencies up to 12 Mhz. At the desired operating frequency of 500 KBAUD, distortion was negligible. 43 VIII. SOFTWARE As stated previously, the operation of the COM-UNIT depends on software to make requests, establish connections, and set up data transfers. the succeeding routines, written in 8080 assembly language, illustrate one possible mode of operation of the COM-UNIT. 1. TIE Software The following routine is used to set up a data transfer through a TIE module. The conversation with the COM-UNIT consists of one request and one response. When this routine is entered, the register pair D, E contains the starting address of the buffer to store the incoming data. Register B holds the word count and register C contains the request word. After the path through the COM-UNIT is established, the symbolic name of the data block and and control information to indicate a read operation is to be made is sent to the other MUMS processor. The L register is used as a return code to the calling program and is coded as follows. A return code of two indicates that an incoming request is pending. One indicates that the requested MUMS system was not available. A zero indicates the transfer was completed. The interrupt service routine will decode the interrupt vector and pass control to routines which handle request sent, data ;transfer complete, and request received. For this example, the service routines which handle data transfer complete and request sent simply return. The routine which handles request received Returns the request in the accumulator. 44 READ: MVI L,0 2 MVI A, 4 OUT CR IN STATUS ANI 30 JZ READ CPI 10 RNZ DCR L MOV A,C OUT SR MVI A ,48 OUT CR HLT MVI A, 86 OUT CR HLT CPI BUSY RZ DCR L LDA OPCODE OUT SR MVI A, 48 OUT CR HLT MOV A,B OUT WC MOV A,E INITIALIZE FLAG SET REQUEST OUT COMMAND WITH INTERRUPTS OFF TEST REQUEST IN AND OUT TO DETERMINE IF YOU MAY PROCEED IF REQUEST IN SET, RETURN WITH RETURN CODE IN L UPDATE RETURN CODE SEND REQUEST WORD TO TRANSMITTER LOAD SEND REQUEST COMMAND WITH INTERRUPTS ON WAIT UNTIL SENT LOAD RECEIVE REQUEST COMMAND WITH INTERRUPTS ON WAIT FOR RESPONSE EXIT IF REQUEST IS BLOCKED UPDATE RETURN CODE SEND REQUEST TO MUMS PROCESSOR CONTAINING DATA NAME AND READ CODE WAIT UNTIL SENT LOAD WORD COUNT REGISTER SEND LOW BYTE OF 45 OUT AR MOV A,D OUT AR MVI A ,18 OUT CR HLT MVI A, 4 OUT CR MVI A , BREAK OUT SR MVI A, 48 OUT CR HLT RET STARTING ADDRESS SEND HIGH BYTE OF STARTING ADDRESS LOAD READ DMA COMMAND WITH INTERRUPTS ON WAIT UNTIL TRANSFER COMPLETE SET REQUEST OUT FLAG TO INFORM COM-UNIT TO BREAK THE CONNECTION LOAD SEND REQUEST COMMAND WITH INTERRUPTS ON WHEN SENT, CONNECTION IS BROKEN 46 2. COM-UNIT Software The following is a portion of the COM-UNIT executive to handle interprocessor communication. The basic strategy when an incoming request is detected is to receive the request and index into a table of service routine addresses to handle that particular request. The status of each line is stored in the table STATAB which contains either a free code or the connected line and trunk number if it is active. The status of the trunks consists of either a busy or not busy flag and is contained in the table TRKTAB. The interrupt vector from the scanner is located in the B register when this routine is entered. For this example, the request consists of a request code in bits 4-7. The contents of bits 0-3 are defined by the request code. MOV A,B ANI 40 JZ RESTORE MOV A,B ANI 0F STA REQLINE CALL RECEIVE LXI H,SERTAB ADD L MOV L,A PCHL VECTOR TO ACCUMULATOR ISOLATE REQUEST IN BIT EXIT IF FALSE INTERRUPT EXTRACT REQUESTER'S LINE NUMBER FROM VECTOR SAVE LINE NUMBER REQUEST CODE RETURNED IN ACC INDEX INTO SERVICE ROUTINE ADDRESS TABLE, ASSUMING THAT NO PAGE BOUNDARY IS CROSSED 47 The RECEIVE subroutine reads a request from the line whose number is contained in REQLINE. The request is returned in REQIN. RECEVE BUSY: LDA REQLINE ORI 10 OUT TCR IN TSR JP BUSY IN RBR STA REQIN RAR RAR RAR RAR AN I 0F RET FETCH LINE NUMBER ADD IN RECEIVE SERIAL BIT LOAD TRANSCEIVER COMMAND READ STATUS AND LOOP UNTIL REQUEST RECEIVED READ REQUEST AND STORE IT SHIFT REQUEST CODE INTO LOWER 4 BITS TO BE USED AS AN INDEX MASK OFF OTHER BITS EXIT This service routine establishes a connection if possible between the requester, whose line number is stored in REQLINE, and the target, whose line number is contained in bits 0-3 of REQIN. After the path is established, the target is simply marked as busy. The requester's status word contains the trunk number being employed in bits 4-6, and the target number in bits 0-3. LXI H,STATAB LDA REQIN ANI 0F MOV B,A SET POINTER TO STATUS TABLE FETCH REQUEST ISOLATE TARGET NUMBER AND SAVE IT 48 PROCEED NEXT: ADD L MOV L,A MOV A,M CMI FREE JZ PROCEED MVI A,BUSYCODE OUT TBR LDA REQLINE ORI 20 OUT TCR JMP RESTORE MOV A,B ORI 88 OUT SCR IN SSR JM NEXT CALL RECEIVE CALL CONNECT LDA REQLINE ORI COMPLETE OUT TBR LDA REQLINE ORI 60 OUT TCR LDA TRUNKNO ORA B MOV C,A LXI H,STATAB INDEX TO READ STATUS OF TARGET DETERMINE IF THE TARGET IS AVAILABLE AND PROCEED IF SO SEND BUSY RESPONSE TO REQUESTER IF NOT LOAD LINE NUMBER AND SEND SERIAL BIT WITH INTERRUPTS OFF EXIT REGAIN TARGET LINE NUMBER SET SEND AND LOAD LINE BITS SEND COMMAND TO SCANNER IF TARGET IS REQUESTING THE REQUEST MUST BE READ AND DISCARDED ESTABLISH PATH BUILD CONNECTION COMPLETE CODE AND SEND TO THE CONNECTED MODULES INSTRUCT TRANSCEIVER TO SEND AND ENABLE SYNCH COM-UNIT ACTION IS COMPLETE FETCH TRUNK NUMBER IN BITS 4-6 ADD IN TARGET NUMBER AND SAVE IT UPDATE STATUS OF REQUESTER 49 RESTORE: LDA REQLINE ADD L MOV L,A MOV A,C MOV M,A LXI H,STATAB MOV A,B ADD L MOV L,A MVI A,BUSYCODE MOV M,A MVI A, 4 OUT SCR EI RET INDEXING INTO STATUS TABLE AS BEFORE ASSUMING NO PAGE BOUNDARY IS BEING CROSSED STORE STATUS WORD IN TABLE SET POINTER TO STATUS TABLE FETCH TARGET NUMBER COMPUTE INDEX OF TARGET ENTRY MARK TARGET SIMPLY AS BUSY RE-ENABLE SCANNER BY SETTING SCAN ENABLE BIT ENABLE INTERRUPTS RETURN Since the number of trunks is one half the number of lines, a free trunk will always be found. This routine finds a free trunk from the trunk status table TRTAB, and establishes a path througn the matrix between the line in REQLINE and its target in REQIN. The number of the employed trunk is returned in TRUNKNO. 50 TSEARCH: MVI C,-l LXI H,TRTAB-1 INX H MOV A,M INR C CMP TRFREE JNZ TSEARCH MOV A,C RLC RLC RLC RLC STA TRUNKNO MOV C,A LDA REQIN ANI 0F ORA C ORI 80 OUT MCR LDA REQLINE MOV M,A ORA C ORI 80 OUT MCR RET INITIALIZE TRUNK COUNTER INITIALIZE TRUNK POINTER INCREMENT POINTER READ TRUNK STATUS UPDATE TRUNK COUNTER LOOP UNTIL A FREE TRUNK IS FOUND MOVE TRUNK NUMBER TO A REGISTER SHIFT TRUNK NUMBER TO BIT POSITIONS 4-6 TO BE USED WHEN ESTABLISHING CONNECTIONS STORE TRUNK NUMBER AND SAVE TEMPORARILY FETCH TARGET LINE NUMBER AND ISOLATE IT ADD IN TRUNK NUMBER SET MAKE CONNECTION BIT WRITE COORDINATE INTO MATRIX FETCH REQUESTER'S LINE NUMBER UPDATE TRUNK STATUS ADD IN TRUNK NUMBER SET MAKE CONNECTION BIT ESTABLISH PATH EXIT 51 At the end of an interaction between two MUMS processors, the original requester sends a request to the COM-UNIT to break the connection, as discussed previously. This routine uses the requester's line number in REQLINE to fetch the target line and trunk numbers from STATAB. For this example, no explicit response is made by the COM-UNIT following the request. The target interprets the transfer of the request to break the path as a connection invalid signal. DISCONCT: LDA REQLINE LXI H, STATAB ADD L MOV L,A MOV A,M MVI M,FREE OUT MCR MOV C,A ANI 70 MOV B,A LDA REQLINE ORA B OUT MCR LXI H, STATAB MOV A,C ANI OF ADD L MOV L,A MVI M,FREE FETCH TRUNK AND TARGET NUMBERS FROM STATUS TABLE AT ENTRY OF REQUESTER THE ENTRY IS IN THE CORRECT FORMAT FOR MATRIX CONTROL REQUESTER STATUS UPDATED BREAK CONNECTION SAVE TARGET NUMBER ISOLATE TRUNK NUMBER AND SAVE IT TEMPORARILY REGAIN REQUESTER NUMBER ADD IN TRUNK NUMBER REMOVE REQUESTER FROM PATH MARK TARGET MODULE AS FREE REGAIN TARGET NUM3ER ISOLATE IT INDEX INTO ENTRY FOR TARGET MARK TARGET AS FREE STATAB IS NOW UPDATED 52 LXI H,TRKTAB MOV A,B RRC RRC RRC RRC ADD L MOV L,A MVI M,TRFREE JMP RESTORE SET POINTER TO TRUNK STATUS REGAIN TRUNK NUMBER SHIFT TRUNK NUMBER INTO BIT POSITIONS 0-3 TO BE USED AS AN INDEX INTO TRUNK STATUS COMPUTE INDEXED ADDRESS MARK TRUNK AS BEING FREE JUMP TO RESTORE TO RE-ENABLE SCANNING AND RETURN 53 IX. SUMMARY AND CONCLUSIONS Since software plays a large part in the operation of the COM-UNIT, further work should begin in that area. The software presented here is just a basic skeleton to operate the COM-UNIT with a minimum of features. An extensive operating system which provides the features suggested in the course of this paper would make the COM-UNIT a valuable tool. The present aim of this work is for a microprocessor laboratory course utilizing a number of MUMS systems, incorporated with a COM-UNIT, and supplemented by a mass storage device . In terms of hardware, there are a number of features which would be desirable but not included at present due to space and cost constraints. LSI components such as a DMA controller and a Universal Synchronous Receiver Transmitter (USRT) , would reduce the complexity of the TIE module considerably. Overlap of serial and DMA operations, error checking and correction, and buffering could then be included with the use of the above mentioned LSI. At present, the COM-UNIT is being built to handle 16 MUMS systems. Actual operation will determine the capability and response time of the COM-UNIT processor. If needed, a higher speed processor could then be employed to enhance the operating charactar istics. In summary, the COM-UNIT achieves its primary goal of supporting multiple processors. This capability is attained at a level and cost which is acceptable for a configuration such as this. 54 The additional flexibility of the hardware should provide an easy means of adapting the COM-UNIT to a variety of applications. 55 X. REFERENCES 1. M. Faiman, A. Weaver, and R. Catlin, "MUMS-A Reconf igurable Microprocessor Architecture," COMPUTER, Vol. 10, No. 1, January, 1977, pp. 13-17. 2. R. Catlin, "MUMS: A Modular, Unified Microprocessor System," MS Thesis, Report No. UIUCDCS-R-76-809 , Department of Computer Science, University of Illinois, 1976. 3. E. J. McCluskey, "Micros, Minis, and Networks," Technical Note No. 58, Digital Systems Lab, Stanford University, June, 1975. 4. Bell, Broadley, Wulf, and Newell, "C.mmp: The CM Multiprocessing Computer," Carnegie Mellon Technical Report. August, 1971 5. Fuller, Siewiorek, and Swan, "Computer Modules," ACM Proceedings, October, 1975 6. L. C. Widdoes, Jr., "The Minerva Multi-Microprocessor," Technical Note No. 62, Digital Systems Lab, Stanford University, July, 1975. 7. J. D. Nicoud, "Peripheral Interface Standards for Microprocessors," Proceedings of the IEEE , Vol. 64 No . 6 , June 1976, pp. 896-904. 8. T. R. Blakeslee, "Digital Design with Standard MSI and LSI, Wiley, New York, 1975. BIBLIOGRAPHIC DATA SHEET 1. Report No. UIUCDCS-R-77-880 3. Recipient's Accession No. 4. Title and Subtitle A COMMUNICATIONS UNIT FOR A MULTI-MICROPROCESSOR NETWORK 5. Report Date June, 1977 6. 7. Author is ) Gary Kujawinski 8. Performing Organization Rept. No UIUCDCS-R-77-880 9. Performing Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 10. Project/Task/Worlc Unit No. 11. Contract /Grant No. 12. Sponsoring Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 13. Type of Report & Period Covered 14. 15. Supplementary Notes 16. Abstracts The Communications Unit is an interconnection scheme for Microprocessor systems, structured around a MUMS bus, to provide inter-processor interaction and resource sharing. Data transmission between busses and the COM-UNIT is synchronous serial over 5 lines. The COM-UNIT, which is basically a microprocessor controlled serial crossbar, communicates with each bus through a tIE module, which appears as a DMA device to the local processor. For small systems, direct TIE to TIE connections can be made as dedicated datapaths. 17. Key Words and Document Analysis. 17a. Descriptors Microprocessor Network Crossbar Matrix 7b. Identifiers/Open-Ended Terms COM-Unit MUMS TIE 7c. C0SAT1 Field/Group 3. Availability Statement Release Unlimited (3RM NTIS-35 ( 10-70) 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price USCOMM-DC 40329-P71 'SEP f 6 197? BC70 6I977 o 1070