LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510o84 
 
 U6r 
 
 no. 111-130 
 cop . 3 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books are reasons 
 for disciplinary action and may result in dismissal from 
 the University. 
 To renew call Telephone Center, 333-8400 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 BUILDING U$E ONLY. 
 SEP 2 1 1! 
 
 SEP 2 1930 
 
 80 
 
 L161— O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/decisionstructur126amon 
 
IJLCr- 
 
 r 
 
 co- 3 
 
 DIGITAL COMPUTQR LABORATORY 
 
 UNIVERSITY OF ILLINOIS 
 
 URBANA, ILLINOIS 
 
 Report No. 126 
 
 DECISION STRUCTURES FOR RECOGNITION 
 
 by 
 
 Albert H. Anion 
 
 October 1, 1962 
 
 This work was supported in part by Atomic 
 Energy Commission Contract AT(ll-l) - 1018 
 
c a pj 
 
 TABLE OF CONTENTS 
 
 Page No, 
 I . INTRODUCTION 
 
 II. SYSTEMS AND FUNCTIONS 1 
 
 III. PROCESSES AND PROCEDURES 3 
 
 IV. REPRESENTATION OF STRUCTURE 5 
 
 A. Nodes and Channels 5 
 
 B. Operations; Tests and Transformations 6 
 V. PROCEDURES ORGANIZATION IN RECOGNITION STRUCTURES 8 
 
 A. Parallel Matching 8 
 
 B. Progressive Tracing Structures 9 
 
 C. Fore-testing, Echo -techniques, and After Processing 11 
 
 D. Other Structures 13 
 VI. USES OF REACTIVE MODIFICATION IN DETECTION ik 
 
 A. Sensing the environment 15 
 
 B. The Nature of Errors 17 
 
 C. Detection and Correction of Errors 19 
 
 D. Error Avoidance 20 
 
 E. Undermining and Urgency 22 
 VII. DETERMINANTS OF SHORT RANGE 2k 
 
 A. Order of Tests 25 
 
 B. Segmentation by Function and by Purpose 28 
 
 C . Link - Forming 30 
 
>> 
 
I. INTRODUCTION 
 
 The present report describes concepts underlying complex decision- 
 making systems and how these systems handle certain information processing 
 problems now easy for human beings but difficult for computers. The 
 dominant issue discussed is the recognition of qualities of sensory input. 
 It is likely that other inductive functions of the brain demand a. similar 
 analysis. In particular the concept of a. link-forming recognition structure 
 is introduced. Further the discussion of error detection, correction and 
 avoidance suggests new decision organizations of enhanced reliability. 
 
 The problems of coordinating vast numbers of decisions made 
 simultaneously have not been adequately faced either by psychologists or 
 by logical designers. As a psychologist the writer claims no special 
 competence in computer design. However, understanding these matters may 
 allow the construction of computers of unexpected kinds . It will be 
 obvious that the inspiration for much of this presentation comes from psy- 
 chology, here presented in a contrived translation into another language. 
 No originality for the concepts presented other than the labor of assembly 
 is claimed. This is an interim report that shows disturbing gaps, in 
 particular the problems of development and discontinuous change in complex 
 decision-making systems present challenges unanswered here. 
 
 The work on the Illinois Pattern Recognition Computer is a. promising beginning, 
 Refinement of the present work may be of use in the later stages of control 
 system design for this machine, and in the formulation of reliable pattern 
 recognition programs. 
 
II. SYSTEMS AND FUNCTIONS 
 
 A complex dec is ion -making system (S) may be characterized by its 
 observable external functioning only, or in addition by its real or hypothetical 
 internal processing. This report will speculate about possible models for 
 internal processing available to complex decision-making systems. A functional 
 description of a, system, however, is provided by way of introduction. 
 
 A system is an assemblage of interdependent processors embedded 
 in an environment to perform tasks. 
 
 It is supposed that S receives information from its environment, 
 the input function of S. Input information may either be imposed in a 
 fixed format upon the S by the environment, or the S may be allowed some 
 selectivity in the rate it takes up input or in the order and kind of the input 
 information itself. 
 
 The responses of S may either be outputs to the environment or 
 internally induced changes. Input selection, if any, and response selection 
 are to serve one or both of two major responsibilities of the S: short range 
 rea.ction to environmental input appropriate to the tactical task; and longer 
 range adaptation , producing changes in the S itself that make its future 
 reactions more effective. Tasks may be externally imposed or set by the S 
 to itself; in either ca.se when they become firmly enough established to 
 direct behavior of S they become purposes . 
 
 The rea.ctive function of particular interest here is the recognition 
 of environmental objects. Here reactions may be inconclusive trials through 
 which additional environmental information is sought, and conclusions :; 
 recognitions called for by the task. 
 
 Conclusions may be evaluated by the environment, or by an internal 
 evaluative process. When evaluations have a. metric which allows comparison 
 among concluded ta.sks (i.e. recognitions) they assign values . If this metric 
 extends to evaluation of the resources expended by the S in obtaining 
 
 2 
 Underlining is used here not for emphasis but to mark the first occurrence 
 
 of terms that take on special meanings in this report. 
 
 -1- 
 
particular values, then one can speak of the cost of obtaining them. Tasks 
 for which it is possible to obtain certain values only at an increase in the 
 cost of obtaining others conflict with the others. 
 
 Values may also be assignable to the achievement of adaptation, as 
 may costs. Particularly difficult conflicts to resolve are those which occur 
 between reactive and adaptive values, the measures of which may be incommensurate 
 
 Recognition tasks may differ. in the degree to which S is to 
 recognize anything present, identification , or in the degree to which S it 
 to recognize only certain searched for things or events, detection , in the 
 fineness of the necessary discriminations, critical resolution ; in the cost 
 of error or required accuracy ; in the rate at which recognitions are to be 
 produced, urgency , and in the degree to which this rate is determined by the 
 environment or chosen by the S. S's which can adopt different rea.ctive modes 
 fitted to these or other task differences are flexible . Flexibility is not 
 necessarily the same thing as adaptability; there may be conflict between the 
 two. An example of this conflict would be adaptation cost which will occur 
 when achieving certain strategic values prevents others from being realized. 
 
 For recognition tasks objects to be responded to ( named ) may be; 
 discrete or intergrading; from a, finite or infinite set; from a. variable or 
 fixed set; associated with spatially contiguous or dispersed aspects of the 
 environment; and be themselves variable or not. The correspondence between 
 object and name may or may not be a, one-to-one mapping. The relationship may 
 be such that certain relationships obtaining between the objects hold also 
 between the associated names, in which case the naming is more or less regular „ 
 Naming may be arbitrary and allow no predictability of name for a. novel object 
 from a. similar, previously named one. Naming in this ca.se is arbitrary . 
 Objects may or may not be characterizable by separable attributes (in ways that 
 potentially facilitate recognition.) In some ca.ses objects to be identified 
 will increa.se in span , either in the amount of the environment involved or 
 in the time span of relevance, as S itself develops. Context information, 
 either from the domain or from previously stored information, may be utilized 
 by the S to achieve recognition. Changes in the environment, its objects, 
 or the names or other responses called for, may be predictable or unpredictable. 
 
 -2- 
 
S may or may not have access to the information necessary for prediction. 
 Further definitions of and distinctions between aspects of task and 
 environment will be made as they contribute to comparisons between various 
 S's. 
 
 III. PROCESSES AND PROCEDURES 
 
 A recognition task calls upon S to produce a. conclusion. Trial 
 responses may intervene which ask of the environment supplemental information- - 
 either additional samples from the domain, or guidance from another S, or 
 both. Such elicitation of environmental feedback may be continuous or 
 discrete. If the latter, it may be possible to divide performance of the task 
 into problems , each problem representing a. unit of processing which may be 
 temporarily closed , pending further input from the environment,, If such 
 division is impossible, there is only one problem per task. A problem is 
 reopened after receipt of appropriate information. Points of closing and 
 reopening are transition points, and for a. particular problem, make up 
 transition pairs . If a task has only one problem, this may be reopened after 
 evaluation indicates the response given was in error. 
 
 Should S find itself occupied with other tasks, the reopening of 
 a problem may be delayed. If intervening activity does not make reopening of 
 a. problem any more costly, the closure is stable relative to that intervening 
 activity. Stability of a closure may depend on the point chosen for reopening, 
 and in this case is a. property of the transition pair rather than of point 
 of closure alone. It will generally be to the advantage of the S if closures 
 are so chosen as to be stable when possible. If unstable closure is for some 
 reason desirable, S may utilize caretaker processes, or residua,, to minimize 
 reopening costs . 
 
 Residua, serve to maintain the readiness of S to reopen a problem. 
 They may have additional monitoring functions such as maintaining a: selective 
 sensitivity to input information, cues , which could significantly affect the 
 chances of success for the reopened problem. A residuunmay monitor competing 
 current tasks ( committed processes) or even competing residua, for its own 
 success -promising cues, and the failure -promising cues of the competing 
 processes . 
 
The processing relevant to any problem is its procedure . Processes 
 involved in a procedure may be variously grouped into subprocedures . If 
 procedures of several problems make common use of certain subprocedures, 
 they are shared . The sharing of common subprocedures is one way in which an 
 S can realize values simultaneously which might require duplicated or serial 
 effort in a less effective S. 
 
 S may simultaneously perform several subprocedures which may, but 
 need not, be parts of the same procedure. This is parallelism . A sub- 
 procedure where no incremental progress is possible without corresponding 
 incremental progress in other subprocedures is continuously dependent on 
 these other processes. Subprocedures continuously interdependent on each 
 other are essentially parallel . Whole procedures, if essentially parallel, 
 carry out inseparable problems for which there is no distinction between 
 sharing and parallel processing. Subprocedures, not essentially parallel, 
 may be effectively carried out in parallel by S. S's differ in the number 
 and kind of subprocedures that can be processed in parallel. 
 
 Parallel processing may present difficulties of coordination , the 
 scheduling of subprocedures so that economy of simultaneous operation is not 
 lost in confusion or error. Coordination difficulties may sometimes be 
 reduced: by standardization and simplification of the operations involved; 
 by distribution of control to separated units each with independent access 
 to information necessary for appropriate decision; or conversely, by 
 centralization of control when this is more effective. Better' coordination 
 sometimes results from a. combination of these methods, for instance by 
 appropriate selection of which decisions should be centralized, which dis- 
 tributed and to what extent, etc. Commensurable value measures for 
 subprocedures may also aid coordination. 
 
 It is sometimes of value to perform in series subprocedures 
 previously performed in parallel, either by completion of one before be- 
 ginning another, or by alternation between the two. Such alternation may 
 be of value when there are environmental or internal conditions that make 
 the performance of one momentarily more efficient than the other, and when 
 these conditions change over intervals too brief to allow selection of a. 
 particular most efficient order of performance. Alternation may also be 
 
suggested when the relative difficulty to be encountered in performance 
 of a subprocedures is not predictable. Here information resulting from 
 alternation may facilitate planning. Alternation is a. fine grained 
 example of the use of closure and reopening, the effectiveness of which 
 will often depend on the existence and detection of stable closure points 
 or transition pairs in the subprocedures alternated. Development of further 
 considerations of stability and residua formation will be deferred until 
 later . 
 
 IV. REPRESENTATION OF STRUCTURE 
 
 A. Nodes and Channels 
 
 The structure of S at a given time is its internal constitution: 
 its available facilities and the ways in which these can be organized . 
 Structure includes immediate reactive modifications produced in A, in 
 service of current purposes, but does not include longer range adaptive 
 changes. The structure of an S may be unknown to the observer, but at 
 any one time it has only one structure. 
 
 The functioning of S may be studied by the use of descriptive 
 r epre s enta t ions , of which there may be many seemingly consistent with what 
 is known of S's structure. 
 
 For comfort in visualization, representations discussed here will 
 employ facilities distributed in space. The unit of organization will be the 
 node , regarded as having a complex of operations which process information 
 and control informational a.ccess to, and exit from, the node. Information 
 is seen as conveyed between nodes either by transfer along discrete 
 channels or by diffuse transmission to all nodes sensitized to its reception. 
 Because it is convenient to conceive of communication between nodes as con- 
 fined to discrete messages, processes essentially parallel to ea.ch other will 
 be assigned to the same node, as their coordination requires continuous 
 intercommunication, Representation in terms of spatially localized nodes, 
 intermittently communicating with each other, is generally applicable to 
 S's. However, an S, extremely tightly coordinated, with every process 
 essentially parallel to every other, might be describable as only one node, 
 
 ■5- 
 
an analysis of limited promise. The problem is one of degree; by ignoring 
 weak interrelationships it may be possible to describe predictable performance 
 of an S for interesting spans of time, and with adequate accuracy. 
 
 The choice of which processes are to be grouped into a node and 
 which are to be separated into different nodes, is constrained only by the ■ 
 above convention concerning essential parallism. Internally complex nodes 
 may be describable in terms of subnodes and their connecting channels, 
 as complicated as the entire representation of other S's. At times it 
 may be of value to consider S's as nodes of more comprehensive S's. Later 
 discussion will argue that some kinds of grouping are more useful than others, 
 and more likely to correspond to what naturally occurs in the structure of 
 the S's studied. 
 
 Messages may involve much or little information. Substantive 
 messages carry information about environmental objects or recollections of 
 such objects from memory. Control messages carry information, usually less 
 complex, which may affect operations at various nodes. This distinction 
 will not always be meaningful, as the contents of a, substantive message may 
 be progressively disassembled into abstract attributes of an object, and 
 these fragments transformed into control messages directing the processing 
 of subsequent substantive messages „ As a. message is transfered from node 
 to node it may have associated with it tags, or control messages, prepared in 
 one node for use by another. 
 
 B. Operations: Tests and Transformations 
 
 Operations at a node may be roughly distinguished as either tests 
 or transformations . Tests select the information they operate upon in terms 
 of its origin or content , and control the destination to which this infor- 
 mation is to be conveyed. Tests control the path followed by information 
 within S. In this path control, then, tests are the ba.sic dec is ion -making 
 elements of the S. A pure test directs, but does not alter, the information 
 upon which it operates. A transformation on the other hand alters, but 
 does not direct, the information on which it operates, The distinction is 
 ultimately an arbitrary one, as path choice itself contains information, 
 and what is done by a, transformation in one representation of a. S might be 
 accomplished by a. test in another. 
 
For many purposes the distinction between input selecting and 
 output selecting tests is an unnecessary one. Thus, it will often "be 
 possible to replace the input selection function of one test by the output 
 selection functions of preceeding tests. Cases where the distinction is 
 important will be met. But for present purposes the term "test" will mean 
 an output selective test, unless otherwise specified. 
 
 Tests range in resolution from unequivocal selection of one 
 alternative, through sequence selection enlightened by information concerning 
 the probability distribution over outcomes, to arbitrary sequence selection. 
 
 Tests may differ also in the amount of time necessary for their 
 completion, Sometimes, for progressive tests, the resolution of the outcome 
 may be optional depending on transfer thresholds , on the time allowed for 
 completion and on the amount of information utilized. In such ca.ses the 
 resolution required may differ from task to task and be controlled by 
 information conveyed from other nodes . Changes in transfer threshold (in 
 particular, differential threshold changes) can make some outcomes easier to 
 test while making others more difficult, and therefore create a, bias toward 
 particular outcomes. Again, the order in which the outcomes are to be tried 
 may be influenced from other nodes . 
 
 Transformations may variously a.ffect messages operated upon. 
 Information may be reproducted and the message so duplicated sent simultaneously 
 over several paths for parallel processing. A transformation may affect 
 the message irreversibly, in which case reproduction preserves the original 
 message. Information may be substituted for message information in a. regular 
 or arbitrary way (the same terms as used earlier for the naming transformation). 
 The substitution may be complete or partial. A transformation may be 
 fabricative in that it adds to a message information for some rea,son missing. 
 An example of fabrication is the filling in of the retinal blind spot in such 
 a. way that a. human S normally remains unaware of the addition . 
 
 Transformations may disassemble a. message into submessages, each 
 containing only part of the original information, others rea.ssemble such 
 a. divided message after various vicissitudes of processing. 
 
 -7- 
 
Transformation processes are not limited to change of substantive 
 messages. Change of operations at a. node may occur also, as a result of 
 transformations triggered by control messages . Control modifications 
 possible include alterations of transfer thresholds at nodal tests,, 
 alteration of the arrangement of sequencing nodes; and opening and closing 
 of various channels, even to complete bypassing of certain tests and 
 transformations . 
 
 Of particular interest for contextual synthesis and for error 
 correction is the observation that certain transformations may be incom- 
 patible, if environmental laws make it impossible for two particular 
 messages to be such that one could be transformed one way, the other 
 another way. An example of such incompatible transformations would be 
 making one of two objects larger, and the other unaltered or smaller, when 
 they are known to be at the same distance from the observing S and their 
 relative sizes known and fixed. 
 
 V. PPOCEDURE ORGANIZATION IN RECOGNITION STRUCTURES 
 
 In this section some alternative models of recognition structures 
 will be described briefly. Additional detail will be supplied in later 
 sections, where there will be relaxation of simplifying assumptions 
 adopted here. Considerations of relative effectiveness of these structures 
 and of possible modification strategies will also be postponed until then. 
 
 A. Parallel Matching 
 
 For the moment it is assumed that each of the objects to be 
 recognized belongs to only one of a. number of discrete types , An input 
 message regarding the object is a sample , The recognition procedure consists 
 then of pairing each sample with the name of a particular type, 
 
 A very simple recognition procedure involves simply comparing the 
 sample with stored information characterizing each type. This procedure 
 might be quite efficient if there were facilities for simultaneously 
 matching each sample against each possible type. Although some parallelism 
 in matching may be possible, provision of such facilities will generally be 
 prohibitive if the number of types to be recognized is large. More 
 
efficient recognition procedures will select more likely types before 
 matching is attempted. Each type may correspond to objects and samples 
 widely differing in characteristics, many irrelevant to the discrimination. 
 Attempted matching of all possible type variations would greatly tax 
 matching facilities. It may be more efficient in these situations to 
 transform the sample into a.s invariant a form as possible before attempting 
 a match. As many types may require the same transformations, and the 
 selection of possible transformations made by identical tests, it may 
 be inefficient to assign independent facilities for transformations for 
 each type. The limitations of a. purely parallel matching structure have 
 been discussed here as if matching were integral to the recognition. 
 Recognition structures, however, may not use matching at all. 
 
 B. Progressive Tracing Structures 
 
 Structures of greater organizational depth will now be considered. 
 The organizational relationship established between nodes by interconnecting 
 channels in the connectivity of the structure. Transient organizations 
 which may occur as certain channels are activated or closed off are 
 linkages . First, divergent detection trees will be discussed. " Tree " here 
 
 is used as in the theory of abstract graphs, to designate structures 
 
 3 
 with no more than one path connecting any two nodes . 
 
 In progressive tracing structures samples are introduced only 
 at the root node, the node of rank zero. Procedures continue at successive 
 nodes, the rank of each of which is the rank of the immediately preceeding 
 node, plus one. The termini of the tree are the nodes of highest rank on , 
 their respective branches . 
 
 The simplest progressive tracing models use undistributed sample 
 structures. In these the sample is transfered from the critical node at 
 which the sample is being tested to only one of the nodes of next higher 
 order. Terminal nodes of the tree correspond to types to be distinguished. 
 Any information in the sample not required for the tests used at preterminal 
 nodes is supplementary . Termini which have associated with them information 
 about the types, other than that used in the preterminal tests, have 
 
 3 
 As used in this report information flow in a. tree need not be exclusively 
 
 from root to termini. For example, in later discussing echo techniques a.nd 
 
 error retraces, signal flow from terminal toward the root is explicitly 
 
 introduced. 
 
 -9- 
 
type memories . Terminal checking trees ha.ve tests assigned to terminal nodes 
 which compare the type memories with the supplementary information of the sample 
 
 The next simplest progressive tracing models are distributed 
 sample structures . In these the sample is reproduced at all or some nodes 
 so that the sample may be distributed simultaneously to more than one node 
 of higher rank. It is still assumed that only one of the termina.l types 
 corresponds to any object to be recognized. Disbributed sample structures 
 have tests assigned to nodes to reject the sample,, sooner or later, along 
 all branches other than the one leading to the correct terminus. If 
 several of the reproduced samples rea.ch termini, the terminal tests should 
 eliminate all but one. If the sample is sent up all branches and rejection 
 occurs only at terminal nodes, this model reduces to the parallel matching 
 model. Distributed sample structures can also approach the undistributed 
 sample model, if only few nodes transfer the sample to more than one 
 node, and for all nodes, except on the correct branch, rejection is quick. 
 As a model approaches the undistributed sample extreme, it is said to 
 be more selective . 
 
 Progressive tracing models need not be limited to models 
 completing one recognition procedure before beginning another. In 
 undistributed sample structures only one of the processing nodes will be 
 active at any time. By giving each sample an environmental loca.tion tag, 
 and by providing buffering facilities to prevent interruption of active 
 nodal processes by incoming samples, multiple sample structures are 
 possible. The location tag provides for the reassembly of the objects 
 recognized in their environmenta.l order. Multiple sample structures 
 can either maintain a constant load, by admitting a. new sample to the tree 
 for each sample achieving recognition, or take up samples in batches, 
 perhaps from an environmentally related object set. This technique can 
 facilitate recognition by use of contextual information, should some of 
 the types first recognized in the multiple sample structure supply 
 constraining information fed back to nodes still processing other traces. 
 (For some applications, see Sec. VI D) 
 
 Progressive tracing structures may have certain nodes which 
 are always sequencing nodes to provide for an orderly testing of alter- 
 natives. Sequencing nodes may be regarded as nodes for which an appropriate 
 
 -10- 
 
test has yet to be found . A node may revert to a sequencing operation, 
 if tests assigned it are powerless because a particular sample lacks 
 the "attributes tested. Nodes always sequencing in function generally 
 are better placed near the termini of a, structure to allow search among 
 possible alternatives to be more readily accomplished. Later discussion 
 of echo-processing, retracing and error detection will illustrate reasons 
 why additions to a structure are more readily made at high ranking nodes. 
 
 Where a. sequencing node, or one that must frequently serve as 
 such due to sample insufficiencies, must be placed in a. low ranking 
 position, it may be of value to provide for relative , preterminal closure . 
 This closure may be accomplished by the technique of admittance testing . 
 tests to determine whether or not the correct branch has been entered. 
 If such testing is distributed over several successive nodes, relative 
 closure is said to be completed at the highest ranking node of the 
 admittance testing series . Of course admittance tests could be performed 
 after tests at the sequencing node, but this might not allow efficient 
 simultaneous testing in parallel distributed S models. In any model, 
 testing at later nodes may be facilitated by lower-ranking admittance 
 testing, as in each branch there may afterwards be less to test than 
 previously at the original ambiguous node. Also, admittance testing may 
 be of value as a continuous check in structures, no one of the nodes of 
 which may be conspicuously uncertain. (See Sec. 6 D, error location.) 
 
 C. Fore-testing, Echo -techniques, and After-Processing 
 
 Structure connectivities discussed have been simple trees with 
 undirectional passage of information from root to termini. Sample dis- 
 assembly can occur at the root or subsequent node, and partial samples 
 sent ahead in the tree to allow partial testing of these forerunners . On 
 the basis of the outcomes of forerunner testing, the processing of the 
 entire sample might be facilitated (i.e. by the deletion of tests up- 
 stream.) Advantages of forerunner tests include: trial anticipatory 
 tests by matching against more likely types, freeing the probably 
 irrelevant parts of the structure for multiple sample processing, and 
 enabling a. general Gesta.lt to form, in terms of which the progress of 
 subsequent more detailed testing might be judged and possible errors 
 forestalled. If forerunner processing produces a simplified structure 
 by making some tests unnecessary through the closing of channels, it is 
 fore blocking . 
 
 -11- 
 
 UNIVERS/TY Of 
 
In some situations partial matching amy be more desirable than 
 
 complete matching, either in speed or in permitting more completely 
 
 parallel processing. If the undirectional constraint on the flow of 
 
 information in the tree is relaxed echo techniques , whereby the outcome 
 
 of a partial matching attempt is conveyed back to the echo inducing node 
 
 (often but not necessarily the root), can be imagined by making use of 
 
 backwardly propagated, or reflected echo messages. Ba.ck blocking is the 
 
 closing of all channels to higher ranking nodes other than those over 
 
 which the echo was received. Because of the partial nature of echo 
 
 matching, a number of termini will in general reflect echos. In such 
 
 cases back blocking produces a, reduced tree in which necessary critical 
 
 testing can be accomplished with fewer decisions. Echos may be reflected 
 
 from termini only, or from intermediate nodes where relative closure is 
 
 possible (as by admittance testing). Echo structures can also differ in 
 
 the way sample information is conveyed to potentially reflecting nodes . 
 
 This information need be no more complete than required by the partial 
 
 matching processes which determine what will be reflected. If this 
 
 information is transfered through the tree by the normal sample message 
 
 channels, it becomes a. special kind of forerunner processing, where no 
 
 decisions are made until reflecting nodes are rea.ched. If intermediate 
 
 message forerunner testing is used to limit the nodes from which 
 
 reflections will be possible, then echo and forerunner processing are 
 
 combined. However, it is not necessary that the sample information be 
 
 so transfered to the reflecting nodes. Direct input to reflecting nodes 
 
 through special, simple input channels, or even transmission may be 
 
 employed . 
 
 k 
 The echo signa.l reflected ba.ck down the tree may be nothing 
 
 more than a, pulse necessary for ba.ck blocking. Echo magnitude can be 
 
 used to encode changes in the order of sequencing nodes or to change 
 
 test thresholds. In structures other than trees, the echo might acquire 
 
 information as to path taken a.s it passed each node on the way back. 
 
 More complicated possibilities include the progressive echo technique . . 
 
 Let the echo contain information (e.g. from terminal type memory) other 
 
 than that transmitted 
 
 n " 
 
 That is, toward the root of the tree, 
 
 -12- 
 
to it for partial matching. Matching this transformed echo to the 
 sample at the echo inducing node could confirm the echo, discriminate 
 the more promising from among candidate returned echos, or even start 
 a new echo cycle involving more comprehensive, or in any ca.se different, 
 partial matchings . 
 
 Both forerunner and echo techniques involve partial testing on 
 partial samples, disassembled and distributed in the structure . They 
 produce reduced trees which are intended to allow a, s implication of 
 subsequent critical testing. Auxiliary testing processes may lag behind 
 rather than lead the critical processes. An example is after -testing , 
 in which the testing processes of a node continue to process the same 
 sample after it has already been passed on to subsequent nodes. In this 
 case there is a difference between transfer thresholds of a, test and more 
 rigorous closing thresholds. After-testing will be considered later 
 (in Section VI) in connection with error avoiding techniques. 
 
 Tree simplification, either by fore or by ba.ck blocking, has 
 been described as though limited to omission of nodes and closing of 
 channels. However, internal nodal test and transformation structure can 
 also be modified to simplify the critical procedure. Thus that part of 
 a. nodal operation that serves only to distinguish between alternative 
 blocked channels can be eliminated, as can any part which distinguished 
 only between blocked and unblocked channels. Operations distinguishing 
 between still unblocked channels alone need be kept. However, a limit 
 may soon be reached in the effectiveness of providing many alternative 
 sets of operations for each node, one set for any combination of possibly 
 unblocked channels. It should be noted that improvements in efficiency, 
 possible by these structure -simplifying techniques, may be accompanied 
 by an increased vulnerability of the S to error, should the correct type 
 belong to one of the eliminated parts of the structure . 
 
 D. Other Structures 
 
 Divergent trees so far considered produce multiple outcomes only 
 as different objects are recognized. Different in purpose are attribute - 
 isolating trees where attributes of one object rather than types are 
 distinguished. In such a, structure nodal tests control the disassembly 
 
 ■13- 
 
of a. sample into sub-samples, passed on to different branches. Ultimately, 
 each attribute is identified at a terminus. A subsequent convergent 
 structure can reassemble the idealized sample into a, form appropriate for 
 a. later recognition structure. The two structures together perform a. 
 complex transformation of the sample. 
 
 Structure connectives discussed have been simple trees. 
 Retaining this limitation on connectivity for channels conveying substan- 
 tive messages, but relaxing it for control connectivity, more interesting 
 organizations become possible. Thus information resulting from 
 operations in other parts of the structure, (i.e. not necessarily on the 
 same branch), can be utilized in influencing operations at a. node. 
 Procedures having achieved, or nearly achieved, recognition could in this 
 way feed back contextual information to supply helpful constraints for 
 more slowly progressing procedures. Control information need not arise 
 within the decoding tree. Control information from outside the decoding 
 tree may modify the tree to facilitate special, detection problems 
 [for example, by affecting sensitivities]. Particular uses of this 
 environmental control will be examined later, after some possible 
 mechanisms have been suggested here. 
 
 The possibility of utilizing control information derived from 
 parts of the S outside the immediate detection structure has been men- 
 tioned. One method of some promise stems from the observation that whether 
 types are to be regarded as different, or alternatively, as essentially 
 the same, may depend on the current, dominant purpose of the S. Thus 
 cla.sses of types may be purpose -equivalent . The echo techniques of the 
 back blocking may simplify the tree detection by bypassing all tests 
 serving only to distinguish between purpose-equivalent types. 
 
 VI. USES OF REACTIVE MODIFICATION IN DETECTION 
 
 The utilization of reactive modification in the service of 
 detective purposes of the S will be considered here. For the sake of 
 exposition relatively transient changes will be distinguished from 
 adaptive changes which though not necessarily realized immediately yield 
 an enduring, improved effectiveness of the S. It is assumed 
 
 -Ik- 
 
for this report that the S's studied are not subject to pathologically 
 determined changes towards lesser effectiveness, or subject to degre- 
 dational changes such as ageing. These assumptions confine the 
 discussion to fictional S's. All reactive changes then are initiated 
 to improve the S's effectiveness, however unfortunate their consequences 
 may prove to be. Modifications to be considered in this section are 
 sensing, correction and avoidance of error in detection problems. 
 
 A. Sensing the environment 
 
 An S, the task of which is to detect occurences of a limited 
 class of objects, may treat all other objects as members of an inclusive 
 equivalence class of the irrevelant. By eliminating operations and 
 channels concerned with the recognition of irrelevant types, the 
 decoding structure can often be considerably simplified, for example by 
 back blocking. In general, however, irrelevant samples will enter S. 
 Therefore operations distinguishing between relevant and irrelevant 
 types can not be eliminated, as they can when fore testing or echo 
 testing a particular sample from the full environment. 
 
 In the detection mode S operates on many samples, most of 
 which are only partially processed. Samples may be discarded from 
 further processing whenever an operation attempts to direct them to 
 a. blocked channel. These channels may be regarded a.s discard channels 
 rather than as being physically blocked. The high rate of internal 
 discard suggests that the input facilities at the root of the decoding 
 tree may become overloaded and be a bottleneck to an otherwise adequate 
 structure. For this rea.son it may be advantageous to supply the 
 structure with multiple input facilities, so that a. number of samples 
 may be simultaneously carried through the operations of nodes 
 occupying the first few ranks of the structure. In S's where most 
 samples will be discarded before reaching recognition, multiple, sample 
 rather than single sample processing can be advantageous. 
 
 The difference in cost between missing an example of the 
 class to be detected, and identifying a,s an example objects in fact 
 irrelevant, will often affect choice of detection strategy. If 
 false alarms are very costly, S will benefit by a final, precise test, 
 
 •15- 
 
such as testing of supplementary information in a. terminal checking 
 structure. Usually few samples can "be expected to reach terminal 
 nodes. It may be relatively efficient to have the testing at lower 
 ranks rather looser than might be desirable in the general identification 
 mode. Looseness in preterminal processing increases the likelihood 
 that not even atypical samples of the cla.sses to be detected will be 
 overlooked. 
 
 A partner to this looseness in preterminal testing is direct 
 anticipation of terminal matching. If the cla.ss of purpose -relevant 
 types consists of a. few common and many more uncommon types, detection 
 processing may become more efficient by first conveying the sample 
 to the common termini for attempted matching, and resorting to the 
 more orderly testing sequences only if this direct anticipation falls. 
 
 Detection problems differ in the degree to which the S 
 may increase its detection rate by selectivity in its exposure to the 
 environment. This selective exposure occurs, for example, when a. 
 person looking for a particular object moves either his person, his 
 receptor orientation, or his attention from one part of the environment 
 to another. When a, detection procedure makes use of such responses 
 to increase the probability of its exposure to task relevant objects, it 
 is using search responses, a. special variety of trial response. The 
 simplest search is that pattern of environmental exposure insuring that 
 eventually the environment will be covered with as little repetition 
 as possible. Such a search is a, scan . More interesting searches take 
 advantage of regularities existing in the distribution of the objects 
 of interest in the environment; the term hunt would be appropriate for 
 these. 
 
 Searching procedures may make use of different reactive 
 modifications in S than those called for in the simple detection mode. 
 Thus in a, hunt the cla.ss of relevant objects for detection may change 
 as the hunt progresses. No matter how limited the cla.ss of objects for 
 terminal detection, S may do well in early stages to recognize a. much 
 broa.der cla.ss of potential cues, many perhaps having little similarity to 
 the objects sought. Examples of this progressive narrowing of the class 
 of objects to be detected occur in information retrieval. Both search 
 and detection, unlike the other processing modes hitherto considered, 
 seem to require temporary memory of a. rather special kind. 
 
 -16- 
 
B. The Nature of Errors 
 
 This subsection concerns error in the detection procedures 
 of S, and those reactive modifications in S that reduce the severity 
 of the consequences of error. Before considering these modifications, 
 some attention will be given to characteristics of errors themselves. 
 
 First, an error of identification may have its source in 
 either S or in its environment „ Thus an error in reading a. printed 
 page may stem from a misprint in the text or from error in the recogni- 
 tion process. Environmental errors may be treated as part of the 
 variability naturally occuring in the types to be recognized. "Error" 
 correcting transformations may be introduced as rea.ctive modifications 
 in the S„ Such modifications might well, for instance, be based on 
 experienced probability distributions which reflect the relationship 
 between the environmental processes which produce the sample and the 
 kinds of error to which these processes are vulnerable. For example, 
 a typist may frequently hit a. key adjacent to the correct' one, or 
 transpose neighboring letters. Such common kinds of error could be 
 incorporated in transformations applied to the sample before or after 
 an attempted terminal match. 
 
 At a. deeper level the problem of distinguishing between 
 environmental error and S processing error goes beyond questions of 
 type variability. If a. particular object ha,s never before been 
 recognized by S as an example of a. type, S may not have formed an 
 appropriate transformation to reduce samples of "environmental error" 
 to a. standard form appropriate for terminal matching „ Fa,ced with what 
 seems to be an unusual object, S must decide whether the apparent 
 peculiarity of the object stems from a. failure on the part of S to 
 recognize the familiar, or from a genuinely different quality of the 
 object for which it has as yet no name. Most environments can be described 
 in an infinity of ways, and the fact that there is something unusual in 
 the grouping of environmental attributes constituting a. sample is, in 
 itself, no indication that the S need make provision for future recognition 
 of such samples^ they may never recur, or the may be of so little 
 importance to the purposes of the S that it would do well never to attend 
 
 -17- 
 
to such samples should they recur. Seen this way, the internal evidence 
 of peculiarity in an S, can mean: internal error, environmental 
 variability of a. familiar type (or "error" ), or novelty, important or 
 not. There may be no way for the S to be able to determine at a given 
 time which of these alternatives fits the sample, the processing of 
 which has had unexpected results. 
 
 What is to be considered error may depend on the momentary 
 purpose of the S. Some purposes may require identification so crude as 
 to be functionally erroneous for another purpose. Thus S may not be 
 able to establish any single error identification or correction policy 
 of value independent of the task to which the S is momentarily 
 committed. Costs implicit in various errors are variable and depend 
 on purpose. Depending on the S's purpose, the occurrence of an error 
 may not warrant correction; if it does not, sensitivity of S to 
 the fact that error has occurred might only distract and so hinder 
 the effectiveness of the S's performance. Thus a. person in early 
 stages of learning a, foreign language may have trouble deciding 
 whether to look up a misinterpreted word or to go ahead in the 
 expectation that context will supply the missing significance. At 
 a more advanced stage of learning a similar error might be symptomatic 
 of an important gap in his understanding of the language and call for a. 
 careful diagnostic retrace to determine the source of misunderstanding. 
 The leisure to correct specific errors may be available only to S's 
 sufficiently developed so as to make comparatively few such errors. 
 
 Errors differ also in how conspicuous they may be to the 
 error -making S. Often the possibility of correcting the error will 
 depend on the precision with which the particular faulty operation can 
 be localized, which in turn may depend on the speed with which the S 
 becomes aware that an error has been made. When errors are likely not 
 to be sufficiently conspicuous to the S, a. tea.cher may be helpful in 
 pointing them out. The effectiveness of teaching machines depends in 
 no small degree on the immediacy with which the fa.ct of error is 
 communicated to the student. Especially helpful are teaching pro- 
 cesses which communicate not only the fact of error but which identify 
 the correct response, also. When this occurs in an S utilizing a. 
 
 ■18- 
 
decoding tree the correct and the incorrect responses can be backed 
 up until they meet at a. node where the error probably originated. 
 Ways in which adaptive S's can learn without direct environmental 
 teaching aid are of great interest in connection with S ' s capable of 
 discoveries predictable by no teacher. 
 
 C. Detection and Correction of Error 
 
 Detection of error may be signalled by a. failure of" matching 
 in a terminal checking S, or by failure of concurrent checking pro- 
 cesses, such as admittance testing. When the correct response is 
 not known at the same time that error is detected, location of error 
 may require retracing processes . In retracing, S tests various nodes 
 of possible error in a. more or less orderly way. The sample is trans- 
 ferred to different nodes from each suspicious node in turn until a 
 correct terminus is reached. If there is no information pointing to 
 one node as more likely to be at fault than any other, a. retrace 
 strategy which tries out nodes from the highest ranking on down 
 is suggested. This is the preferable direction because the higher 
 ranking nodes are, nearer to termini, which allows the elimination 
 of a greater number of error possibilities for any given number of 
 retrace steps . Also, should secondary errors arise in the retrace 
 process, there are for high ranking nodes fewer dependent branch 
 points, each perhaps calling for extensive retraces ot its own. 
 
 Retrace difficulties will be discussed in connection with 
 the optimal ordering of tests. (See Section VII ) It may be 
 immediately seen that retrace difficulties are greater, the lower 
 the rank of the node at which the error occurred. Accordingly, low 
 ranking nodes must optimally be of very high accuracy. Other ways of 
 avoiding the pyramiding difficulties of retraces back to low ranking, 
 nodes, are techniques to determine nodes of likely error without 
 calling for a complete retrace. Attention will now be directed to 
 these. 
 
 
 -19- 
 
D. Error Avoidance 
 
 Error, whether occuring near the beginning or end of a path, 
 will lead to terminal error. Where the error occurs, however, will have 
 important influence on selection of error avoidance and error correc- 
 tion technique. Each sample may be tagged with a path uncertainty - 
 vector, the elements of which are estimates of the probable error at 
 each preceding nodal decision. A path uncertainty vector can be used 
 to determine optimal retrace strategies . The path uncertainty vector 
 may simply acquire an estimated uncertainty element as each node is 
 passed, the values of such elements remaining fixed subsequently. 
 Or in more interesting structures the information in the vector may 
 be updated by input to it from on-going processes throughout the 
 structure. This updating can simplify error correction by anticipating 
 how error might come about, before error is detected or indeed before 
 a terminus is reached by the critical recognition procedure. Some 
 auxiliary checking techniques have already been mentioned: fore- 
 processing, echo -processing and after-processing. 
 
 After -processing, as a checking procedure, may allow deter- 
 mination of error at a. node after the sample has been transferred to 
 a higher ranking node. Low transfer thresholds can contribute 
 sufficiently to the overall processing speed of a structure that time 
 lost in error correction may be offset by this speedup of testing . 
 However when low threshold (i.e. loose) testing procedures are used, 
 after -processing and other error locating techniques become relatively 
 more important . 
 
 Another error -avoidance technique is the use of counter- 
 processing. This is essentially a distributed sample structure, in 
 which at certain nodes of possible error the sample is transferred 
 not only to the test -selected node but to other nodes as well, despite 
 the test. Counter -processing could be said to function as a devil's 
 advocate in trying possibilities, individually unlikely but dangerous 
 if correct. The term is intended to designate something more selective 
 than widespread sample distribution with rejection in most branches. 
 By dangerous alternatives are meant those which, if correct, lead to 
 responses incompatible in outcome values with those following an 
 erroneous decision. Costs of error are normally a function of the 
 erroneous node's rank. As has been shown, retrace will generally be 
 easier, the higher the rank of the node of error. Because of this, 
 counter -processing, and after -processing will usually contribute more 
 at low than at high ranking nodes „ 
 
 -20- 
 
Another error avoiding method uses multiple structures for 
 detection. Thus if S has two detection trees, operating simultaneously 
 on the input samples, they could serve to check each other. The set 
 of types still possible at any interim processing stage, will be 
 called the consequence set reached by that structure at that stage. 
 As processing proceeds, the consequence set generally will become 
 smaller and smaller until, at recognition, only one type is 
 included. If S is processing the sample simultaneously in more 
 than one structure, the logical product of the consequence sets 
 of each of the structure can be formed. When this product is 
 reduced so that only one type is left, recognition can be regarded 
 as completed even if no one structure has carried processing far 
 enough to have achieved identification by itself. This identification 
 by first unique consequence set is a. form of parallel operations 
 available for multiple structures that can allow quick identification. 
 Post -unique continuation of consequence set testing offers an error 
 checking procedure. If error has occurred in any of the structures, 
 the consequence set may become disconnected , (i.e., the logical 
 product vanishes); no type remains that could be the outcome of all 
 procedures . 
 
 Checking by post -unique testing of the consequence set of 
 multiple structures will be sensitive to the extent that each different 
 structure tests different attributes, as uncorrelated as possible. If 
 the decisions made at corresponding nodes are correlated, the structures 
 become more and more replicas of each other. Here the consequence 
 set does not converge to uniqueness appreciably before the termini 
 of the structures are reached „ Highly correlated structures offer 
 little possibility for checking other than the obvious increase in 
 reliability to be obtained by replication of processing mechanisms. 
 
 Location of the source of error may also be accomplished in 
 local link-forming structures. These structures allow a. quite 
 different way of path determination in which, instead of the sample 
 being passed from node to node, from root to terminus, the sample, 
 either a.s a whole or disassembled, is input to all or many nodes 
 
 -21- 
 
simultaneously. Links are then established between nodes according 
 to the results of testing within the nodes in question, but not 
 necessarily in order of increasing rank. These structures are of 
 particular interest in non-tree connectivities , but may be useful 
 in trees also. 
 
 Local link forming processes can link nodes that could not 
 have been approached in a progressive tracing structure if the 
 lowest ranking node of the chain had been missed. Thus, there may 
 be formed by these processes, unrooted node linkages, or chains , 
 which may reach a terminal node and have higher overall probability ' 
 of correctness than a rooted sequence produced by progressive 
 tracing. Figure I shows how such a. chain can be interpreted as 
 "pointing" to the node at which the sequential tracing process most 
 probably was in error, 
 
 x - node of error 
 
 = - the progressive trace path 
 
 — - unrooted node linkages 
 
 ^- - indication from the unrooted 
 linkage a.s to probable error 
 made 
 
 Figure 1: Unrooted Linkage Pointing to a, Node of Probable Error 
 
 E. Undermining and Urgency 
 
 Before shifting attention from problems related to error, it 
 may be of interest to think about reactive modification in a detection 
 tree, when there is evidence that an error may have been committed. 
 In some cases it will be best for the S to abandon the present procedure 
 at once and to institute appropriate error retrace or other correction 
 processes. The nearer the on-going process is to a relative closure, 
 
 The "flash thru" feature of the Pattern Articulation Unit, originally 
 suggested by the author on psychological grounds, is an example of a 
 local link forming process. See Digital Computer Laboratory Reports 
 No. 122, 125 by B. H. McCormick. 
 
 -22- 
 
such that it can be resumed without loss if it later appears that no 
 error occurred, the more desirable it will be for the process to be 
 continued, at least until the point of relative closure is reached. 
 The best point of relative closure is, of course, either a. point of 
 admittance test completion, or a terminus where the issue of correct- 
 ness can be quickly resolved. When there is no proximity to a. point 
 of relative closure, continuation of the procedure will depend on: 
 cost of the process so far, value realizable if it should 
 prove correct, cost should it be incorrect, cost of resuming it 
 after discontinuation should resumption be necessary, and, of 
 course, probability that the process is in fact in error. 
 
 The greater the probability that a. procedure is in error, 
 despite apparent success in recent local tests, the more it can be 
 said to be undermined. Undermining may be of several kinds: error 
 undermining is caused by the possibility of downstream error, 
 although the purpose of the procedure remains unchanged; purpose 
 undermining occurs when the doubt comes not from possible down- 
 stream error in achieving this purpose, but from the fact that, 
 error or not, achievement of the purpose itself is losing its 
 value as the S shifts to other purposes . 
 
 A procedure, however undermined, may undergo a reactive 
 modification designed to reach relative closure before being terminated, 
 This relative closure, if achievable, throws away rich possibilities 
 or greater precision realizable had the procedure not been undermined. 
 This urgency mode may, in different circumstances of undermining and 
 in different S's have a variety of characteristics. Possible changes 
 include, increased use of anticipation for the direct testing of 
 high probability outcomes of relative closures, dropping more 
 unlikely branches of sequencing nodes, so that the sequence 
 exploration can be shortened; increased looseness in testing 
 thresholds for the more probable alternatives; momentarily increased 
 interruption threshold for the entire procedure, to compensate for 
 
 o 
 
 i 
 
 That is, toward the root of the tree 
 
 -23- 
 
more likely interruption as additional undermining information becomes 
 available; preparation of residua, to allow return to be as cheap as 
 possible; and closure-forcing selectivity of testing, directed to the 
 immediate elimination of as many as possible unlikely alternatives of 
 procedure (for example, to aid anticipation). The urgency mode 
 intentionally resembles the psychological phenomena, of anxiety; both 
 exhibit a reactive narrowing of on-going processes in the face of 
 threat or suspected undermining. The urgency mode is the first 
 example in this report of a kind of inertia, or lag in changing 
 procedures. Study of such decisional hysteresis that is optimal in 
 different situations might prove rewarding. 
 
 The kinds of error considered hitherto have been particular 
 nodal faults . More general errors are systemic . Diagnostic procedure 
 may be utilized by S to test its operating effectiveness aside from any 
 particular problem. The possible value of using intentionally intro- 
 duced error for diagnostic purposes may be mentioned. If such error 
 is found to ca.use no difficulties, this may be evidence that the 
 structure is encumbered by art if actual complexities no longer necessary 
 for its function. Thus, important s implication may be possible. 
 Introduced error (e.g., marginal checking) may also point to possible 
 weaknesses of the structure which unless improved may be the source 
 of errors at awkward future times. Further discussion of diagnosis 
 of systemic error must await a proper treatment of the subject of 
 developmental change . 
 
 VII. DETERMINANTS OF SHORT RANGE EFFECTIVENESS 
 
 This section considers the relative rea.ctive and adaptive 
 effectiveness of possible S structures. The determinants of effectiveness 
 are potentially many, few of which can be adequately specified in abstract 
 discussion. Principles presented here will be too qualitative to allow 
 prediction of absolute differences in effectiveness of alternate S designs, 
 In real S's, full realization of any one of these principles of effective 
 design will often be at the cost of other principles, the relative 
 importance of which will depend on the specific S and ta.sk. 
 
 -2k- 
 
A. Order of Tests 
 
 The order in which tests are arranged in detection trees 
 may seem of little consequence. Procedures end with the same types, 
 each the logical product of all the consequence sets produced by decisions 
 between root and terminus . Logical products are commutative, and the 
 total information derived is a function only of the number of types 
 and of the relative frequency of occurence of their corresponding 
 objects in the relevant environment of the S. 
 
 If all samples called for the same tests, nothing would 
 be gained by fixing order of testing. There might be an advantage to 
 selecting the order for a sample by methods of queueing theory, on 
 the basis of the momentary load on ea.ch testing facility. 
 
 However, when the range of types to be identified is 
 large, reducing the number of tests to be performed, generally increases 
 efficiency. Thus, once one has determined that an animal has four 
 hooved feet, it is no longer important to ask about the number of 
 spines in his dorsal fin, or to ask what language he speaks . 
 
 The test economy principle can be stated.: the more 
 generally applicable a. test, the lower the rank to which it should 
 be assigned; or, the more decisions to which a. given decision is 
 relevant, the sooner that decision should be made. Such a. test can 
 be regarded as a shared subprocedure which, done once, makes possible 
 the processing of other subprocedures, perhaps simultaneously (as in 
 distributed sample structures )„ Or again, the completion of such a test 
 may be viewed as a. way of eliminating with one test as many incorrect 
 alternatives as possible, where this decision, if unma.de, would be a 
 component of many subprocedures of higher rank. Because low rank 
 tests will be used on more samples than high ranked tests, it may be 
 desirable either that they take little time, or that multiple low 
 rank structures be provided by the S. 
 
 There may be circumstances in actual S's that prevent full 
 realization of the test economy principles. For example, the low 
 ranking nodal tests may be largely confined to aspects of the input data 
 first acquired. There may be statistical support for the assignment 
 
 -25- 
 
of low rank on the basis of seniority. Early discriminations may 
 be, but are not always, of relatively universal relevance, but they 
 are not always so. Even in S's capable of change, the lower the 
 rank of the node to be changed, the greater the number of secondary 
 changes that may be necessitated in higher order nodes. Similarly 
 node replication for parallel processing may increase the difficulty 
 of change . 
 
 Another limitation on the universal applicability of 
 test economy comes from the high cost of error retracing for low 
 ranked nodes. This leads to the rank of error principle; that 
 the greater the frequency of error in a. test, the higher the rank 
 of the node to which it should be assigned. Statistically this may 
 also tend to be correlated with range of applicability and seniority, 
 as increased experience with a. test may lead to its being more 
 reliably performed. Also, inaccurate tests will often be those 
 which need changing, and change is ea.sier in higher ranks. 
 
 This requirement of greater accuracy in lower rank nodes 
 suggests also that the error avoiding techniques mentioned in the 
 preceeding section will be better employed in the lower than in the 
 higher ranking parts of the tree. As more nodes are interjected 
 between root and terminus in a detection tree, the structure shows 
 greater vulnerability to error. If the sequential decisions required 
 are statistically independent, the probability that the whole 
 procedure has reached the correct terminus is the product of the 
 probabilities for correctness of each test. If many decisions are 
 involved, the overall probability of error may be high even though 
 individual tests are of high accuracy. This observation puts a. 
 severe limitation on the depth (mean terminal rank) of decoding 
 trees . Again this limitation may be in conflict with the economy 
 principle. . 
 
 If a. test has been found to be a, source of frequent error, 
 what can be done about it? At the expense of test economy it can be 
 assigned to higher ranking nodes, where, as the termini are close, 
 error when it occurs will be more rapidly detected. This will 
 
 -26- 
 
require that the test be assigned to at least one node in every 
 descending branch. Such duplication may, however have the advantage 
 of allowing variants of the test to be tried in each place . Should 
 an improvement be found in the test, it may be possible to move the 
 test back to lower rank. Other possibilities include: assignment to 
 a node of more than one test, each normally sufficient to make the 
 requested decisions ( a multiply determined node ), making the node 
 momentarily into a sequencing node until information useful for 
 improving the test can be acquired) and, assigning special input 
 facilities to the node so that the sample can be augmented by 
 additional information. 
 
 Related to these problems of accuracy are ambiguities 
 caused by objects which are incomplete representations of their 
 types. If due to the nature of the objects studied, particular 
 attributes are especially likely to be missing, then tests using 
 these attributes are better assigned to high ranked nodes of the 
 structure, as they will often be forced to revert to sequence 
 selection. Retrace difficulties make costly any unnecessary assign- 
 ment of sequence selection to low ranked nodes. When sequencing 
 becomes necessary at a low rank node, S must function in a. less 
 efficient way than normal, and the reactive modifications characteristic 
 of the urgency mode may be useful. 
 
 The use of anticipation will generally be most effective 
 close to the terminal nodes at which it can be tested. It is also 
 most effective when the alternatives at a node are of very unequal 
 probabilities, so that anticipatory testing of the most likely 
 alternatives can often permit bypassing of time-consuming test 
 discrimination among unlikely possibilities. 
 
 These two considerations can be combined in the principle 
 that tests with very unequally probable alernatives are better 
 assigned to high ranked nodes. Additional values of such assign- 
 ment will be realized in cases where these are also nodes which 
 serve for sequence selection. The desirability of assigning 
 sequencing functions to high rank nodes has been mentioned. This 
 exploitation of the inequality of alternative probabilities allows 
 improvement over arbitrary sequencing. 
 
 -27- 
 
Low ranked nodes, on the other hand, will generally be more 
 effective, the more rapidly their decision processes can eliminate branches 
 from further consideration. In information theory terms the average 
 information conveyed by the decisions of a. node is related to the average 
 number of alternatives eliminated, and this will be greatest when the 
 alternatives are equally probable. This is the converse of the previous 
 principle of assigning processes with very unequal probability distributions 
 among alternatives to high ranking nodes. 
 
 This discussion can be simplified by introducing a. method 
 of tree description, the relative ramification profile . This is 
 constructed by plotting for each rank the ratio of the number of nodes 
 of that rank to the number of nodes at the next lower rank, and dividing 
 this ratio by the mean information conveyed by the decisions transferring 
 messages between these two ranks. The two principles, relating the 
 probability distribution among alternatives at a, node to the effective 
 rank assignment of the node, maybe combined in this principle. the 
 relative ramification profile of a, detection tree should be a mono- 
 torically increasing function of the rank. High relative ramification 
 is also associated with difficult error retrace, expecially costly at 
 the lower ranks of the structure . 
 
 Trees in which all termini have the same rank will generally 
 be less effective than trees which have branches of various lengths. Thus, 
 if there are particular types that are of unusual importance to the S, 
 or which will provide particularly valuable context information for 
 recognition of other objects, it will be effective for the S to achieve 
 recognition of these salient types in less time than required for other 
 types. This can be achieved variously, but the recognition of these 
 salient types by procedures with fewer nodes can provide not only 
 generally higher speed, but also offer greater freedom from error 
 because of the lesser rank depth. This kind of efficiency from saliency 
 shortening may, unfortunately, be achievable for only a, limited set of 
 purposes, as the saliency of a. type will often closely depend on purpose. 
 
 B. Segmentation by Function and by Purpose 
 
 Difficulties can arise in attempts to design one structure to 
 be effective for more than one purpose. Ways of improving the effectiveness 
 of structures will point to different structures for different purposes, 
 
 -28- 
 
such as saliency shortening. When difficulty of multiple optimization 
 is too great, S may develop partially independent structures for 
 certain important purposes . Such an S is purpose -segmented, as 
 contrasted to a, system which is functionally segmented, a.s into encoding 
 and decoding structures, both of which will "be involved in achieving any 
 purpose. 
 
 Functional segmentation can sometimes reduce the necessity of 
 purpose-segmentation. As an illustration, let it be a.ssumed that 
 "purpose selection ha.s been assigned to the encoding segment (i.e. 
 response preparation and selection structure). Grouping of termini 
 into purpose -equivalent sets, reactive modifications of thresholds for 
 detection, etc. will be induced from the encoding structure. This 
 arrangement has the advantage of keeping the structure of the decoding 
 segment free from transient changes induced by long range influences 
 of purpose. This relative independence of the decoding segment may 
 free its interpretation of environmental realities from excessive dis- 
 tortion in the service of particular purposes, an historic source 
 of individual and social pathology. This segmentation will be most 
 effective if the changes induced in the decoding tree diminish a.s the 
 rank of the affected nodes decreases . 
 
 The above model allows some flexibility in the decoding 
 structure to accomodate its processes to purpose. If this is not 
 sufficient, further purpose -segmentation may still be necessary. For 
 structures where change of purpose is frequent, rea.ctive changes induced 
 by interruption of purpose may so quickly change the structure that 
 there is little opportunity to seek the shelter of relative closure or 
 to form residua. Residua, by the way, are a. particular kind of 
 trans i ent pur po s e - s egmen t s . 
 
 Trees are particularly vulnerable to demands for purpose - 
 segmentation. If the definition of error is too closely tied to 
 purposes, it may be impossible to confine the error-prone tests to 
 the highest ranking nodes of the tree. This in itself may entail such 
 difficulties in retracing that use of a single tree becomes inefficient. 
 The utility of local link-forming structures in locating the node of 
 
 -29- 
 
error in a progressive trace procedure has been mentioned. This 
 technique will be of particular value in trees whose nodes can not 
 be ranked according to accuracy. Such link-forming structures have 
 other advantages, particularly when not tree-connected. 
 
 -C. Link -Forming 
 
 Earlier mention was made of inducing reactive modifications 
 in trees by using context to aid recognition. Most natural of these 
 reactive modifications is the use of back blocking to simplify the 
 decoding trees as a function of previous recognitions. The problem 
 of how the termini to be back blocked could be selected without an 
 auxiliary structure, perhaps as complicated as the decoding tree itself, 
 was not met. Certain features of local link-forming structures, and in 
 particular functionally segmented ones, that seem naturally adapted to 
 this kind of recognition problem, will now be discussed. 
 
 Consider structures the individual nodes of which respond 
 to possible attributes of the input samples . Part of all of the 
 sample information is input to each node, and the node becomes 
 activated to a. degree corresponding to the likelihood, tested in 
 the node, that the nodal attribute may be present in the sample . 
 This activation is the initial operation of the node. Subsequent 
 operations consist of forming links with other nodes and of mult i -node 
 operations to test the validity of particular chains. For a. crude 
 example consider a structure, the nodes of which correspond to the 
 letters occuring in printed text. Let there be a. complete set of 
 
 letter nodes available for each possible letter position in the word. 
 
 7 
 In Diagram II, a simple example from Selfridge, the use of context 
 
 in determining the recognition of an ambiguous letter is presented, 
 
 "f\" Ambiguous, i.e. A or H 
 
 7 
 Selfridge, 0. Pattern Recognition and Modern Computers, Proc. WJCC, 
 1955, p. 91-93- 
 
 -30- 
 
The transition probabilities of English are such that "A" 
 in the second place is seldom followed by "E", and that "H" in the 
 second place is seldom followed by "T". Thus in this example there" 
 would be, in each case, only one path, formed from high transition 
 probability links, leading from the first to the last position. The 
 existence of this path might be tested by sending a fla.sh-through pulse 
 between the terminal positions, i.e. "T" to "T". Should there be a 
 number of paths, they could be individually checked at the terminus 
 against the words in a "dictionary". This matching procedure is 
 more feasible than complete parallel matching: the number of 
 alternatives to be tested is fewer than the number of possible 
 permutations prior to processing in the link-forming structure. 
 Successive increments in the threshold for link-forming as a function 
 of transition probabilities allows testing of successively narrowed 
 consequence sets for uniqueness, and also post-unique checking. 
 
 At the same time another local link-forming structure could 
 be operating in which the nodes represent words, rather than letters. 
 At this level, representation of ea.ch word possible at each sentence 
 position would probably be uneconomical, and special constraints or 
 tagging might be used to determine the arrangement of links . Activation 
 of the nodes at this level would follow word recognition on the letter 
 linking level. In addition if gaps developed in the linkages of either 
 level, the nature of the missing information could be "pointed" to 
 (as indicated in Figure l) for location of error. 
 
 In this model pointing could take the form of activating 
 certain nodes, not sufficiently activated by environmental input, to 
 bridge the gap. It may be noted that this kind of interaction between 
 local link-forming structures can produce fabricative transformations 
 in which detail absent from the input is supplied from stored, past 
 experience. This use of internal as well as external context permits 
 very sensitive recognition on the basis of limited input, but is 
 
 o 
 
 error prone. An example of this is what James called the "proof 
 reader's illusion". Perception of a. misreading can be quite as clear 
 as perception of correct reading, but wrong. Recently the writer saw 
 
 a 
 
 James, W. Principles of Psychology, 2 Vol. Dover edition, 
 
 -31- 
 
a sign on an automobile as " DANGER", possibly because such 
 
 announcements often enough threaten some misfortune. "The sign said" 
 
 DANCER", but the rereading produced no clearer image than the 
 
 first. It is as if the first chain von the race for through connection 
 and, once established, the details of the image were as clearly supplied 
 from storage as from the environment . 
 
 In link forming processes, nodes belonging together become 
 
 q 
 linked. Hademard reports Poincare's image of thoughts being formed 
 
 by the clinging together of hooked atoms mobilized in a "dance". 
 Aside from the mobility of the nodes, this sounds much like a link- 
 forming structure. The formal properties of the representation can be 
 equally well satisfied by moving links between fixed entities or fixed 
 links between moving entities . 
 
 An example of input information that may evoke incompatible 
 purposes arises when two different objects are superimposed, for 
 example, two different voices heard simultaneously. The purpose of" 
 detecting one may intricately interfere with the purpose of detecting 
 the other. If they were separable prior to detection reactive 
 modification of the structure to ignore attributes of one, or the 
 other, would suffice. However not until detection has been accomplished 
 will it be possible to determine which attributes were characteristic of 
 one, which of the other. There seems to be no reasonable reactive 
 modification of a detection tree that would make such recognition 
 possible. Cherry has shqwn, however, that even if two messages 
 are read by the same voice, superimposed on tape, and input to the 
 same ear, the human subject can reconstitute the separate messages 
 without scrambling. This is a, rather extreme example of the singular 
 ability of human beings to make use of context in recognition. Here 
 even purpose -segmentation would in no obvious way contribute to recognition, 
 
 9 
 
 Hademard, J, The Psychology of Invention in the Mathematical Field, 
 
 Princeton, University Press, 19^9- 
 
 Cherry, C. On Human Communication, Science Editions Inc, New York, 
 1961. 
 
 ■32- 
 
Separation of superimposed samples is possible in link- 
 forming structures to the extent that a. chain formed for one sample, 
 fitting together its attributes, does not interfer with the simultaneous 
 formation of a chain for the other sample. The apportionment of 
 attributes to the proper sample, a difficult problem for a. successive 
 tracing structure, can be accomplished by attribute node linking to 
 whichever growing tree would be better completed with that attribute 
 in a gap. No magic vitalism is implied. This separation of super- 
 imposed samples requires something approaching essentially parallel 
 processing; that is, the cost of alternating between subprocedures 
 is very high. In a. local link-forming structure, order depends on 
 the success of nodes in achieving activation,, This freer order can 
 utilize particular opportunities offered in the nature of the specific 
 input, opportunities perhaps never before anticipated. 
 
 In a. tree structure, by contrast, the order of decisions 
 is fixed. It can be well chosen a.ccording to average qualities of the 
 inputs expected. Thus, it is conceivable that for environments which 
 are of limited and well regulated variety, with unvarying quality of 
 input, there may be trees which will perform more effectively than 
 any link-forming structure. However, the difficulty with the progressive 
 tracing structure lies in the fact that information necessary for alloca- 
 tion of an attribute can be obtained only after further processing 
 which presupposes allocation. The amount of trial and error required 
 by a progressive tracing structure is high, because decisions to 
 be made must be accomplished in a fixed order, and progress to higher 
 ranks occurs one step at a. time. 
 
 Therefore, it is maintained here great variety in environ- 
 ment or in purpose can be encompa.ssed effectively in progressive 
 tracing structures only by costly purpose- segmentation, and can be 
 more readily managed by link-forming structures. Fore- and ba.ck- 
 blocking and echo testing are methods suggested for improving the 
 effectiveness of progressive fore-testing and echo-tracing structures. 
 These procedures attempt to alleviate difficulties inherent in fixed 
 testing sequence by preliminary tests on sub-samples, which in turn 
 can simplify the later critical progressive tracing process. They 
 
 •33- 
 
effect tree simplification by local link elimination. These devices, 
 as well as the use of multiple trees processing the same sample in 
 parallel, in part bridge the gap between local link-forming structures 
 and progressive tracing trees. The gap can be bridged from the other 
 direction also, as local link forming structures can utilize tree 
 structures when the information to be processed can be reduced by 
 natural or artificial constraints to the simplicity required for the 
 more sequential processing. 
 
 It was suggested earlier that efficient sequencing of a, 
 previously parallel process depends on the existence of sufficient 
 points of relative closure for subprocedure alternation to occur 
 without excessive costs for resumption of temporarily discontinued 
 processes. Relative closure allows for interruption of current 
 activity by error retracing processes. Without such closure errors 
 would be shifty targets that would not stay put when found. Language 
 function may use a relatively sequential, tree -like structure, richly 
 supplied with relative closure. Important problems of a. more parallel 
 S might still be solved by such highly organized, if limited, 
 structures as this model of language. 
 
 It is easy to picture the growth of a tree structure by the 
 sprouting of new limbs and twigs as finer and finer discriminations 
 become useful. The development of linking systems suggests no such 
 facile analogies. Analysis of how growth may come about can be 
 profoundly interesting. Understanding of developmental possibilities 
 requires further specification of at least two kinds;: i) specification 
 of ways in which links are formed and chains tested for validity; 
 (ii) specification of ways to coordinate sub -structures such as 
 interacting letter-level and word-level linking planes, or such as 
 progressive tracing fields and linking fields. 
 
 In the only technique of link formation so far suggested, the 
 probability of linkage reflects some accumulated estimate of transition 
 probabilities. This possibility follows the well-worn tradition 
 of associationist models of brain function. It would be surprising 
 if such a plausible principle were to be shown to have nothing 
 
 ■3*- 
 
whatever to do with brain mechanisms. On the other hand, it would 
 be very surprising if other determinants of linking could not be 
 conceived, found useful in design of various S's, and perhaps 
 identified in existing S's. Determinants of linking may reflect 
 non-probabilistic conventions. For example, McCormick has 
 suggested that the syntactic rules of languages may operate through 
 link- format ion. Link-forming in one part of a structure is con- 
 ceived here as potentially producing a. vast number of different 
 linkages. Which will occur and endure is in large part determined 
 by information input from the environment and from other parts of 
 the S. 
 
 A simple example of a. model in which influences external 
 to S can affect the selection of links is one in which there are 
 distinct start nodes and distinct goal nodes,, the task being to 
 grow linked intervening nodes. Let the start nodes represent states 
 that can be readily reached by the S in its present environmental 
 and internal circumstances, and the goal nodes represent states in 
 which purposes valuable to the S have been achieved. Linkage 
 between a. start node and a. goal node through intervening nodes 
 indicates to the S that the goal can probably be reached in the 
 current dircumstance, as each is linked only with other nodes 
 ultimately reachable from the state represented by the start node. 
 
 The efficacy of this selection is ba.sed in part on the 
 fact that input to the nodes will activate only those relevant to 
 the situation, and in part on the further selection of only those 
 potentially linked states appropriate to input and purpose. In 
 this model linkages are regarded as being directed , from a. start 
 node or from a, goal node . Add the requirement that there be no 
 cycles in the linkages between nodes, and the resulting linkages 
 are partially ordered. There will grow, then, two trees, one rooted 
 in the start node, the other in the goal node. When ultimately 
 contact is established between a. terminus of one tree and any node 
 of the other, linkage has been established. According to the nature 
 of the problem and the environmental situation it may be more 
 
 1L . . . . 
 
 Private communication 
 
 •35- 
 
effective for linkages to grow from the start node at a slower 
 or faster rate than those from the goal node. A first elaboration 
 of this model allows radial growth of linkages in both directions 
 from intermediate nodes (regarded as subgoals),, Note that great 
 savings in the number of possible linkages and decisions accrue from 
 specification of a goal (in narrowing the forward branching of the 
 linkages from the start node), the additional saving achieved by 
 backward branching from the goal, and the yet additional savings 
 by growth from subgoals . Ea.ch of these economies narrows further 
 the range over which less guided linkings might grow. 
 
 Interacting letter and word linking structures have been 
 mentioned, Another possibility is related to fore -processing and 
 echo-testing techniques previously described for tree structures,, 
 Suppose that the same message is input to the nodes of two link- 
 forming structures, which may be visualized as parallel planes. 
 Both planes have one or more start nodes to be linked to one or 
 more goal nodes, and these linkages are directional. let the two 
 planes be called the strategic and the tactic plane respectively. 
 The strategic plane is supposed to be different from the tactic one. 
 It should be more richly interconnected^ allowing freer and quicker 
 linkages to fan out from start nodes and fan in toward goal nodes; 
 and offering more resistance to change through learning. Suppose 
 that there are fewer nodes on the strategic plane P each possibly 
 linked to many others of that plane. On the other hand each of the 
 more numerous nodes of the tactical plane is allowed to link to 
 fewer nodes in that plane. Finally let there be linkages possible 
 between planes such that there will be many-few connections from 
 the tactic plane to the strategic plane, and few-many connections 
 in the opposite direction. 
 
 Introduction of input into S will generally be accompanied 
 by a connection between some start node and some goal node in the 
 strategic plane before much happens in the tactic plane, Completion 
 of this linkage suggests that opportunities exist for reaching at 
 least one goal from one of the available start points. Let completion 
 of the strategic linkage activate nodes in the tactic plane corres- 
 ponding to the strategic start node and goal node. In this way 
 
 -36- 
 
the strategic plane can select a. particular one of many possible goal 
 points in the tactic plane and sensitize nodes leading to it. This 
 selection reduces the burden of satisfying purposes not realizable by 
 the simultaneous procedures in the tactic plane. The path connecting the 
 start and goal nodes in the strategic plane may, through activating regions 
 in the tactic plane, help a. linkage that would have been difficult with 
 the greater reality constraint of the tactic plane with the narrower, but 
 more rigorous, linking horizons of its nodes. 
 
 This two plane coordinated linking structure is the simplest 
 of many such multiple linking structure models, some of which have 
 quite interesting potentialities. Further elaboration of these models, 
 as well as the crucial issue of developmental change, will be deferred to 
 subsequent publication. 
 
 •37-