UN/VCRSITY OF, ILLINOIS LIBRARY At URBANA-CHAMPAIGN The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. To renew call Telephone Center, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN I _ > that are neither linearly ordered nor totally unordered. In this paper we will restrict ourselves to the case of partially ordered sets having the property that for any two elements a, b E S, there exists at least one element o such that a ^_ a and b £_ Q Sets with such structures will be called generalization structures or g-struc- tures . Figure 1 presents a Hasse diagram of a g-structure. An example of g-structure Figure 1 * The term used in our previous papers on this subject, In the diagram, a relation a _> b is represented by placing node a above node b and linking the nodes by an arc. Examples of descriptors : the blood type of a person is a nominal descriptor, the height or weight of a person is a linear descriptor, and the position of the person in an hierarchy of an institution is a structured descriptor. Suppose, without loss of generality, that we are given two event sets, El and EO, where El, EO C g y each associated with a certain decision or action k (k = 1 and 0, respectively). These sets define a set of functions {f: £ + D} (2) such that {e | f(e) = k} = Ek, k = 1,0 (3) where e £ & and D = {0,1,*}; '*' in D means 'no decision'. A problem of inductive inference is to determine an expression V of a function f, which is most desirable, with respect to some criterion, among all the expressions of all functions (2). Such an expression will usually also assign values 1 or to events not included in Ek; i.e., the expression will be a certain generalization of the sets Ek. Namely, the initial set Ek will be transferred into sets Ek(V) 3 Ek, where Ek(V) = {e | V(e) = k} , k = 1,0 V(e) - the value of the expression V for the event e. AQVAL/1 programs (Michalski 77) can be used to solve the problem if the expression V is restricted to the class of DVL 1 expressions and the sizes of the sets Ek do not exceed certain limits. When sets Ek are very large (say, a few hundred elements or more) , then the computational time of the programs may be too long. The problem arises as to whether sets Ek could not be reduced to more manageable sizes and still provide sufficient information about decision classes from the viewpoint of inductive inference. If a precise measure of a "degree of representativeness" of each event e £ Ek were available, then an event reduction process could be performed simply by selecting events whose 'degree of representativeness' is above a certain threshold. For example, the frequency of occurrence of an object with the description e in the class k could serve as an estimate of such a measure. This estimate, however, in many practical problems is either not available or is not adequate. Consequently, some other means must be developed for selecting the 'most representative' events. There can be a number of different methods of solving this problem (see, e.g., [Michalski 75]). Program ESEL implements a method called 'outstanding representatives' (OR). 1.2 An Outline of the OR Method In this method, the original event set is reduced to a set consisting of events which are most 'distant' from each other. An important feature of this method is that the resulting set will include events which delineate the 'outside' of the events in the original set. For example, if the 'true' but unknown decision class is a circle and its interior and the original event set consists of a number of randomly selected points from this class, then the reduced set will be a set of points lying on or close to the perimeter of the circle and spanning a polygon of approximately equal sides. This method is, however, very sensitive to events which differ significantly from the rest of the events in the original set. If such events happened to be errors, then these errors would have a strong effect on the result. To circumvent this problem, an additional test could be done, which selects an event only if it has a certain number of 'close' neighbors*. Figure 1 illustrates this method. *This feature is not implemented, Let e and e« denote two given events: e l (x l' x 2'*--> x nl' X nl+1' X nl+2'•• ,, X n2' X n2+1' X n2+2'"*' X n* p = fx" x" x" x" x" x" x" x" x"} e 2 ^ V x 2'-'-' nl' nl+1' X nl+2'"*' n2' n2+l» n2+2' ' V V _ J V ^ V linear structured nominal variables variables variables where x! and x'.' denote values of variable x. in e, and e„, respectively. Assume, 11 i 1 2 without loss of generality, that the first nl variables in the events above are interval variables, the following n2 variables are structured variables and those remaining are nominal variablest. First, we will define a measure of the distance d (x ' , xV) between the values of a variable depending on the type of the variable : « For linear variables: |x' - x'.'l d(x'., xV) = * 1 , 1 < i < nl (4) 1 1 a.x — — assuming that the domain of each linear variable is represented the by the set {0, 1, 2,..., X.},\ ■ \-l ( d i ~ the cardinality of D , i.e. of domain of x . ) « For structured variables: d(x . X ") = _NB_ Q ^ X i' V mnb (5) nl < i <_ n2 (see Figure 2) where NB is the number of branches on the shortest path linking x' with xV in the Hasse diagram representing the domain of x., and MNB is the maximum number of branches on the shortest path linking any two nodes of the diagram. • For nominal variables: 1, if x' is not identical to x'' d(x\ x") = ri, if x^ 1^0, other* lerwise (n2 < i £ n) Two types of distance measures between events are considered: tit is assumed here that if the domain of a structured variable is not a g- structure, then the variable is treated as a cartesian variable. NB(a,b) = 3 NB(b,c) = 6 NB(a,c) = 7 MNB = 9 d(a,b) = 3/9 d(b,c) = 6/9 d(a,c) = 7/9 Illustration of a distance between values of a structured variable Figure 2 (1) Quantized measure : d q (e l' e 2 } = i-1 iCdCxJ.xJ).!^ + i= ^ 2+1 wd(x;,xp (6) where T, = (t._, t ,..., t. ) is a sequence of thresholds t.. associated with 1 11 12 ip ^ ij variable x,, i = 1, 2, ..., n2 q is a quantization function q:{d(x' x"XT}-»-{0, 1, 2,..., p} defined as -0, if d(xj, xj) £ t n 1, if t. 1 ) = mln d ( e > e n) mm U ^ „ \) etE d ( e ™o V > O = max d ( e > e r) max o eeE 3. Determine the distance d(e . , e ) and divide it into mm max r intervals*, where r is between 0.01 and 0.1 of the size c(E) of the original set E (e.g., if c(E) = 3000 then r is between 30 and 300) . 4. Partition E into r subsets, E n , E_,..., E , such that ' 1' 2 r E. consists of events whose distance d(e, e„) lies in the ith interval, i = 1, 2,..., r: a i-l < d ^ e ' e 0^ - a i where a. , and a. are the endpoints of the ith interval l-l l (a_. = d(e . , e„) and a = d(e , e_.)). mm r max * The intervals do not have to be equal. The desired situation here is to have intervals which will lead to the subsets E. (determined in step 4) of approximately the same size. 5. From each subset E., 1=1, 2, ..., 4, select a subset E consisting of s events (where s is such that r*s gives the desired size of the reduced event set) . The selection is made in the following way: 1.) Find e, and e„ in E. such that 1 2 l d(e_, ej = max d(e , e )* 12 r „ a b e , e, e E . a b x 2.) Find e~ such that d(e , e ) • d(e_, e^) = max (dCe.e.^) • d(e,e 2 )) e £E. l s-1.) Find e such that s s-1 s-1 n d(e , e.) = max II d(e, e^) j-1 S J e6E 1 j-1 J where fl denotes the arithmetic multiplication. 6. The union of the sets E, : is r E = U E. S i-1 1S gives the reduced event set. *A more computationally efficient process, though one which might lead to a less desirable result, is to replace step 1 by two steps: la) find e^ such that d(e-L, e Q ) = min d(e, e Q ) e eEi lb) find e 2 such that d(e^, e 2 ) = max d(e, e^) . ee E tlhe reason for using multiplication in steps 2, ..., s-1, is to select events which are at similar distances from each other. 10 The number of operations required by the algorithm is approximately : s-1 N = c(E) + r (cCEj) + 2 2 j(t-j) where c(E), c(E ) is the cardinality of E and E., respectively. (E . are assumed to be all of equal size.) An 'operation' may involve computing the distance between two events, the comparison of two distances, the comparison of the distance with a threshold, etc. In the modified form of the algorithm we have: s-1 N ' = c(E) + r .2 jc(E ) For example, if c(E) = 3000, c(E ± ) = 100, r * 30, s = 10, then N = 273000 (N' = 268000), and the cardinality of the reduced set would be c(E ) = 300. 1.4 User's Guide for ESEL INPUT FILES PARMSX - A file with information about variables. INST - A file with information about the sizes of event sets which are in the data base and the sizes of representa- tive sets of events. EVNT - The file containing the data base. PARMSX FILE This file contains the number of variables in the event descriptions in the data base, the range of values for each variable, the domain structure for each variable, and the weight which should be given to each variable. The first number in this file must be the number of variables in each event descrip- tion. The next three specifications are all in the same form: a number, optionally followed by a set of numbers. The first number may be used to set all values of 11 the range, structure, or weight to a single value. If all values in a specifi- cation are to be set to one value, then this first number should be this value and there is no following set of values. If a value is to be specified for each variable, the first number should be (0). For example, suppose there are 3 variables with the following situation: x l x 2 X 3 max. value 3 2 4 structure interval nominal interval weight 5 6 3 The file PARMSX would look like this: 3 324 303 563 max value structure weights The first 3 indicates 3 variables, the first indicates that the range will be specified for each variable (another value than would indicate that all variable ranges will be of that value) . The second indicates that the structure of all variables will be individually specified. An interval structure is specified by the number 3, any other number gives a nominal struc- ture. The final indicates that weights will be specified independently for each variable. Here is another example: suppose there are 4 variables in the data base with the following characteristics: X l X 2 X 3 X 4 max. value 3 3 3 3 structure interval interval interval interval weight 6 5 11 Then the PARMSX file would look like this: 6 5 11 12 The 4 indicates that there are 4 variables describing objects in the data base. The next two 3's indicate that all ranges are from to 3 and that all variables are of interval structure. The indicates that weights will be specified independently. INST FILE This file contains information about the number of events in each class in the data base and the number of events from each class the program is to select. Each class is specified by two lines in this file. The first line specifies the number of events in the data base which correspond to the class, the second line specifies the number of partitions and the total number of events which are to be selected from the class. A class with events, par- titions, and selected events terminates the file. For example, a data base with 100 events, the first 50 of which are to be in the first class, the next 20 in the second class and the last 30 in the third may be specified as follows: INST: blank - the first line must be blank 50 - the first 50 events are in the first class 15 - using 1 partition, select 5 events 20 - the second class has 20 events 16 - using 1 partition, select 6 events 30 - the last class has 30 events 3 20 - using 3 partitions, select 20 events - the last class EVNT FILE This file contains the actual data base. Events are stored as lists of integers. Irrelevant values are stored as -1. The first line of this file must be blank. For example, a situation with 5 events and 3 variables: 13 EVNT FILE blank 0-13 2 2 3 3 3 1 3 11 10 OUTPUT FILES OFILEX - A file with the selected events. TOPT - A file with the remaining events which were not selected. Each file is in the same form as the input file EVNT except that a class number is appended to the beginning of each event. This output format is compatible with the VL- mode of the INDUCE-1 (Larson, Michalski 77, Larson 77 a,b) program and the program AQ11. 14 2. INCREMENTAL GENERATION AND TESTING OF VL HYPOTHESES: Program AQ11 2. 1 Introduction There are many situations when one starts with certain initial hypotheses about given data and then, in the process of experimenting with these hypotheses, has to modify them in order to preserve consistency with new acquired facts. Such situations arise, e.g., in rule-based expert systems, where in the course of a system's performance some rules are discovered to be incorrect or incomplete and have to be modified. A process of generating hypotheses (or descriptions) in steps, where each step starts with certain working hypotheses and a set of (new) data and ends with appropriately modified hypotheses, is called an incremental (or mul t i- step ) generation of hypotheses . The purpose of program AQ11 is to implement such an incremental generation of hypotheses in the framework of the variable-valued logic system VL- (Michalski 74) . Although from the viewpoint of the complexity of real scientific research this framework is extremely restricted; nevertheless, it is still sufficiently rich to provide an interesting research subject and, also, to obtain solutions which may have practical applications. Hypotheses are expressed here as (constant-free) disjunctive normal VL expressions (DVL expressions*). A DVL expression is a disjunction of terms, where a term isalogical product of selectors. A selector is a statement in the form: [x # R] where x is a unary descriptor (variable) # denotes any of the relational operators ■ £ ^ ^ R is a list of constants which are elements of the domain of x (R is called the reference of the selector) *In the general case, DVL^ expressions involve constants and are multiple-valued logi expressions [Michalski 74]. Here, for simplicity, we will assume initially, that they are just binary (i.e., either satisfied or not satisfied), and have no constant 15 When a DVL expression is evaluated for a given event, selectors are interpreted as conditions (or questions) . A selector is satisfied if the value of the variable in the event satisfies the condition, otherwise, it is not satisfied . Some examples of selectors and their interpretation as conditions follows: [x. = 1] is x. equal to 1? l l [x. = 1,3] is x. equal to 1 or 3? [x. = 1..3] is x. between 1 and 3, inclusively? l l An example of a term: [ Xl = 3][x 3 = 2,4,5][x 5 = 0] The above term is satisfied if x 1 equals 2, x has value 2, 4, or 5 and x_ has value 0. An example of DVL- formula: T V T V T 12 3 where T , T , T are terms. The formula is satisfied if term T or T~ or T„ is satisfied. A DVL formula is interpreted as a description of a set of events, namely events which satisfy it. 2.2. Description of Methodology Suppose there is given a set of hypotheses (DVL descriptions), V = {V.}, i=l,...,m, and a family of event sets ('facts'), F={F }, which these hypotheses are supposed to describe. Suppose that for any i, V. describes cor- rectly only a part of the events from F . The problem is to produce a new set of hypotheses, V = {V }, where each V. describes all events from set F f , and does not describe events from i i any other event set F., j ± i. The following solution to this problem is based on the multiple application of a computer program implementing an efficient algorithm [Michalski 71] for determining a cover , C(E /E ), of an event set E ] against the event set E . 16 Such a cover can be interpreted as a DVL expression, which is satisfied by every event in E and not satisfied by any event in E (or in E \ E- , if E and E, intersect) x o O 1 o 1 The solution consists of 3 major steps: Ste P 1 - The first step isolates those facts which are not consistent with the given hypotheses. For each hypothesis, two sets are created: F - a set of events which should be covered by the hypothesis, but are not F - a set of events which are covered by the hypothesis, but should not be covered. (An event is said to be covered by a hypothesis if the event satisfies the VL. formula which represents the hypothesis.) Specifically, this step determines, for each i, 1=1,2, ... ,m, the sets*: F + = F. \ v\ (8) ill F" = V. n F , j=l,2,...,m; tfi (9) (see Figure 3) . Thus, F. denotes events which should be covered by V but are not, and F.. denotes 'exception' events, i.e., events in F., j^i, which are ij F ' ' i covered by V., but should not be covered, l Step 2 . This step determines, for each i, a generalized formula V. describing all exception events (the union of sets F.., j=l,2,...,m, j^i). This is done by generating, for given i and each j, a cover of F against the events in the sets V, U F., i=l, 2, . . . ,m: 3 m + V.. = C(F, . / U V.u F.) (10) ij ij i=l i i and then taking the logical union of V ■ v vT (n) j=i 3H 1 • i ij *V denotes the set of events covered by formula V.. -17- Illustration of sets ¥_, and F_, . Figure 3 18 The reason for this step is that it is computationally more efficient to use formulas V than the union of E.., j=l,2,...,m; j^i. Step 3 . New 'correct' hypotheses could be obtained now by 'subtracting' from each V. the formula V. and 'adding' to it the set F.. To do this 11 l directly, however, is difficult. Again, an advantage is taken of the available computer program for generating covers C(E. /E ). 1 o Namely, the new hypotheses, V., i=l,2,...,m, are determined as covers: v] = C(F./U [(V^v") u Ffc ]) (12) k=l k^i (The point is that directly simplifying a union of terms is difficult; but ' substracting' a term from a term or generating a cover of an event set against a DVL.. formula is easier). Step 4 . This step determines the final representation of hypotheses V.. The V. are DVL.. expressions which are unions of terms. Some terms i l 1 in a V. may represent (cover) only a few events in F . . Such 'low weight' terms are replaced by the events (facts) themselves (since an event takes less memory than a term). In program AQ11, parameter PUNY specifies the minimum percent of events which a term has to cover to be a 'high weight' term. For example, if PUNY = 0.02, and a set F. has 100 events, then all terms which cover 3 or more events (3 > 0.02 x 100) are 'high weight' terms. Terms which cover 1 or 2 events are replaced with those events. 2. 3 An alternative way of handling exception events In the procedure above, the exception events were represented by terms in V . . If the number of exception events is small, it can be easier to handle the events without turning them into expressions V.. The 'substraction' (denoted by \) of an event e from a term T (in a given formula) is done by logically multiplying the term by the negation of the event: 19 T\i e = T a e (13) In order to use this way of handling exception events, in program AQ11 the parameter STGY should be set to value 2 (STGY=2) . The result of operation (13) can produce several terms. Anyone of them is sufficient to be used in the new hypothesis. In program AQ11, there is a para- meter //EX which specifies how many such terms a user wants to store for representing a hypothesis. If the number of generated terms is larger than //EX, the program Selects //EX 'best' terms according to the criteria list. 2.4 Additional Features There may exist certain restrictions on the event space which must hold in the resulting formulas. A restriction may be of the form [x =2] -> [x =NA] (NA = not applicable) which is read "if x has the value 2 then the variable x is not applicable." The implementation of these restrictions can be viewed as an extra set of hypotheses V ,, which is included in the set E of all covers: n+1 C(F./E°UV L ,) ' l n+1 Due to the techniques used in the covering algorithm (namely, the use of para- meter 'maxstar', see p. 27), this may not be the best approach since only a few terms in each intermediate quantity are retained. Therefore, the program imposes these restrictions on all facts in the set F={F.}. Using the above restrictions, an event e = (1 3 2) is replaced with e = (NA 3 2) 20 2. 5 Testing Procedure By applying the above described part of AQll program one can determine DVL descriptions (hypotheses) of classes of objects from examples of objects representing individual classes. An obvious problem arises of testing the validity of the derived descriptions. This is done by applying the descriptions to new examples of objects with known class membership. The results of such testing are usually represented in a form of a confusion matrix . This matrix specifies for each class ( a row in the matrix), the numbers of testing objects of this class, which were assigned by the descriptions to individual classes (corresponding to columns of the matrix) . Below is an example of a confusion matrix involving 2 classes: a class of cancer cells, and a class of non-cancer cells: Class (Correct Assignment) Assigned Decision Cancer cells Non-cancer cells Cancer cells Non-cancer cells 28 7 2 23 Entries on the diagonal indicate the correct decisions, entries outside of the diagonal - incorrect decisions. For example the number 7 in the second row indicates that 7 (testing) non-cancer cells were classified incorrectly as cancer eel] This form of confusion matrix is adequate if an event (object) either satisfies or does not satisfy a formula. In general, however, it is desirable to consider the degree to which a given event e satisfies or matches a formula. Such a degree, called degree of consonance (or degree of match ) and denoted DC(e,V), is computed according to an evaluation scheme . An evaluation scheme consists of definitions for computing: (1) DC(S,e) - a degree of consonance between a selector and an event (briefly, degree of consonance of a_ selector ) , (2) DC(T,e) - a degree of consonance of a term (a product of selectors), (3) DC(V,e) - a degree of consonance of a DVL formula (union of terms), 21 DC({V }, e) - a degree of consonance of a set of formulas (describing the same class) . Many different evaluations schemes can be applied for evaluating DVL-, formulas. Methods developed in many-valued logic (e.g., Recher 69) and fuzzy reasoning (e.g., Zadeh 74, Gaines 76) are applicable here. We will describe the evaluation scheme currently implemented in program AQ11, and give suggestions for other evaluation schemes. (1) Definition of degree of consonance of a selector. The basic definition of the degree of consonance, DC(S,e), of a selector comes from the evaluation rules in VL [Michalski 74]. Assuming that the output domain of the formulas D = {0,1} we have: 1, if the value of appropriate variable in e satisfies the selector S DC(S,e) = 0, if it does not satisfy S * the value is unspecified For example, suppose event e = (x ,x„,x ) = (3, 1, 1), and selector S is [x =1,3]. We have D(e,S) = 1, because value of x_ in e is a member of the reference of the selector (i.e., 1 is member of {1,3}). (Fig. 4). Alternative evaluation schemes can take into consideration the structure of the domainof the variable in the selector. If a variable is linear, it seems that the above definition of DC(e,S) is too rigid. For example, if a ,linear variable x =13 and S: [x =14.. 18], the selector is evaluated to 0, while it seems desirable to evaluate it to some value greater than (since 13 is so 'close' to 14). This means that one could accept a 'bell-shaped' function for evaluating interval selectors (Fig. 5). The concept of 'trimming' a term can be also useful here. In an untrimmed ( extended ) term, references of selectors (sets of values) are as large as possible without leading to a contradiction, i.e., intersectionwith formulas of different classes In the trimmed term, references are as small as possible, providing that the term jJtill covers the same learning events as the extended term and preserves the type -22- DCn Selector S:[x=l,3] ~T~ ~T~ ~J~ ~4~ *"x A graphical illustration for evaluation selector [x =1,3] Figure 4 DCA -if L T---— I 12 345 6 78 A bell-shaped (A) versus step-shaped (B) function for evaluating a linear selector [x.=3..5] Figure 5 23 of selectors, e.g., if the reference of a linear selector lsa..b, then in the trimmed selector it will be an interval a . .b , a 1, the program will partition the set of facts into sets whose size depends on the PCT parameter (see below) and form hypotheses based on old hypotheses and the partitioned facts. Then, new sets of events will be taken in turn and hypotheses formed based on the entire set of events taken up to the point and the hypotheses from the last pass. • TEST Example: TEST = 'O'B (default value) Possible values: 'l'B or 'O'B If TEST is 'l'B, then a confusion matrix will be computed after each pass. The testing events must be given to the program in the file TESTF. • RESTRICT Example: RESTRICT = 'O'B (default value) Possible values: 'l'B or 'O'B If RESTRICT is 'l'B, then a set of restrictions is accepted by the program (see parameter REST) and applied to all events. • RTEST Example: RTEST = 'O'B (default value) Possible values: 'l'B or 'O'B If RTEST is 'l'B, the restriction will also be applied to testing events. 29 TRANS Example: TRANS = 'O'B (default value) Possible values: 'l'B or 'O'B If TRANS is set to 'l'B, then the variable names and values are translated into descriptive names in the output. In this case, a file TRAN must be given to the program (see TRAN below) • PUNY Example: PUNY =0.02 (default value) Possible values: real value in interval [0:1] All terms which cover less than a percent ' (PUNY*ioo) of the events of the corresponding set will be discarded in the next pass (i.e., if a term covers 2 events, PUNY = 1, and there are 23 events in the training set, this term will not be used in the next pass). • TAU Example: TAU = .019 (default value) Possible values: real values in interval T0:ll This parameter relates to the computation of the confusion matrix. Any two values (degrees of consonance) within TAU of each other are considered to be of the same rank. For example, if .98 is the highest decision value for a testing event, any decision with a value between .96 and .98 would be a rank 1 decision (assuming default value of TAU) . • IRK Example: IRK = 2 (default value) Possible values: positive integer less than NCL This parameter also relates to the computation of the confusion matrix and controls the number of decisions printed out. All degrees of consonance of rank not greater than IRK are printed, others are not printed. If IRK = 1, only rank 1 degrees are printed. One exception to this is that the degree associated with the correct decision is always printed. 30 NCRIT Example: NCRIT = 2 (default value) Possible values: integer in range [1:4] NCRIT specifies the number of cost criteria which should be applied when computing the cost of a formula (see CRIT) . CRIT(l) through CRIT (NCRIT) will be used, all others will be ignored CRIT (I) Example: CEIT(l) = 1 CRIT(2) = 2 (default value) Possible values: each CRIT(l) may have values 1, 2, 3, 5 and 9 CRIT (I) = J specifies that the I-th criterion in order will be the cost function J. There should be NCRIT specifications indi- cating the cost function which will be used (j) and the order in which they will be applied (i). Available cost functions are the following: 1. Maximize the number of events covered by the given term, and not covered by previous terms 2. Minimize number of selectors 3. Minimize cost of all variables in this term. If this criterion is specified, costs of variables must also be specified (see Z parameter) 5. Minimize the number of events of £0 covered 9 . Maximize total number of events covered by a term #EX Example: #EX = 1 (default value) Possible values : positive integer During some phase of the program, exception terms are formed (description of events which are covered by hypotheses but should not have been). #EX gives the number of redundant exception terms (i.e., the terms which cover the same event ) . 31 • TR Example: TR = 'O'B (default value) Possible values: '1' or '0' B TR gives a trace of the multi-step process giving the exception terms and the size of the sets F" 1 " and F~ described in Section 2.1. • STGY Example: STGY = 1 (default value) Possible values: 1 or 2 If STGY has the value 1, then exception terms are formed for events in the sets F~. If STGY has the value 2, then the previous hypotheses are multiplied by the complement of the exception events of the set F~. • INDEP Example: INDEP = 'O'B (default value) Possible values: *0'B or 'l'B If INDEP is 'l'B, then the number of independently covered events are printed for each complex. Otherwise, only the num- ber of new events and the total number of events covered are printed. • TITLE Example: TITLE = (default value) Possible values: non-negative integer TITLE specifies the number of cards which are in the title. The title cards must follow the semi-colon which terminates the set of control parameters. • OPT Example: OPT = 'l'B (default value) Possible values: 'l'B or 'O'B If OPT is 'l'B, then after each pass a table is printed indicating the numbers of times each cost criterion is evaluated (number of terms for which the cost function is evaluated) . 32 • MODE Example: MODE = 'IC (default value) Possible values: 'IC, 'DC*, »VL' If MODE = 'IC, then covers are allowed to intersect over 'DON'T CARE'areas of the event space. If MODE = 'IC, the covers are constrained to be disjoint. MODE = *VL' gives order dependent covers. • CPXEV Example: CPXEV = 'l'B (default value) Possible values: 'l'B or 'O'B If this parameter is 'l'B, then during the testing phase a table is printed which gives the number of times each term was needed to give a correct decision. • GEN Example: GEN = 'l'B (default value) Possible values: 'l'B or 'O'B If this parameter is 'l'B, then only the necessary parts of the reference of each output complex are printed (i.e., a new term is created from the generated term which has the following properties) : a. The new term covers the same events. b. The new term contains the same variables. c. The references in the new term are as small as possible. • ECHO Example: ECHO = 'ERZ' (default value) Possible values: A string contains any of the characters ZERF If the letter appears, the corresponding input data is echoed. E = Events R = Restrictions Z = Variable costs F = Input formulas The default echos events, restrictions and variable costs if they are in the input. 33 • TOLERANCE (I) Example: TOLERANCE(2) =0.0 (default value) Possible values: integer or real in [0:1] TOLERANCE (J) is the tolerance for the J-th criterion specified. If it is an integer, then it is assumed to be an absolute tolerance, Otherwise, it is a relative tolerance calculated by finding TOLERANCE * (MAX-MIN) when MAX or MIN are the maximum and minimum elements in the list of costs to be sorted. • ORD Example: ORD = 'l'B (default value) Possible values: 'l'B, 'O'B If ORD is 'l'B, then the program will reorder events in EO, in decreasing order, with regard to the distance from e.. . • N-TAU Example: N-TAU = (default value) Possible values: integer in [0:8] This parameter, if not zero, generates a TAU estimation table giving summary information for each class in the evaluation procedure using N-TAU values of TAU beginning at with increments of TAU- INC. • TAU- INC Example: TAU- INC = .02 (default value) Possible values: Real in [0:1] This is the increment used in the TAU estimation table. 34 Semi colon (;): This must be entered to terminate the control parameters. 2.6.2 Data parameters These parameters have the names as used in the program. In the input to the program only their values are specified, in the order given here (See fig. B-l (a) for an example.) § TITLEC Possible values: The number of lines specified by the TITLE parameter These lines are printed at the top of the output. • NSPEC Possible values: An integer in the range [0:NV] Number of variables for which a structure is to be specified. • VTYPE Possible values: 'F', 'I' The NSPEC variables will be of this type ('F 1 - nominal variable, 'I' -linear variable). • TYPE Possible values: A list of NSPEC integers in the range [1:NV] The list indicates variables of VTYPE. Example of NSPEC, VTYPE, TYPE: 3'F' 1 3 5 There are 3 variables of type 'F' (nominal) namely, variables 1, 3, and 5. The rest will have type 'I'. • NL Possible values: A list of NV positive integers in the range [1:8] This parameter gives the number of values which each variable can assume. Example: 12 4 35 • NE Example : 3 1^+ Possible values: A list of NCL integers in the range [0:NEVE] The parameter specifies the number of events in each event set. The sum should add up to NEVE. • NF Example : 3^1 Possible values: A list of NCL non-negative integers This parameter specifies the number of terms of the hypothesis for each event set. • PCT Example: .2 .k 1 Possible values: A list of NPASS real values in range [0:1] (except if NPASS = 1, PCT is assumed to be l) In this example, 20$> of the events will be described first, then an extra 20$, of the events will be added and a description formed using previous hypotheses. Finally, the complete set of events is used (see NPASS above). • REST Example: (xl2 = l)-> (xlk = *); (xl3 = 2)-> (xl = *) (xh = 1). Possible values: A list of decision rules separated with semi- colons and terminated with a period This restriction will be applied to all events (i.e., added to current specifications). RESTRICT must be set to specify restric- tions. An * in the reference indicates that this variable is not applicable. Restrictions are separated by semi-colons and the list of all restrictions is terminated by a period. 36 • EVENT Possible values: NEVE lists of events, NEVE = SUM(NE) There are two ways in which events can be specified, and the two types of specifications can be mixed. 1. An event can be specified as a list of values, one value for each variable. The values can be: a) non-negative integer — indicating value of the variable b) -1 — variable does not apply c) -2 — do not know the value Example: 3 2 0-1-20 k 1 2 2. An event can also be specified by a VLl formula which is preceeded by a line which says FORMULA. Each formula must be terminated by a semi-colon. Example : FORMULA. (xl = 2) (x3 = 0); FORMULA. (x3 = 1) (x21 =2); • FORMULA Possible values: NCL lists of formulas, each having NF complexes There are two ways to specify a formula: 1. as a FORMULA as in the event specification, 2. as a binary positional bit string in PL/1 List Format. • Z Example: Z(l,2) = 9 Z(3,*0 = 57; Possible values: Integer values terminated by semi-colon in PL/1 Data Format These are costs of the variables which are accepted if CRIT(I) = 3 has been specified for the event set I. If Z value is not specified for some variables, it is assumed to be 1. Z(I,J) = Y means that variable x has cost Y for event set I. 37 2.6.3 Files • TEST This file must be included if the parameter TEST is set to 'l'B, The first line of this file contains a list of NCL values indicating the number of test events for each event set. The list of testing events follows. Each event is specified as a list of variable values with coding of -1 and -2 as above. • TRAN This file must be included if TRANS is 'l'B. It contains the names of all variables and variable values. Each name will be truncated: variable names to 20 characters, value names to 10 characters. The format is the following: For each variable one specifies: variable name, variable value names Each name must be in single quotes. Example : //TRAN DD * •TEMPERATURE' 'COLD 1 'MODERATE' •WARM' •HUMIDITY' •DAMP' 'DRY 1 • TESTF This is a temporary file which the program uses to store test events from one pass to the next. See JCL set up for specifi- cation of this file. This completes a description of the input specification to the pro- gram AQ11. For a user's convenience, appendix A gives a summary of the input specification. Appendix B gives an example of input and output from the program. 38 2.6.4 Program Output Most of the output is self-explanatory (see appendix B) . The input data is echoed when specified. Then, the formulas for each pass are printed. To the right of each term is a pair of numbers which specify the number of new events covered and the total number of events covered by that term. After all the formulas for one pass are printed, a confusion matrix is printed for these formulas and given testing data. Information about each pass is printed in turn until all passes are complete. If two events of different classes are identical, then a message is printed indicating a non-disjoint representation of classes. In such a situation, if a cover C(E1/E0) is being created, then the event of EO is ignored . The output from the evaluation part of the program consists of an extended confusion matrix, as described in section 2.5. Two other tables are printed at the user's option. If CPXEU is set, then a table listing the number of correct decisions for each complex is given. If N-TAU is not zero, then TAU estimation table is printed, giving the indecision ratio and number of correct decisions for each class for N-TAU values of TAU , beginning with in increments of TAU-INC. 3 . SUMMARY We have described here the underlying methodology and computer programs for selecting 'best' learning VL, events (program ESEL) , and incrementally generating VL.. hypotheses for given event sets (e.g., selected by program ESEL), and then automatically testing them on the supplied testing events (program AQ11) These two programs constitute a package which can be used for making experiments in induction of descriptions from examples in various applied fields. 39 ACKNOWLEDGMENT This work was supported in part by the National Science Foundation under Grant NSF MCS 76-22940, and in part by a Senior Visiting Fellowship from the Science Research Council in U. K. The paper was written while one of the authors, R. S. Michalski, was spending his sabbatical leave at the University of Essex in England. This author would like to express here his deepest gratitude to Professor Brian Gaines, the head of Electrical Engineering Department of University of Essex University, for unusual hospitality and help to organize life in the new environment, as well as for the numerous and inspiring discussions. Thanks are also due to Tom Dietterich for the strenuous job of proofreading of this paper. 40 REFERENCES 1. Cuneo, R. P. Selected Problems of Minimization of Variable-Valued Logic Formulas. Report No. 726, Department of Computer Science, University of Illinois, Urbana, Illinois, 1975. 2. Forsburg, A. S. A user's guide for AQPLUS , on internal report, Department of Computer Science, University of Illinois, Urbana 1975. 3. Gaines, B. R. Foundations of Fuzzy Reasoning, International Journal of Man-Machine Studies, No. 8, 1976. 4. Jensen, G. M. Determination of Symmetric VL ± Formulas: Algorithm and Program SYM4. Report No. 774, Department of Computer Science, University of Illinois, Urbana, Illinois, 1975. 5. Larson, J. Inductive Inference in the Variable Valued Predicated Logic System VL : Methodology and Computer Implementation, Report No. 869, DepartmenFof Computer Science, University of Illinois, Urbana, Illinois, 1977. 6. Larson, J. Induce 1 : An Interactive Inductive Inference Program in VL Logic System. Report No. 876, Department of Computer Science, University of Illinois, Urbana, Illinois, 1977. 7. Larson, J., Michalski, R. S. Inductive Inference of VL ± Rules. Workshop on Pattern Directed Inference Systems, Honolulu, Hawaii, May 1977. 8. Larson, J., Michalski, R. S. AQVAL/1 (AQ7) User's Guide and Program Description. Report No. 731, Department of Computer Science, University of Illinois, Urbana, Illinois, 1975. 9. Michalski, R. S. TOWARD COMPUTER-AIDED INDUCTION: A Brief Review of Currently Implemented AQVAL Programs, Report No. 874, Department of Computer Science, University of Illinois, Urbana, Illinois, May 1977. 10. Michalski, R. S. On the Selection of Representative Samples from Large Relational Tables for Inductive Inference, Department of Information Engineering, University of Illinois at Chicago Circle, July 1975. 11 Michalski, R. S. VL : Variable-Valued Logic System. 1974 International Symposium on Multiple-Valued Logic, West Virginia University, Morgantown, West Virginia, May 1974. 12. Michalski, R. S. A Geometrical Model for the Synthesis of Interval Covers. Report No. 461, Department of Computer Science, University of Illinois, Urbana, Illinois, 1971 13. Rescher, M. Many-Valued Logic McGraw-Hill, New York, 1969. 14. Stepp, R. Uniclass Cover Synthesis: Methodology and a Computer Program Description, Report No. , Department of Computer Science, University of Illinois, Urbana, Illinois, 15. Zadeh, L. A. Fuzzy Logic and its Application to Approximate Reasoning, Proceedings IFIP Congress 1974, Vol. 3, North-Holland, 1974. 41 APPENDIX A AQ11 Input Specifications 1. ID parameters Allow 150 to 180 K Bytes of storage for large problems. A very small problem may be run in 120 K. Very few IOREQ's are used by the program; 500K ismore than enough. Time is the main variable which must be ad- justed. Using the following parameters, an estimate of the time re- quired for a large job can be given. MAXSTAR = 1 NPASS =3 NV = 35 NCL = 19 NEVE (training) = 307 No evaluation Time: 1 min. Region: 174K Changing MAXSTAR to 7 and requesting evaluation using 388 events, the time increased to 3 minutes for the training phase and 1 minute and 30 seconds for evaluation. JCL The following JCL is recommended: // EXEC PGM=ITCN3,REGION=180K,PARM=' ISA (N), REPORT' //STEPLIB DD DSN=USER.P2123.ITCN3,DISP=SHR //SYSPRINT DD SYS0UT=A //PLIDUMP DD SYS0UT=A //SAVEF DD DSN=&&TEMP,UNIT=DISK,SPACE=(TRK,(10,1)) //FT06F001 DD SYS0UT=A //SYSIN DD * input parameters and data //TEST DD * test data (if evaluation requested) //TRAN DD * translation data (if TRANS is set) ISA(N): N should be the region requested minus 125, e.g. 51K 42 Control parameters Parameter NV NEVE NCL MODE MAXSTAR ECHO NCRIT CRIT(l) CRIT(2) CRIT(3) CRIT(A) TITLE RESTRICT SAVE GEN PUNY TR NPASS STGY #EX OPT TRANS TEST RTEST TAU IRK CPXEV NGE INDEP TOLERANCE (I) N-TAU TAU- INC ORD Semi-colon (;) Data parameters Parameter TITLE NSPEC TYPE Default r IC« 10 'ERZ' 2 1 2 5 9 'O'B •O'B •l'B .02 'O'B 1 1 1 'l'B 'O'B 'O'B 'O'B .019 2 'l'B 200 'O'B .02 'l'B Description Number of variables Total number of training events Number of classes Mode of operation Maximum star size Echo input Number of criteria Criterion 1 Criterion 2 Criterion 3 Criterion 4 Number of lines in title Accept restrictions Save formulas in a file SAVEF Trim eomplexes for output and evaluation The minimum percent of events which h to be. covered by a term Trace multi-step procedure Number of steps Way in which events of F sets are handled Numbers of redundant exception complexes Print statistics about number of times each cost function is evaluated Translate output using TRAN file Evaluate formulas Apply restrictions to test events Equivalent threshold for rank 1 decisions Number of ranked decisions which are printed Print statistics about satisfied complexes during evaluation Initial storage for complexes Prints independent events if set Tolerance for I tn specified test function Number of columns in 'tau' estimation table Increment in tau estimation table Reorder the events in EO, in decreasing order, with regard to the distance from e . Terminate control parameters Description Lines of title (if any) Number of variables for which type TYPE is specified Type of these variables 43 VSPEC PCT NL NE NF RESTRICTIONS EVENTS FORMULAS Z 5. Files File Indicies of variables of type TYPE If NPASS > 1, the percent of events to use in learning phase for each pass Number of values for each variable Number of events in each set Number of formulas in each set If RESTRICT is set. Each pair of rules must be separated with a semi- colon; the entire list is terminated with a period. Lists of events in either of two forms Lists of formulas as in either of two forms If any CRIT(l) = 3, costs of variables terminated with semi-colon Description TRAN TEST SAVEF TRANS is 'l'B, the file names of classes, variables and variable values. Each name must be in quotes TEST is 'l'B, the file of test events SAVE is 'l'B, the output file of formulas in bit positioned form (list format) 44 APPENDIX B An Example of an Input to and an Output from AQ11 This appendix contains an example of the program input and output which involves most of the features of the program. Figure B-l gives the input specification for this example. Figure B-2 gives the output which was obtained. The first page of output repeats the input in a slightly extended form. The next pages show formulas which were generated [in which variables x 1 ,x 2 ,x 3 ,x^ are substituted by their names, and defined in the input (item P in Fig. B-l)] and the results of the evaluation of formulas on testing events. Explanation of Figure B-l . The example involves four variables (NV=4; see item B in Figure B-l(a)), which can take 2, 3, 4 and 2 values, respectively (item E) . All variables are nominal, except variable x which is interval (item D) . There are 2 classes (NCL=2; item B) , each represented by 6 learning events (items F, J). The last event of set (class) 1 is specified as a DVL.. formula (in the middle of item J) . Item H defines the percentage of learning events to be used in each iteration (pass). The restriction on event space is given by a VL, decision rule (item I) . There are initial hypotheses for class 1 and 2 hypotheses for class 2 (item G) . Item K (fig. B-l(b)) lists the hypotheses for class 2, The cost of variable 1 for set 1 is specified as 2 (item L) • the cost criteria for the selection of complexes (terms) in the synthesis of covers are in the order 1, 2, 3, 9 (1 and 2 by default; 3 and 9 defined by CRIT(3)=3, CRIT(4)=9 in item B) . (For the definition of cost of variables and cost criteria see [Larson, Michalski 75]). Evaluation of the formulas to be generated is requested (//TEST DD*) and sets of test events supplied, 4 events per class (items M, N) . A file containing names of each class (set), each variable and each value of the variable is also supplied (items 0, P) . 45 Input for Example A // EXEC PGM=ITCN3,REGION=160K,PARM«'ISA(23K), REPORT' //STEPLIB DD DSN=USER.P0012.ITCN3,UNIT=DISK,DISP=SHR //SYSPRINT DD SYSOUT=A //PLIDUMP DD SYSOUT=A //SAVEF DD SYSOUT=B //FT06F001 DD SYSOUT=A //TEST DD DSN=JIM, UNIT=DISK,DISP= (NEW, DELETE), SPACE=(TRK, (10,10)) //SYSIN DD * NPASS=2 NV=4 MAXSTAR=30 CRIT(3)=3 CRIT(4)=9 TITLE=3 NEVE=12 ECHO='ERFZ' TEST='1'B TRANS- 1 l'B RESTRICT='1'B NCL=2 INDEP= , 1 , B NGE=100 NCRIT=4 NTAU=4; ************************************************************ TEST RUN ************************************************************************ i »r 3 2 3 4 2 6 6 2 .5 1 (X1=0) (X2=0) (X3=0) -> r 2 2 1 Oil 2 2 1 FORMULA (X4=0) (X2=l 2) (X1=0) 2 1 1 3 12 111 10 2 1 12 3 (X4=0) . A JCL B Control parameters C Title D NSPEC, UTYPE E Number of levels/variable F Number of events/pass G Number of formulas/set H Fraction of events/pass I Restriction J Event list (6 events/set) v Figure B-l (a) 46 M ( r < ( r FORMULA (Xl=l) (X2=l 2) FORMULA (Xl=l) (X3=2 3) z(l,l)=2; //TEST DD * 4 4 10 110 13 11 110 1 12 3 1 13 13 1 //TRAN DD * 'ACCEPT' 'REJECT' 'NEW' 'COLOR 1 'SIZE' 'WEIGTH' (X4=0) (X3=0 1); 'YES' 'NO' 'RED' 'BLUE 1 ' ORANGE ' 'SMALL' 'MEDIUM' ' LARGE ' 'X LARGE' 'HEAVY' 'LIGHT' variable xl ) values variable x2 ) values variable x3 values variable x4 j values K Formulas (2 guesses for set 2) L Cost of variables (variable 1 has cost 2 for set 1) M Number of test events/set N List of test events Name of each set P Variable names and variable value names Figure B-l (b) 4? Explanation of Figure B-2 . The first part contains an echo of the input (item A) . Next (item B) prints the formulas obtained after the first iteration (pass) , which used 50% of the input events (first 3 events in both classes; see item H in Fig. B-l) . The classes, variables and values of variables are specified by names. Together with each complex (term) a triple of numbers is printed (NEW, IND, COV) (item C) , where NEW - denotes the number of events covered by the given complex and not covered by the previous complexes on the list of complexes generated for this class IND - denotes the number of events covered only by the given complex COV - the total number of events covered by the given complex. The program also lists the number of times each cost criterion has been evaluated (item D) . Item E gives a symbolic specification of the obtained formulas. Next, an extended confusion matrix is printed (item F) as the result of evaluating the obtained formulas for the testing events (item N in Fig. B-l). We can see from the matrix that all testing events of the first class ('ACCEPT') have been misclassif ied, and all the events of the second class ('REJECT') have been correctly classified. Item G specifies the number of times each complex in the cover of each class has been satisfied by testing events in the case of correct decisions (second complex, C2, of class D2 correctly classified 3 testing events, and the third one, C3, correctly classified 1 testing event). Item H specifies the percentage of correct decisions and the indecision ratio for various values of parameter TAU (generally, the higher TAU, the greater is the number of correct decisions, but also the greater is the indecision ratio) . Item I lists the formulas obtained in the second iteration (which used all the learning events), and item J - the corresponding confusion matrix. We can see that this time 50% of testing events of class 1, and 100% of class 2 were correctly classified. Items K and L give the same information as items G and H, respectively, but for the formulas obtained in the second iteration. 48 *************************************************************************** TEST RUN *************************************************************************** TR= ' ' B NPASS=2 NV=4 MAXSTAR=30 TITLE=3 NCL=2 TEST='1 , B MODE= ' IC * STGY=1 INDEP='1'B GEN='1'B ECHO='ERFZ' NGE=100 TAU=1.89999E-02 IRK=2 CPXEV='1'B TRANS='1'B NTAU=4 TAU INC=1.99999E-02 ORD-'l'B'; CRIT LIST 1 0.00 2 0.00 3 0.00 9 0.00 113 2 PASSES 0.50 1.00 NUMBER OF LEVELS / VARIABLE 2 3 4 2 #EX=1 NUMBER OF EVENTS / CLASS 6 6 SAVE='0*B NUMBER OF FORMUAS / CLASS 2 PUNY=1.99999E-02 RESTRICT= ' 1 ' B RTEST='0'B RESTRICTIONS ON EVENT SPACE (X1=0) (X2=0) (X3=0) -> (X4=0) LIST OF INPUT EVENTS 1 2 2 3 2 1 4 1 1 5 2 2 1 (X4= =0] (X2=l ,2) (X1=0) (X3= =D; 7 2 1 1 8 3 9 1 2 10 1 1 1 11 1 2 1 12 1 2 3 INPUT FORMULAS (Xl=l) (X2=l,2) (X4=0) (X3=0,l); (Xl=l) (X3=2,3); COSTS OF VARIABLES WHICH ARE NOT 1: Z(l, 1)= 2; TIME FOR INPUT OF DATA 36 CENTISECONDS C NEW IND COV C *****C0VER OF ACCEPT***** CPX 1: (NEW= YES) (SIZE= SMALL) CPX 2: (NEW= YES) (SIZE= LARGE) *****COVER OF REJECT***** CPX 1 CPX 2 CPX 3 (SIZE= MEDIUM) (SIZE= X LARGE) (NEW= NO) # TIMES EV. 12 11 4 4 19 CENTISECONDS Figure B-2 (a) 2) 1) ( 1 1 1) ( 1 1 1) ( 1 1 1) 49 FORMULAS FOR CLASS 1 CPX :(X1 = 0) (X3 = 0) CPX :(X1 = 0) (X3 = 2) FORMULAS FOR CLASS 2 CPX CPX CPX (X3 = 1) (X3 = 3) (XI = 1) NUMBER OF EVENTS IN EACH CLASS ASSIGNED DECISION CORRECT // EVENTS/ TIE UNSP 1 1 D 1 D 2 ASSIGN # RK1 DEC | | D 1 ACCEPT .50 1.00 .50 1.00 .50 1.00 .50 1.00 4/4 | | 4 1.00 0% 100% D 2 REJECT .50 1.00 1.00 .50 1.00 .50 1^00 4/4 | 4 | 1.00 0% 100% NUMBER OF CORRECT DECISIONS /COMPLEX COMPLEXES EVENT SETS C1C2C3C4C5C6C7C8C9C10 Cll C12 C13 1 3 1 2 Figure B-2 (b) H CLASS 1 2 TOTALS TAU ESTIMATION TABLE (% CORRECT / INDECISION RATIO) VALUE OF TAU 0.00 0.02 0.04 0.06 0.00/1.00 0.00/1.00 0.00/1.00 0.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 0.50/1.00 0.50/1.00 0.50/1.00 0.50/1.00 FINAL STATISTICS INDECISION RATIO: 1.00 PERCENT CORRECT: 50.00 TIME TO EVALUATE FORMULAS 35 CENTISECONDS CPX 1:(NEW= YES) (SIZE= SMALL MEDIUM) (WEIGHT= HEAVY) CPX 2:(NEW= YES) (SIZE= LARGE) CPX 3:(NEW= YES) (SIZE= SMALL) *****COVER OF REJECT***** CPX 1:(SIZE= MEDIUM) (WE IGHT= LIGHT) CPX 2:(SIZE= X LARGE) CPX 3:(NEW= NO) CRIT # 1 2 3 9 = HEAVY) ( 3 2 3) ( 2 2 2) ( 1 1 2) ( 1 1 1) ( 2 1 2) ( 3 3 4) // TIMES EV. 8 7 4 3 TIME FOR THIS PASS 20 CENTISECONDS r ASSIGNED DECISION CORRECT // EVENTS/ TIE UNSP | I D 1 D 2 ASSIGN // RK1 DEC I I D 1 ACCEPT 4/ 4 1.00 D 2 REJECT 4/ 4 1.00 I I 1^00 .50 1.00 .50 767 1.00 .67 1.00 2 50% 2 50% II ,50 1^00 1^00 ,67 1.00 so lTgg I 4 | 0% 100% Figure B-2 (c) 51 NUMBER OF CORRECT DECISIONS /COMPLEX EVENT SETS COMPLEXES C 1 C 2 C 3 C 4 C 5 C 6 C 7 C 8 C 9 CIO Cll C12 C13 r CLASS 1 2 TAU ESTIMATION TABLE CORRECT / INDECISION RATIO) VALUE OF TAU 0.00 0.02 0.04 0.06 0.50/1.00 0.50/1.00 0.50/1.00 0.50/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 TOTALS 0.75/1.00 0.75/1.00 0.75/1.00 0.75/1.00 FINAL STATISTICS INDECISION RATIO: 1.00 PERCENT CORRECT: 75.00 TIME TO EVALUATE FORMULAS 28 CENTISECONDS Figure B-2 (d) 51 NUMBER OF CORRECT DECISIONS/COMPLEX EVENT SETS COMPLEXES C 1 C 2 C 3 C 4 C 5 C 6 C 7 C 8 C 9 CIO Cll C12 C13 3 1 TAU ESTIMATION TABLE CORRECT / INDECISION RATIO) CLASS 1 2 VALUE OF TAU 0.00 0.02 0.04 0.06 0.50/1.00 0.50/1.00 0.50/1.00 0.50/1.00 1.00/1.00 1.00/1.00 1.00/1.00 1.00/1.00 TOTALS 0.75/1.00 0.75/1.00 0.75/1.00 0.75/1.00 FINAL STATISTICS INDECISION RATIO: 1.00 PERCENT CORRECT: 75.00 TIME TO EVALUATE FORMULAS 28 CENTISECONDS Figure B-2 (d) IBLIOGRAPHIC DATA HEET Title and Subtitle 1. Report No. UIUCDCS-R-78-867 SELECTION OF MOST REPRESENTATIVE TRAINING EXAMPLE! AND INCREMENTAL GENERATION OF VL HYPOTHESES: the underlying methodology and the description of programs ESEL and AQ11 Author(s) R. S. Michalski and J. B. Larson Performing Organization Name and Address Department of Computer Science University of Illinois at Urb ana-Champaign Urbana, IL 61801 2, Sponsoring Organization Name and Address National Science Foundation Washington, D.C. 5, Supplementary Notes 3. Recipient's Accession No. 5. Report Date May 1978 8* Performing Organization Rept. No. 10. Project/Task/Work Unit No. 11. Contract /Grant No. NSF MCS 76-22940 13. Type of Report & Period Covered 14. 6. Abstracts ^ e p a p er describes the underlying theoretical framework and operational de- rails of two programs, ESEL and AQll, for computer induction within the framework of the /ariable-valued logic system VL^ (i.e., a statement calculus which involves variables fith an arbitrary number of discrete values [Michalski 1974]): ESEL - A supporting program for selecting 'most representative' learning and/or testing VL^ events from a large data base of events. The programs provides the input to the program AQll. AQll - A program for incremental generation of VL-^ hypotheses, which are generalized and optimized descriptions of input event sets. The program also provides a facility for evaluating the performance of these inferred hypotheses on testing events. Given a large set of examples describing certain objects or situations, program ESEL selects from them a small subset of the most representative ones. The examples have to be in the form of VL 1 events, i.e., in the form sequences of values of certain discrete variables (or descriptors). In selecting the events, the program distinguishes among three types of descriptors: nominal descriptors, whose value set is an unordered set, linear descriptors, whose value set is a linearly ordered set, and structured des- criptors, whose value set is a tree-ordered set. Events selected by ESEL are input to program AQll, which generates VL-, hypo- theses describing the events. The program can work incrementally, i.e., given a work- ing hypothesis (a set of rules) obtained at some stage, and a set of events, the pro- gram can modify the hypothesis to make it consistent with the events. Program AQll also has the facility to test the performance of a given hypothe- sis on a set of testing events, and to compute an extended confusion matrix. i7. Key Words and Document Analysis. 17o. Descriptors VL., Variable-valued logic, inductive inference, incremental hypothesis generation, incremental induction, hypothesis modification, rulebased deduction, deductive inference 18. Availability Statement f?. Security Claaa (This Report) IINC.I.ASSIF m. 20. Security Class ( 1 his Page UNCLASSIFIED 21. No. of Pages 22. Price °«M NTIS-35 (10-701 USCOMM-DC 40329-P7 1 AUG 1 7 1978 ■