UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN BOOKSTACKS THE HECKMAN BINDERY, INC. North Manchester, Indiana JUST FONT SLOT TITLE H CC 1W H CC 1W H CC 1W H CC 7W 3R aCULTY /ORKING PAPER 8 1988 7 NO. 1468-1485 330 B385<"CV"> no. 1468-1485 cop. 2 U. of ILL. LIBRARY SANA BINDING COPY PERIODICAL: □ CUSTOM D STANDARD □ ECONOMY □ THESIS BOOK \3 CUSTOM D MUSIC [3 ECONOMY AUTH. 1ST □ ACCOUNT LIBRARY NEW ACCOUNT NAME RUBOR TITLE ID. SAMPLE NO. VOLS. THIS TITLE LEAD ATTACH FOIL 6632 WHI 488 UNIV OF ILLINOIS ACCOUNT INTERNAL I.D. ISSN. STX4 COLLATING 35 BINDING WHEEL SYS. ID. FREQUENCY 1 3 76 ADDITIONAL INSTRUCTIONS Dept=STX4 01 Item INM=[Z 1CR; ' BY # B4 ' SEP SHEETS PTS BD PAPER TAPE STUBS CLOTH EXT INSERT MAT. PRODUCT TYPE HEIGH An FILLER 7 GUM ' SPECIAL PREP. LEAF ATTACH ACCOUNT LOT NO ACCOUNT PIECE NO VOL. THIS PIECE NO COVER S x — " 47 001247981 BEBR FACULTY WORKING PAPER NO. 1483 ^M •"He LI8f? AR Y Learning Model Management Knowledge in Intelligent Decision Support Systems Michael ]. Shaw College of Commerce and Business Administration Bureau of Economic and Business Research University of Illinois, Urbana-Champaign BEBR FACULTY WORKING PAPER NO. 1483 College of Commerce and Business Administration University of Illinois at Urbana- Champaign August 1988 Learning Model Management Knowledge in Intelligent Decision Support Systems Michael J. Shaw, Assistant Professor Department of Business Administration Digitized by the Internet Archive in 2011 with funding from University of Illinois Urbana-Champaign http://www.archive.org/details/learningmodelman1483shaw LEARNING MODEL MANAGEMENT KNOWLEDGE IN INTELLIGENT DECISION SUPPORT SYSTEMS Abstract Model management systems are important for handling complicated decision problems in decision support systems (DSS). The current model management systems usually automate the model manipulation tasks through deductive inference mechanisms with some inherent weaknesses. Aiming at overcoming these weaknesses, we present a new framework of model management system which is able to perform model manipulation more effectively. The new approach incorporates machine learning to acquire model manipulation knowledge, stored in the form of schemata, and to refine these acquired schemata. In addition, we also address the issue of learning model selection heuristics, making the selection adaptive to the characteristics demonstrated by the problems or the users of the DSS environment. Keywords : Model Base Management, Machine Learning, Intelligent Decision Support, Artificial Intelligence 1. Introduction Due to the operational characteristics of decision support systems (DSSs), the solution process usually involves transforming data in various ways through a diverse collection of program modules — i.e., models. It is therefore necessary to have not only a comprehensive collection of such models (i.e., a mode bank), but also suitable mechanisms for using these models effectively. Thus, an effective model management subsystem is quite essential for solving problems and handling queries in DSSs. This paper is aimed at applying machine learning methods to two important aspects of model management: model representation and model manipulation. Model representation concerns representing each model with its input and output conditions (Elam and Henderson [1983]; Dolk and Konsynski [1984]; Applegate et. al. , [1986]; Fedorowicz and Williams [1986]). The representational approaches employed to date include predicate calculus (Bonczek et. al. , [1981, 1983]), semantic network (Elam et. al. , [1980]), frame (Dolk and Konsynski [1984]), and relational database theory (Blanning [1986]). All these systems basi- cally treat models as a form of data transformation, so that the user can easily query the system without the burden of programming details, and that the model management subsystems can be easily integrated into the decision support system (Geoff rion [1987]). Using the concepts developed in machine learning, we will use schemata to represent the synthesis of multiple model applications. Model manipulation , on the other hand, involves selecting, retrieving, and activating models to solve problems (Blanning [1986]; -2- Dutta and Basu [1984]). That is, while individual models are used to perform stand-alone computation (e.g., time-series, simulation, regression analysis, etc.), they often need to be combined with one another into a sequence of steps in order to reach the solution. Such a process requires dynamically selecting the necessary models, imposing an appropriate sequence of model applications, and determining the desirability of each model to different decision problems. These tasks constitute model manipulation. Prior research in model management has attempted to design model management systems capable of model manipulation in response to dif- ferent problems (Bonczek et. al. , [1983]; Dutta and Basu [1984]; Blanning [1986]; Dolk and Konsynski [1984]). But these systems show several weaknesses: (1) the performance of the DDS relies heavily on a predetermined collection of problem-solving methods acquired from domain experts; (2) similar problems are solved individually and inde- pendently; and (3) past problem solving experiences are ignored in solving subsequent problems. These systems ignore the fact that problem-solving skills and modeling knowledge provided by human experts may not be complete initially, and that even a commonly used solution process may change over time. This paper presents a new framework for model management. Machine learning , an emerging technique in artificial intelligence (AI), is applied to incorporate an element of adaptiveness in the DSS. Recog- nized as the essential feature of any intelligent system, learning processes include the acquisition of new declarative knowledge, the development of problem-solving skills through instruction or practice, -3- the organization of new knowledge into general, effective representa- tions, and the discovery of new facts and theories through observation and experimentation. Machine learning is concerned with the computer modeling of the learning processes. We will discribe the application of machine learning to model management, resulting in a learning aug- mented system which not only can perform problem solving intelligently, but also can accumulate prior problem-solving knowledge and refine/ modify its knowledge continuously. In addition to the generation and modification of model manipula- tion knowledge, the issue of using heuristics to select models adap- tively will also be addressed. The method can be used to create heuristics learned from prior experiences of model selection among alternatives. We shall discuss machine learning techniques which can incrementally modify model representation by experimenting with obser- vations; the heuristics can be intelligently created by dynamically refining an evaluation function. The remainder of this paper is organized as follows: Section 2 presents the learning-augmented framework for intelligent DSSs, adding a learning component to the operational design of DSSs; Section 3 discusses the learning of model-manipulation schemata; Section 4 describes a learning-by-experimentation method for refining model manipulation schemata; Section 5 applies a learning method for generating model-selection heuristics adaptively; finally, Section 6 summarizes the characteristics of applying machine learning to model management . -4- 2. A New DSS Framework. Incorporating Machine Learning Machine learning methods can be categorized into the following areas based on their behavioral characteristics: rote learning (Samuel [1968]), learning from instruction (Davis [1979]), learning by induction (Buchanan and Mitchell [1978]; Dietterich and Michalski [1983]), learning by analogy (Winston [1979]; Carbonell [1983]), learning by competition (Holland [1986]), and learning from obser- vation and discovery (DeJong [1986]; Langley [1981]; Lenat [1983]). A basic machine learning model is summarized in Figure 2.1, where the learning system consists of four elements: Environment, Learning Element, Knowledge Base, and Performance Element. The Learning Element takes its input from the Environment, in the form of obser- vations, or from the Performance Element, in the form of performance results. The learning process will result in either new knowledge for the knowledge base or modifications on the existing knowledge. We shall adapt this basic model to the intelligent DSS setting, where the input from the Environment is collected from the firm's database, and the Performance Element corresponds to the rule-based problem solver of the DSS. Insert Figure 2. 1 Here The Knowledge Base in the DSS setting contains: (1) procedural knowledge, (2) decision heuristics, and (3) model-manipulation knowledge. Procedural knowledge is the knowledge about the essential steps, mostly related to information collection, for making a given decision. The decision heuristics are rules of thumb used by domain -5- experts. Because of the inherently judgmental nature, this type of rules needs considerably more effort to obtain and refine. The rules generated by inductive learning belong to this category. The third type of rules is used to represent the model knowledge available for decision support; these rules indicate the application requirements of each model and the relations between models. Some examples of rules of this type are shown in Appendix A.. Most existing DSSs use knowledge engineering for knowledge acquisition; they take the domain knowledge from experi-enced decision makers in the field and transform the knowledge into the represen- tation form in the knowledge-base of the DSS (Elain and Henderson [1980]). This is shown as process (a) in Figure 2.2. There are two types of rule learning: (1) Learning from an example set, in which decision rules are derived from a given set of positive and negative examples (shown as process (b) in Figure 2.2); and (2) Rule modifica- tion, in which the rules in the knowledge-base are modified to improve the performance of the DSS (shown in process (c) in Figure 2.2). Learning from examples can be achieved by inductive inference (Rendell [1986]; Michalski [1983]). Rule refinement, on the other hand, can be achieved by comparing the resulting solution path (i.e., the perfor- mance trace) with the correct path (i.e., the ideal trace). Bundy [1985] reviewed several methods for rule refinement and compared their performances. Learning model management knowledge involves both aspects of learning. Insert Figure 2.2 Here -6- Our approach incorporates four interactive functional components — the Instance Selector, Problem-Solver, Critic, and Learning Module — to integrate the learning function. The Instance Selector either accepts the training instances supplied externally or generates new training examples by itself in response to previous learning process . The Problem-Solver produces solutions to the new problems supplied by the Instance Selector either by applying existing problem-solving heuristics or by utilizing the inference mechanism. The resulting solution path for each new problem is then evaluated by the Critic, which compares the solution just produced with the desired solution. Based on the observations made by the Critic, the Learning Module either refines existing rules or hypothesizes new rules. This learning augmented DSS configuration for knowledge refinement is shown in Figure 2.3. Insert Figure 2.3 Here The major applications of machine learning to model management are in three aspects: (1) the acquisition of model manipulation knowledge, (2) the refinement of model manipulation knowledge, and (3) the creation of model selection heuristics. 3. The Acquisition of Model Manipulation Knowledge We use the term model manipulation schemata to represent the knowledge generated from past problem solving tasks. Every schema contains a condition part which describes a class of problems and a solution part which displays the shared solution plan for every -7- problera in this class. The solution plan can be represented in an AND/OR tree structure, which we call the solution tree. An OR subtree in the solution tree denotes all possible alternative solution paths, and an AND subtree indicates the input requirements for a model or a set of subproblems for a decomposable problem. The subproblems can be simple data retrievals or model executions. It should be noted that several models may generate similar solutions to a problem which alto- gether constitute an OR tree for this problem. Each subtree con- verging to an OR node is an alternative solution plan to this problem. Nodes in the bottom of the solution tree are either solvable terminal nodes which are subproblems solved by either data-retrieval or user input, or they may represent unsolvable terminal nodes which are subproblems that cannot be solved by data-retrieval, user input or models. The solution tree with at least one solution path whose ter- minal nodes are all solvable is complete since this solution tree can provide a solution plan to the problem. Otherwise, it is incomplete since it cannot provide any solution plan to the problem. The solu- tion of a stored model manipulation schema is applicable only if a new problem matches with the condition part of this schema. The applica- tion of a model manipulation schema in the form of an AND/OR tree is shown in Figure 3.1. We use the schemata as problem solving concepts. That is, useful schemata will be those that organize operators to achieve an important goal, or a set of goals, in a general way. As the model management system usually deals with executing two or more models in an appropriate sequence, the process of model manipula- tion involves a multiple-step process, in which each step involves either a database retrieval or a model application. As opposed to searching for the individual steps, a learned model-manipulation schema integrates the entire multiple-step process into a single module; such a schema can be applied either as a single step or just a portion of it, depending on the problem to be solved (Figure 3.2). The learning of model manipulation schemata can be characterized as "learning of multiple-step tasks," which is also used in Fik.es [1972]; Korf [1982]; and DeJong [1986]. Moreover, the concept of model manipulation schemata is similar to the "macro-operators" used in (Fik.es [1972] and Korf [1982]) for representing the sequence of actions learned. The macro-operators help reduce the amount of search required on the same type of problems, because they are stored in a generalized form that allows similar situations to be applied (Fikes [1972]). The learning procedure using the learning components in Figure 2.2 for acquiring model manipulation knowledge is depicted in Figure 3.3. DeJong [1986] employed "schemata" to achieve the same purpose, and he also addressed an alternative machine learning approach called explanation based learning (EBL). EBL is characterized by its use of a structured set of domain knowledge and a generalization process based on a single example (Mitchell et. al. , [1986]). Insert Figures 3.1, 3.2 and 3.3 Here In addition to automatically acquiring model manipulation sche- mata, machine learning techniques also enables the model management system to refine these schemata after an iterative experimentation process. We shall elaborate on this experimentation process next. -9- 4. Refinement of Model Manipulation Knowledge The model manipulation schema is derived from generalizing of a specific problem instance and its solution plan. However, such a generalization may cover more than it is supposed to. To increase the accuracy of the initially learned schema, the model management system needs to modify the schema through a training process which contains a collection of self-created or teacher-provided training examples. Since the system would choose and manipulate the training instances (by the Instance Selector) in order to verify the hypothese about the concept, this process is sometimes referred to as learning by experi- mentation (Mitchell, Utgoff, and Banerji [1983]). The series of experiments with training instances would help the learning process converge to the correct concept description. As defined in the preceding section, a schema consists of two parts: a condition part describing a class of problems to which this schema is applicable, and a solution part which displays the shared solution plan for every problem in this class. An experiment with a training instance provides a positive or a negative example for the current schema. A positive example has a complete instantiated solu- tion plan based on the schema. A negative example, on the other hand, is a problem instance which does not belong to the class under con- sideration. After the model manipulation schema is acquired by generalizing the derived solution plans for the given problem, the refinement of the schema on the current over-generalized form is achieved by an iterative process generalizing or constraining the training examples. This refinement process can be summarized by the -10- following two operations: If the current problem expression of the schema does not cover the encountered positive example, then it needs to be generalized. If the current problem description of the schema covers the encountered negative example, then it needs to be con- strained. This refinement process can be facilitated by organizing the possible problem descriptions in a "version space" (Mitchell [1982]). Essentially, we are treating the refinement process as a search process for concept learning — in this case, the concept to be learned is the correct problem description for the schema. The search is con- ducted in the space of all possible versions of the descriptions, referred as the version space. The version space basically provides a generality/specificity structure for guiding the refinement process. Given a version space and a description in the version space, the Learning Module should be able to find the more generalized version of the description, the more specific version of the description, or descriptions belonging to the same level of generalization. The approach described in Mitchell [1982] takes advantages of the general-to-specific ordering of descriptions in the version space. Mitchell argues that a version space can be represented by two sets of descriptions, S and G, where S is the set of the most specific descriptions consistent with the observed instances, and G is the most general descriptions consistent with the observed instances. To refine a model manipulation schema, we first create a version space for the problem descriptions to which this schema is applicable. This version space is represented by the G and S sets. Initially, G -11- and S are defined by the first training example t : G is the maximal Generalization of t and S is defined to be t . The set of training ° o o examples are then input sequentially to shrink, the version space. For each example, if the training example is a positive instance, then (1) generalize S, as little as possible, in order to cover this posi- tive example and (2) remove from G all concept descriptions that do not cover the example; on the other hand, if the training example is negative, then (1) remove from S the parts which cover the example and (2) make G more specific, as little as possible, so that its elements would not cover the example. Thus, in each step G is constrained to avoid covering the negative examples and S is expanded to cover the positive examples encountered. G and S will eventually be equal as more and more training examples are considered. When they finally converge, the proper problem description for the schema is found. The learning procedure for the refinement of model manipulation knowledge is depicted in Figure 4.1. Insert Figure 4. 1 Here As shown in Figure 4.1, it is sometimes necessary for the Instance Selector to generate training examples to expedite the schema refine- ment process. The main idea is to make some slight changes on a prior training example and see if the changes would result in a different classification for the new example. By considering the new training example in the Learning Module, the S and G sets would move closer to each other. This converging process between S and G can be further facilitated if the new training examples selected represent concepts -12- closely related to some prior training examples. Mitchell et al. [1983] has a more detailed account on how the new training examples can be generated by the Instance Selector to facilitate the learning process. In many learning situations, it is possible to have "near miss examples" — i.e., the instances which are very close to being positive (Winston [1979]). In refining model manipulation schemata, for example, the near miss examples can be defined as the problem descrip- tions which, although not directly solvable by the schema, need only minor modifications for the schema to be applicable (e.g., one un- solvable node in the solution tree when the schema is applied). The Learning Module can decide to modify the schema by finding the solu- tion for the unsolvable node, so that it becomes applicable to this near miss example. The G and S sets are updated by treating the near miss example as a positive instance for the modified schema. The re- finement process would continue until G and S converge. An example of schema refinement using positive, negative, and near-miss examples is described in Appendix B. 5. Learning Model Selection Heuristics When there are more than one way to solve a given problem (e.g., models such as regression, moving average, exponential smoothing, and delphi models can all solve a forecasting problem), the model manage- ment system usually either let the user select the best model, or it can choose among these alternative models based on a heuristic func- tion. This heuristic function is chosen based on past performances or -13- human experts' experiences and is usually in the form of a polynomial of several important factors: E = Ew.*f., where w. is the weight r . 1 1 i l given to f . For example, the f.s' used in a heuristic function for scoring the performances of several forecasting models could be the accuracy /error , the operating cost, the operating time, and the dif- ficulty of collecting data for each model, where each f. is charac- terized in a numeric scale. The coefficients of the heuristic function may be affected by the preference of the users and by the characteristics of the problems. A marketing manager may think, that the past accuracy of a forecasting model should dominate other criteria, but an MIS manager may give higher preference to the computational efficiency. Therefore, dif- ferent users may assign different heuristic functions to the models. Hence, the model management system should be able to adjust the co- efficients of the heuristic function according to the "preference patterns" manifested by the users. To that end, an inductive learning method Is needed to derive the heuristic function for each user, based on observations of that user's selection behavior. Since the objec- tive is to make model selection adaptive to the user's preference, we adopted the inductive learning method articulated in Rendell's proba- bilistic learning system (PLS) (Rendall [1983, 1986]). The heuristic function in this learning method corresponds to the utility function defined on a feature space. The feature space consists of a set of rectangular regions, each of which contains instances of a single con- cept (i.e., class). Thus, the region R in the feature space can be defined as R = (r, u, e), where r is a rectangular region in the -14- feature space; u is utility function value, as estimated by the proba- bility given by the ratio of the positive instances to the total observed instances in this region, and e is the error rate allowed in this region. The utility indicates the probability that an instance in the region is a positive instance; e is used to represent the system's confidence in its judgement of the instances contained inside a region. This PLS framework, can be applied to generate the heuristics for model selection. For example, suppose that the models are ranked on two performance criteria: (1) quality of solution and (2) computa- tional complexity, which are treated as the two dimensions of the feature space. Each region in the feature space then contains those model instances on the same utility level, described by the com- bination of solution quality and time complexity. For a given model, the corresponding utility — which represents the value for the model selection heuristics — can be determined by mapping it into the feature space. This approach progressively refines the utility assigned to each region by splitting a region into smaller regions. In addition, unlike some of the other learning system (e.g., Michalski [1983]), this learning method can effectively handle noisy data in the training set. For example, in Figure 5.1a, the three problems — "the sales next year," "the inventory three years later," and "the interest expense next year" — all face the decision of choosing the best forecasting model. For the sake of simplicity, we use the quality of solution and computational complexity as two criteria for evaluating alternative -15- models. In the set of training examples, the chosen model for each problem is treated as a positive instance, and the rest are treated as negative instances. Insert Figure 5. 1 Here Using each model's solution quality and computional complexity, the Learning Module can localize the models in the feature space (Figure 5.1b). To determine the heuristic value, the Learning Module further divides the feature space into several classes using the following procedure. Initially, it arbitrarily splits the feature space into two regions, and calculates the success probability, u, in each of these regions. In figure 11. b, an arbitrary splitting gener- ates the regions, r and r ? , which initially have the utilities (prob- abilities) u = 2/7, and u = 1/5, respectively; they are estimated by the ratio of positive instances to total instances in each region. Each region is then refined by a further splitting, where the best splitting is the one resulting in the largest dissimilarity d among all possible partitions of the region. Rendell defined the dis- similarity measure, d, for each splitting as (|log u 1 - log u~ | - log(e /e )), where u , u , and e , e , are the utilities and error rates for the two regions after the splitting. This splitting process is repeated until d <^ for every region, shown as Figure 4.5c in the example. Every region can then define a utility class, in which the models are of the same preference level. This inductive learning process can be applied to the training examples collected from the individual users; the utility classification derived from a set of -16- training examples reflects the preference of that user and can be used as the heuristic for model selection. 6. Conclusions In this paper, we have presented a learning augmented approach to the design of model base management sybsystem of DSSs by adding a Learning and Knowledge Acquisition Unit. The Learning and Knowledge Acquisition Unit can acquire decision rules through an inductive learning engine; it can also refine the rules or derive decision sche- mata by four functional components: the Instance Selector, the Problem Solver, the Critic, and the Learning Module. This learning augmented methodology provides a unified framework, for supporting such important model management operations as rule learning and refinement, improving model manipulation, and deriving heuristics for model selec- tion. -17- Ref erences Applegate, L. M. , Konsynski, B. R. , and Nunamaker, J. F. , 1986, "Model Management Systems: Design for Decision Support," Decision Support Systems , Vol. 2, pp. 81-91. Blanning, R. , 1986, "An Entity-Relationship Approach to Model Manage- ment," Decision Support Systems , Vol. 2, No. 1, pp. 65-72. Bonczek, R. H. , C. W. Holsapple, and A. B. Whinston, 1980, "Future Directions for Developing Decision Support Systems," Decision Science , Vol. 11, pp. 616-631'. Bonczek, R. H. , Holsapple, C. W. and Whinston, A. B. , 1981, "Representing Modeling Knowledge with First Order Predicate Calculus," Operations Research . Bonczek, R. H. , Holsapple, C. W. and Whinston, A. B. , 1983, "Specification of Modeling and Knowledge in Decision Support Systems," in H. G. Sol (ed.), Processes and Tools for Decision Support , (North Holland: Amsterdam). Buchanan, G. B. and Mitchell, T. M. , 1978, "Model-Directed Learning of Production Rules," Pattern-Directed Inference Systems , in Waterman, D. , and Hayes-Roth, F. (eds. ) (New York: Academic Press). Bundy, A., Silver, B. , and Plummer, D. , 1985, "An Analytical Comparison of Some Rule-Learning Programs," Artificial Intelligence , 27, pp. 137-181. Carbonell, 1983, "Learning by Analogy: Formulating and Generalizing Plans from Past Experiences," in Machine Learning , ed. R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Tioga. Davis, R. , 1979, "Interactive Transfer of Expertise: Acquisition of New Inference Rules," Artificial Intelligence , Vol. 12, pp. 121-57, DeJong, G. , 1986, "An Approach to Learning from Observation," in Machine Learning (Vol. II) , Michalski et al. (Eds.), (Morgan Kauf mann, Los Altos, CA). Dietterich, T. G. and Michalski, R. S. , 1983, "A Comparative Review of Selected Methods for Learning from Examples," in Machine Learning: An AI Approach , ed. R S. Michalski, J. G. Carbonell, and T. M. Mitchell, Tioga. Dolk, D. and Konsynski, B. , 1984, "Knowledge Representation for Model Management Systems," IEEE Trans, on Software Engineering , Vol. SE-10, No. 4, pp. 619-627. -18- Dutta, A. and Basu, A., 1984, "An Artificial Intelligence Approach to Model Management in Decision Support Systems," IEEE Computer , Vol. 17, No. 9, pp. 89-98. Elam, J. J. and Henderson, J. C. , 1980, "Knowledge Engineering Concepts for Decision Support System Design and Implementation," Proceedings of Fourteenth Annual Hawaii International Conference on System Sciences , (Western Periodicals: North Hollywood, CA). Elam, J. and Konsynski, B. , 1987, "Using Artificial Intelligence Techniques to Enhance the Capabilities of model Management Systems," Decision Sciences , Vol. 18, pp. 487-502. Fedorowicz, J. and Williams, G. B. , 1986, "Representing Modeling Knowledge in an Intelligent Decision Suport System," Decision Support Systems 2, pp. 3-14. Fik.es, R. , Hart, P., and Nilsson, N. , 1972, "Learning and Executing Generalized Robot Plans," Artificial Intelligence , Vol. 3, No. 4, pp. 251-288. Geoffrion, A. M. , 1987, "An Introduction to Structured Modeling," Management Science , Vol. 33, No. 5, pp. 547-589. Henderson, J., 1987, "Finding Synergy Between Decision Support Systems and Expert Systems Research," Decision Sciences , Vol. 18, pp. 333-349. Holland, J., 1986, "Escaping Brittleness: The Possibility of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems," Machine Learning: An AI Approach , Michalski, Carbonell, and Mitchell (Eds.), Tioga Pub. Co., Palo Alto, CA. Konsynski, B. and Sprague, R. H. , Jr., 1986, "Future Research Directions in Model Management," Decision Support Systems , 2, pp. 103-109. Korf, R. , 1985, Learning to Solve Problems by Searching for Macro- operators , (Pitman, Marshfield, Mass.). Langley, P., 1981, "Data-Driven Discovery of Physical Laws," Cognitive Science , Vol. 5, pp. 31-54. Langley, P., 1984, "Learning to Search: From Weak Methods to Domain- Specific Heuristics," CMU-R1-TR-84-21, (The Robotics Institute, Carnegie-Mellon University, Pittsburgh, PA), also appeared in Cognitive Science (1986). Lenat, D. B. , 1983, "EURISKO: A Program that Learns New Heuristics and Domain Concepts; The Nature of Heuristics III: Program Design and Results," Artificial Intelligence , Vol. 21, pp. 61-98. -19- Michalski, R. S. , 1980, "Pattern Recognition as Rule-Guided Inductive Inference," IEEE Transactions on Pattern Analysis and Machine Intelligence , Vol. PAMI-2, No. 2, pp. 249-361. Michalski, R. S. , 1983, "A Theory and Methodology of Inductive Learning," in Michalski, R. , Carbonell, J., and Mitchell, T. , (eds.), Machine Learning , Tioga Publishing Co., Palo Alto, CA. Mitchell, T. , 1982, "Generalization As Search," Artificial Intelligence , Vol. 18, pp. 203-226. Mitchell, T. , Utgoff, P. E. and Banerji, R. , 1983, "Learning by Experimentation: Acquiring and Refining Problem-Solving Heuristic," in Machine Learning: An Artifical Intelligence Approach, R S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Tioga Publishing Co., Palo Alto, CA. Mitchell, T. , Keller, R. , and Kedar-Cabelli, S. , 1986, "Explanation- based Generalization: A Unifying View," Machine Learning , 1, pp. 47-80. Rendell, L. , 1983, "A New Basis for State-Space Learning Systems and a Successful Implementation," Artificial Intelligence , Vol. 20, pp. 369-92. Rendell, L. , 1986, "A General Framework for Induction and a Study of Selective Induction," Machine Learning , Vol. 1, No. 2, pp. 177-226. Samuel, A. L. , 1967, "Some Studies in Machine Learning Using the Game of Checkers II — Recent Progress," IBM J. of Research and Development , Vol. 11, No. 6, pp. 609-617. Sprague, R. H. and Carson, E. D. 1982, Building Effective Decision Support Systems , (Prentice-Hall Inc.: Englewood Cliffs, NJ). Stohr, E. A., 1986, "Decision Support and Knowledge-based System: A Special Issue," Journal of Management Information Systems , Vol. II, No. 4. Winston, P., 1979, "Learning and Reasoning By Analogy," Communications of ACM , Vol. 23, No. 12, pp. 689-703. Appendix A Production Rules for Applying Models in Loan Evaluation In a DSS, production rules can be used 'o represent model knowledge. The application of each model is directed by an if-then rule and interpreted as "if the input requirements are satisfied and the model thus becomes executable, then the ouput value is ..." In the model predicates, we use the upper case to specify the model, underlines to represent the input values, and the rest to represent the output values. Some of the rules directing model applications in a loan-evaluation DSS are listed here. Machine learning techniques can be used to learn additional rules or to refine existing rules. 1 (varl, ?xl, vrl, fn)&(var2, 7x2. yr I fn)&( REGRESS. ?x_L 9 x2, ?x3, ' ? \4. yr. fn) = > (8. varl, var2. ?x3, yr fn)&(R 2 . varl, var2. ?x4, yr. fn)&(varl, ?x3. yr2, fn) With the input values, 7x1 and 7x2, of varl and varl in a given year for a particular firm, the REGRESS model ouputs values, 7x3 and 7x4, of p and R^ between the two input variables. 2 (varl, ?xl, yr, fn)&(var2, ?x2, yr, fn)&( RAT!Q , 9 xl_, ?x2, ?x3, yr, fn) = > (ratio, varl, var2, ?x3, yr. fn) Using the input values. ? x I and ? x2, of varl md var2 in a given year Cor a given firm, the Ratio model calculates the value nC their rati". 7x3. 3 (var, ?xl. yr. fn)&(var, ?x2. (- yr I). fn)&(var. ?x3. (- yr 2). fn)&(AVG. n xL ?x2, °x3. 7x4, y_r, fn) = > (avg, var, ?x4, yr, fn) Using the input values. ?xl, ? x2. and 7x3, of var from three consequentive years, the AVGERAGE model calculates their average value, 7x4. 4 (var, ?xl, yr, fn)&(industry-type, ?x2, yr, fn)&( PERCENT!LE , ?xj_. ?x2, ?x3, yr, fn) = > (percentile, var. ?x3, yr, fn. ?x2) Using the value of var and the industry type of this firm, the PERCENTILE model calculates its percentile value of var in its industry 5 (var, ?xl, yr, fn)&(industry-type, ?x2, yr, fn)&( MED!AN , TxJ., 2x2, ?x3, yr, fn) = > (median, var, ?x3, yr. fn, 1x2) Using the value of var and the industry type of this 'l r m. the MF.DIAM model calculates its median value nf var in its industry 6 (var, ?xl, yr, fn)&(tax-typc, 7x2, yr. fn)&( TAX . ?xl_, 9 x2. ?x3. yr. fn) = > (after-tax, var, ?x3, yr. fn) Using the value of var and the. tax- type of this firm, the TAX model calculates the after- tax value of var. 7 (var. ?xl, yr, fn)&(var, 7x2, yr, (-, yr, 1), fn)&(var, x3, (-, yr. 2). fn)&(TREND, lx±, 9 x2, °x3. 7x4, yr. fn) = > (trend, var, ?x4, yr. fn) Using the value of var from three consequent ively year-, the TREND model calculates the trend of var. 8 (ratio, ( + , long-term-debt, curr-liab, ?xl, yr, fn), total-assets, ?x2, yr, fn)&(ratio, funds-from-op, ( + , interest, (avg, debt-maturity. ?x4, yr, fn)&(trend, sales, ?x5, yr, fn)&( R!SK-SCORE . ?xl , ?x2 , ?x3 , ?x4, ?x5 , 9 x6, yr, fn) = > (risk-score. ?x6, yr, fn) Using f long- term- debt - current-liabilities) to lotal-as'e'.s ratio, and funds- from- ope rat ion to (inte r est + the- average- deh t - mat uri t: ratio. the RISK- SCORE model calculates the risk score of this firm. 9 (interest-income, ?xl , yr. fn)&(cost-of-handling-dcposit, 7x2. yr, fn)& (avg, loan- volume, ?x3, yr, fn)&(avg, collected-balance, 7x4, yr. fn )&( risk-score, ?x5. yr. fn)&(LT, ^x5, 0)&( LOAN-Y1ELD-I , ?xL ?x2, 1x3, ?x4, 9 x5. ?x6. yr, fn) = > (loan-yield, ?x6, yr, fn) Using the interest- income, cos t-of-handling- deposit , three year average loan- volume and collected-balance, and the risk-score, under the condition that the risk score is less than 0. the LOAN-YIELD-I model calculates the loan-yield of this firm. 10 (interest-income, ?xl, yr, fn)&(cost-of-hand!ing-deposit. 9 x2, yr,fn)& (avg. loan-volume, ?x3, yr, fn)&(avg, collected-balance, 7x4, yr, fn)&(risk-score, ?x5, yr, fn)&(GT, ?x5, 0)&( LOAN-YIELD-I1 , 7x1-2x2, ?x3, 7x4. °?i< 7x6, yr, fn) = > (loan-yield, 7x6. yr. fn) Using the interes t- income , cos t- of-hand ling-deposit , three year average loan- volume and collected- balance, and the risk- score, under the condition that the risk score is greater than 0, the LO A N - Y I ELD- 1 1 model calculates the loan- yield of this firm. 11 (interest-income, 7x1, yr, fn)&(cost-of-handling-deposit, 7x2, yr,fn)&(avg, loan- volume, 7x3, yr, fn)&(avg, collected-balance, 7x4, yr, fn)&(risk-score, 7x5, yr, fn)&(GT, 7x5, 1.255)&( LQAN-Y1ELD-1II , 7xJL, 9 x2, 7x3, 7x4, 7x5, 7x6, yr. fn) = > (loan-yield, 7x6, yr, fn) Using the interest-income , cos t- of- handling-deposit , three year average loan- volume and collected-balance , and the. risk-score, under the condition that the risk score is greater than 1.255, the LO AN -Y 'I 'F.LD-ll I model calculates the loan-yield of this firm. 12 (interest-income, 7x1, yr, fn)&(cost-of-handling-deposit, 7x2, yr,fn)&(avg. loan- volume, 7x3, yr, fn)&(avg, collected-balance, 7x4, yr. fn)&(risk-score, 7x5, yr, fn)&(GT, 7x5, 2.79)&( LQAN-Y1ELD-IV , ^xj., 7x2, 7x3, 7x4. 9 x5, 7x6, yr. fn) = > (loan-yield, 7x6, yr, fn) Using the interest-income , cost- of-handling-deposil . three year average loan- volume and collected-balance , and the risk- score, under the condition that the risk score is greater than 2.79, the LO A N -Y I Ll.D- 1 V model calculates the loan- yield of this firm. 13 (trend, interest-rate, 7x1, yr, fn)&(intcrcst-ratc, 7x2, yr, fn)&(loan-period. 7x3. yr, fn)&(BT, 7x3, 3, 12)&( ST-LOAN-RATE , ?xi, 7x2, 7x3. 7x4, yr, fn) = > (st-loan-rate, 7x4, yr, fn) Using the trend of interest-rale, interest-rate, and the loan-period, under the condition that the loan-period is between J to 12 months, the ST-LOAN-RATE model calculates the short term loan rate of this firm. 14 (trend, interest-rate, 7x1, yr, fn)&(intcrcst-ratc, 7x2, yr. fn)&(loan-period. °x3, yr, fn)&(GT, 7x3, 12)&( LT-LOAN-RATE . 7x]_, 7x2, 7x3, 7x4, yr, fn) = > (lt-loan-rate, 7x4, yr, fn) Using the trend of interest-rate , interest-rate, and the loan-period, under the condition that the loan-period is greatcer than 12 months, the LT- LOA N - RA TE model calculates the long term loan rate of this firm 15 (interest-cost, ?xl, yr. fn)&(operating-cost, 7x2. yr, fn)& (avg-assets, 7x3, yr. fn)&(rcscrvc-rcquircmcnts, ?x4, yr, fn)& ( COST-OF-FUNDS , ?xj_, ?x2, ?x3, ?x4, ^x5, yr, fn) = > (cost-of-funds. ?x5, yr, fn) Using the interest-cost , operating-cost , three year average assets, and the reserve-requirements, the COST-OF-FU N D model calculates the cosl-of-fund of this firm. 6 (cost-of-funds. ?xl, yr, fn)&(loan-yicld, 1x2, yr, fn)& ( COMPENSATING-BALANCE , ?xj_, ?x2, ?x3, yr, fn) = > (compcnsating-balancc. ?x3, yr, fn) Using the cost-of-funds. and the loan-yield, the COM PEN SATING-BA LANCE model calculates the compensating-halance of this firm. Appendix B An Example Illustrating the Schema Refinement Process This appendix describes an example applying the learning method described in Section 4 to refine an existing model manipulation schema. The initial schema is shown in Figure A-l, with G and S defined for the version space. The generalization relations between the domain variables are organized into a hierarchy shown in Figure A-2. In Figure A-l, a model manipulation schema is created from an ini- tial positive instance, (percentile, (ratio, A/R, inv), ?xl, 1986, ABC), which represents the computation modules for getting the per- centile value of the ratio between accounts-receivable (A/R) and in- ventory (inv) in a given year (1986) for a particular firm (ABC). This initial schema has a version space where the G set is the maxi- mally generalization of this instance, (percentile, (ratio, varl, var2), ?xl, yr, fn) , base on the generalization hierarchy shown in Figure A-2. The S set is initiated to be the training Instance. In Figure A-3, the schema is applied to a new instance: (percentile (ra.tio, asset, liab), ?xl, 1986, ABC). Since the instantiated solu- tion tree is complete, the instance is classified as a positive example. It modifies the current version space by minimally general- izing the S set. Based on Figure A-2, asset is the minimal generali- zation of asset and accounts-receivable, and B/S-var is the minimal generalization of liability and inventory. Therefore, minimally generalizing S would result in, (percentile, (ratio, asset, B/S-var), ? xl, 1986, ABC). In Figure A-4 , the training instance, (percentile, (ratio, pro- fits, assets), ?xl, 1988), has an incomplete solution tree. Consequently, this instance is classified as a negative example for the current schema. It then modifies the current version space by constraining the G set to be (percentile, (ratio, B/S-var, var2), yr, f n) , (percentile, (ratio, varl, I/S-var), yr, fn), or (percentile, (ratio, varl, var2), yr , fn) (yr < 1988). In Figure A-5, a near-miss example modifies the schema by adding one more precondition of the RATIO model, (avg, varl, ?x5, yr, fn). The G and S sets in the current version space are also updated to include the maximal and minimal generalizations of this example. Figure 2.1 The Basic Model of a Machine Learning System I Problem Solver Performance Trace Knowledge Base (c) Rule Refinement Learning and Know ledge -Acquisition Unit (b) Inductive Learning (a) Knowledge Engineering Training Examples Experienced Experts Figure 2.2 Interactions Between the Learning Module and Other DSS Components Model Base Data Base User i CONTROL SYSTEM I PROBLEM SOLVER Expert i INSTANCE SELECTOR * ► Kn owledg e Base CRITIC * ► LEARNING MODULE Figure 2.3 The Learning-Augmented DSS Framework for Knowledge Refinement condition : (a generalized problem expression) solution : ( a generalized problem expression) subprobleml subproblem2 subproblem N data-retrieval modell precondl precond2 data-retrieval user-input model2 A precondl precond2 i i Figure 3.1 The Application of A Model Manipulation Schema (a) (b) Goal state operator or rule Initial state Goal state schema Initial state Figure 3.2 Search Processes ( a ) Without and ( b ) With Schemata Q START ) RECEIVE A NEW PROBLEM INSTANCE INSTANCE SELECTOR THE PROBLEM INSTANCE GENERATE THE SOLUTION PROBLEM SOLVER THE SOLUTION TREE EVALUATE THIS SOLUTION CRITIC THE TRAINING INSTANCE AND ITS SOLUTION TREE " IF AT LEAST ONE 3LUTION PATH :ompli NO THE POSITIVE INSTANCE AND ITS SOLUTION YES GENERALIZE THE SOLUTION AND THE PROBLEM LEARNING MODULE A NEW MODEL MANIPULATION SCHEMA Figure 3.3 The Learning Procedure in the Acquisition of Model Manipulation Knowledge r CREATE A NEW INSTANCE INSTANCE SELECTOR YES 1 ACCEPT A NEW INSTANCE INSTANCE SELECTOR APPLY THE EXISTING SCHEMATA PROBLEM SOLVER I EVALUATE THE SOLUTION PATH AND CLASSIFY THE TRAINING INSTANCE CRITIC i REFINE THE EXISTING SCHEMA LEARNING MODULE YES Figure 4.1 The Learning Procedure in the Refinement of Model Manipulation Knowledge [a) Pi P2 P3 Fi ¥2 F3 F4 REGRESSION MOVING AVERAGE EXPONENTIAL SMOOTHING DELPHI PI P2 P3 THE SALES NEXT YEAR THE INVENTORY THREE YEARS LATE] THE INTEREST EXPENSE NEXT YEAR (c) quality of solution quality of solution complexi ty o -- denotes a positive instance * -- denotes a negative instance Figure 5.1 An Example of Learning Model Selection Heuristics Using the PLS Approach :onaition G ' e ! j> Li 1 1 1 e b /5-var 5a;es expenses *.2xes cash A/R Inv pro' Legend B/S-vai 5-. 2' A/R inv 5a lance -sheet -van able ncome-statement-vana: Accounts-receivable Inventory Figure A-2. The Generalization Hierarchy a- example, (percentile, (rat' : asset,! 1 ab ), 9 x 1 , 1 986, AE Z A/;tr 're oattern-matcning, (assets/var I Jiab/var2, 1 986/yr,A5C fn), Mas the following instantiated solution (percent He, (ratio, asset, liab ), 9 xi , 1 986. a BC) (PERCENT 1 lE.7x2.7x3.7x1 ,1986,-5: 2sse: ''3D ,7x2. ' 985 *BC) Ondustry-tyce "■ a a : 2 ?x4,?x5 '«:.' 985, ABC) (asset 7x4, 1 986, ABO* (liab 7x5, 1 986, ABO* data-retrieval data-retrieval "his example is classified as a positive example, since ail terminal nodes >- solvable, it modifies the condition of this schema as 'r'lows 3 (percentile/ rati o.var 1 ( var2), 9 x 1 ,yr,fn) 5 'percentile [ratio, asset, 3/S-var),7x l , 1 986, ABO Figure A-3. Applying the Schema to a Positive Example An example, (percenitle, (ratio, profits, assets), 7 x 1 , 1 988, ABC), with the pattern-matching, (profits/varl ,assets/var2, 1 988/yr,ABC/fn) has the following instantiated solution (percentile,(ratio,profits,assets), 9 x l , I 98 8, ABC) (PERCENTILE,?x2,?x3 J ?x1,1988 J ABC) (rat lo.prof its, assets, 7x2, 1988, ABC) (RATIO 7 X 4, 7 X 5, 7x2, 1988, ABC) (industry- type, 7x3, 1988, ABC) cannot be solved wltn data-retneva : prof its, 7x4, 1 988, ABC) ( assets, 7x5, 1 988, ABC) cannot be solved wltn data-retrieval cannot be solved wltn data-retrieval • Since all terminal nodes are unsolvable, this example is treated as a negative example. It modifies the condition of the schema as follows: G (percentile, (ratio, B/5-var,var2),yr,fn), or (percent ile,(ratio,var 1 ,1/5 -var),yr,fn), or ( percentile^' rat i o,var l ,var2),yr,fn)*(yr< l 988) 3 percentile.Cratio^assets^B/S-var), i 986, ABC) Figure A-4. Applying the Schema to a Negative Example # An example , (percentile, (ratio, inv,(avg,inv)) ( ?x1,1 986, ABC), with the partial pattern-matching, (inv/varl , J 986/yr,ABC/fn), has the following instantiated solution (percentile, (rati 0,1 nv,(avg,inv)),7x 1 , 1 986, ABC) (PERCENTILE, 7x2, 7x3, 7x1, 1986, ABO ( rat io,inv,(avg,inv), 7x2, 1986, ABC) ( industry- type, 7x3, 1 986, ABO* user-input (RAT 10, 7x4,7x5, ^x2, 1986, ABC) (inv,7 X 4, 1 986, ABO* (avg.inv.7x5, 1 986, ABC) data-retrieval cannot be solved wltn data-retrieval *. then modifies the schema as follows condition G (percentile, (ratio, B/5-var,var2),yr,fn), or (percentile,(ratio,varl ,l/5-var),yr,fn),or (percentile, (ratio, varl ,var2),yr,fn)*(yr< 1 988), or (perceni tie, (ratio, varl ,(avg,var2)),yr,fn) 5: (percentile, (ratio, assets, B/S-var), 1986, ABO, or ( percentile, (rat io,inv,(avg,inv)), 1 986, ABO i :' -*. : o n (percent - !e,r a tio.var * ,var2( ?xl),vr.fn (PERCENTILE, 7x2, 7x3, 7xl,yr,fn) (ratio, varl ,var2,7x2,yr,fn) 0ndusty-type,7x3,yr,fn)* user-Input (RATIO, 7x4,7x5, 7x2.yr,fn) (varl ,?x4,yr,fn)* (avg,van ,7x5,yr,fn) (var2,7x9,yr,fn)* data-retrieval (AVE RAGE 7x6, 7x7, 7x8, 7x5, yr,fn) (varl, 7x6,1 986, fn)* (var2,7x7, 1 985, fn)* (var3,7x8, 1 984, fn)* data-retrieval data-retrieval data-retrieval Figure A-5. Applying the Schema to a Near-miss Example HECKMAN BINDERY INC. JUN95 So«d .T*.Pfc*/ N MANCHESTER. INDIANA 46962