PII: 0888-613X(89)90009-1 A Valuation-Based Language for Expert Systems Prakash P. Shenoy S c h o o l o f Business, Umverstty o f Kansas, Lawrence, Kansas A B S T R A C T A new language based on valuations ts proposed as an alternatwe to rule-based languages f o r constructing knowledge-based systems. Valuation-based languages are superior to rule-based languages f o r maintaining consistency m the knowledge base, f o r cachmg references, f o r managmg uncertainty, and f o r nonmonotomc reasonmg. A n abstract description o f a valuatzon-based language is gwen Two specifw instances o f valuation-based languages are described. The first ts designed to represent categortcal knowledge. The ablhty o f such a language to mamtam conststency and cache references ts demonstrated with an example. The second ts an evidential language--a valuatzon-based language m whwh valuattons are behef functions. The abthty o f ewdenttal languages to perform nonmonotonic reasoning and manage uncertainty is demonstrated with an example. KEYWORDS. valuation-based language, rule-based language, valuation system, k n o w l e d g e - b a s e d system, rule-based system, consistency in k n o w l e d g e bases, caching inferences, truth m a i n t e n a n c e systems, evidential systems, n o n m o n o t o n i c reasoning, m a n a g e m e n t o f uncer- tainty I N T R O D U C T I O N This paper proposes a new language based on " v a l u a t i o n s " as an alternative to rule-based languages for budding knowledge-based systems. This language is inspired by the axlomauc framework for propagation o f probabdmes and b e h e f funcuons (Shenoy and Shafer [1, 2]) and by Rs extension, which includes constraint propagatl,3a and discrete optlrmzataon (Shenoy and Shafer [3, 4]). Since the primary objects in the axiomatic framework are called valuaUons, we refer to this language as being valuation-based, and we call a formal structure created using this language a valuation system. A popular language for building a knowledge-based system is a production or a rule-based language (Brownston et all. [5], Davis and King [6]). While these Address correspondence to P. P. Shenoy, School o f Business, Summerfield Hall, Lawrence, Kansas 66045-2003 lnternauonal Journal of Approxtmate Reasoning 1989, 3 383--41 ! © 1989 Elsevier Sctence Pubhshmg Co , Inc 655 Avenue of the Americas, New York, NY 10010 0888-613X/89/$3 50 Umverstty of Kansas, 383 384 Prakash P Shenoy languages have many attractive features, they also have some serious shortcom- ings. In this paper we will focus on four major shortcomings o f rule-based languages that are not shared by valuauon-based languages. These four shortcormngs are referred to as the problem o f consistency, the problem o f caching, the problem o f nonmonotonic reasoning, and the problem o f managing uncertainty. A special case o f a valuation system is an evidential network (Shenoy and Shafer [1, 2]). The use o f evidenual networks to manage uncertainty is well understood (see, e.g., Shafer et al. [7]). However, the use o f valuaUon systems for representing categorical knowledge, maintaining consistency, and perform- mg intelligent caching and nonmonotonlc reasoning is not widely understood Valuation systems include as a special case b e h e f networks and moral graphs. Belief networks have been proposed by Pearl [8, 9] and moral graphs by Launtzen and Splegelhalter [10] for managing uncertainty using probablhtles (see also Heckerman and Horvltz [11]) The use o f valuation systems m representing and propagating probabilities is descnbed m Shenoy and Shafer [1, 2] and Shafer and Shenoy [12, 13] Valuation languages can also be used to propagate constraints (Seldel [14], Dechter and Pearl [15], Shenoy and Shafer [3]) and to solve discrete optimization problems, both constrained and unconstrained (Bertele and Bnoschi [16], Shenoy and Shafer [4]) Other problems that fit in the framework o f valuauon languages Include solution to systems o f equations (Rose [17]), propagation of Spohman belief functions (Spohn [18], Hunter [19]), retrieval from acychc database schemes (Malvestuto [20], Been et al [21]), and use o f a Kalman filter (Dempster [22], Melnhold and Singpurwala [23]). An outline o f this paper is as follows. In the following secuon we discuss some problems with rule-based languages In the third section we gwe an abstract description o f a valuation-based language, and in the fourth section we describe a specific instance o f a valuation language designed to represent categoncal knowledge We demonstrate, using an example, how such a language can be used to maintain consistency m a knowledge base and how Inferences are cached. In the fifth section we describe an ewdentml language--a valuatmn language that uses b e h e f functions as valuauons. We also briefly describe a truth maintenance system and show the correspondence between concepts m truth maintenance systems and concepts m evidential systems Next, using an example, we show how ewdential languages can be used to reason nonmonotonicaUy and manage uncertmnty. We conclude with a summary and some general comments. SOME PROBLEMS WITH RULE-BASED L A N G U A G E S In this secUon, we look at some of the shortcomings o f pure rule-based languages. In particular, we focus on the problems o f consistency, caching, Valuauon-Based Language for Expert Systems 385 nonmonotomc reasoning, and management o f uncertainty. Since there ~s no universally accepted formal definition o f a rule-based language, we will use the model o f a pure production system given in D a w s and Kang [6] as a representative rule-based language Consistency In large knowledge bases, consistency is an important issue By consistency, we mean the absence o f syntactic contra&ctions. An example o f a syntactic contradiction is a premise A = a and two rules: A = a -~ B = b and A = a --, B = - b . (The symbol --, denotes the truth-functional con&tional.) Rule-based languages lack expressive power to check for contra&ctions. Accordingly, most commercial lmplementaUons o f rule-based languages pro- vide little or no support for checlong for contradlcUons. However, this does not mean that such checking cannot be done outside the formal structure o f rule- based languages. In recent years, there have been several stu&es on e f f l o e n t methods for checlong for contra&ctlons in rule-based languages (see, e . g . , Adams [24], Suwa et al. [25], Nguyen et al. [26], Pearl [27], Touretzky [28], and G m s b e r g [29]). As we shall see, unlike rule-based languages, maintenance o f consistency is an Integral part o f valuation-based languages Caching Regarding caching o f references, typically, backward-chaining, goal-driven rule-based languages do not cache any inferences, whereas forward-chaining production systems cache all Inferences in worlong m e m o r y In either case, caching m rule-based languages is o f little help to the knowledge engineer in understanding the implications o f the knowledge in the knowledge base. As we shall see, valuation languages cache and display certain inferences, and this can be very useful in the knowledge engineering process. Nonmonotonic Reasoning The subject o f nonmonotomc or default reasoning is an important area m artificial intelligence W e often use assumptions or defaults as facts untd we observe something that contradicts the references we have derived. W e then need to retract some assumpuons or defaults to avoid the contradiction. A famous example is that o f Tweety the bird. Most birds fly W e may initially use the rule I f X is a bird then X f l w s as an assumption or a default. Upon learning that Twecty is a bird, we m a y infer that Tweety flies. H o w e v e r , we m a y subsequently learn that Tweety is a penguin and does not fly. At this stage, to keep our knowledge base contra&ction-free, we need to retract the assumption that led to the contra&ction. The construction o f efficient procedures to enable nonmonotonic or default 386 Prakash P. Shenoy reasoning is the subject o f considerable research m artificial intelligence (McCarthy [30], McCarthy and Hayes [31], McDermott and Doyle [32], Moore [33], Reiter [34]). Uncertainty Finally, it is now well known that pure rule-based languages are inadequate both to represent uncertain knowledge and to make references from such knowledge (Shafer [35], Heckerman and Horvltz [36]). F o r example, MYCIN used certainty factors and PROSPECTOR used a pair o f likelihood ratios with each rule to represent uncertainty (Shorthffe and Buchanan [37], Duda et al [38]). However, these systems are brittle. They give the right answers in only the simplest o f cases. One solution to some o f these problems is to couple a truth maintenance system to the knowledge base (Doyle [39], de Kleer [40], Reiter and de Kleer [41]). Truth maintenance systems were dewsed by logicianS In artificial intelligence to reason with incomplete and uncertain information symbohcally without using numerical calcuh such as probability theory or belief functions. Truth maintenance systems are still in the developmental stage and are the subject o f intense research m artificial intelligence. Another solution has been to control the sequence o f inferences so that the correct results are obtained. This approach has been studied, for example, by Laskey and Lehner [42] and by D'AmbrosIo [43]. AN ABSTRACT DESCRIPTION OF A VALUATION-BASED L A N G U A G E This section gives an abstract description o f a valuation-based language. The language consists o f objects, and operators that operate on the objects. The objects are used to represent knowledge The operators are used to make mferences from the knowledge In rule-based languages, the objects are variables and rules and the operator is modus ponens. In valuation-based languages, the objects are called variables and valuations, and the operators are called combination, marginalizatlon, and solution. The level o f abstractness at which this language is described here forces us to omit the computational details o f how precisely the three operators are used to make inferences. This allows us to concentrate on the concepts (For a more formal and less abstract exposition with theorems and proofs, we refer the reader to Shenoy and Shafer [1-4] and Shafer and Shenoy [13].) However, since abstract descriptions can be difficult to comprehend, we describe two specific valuation-based systems in the succeeding sections with concrete examples. Valuataon-Based Language for Expert Systems 387 Variables and Configurations W e use the s y m b o l ~ x f o r the set o f possible values o f a variable X , and w e call ~dTx the f r a m e f o r X . W e wall be c o n c e r n e d with a fimte set 9C o f variables, and w e will a s s u m e that all the variables In 9C have finite frames Given a fimte n o n e m p t y set h o f variables, w e let 'Wh denote the Cartesian p r o d u c t o f '~7x f o r X m h ; ~ h = X { %Vx[X E h }. W e call "~h the f r a m e f o r h. W e will refer to elements o f 'Wh as configurattons o f h. PROJECTION OF CONFIGURATIONS ProJection o f configurations simply means d r o p p i n g extra coordinates; if (w, x , y , z) is a c o n f i g u r a t i o n o f { W, X , Y, Z } , f o r example, then the projection o f (w, x , y , z) to { W, X } is simply (w, x), which is a c o n f i g u r a t i o n o f { W, X } I f g and h are sets o f variables, h c g, and x is a configuration o f g, then we will let x *h denote the projection o f x to h. T h e projection x *h is always a configuration o f h. I f h = g and x is a c o n f i g u r a t i o n o f g, then x *h = x Valuations Given a set h o f variables, there is a set ~ h . T h e elements o f ~ h are called valuattons o f h. W e will let ~ denote the set o f all valuations, that is, ~ = (-J { ~ h [ h c ~E}. Valuations are p r i m m v e s m o u r abstract description and as such require no definition But as w e shall see shortly, they are objects that can be c o m b i n e d , m a r g l n a h z e d , and solved. Intmtively, a valuation o n h represents s o m e k n o w l e d g e about the variables m h Examples o f valuations on h are an array, a function H : ~¢~h ---' ~ + ( ~ + denotes the set o f n o n - n e g a t w e real numbers); a superarray, a function H : 2 v:h -+ ~/+ (2wh denotes the set o f all subsets 0f'Wh); a rule, a function H : %Vh -+ { true, false }, etc PROPER VALUATIONS F o r each h c ~ , there Is a subset (Ph o f ~ h w h o s e elements will be called p r o p e r valuattons on h. Let (P denote I,.) { (Ph I h c ~E }, the set o f all p r o p e r valuations. Intuitively, a p r o p e r valuation represents k n o w l e d g e that is consistent in itself T h e notion o f p r o p e r valuations is important as it enables us to define c o m b m a b l l i t y o f valuaUons, it allows us to define existence o f solutions, and it allows us to constrain the definitions o f combination and marginalization to meaningful operations. E x a m p l e s o f p r o p e r valuaUons are apotenttal, a f u n c u o n P" ~¢7h ~ ~L that is not identically z e r o f o r all configurations; a superpotential, a f u n c n o n m : 2wh ~t+ that is not z e r o f o r all n o n e m p t y subsets o f "~¢h; a satisfiable rule, a function R : 'Wh ~ {true, false} that is not identically false f o r all 388 Prakash P. Shenoy configurataons; etc. Potentials c o r r e s p o n d to unnormalized p r o b a b d l t y distributions, and superpotentials c o r r e s p o n d to u n n o r m a l l z e d basic probability assignment functions. Combination W e assume there is a m a p p i n g ® : ~ × ~ --) ~7, called combination, such that 1. I f G and H are valuations on g, and h, respectively, then G ® H is a valuation o n g LI h . 2. I f either G o r H is not a p r o p e r valuation, then G ® H is not a p r o p e r valuation 3. I f G and H are both p r o p e r v a l u a t i o n s , then G @ H m a y o r m a y not be a p r o p e r valuation I f G @ H is not a p r o p e r valuaUon, then w e shall say that G and H are not combinable. I f G ® H is a p r o p e r valuation, then w e shall say that G and H are combinable and that G ® H is the combmatton oJ G and H. Intuitively, as its name suggests, c o m b i n a t i o n corresponds to a g g r e g a u o n o f k n o w l e d g e I f G and H are p r o p e r valuations on g and h representing k n o w l e d g e about varmbles in g and h , r e s p e c u v e l y , then G ® H represents the a g g r e g a t e d k n o w l e d g e about variables m g U h . F o r potentials, c o m b i n a t i o n c o r r e s p o n d s to polntw~se multiplication, if G and H are potentmls on g and h, respectively, then (G ® H ) ( x ) = G(x~g)H(x~h). F o r basic p r o b a b d l t y assignment f u n c u o n s , combmatxon c o r r e s p o n d s to D e m p - s t e r ' s rule ( D e m p s t e r [44, 45]). F o r rules, ff G and H are rules o n g and h , respectively, then G ® H is a rule on g t3 h such that (G ® H ) ( x ) = true lff G(x ~g) = true and H ( X ~h) = true. Marginalization W e assume that f o r each h c_ 9C, there IS a m a p p i n g ~h: I,.J { ~ g l g ~- h} %9h, called marginalization to h, such that 1. I f G is a valuation o n g and h c g , then G sh is a valuation on h 2. I f G is a p r o p e r valuation, then G ~h is a p r o p e r valuation. 3. I f G IS not a p r o p e r valuation, then G ~h i s not a p r o p e r valuation W e will call G ~h the marginal o f G f o r h. Intuitively, marginalization c o r r e s p o n d s to crystalhzatlon o f knowledge. I f G is a valuation o n g representing s o m e k n o w l e d g e about variables in g, and h _c g, then G ~h represents the k n o w l e d g e about variables in h implied b y G i f w e disregard varmbles in g - h . In the case o f potentials, m a r g m a l i z a t l o n f r o m g to h is summation o v e r the configurations o f g - h . I n the b e h e f - f u n c t l o n case, margmallzation Is explained in the section " A n Evidential L a n g u a g e . . . " F o r rules, i f G is a rule o n g, then ValuaUon-Based Language for Expert Systems 389 G ~h is a rule on h such that GSh(x) = true fff there is a configuration y o f g - h such that G(x, y ) = true. Solution We assume that for each g c_ 9C, there is a mapping if. ~ g ~ 2 ~ g called solutton such that 1. I f G is a proper valuation on g, then i f ( G ) is a nonempty subset o f ~Vg. 2. I f G is a valuauon on g that is not proper, then i f ( G ) = ~ . The configurations m i f ( G ) are called soluttons f o r G. Intmtlvely, the solution operator maps knowledge from the space o f valuations to the space o f configurations. W e encode knowledge as valuaUons so that we can aggregate and crystalhze it. However, we need to decode the result The solution operator simply serves as a decoding mechanism In the case o f probablhUes, solutions may correspond to configurauons w~th the highest probabihty or simply configurations with posltwe probabdmes. For b e h e f functions, solutions may correspond to configurauons with the highest plauslbihty or simply configurations with posltwe plausibihtles F o r rules, solutions may correspond to configurations whose value ~s true Propagation of Valuations. A valuation-based language (VL) makes references by 1. Combining all proper valuations m the system (the resulting valuation, if proper, is called the jomt valuation), 2. Computing the marginal o f the joint valuation for each varxable in the system, 3. Computing the set o f all solutions for the margmals o f the joint valuation for each variable; and 4. Computing the set o f all solutions for the joint valuation The above is only a conceptual descripUon o f the actions o f a valuaUon-based language. It is not an algorithm I f there are n variables in the system, and each variable has two configurations m ~ts frame, then there are 2 n configurations o f all variables. Hence, it will not be feasible to compute the joint valuation when there are a large number o f variables. The V L does not actually compute the joint valuation. It computes the marglnals o f the joint valuation without exphcltly computing the joint valuation, and it does this using only local computations. An algorithm for computing exact marglnals and solutions is described in detail in Shenoy and Shafer [1-4]. An algorithm for computmg approximate margmals is described m Pearl [46], and an algorithm for computing an approximate soluuon to the joint valuauon ~s described in Klrkpatnck et al. [47] and G e m a n and G e m a n [48]. 390 Prakash P Shenoy Valuation System A valuation system (VS) consists o f a fimte set o f variables ~E, a finite frame 'Wx for each variable X in 9C, and a finite collection o f valuauons { V,},eM where each valuation V, is on some subset o f ~E. A valuatton network is a graph whose vertices represent either variables or valuations. I f valuation 1I, is on a subset h o f vertices, then this is represented in the valuation network by Including an edge between the vertex corresponding to V, and all variable vertices Xj such that Xj E h. The valuation network serves as a graphical representation of a valuation system and can be used as a user interface The valuation network is also used by the VL to propagate the valuations The algorithm for computing exact margmals and solution requires that the valuation network be a tree If the valuation network is not a tree, then this algorithm embeds it in a tree by clustering variables Such a tree, called a Markov tree, is then used to compute the marglnals and solutions (Shenoy and Sharer [1-4]). The simulation algorithms for computing marglnals and soluuons use the valuation network directly Capabifities o f a Valuation-Based Language A VL has the following capablllUes: 1. A VS can be extended by adding new vartables and adding new proper valuations. 2. A VS can also be reduced by removing variables and valuations. 3. Each time the VL recewes a new proper valuation, It checks whether or not it is combinable with the proper valuations already present in the system. 4. I f the new proper valuation is combinable with the valuauons already present m the system, then the VL accepts the new valuation. I f the new proper valuation is not combinable, then the VL rejects it and informs the user o f its acUon. 5. Each time the VL accepts a proper valuanon, it finds the marginal o f the joint valuaUon (the valuation obtained by combining all proper valuations in the system) for each variable in the system. This IS accomphshed using local computaUons ff an efficient Markov tree can be found for the valuation network (Shenoy and Shafer [1, 2]) or by stochastm simulation otherwise (Pearl [46]) 6. The VL also computes for each variable the set o f all solutaons for the marginal o f the joint valuation for that variable. Once we have the marginal o f the joint valuauon for a variable, computing the set o f all solutions is simply done by exhaustive enumeration o f the frame for that variable. 7. I f necessary, the VL can compute a configuration o f all variables that is a Valuauon-Based Language for Expert Systems 391 solution for the joint valuation. This can be done using an exact algorithm ff an efficient M a r k o v tree can be found for the valuataon network (Shenoy and Shafer [3, 4]) or by stochastic relaxaUon and annealing (Kirkpatnck et al [47], G e m a n and G e m a n [48]) A V A L U A T I O N L A N G U A G E FOR CATEGORICAL K N O W LED G E In this section, we describe an instance o f a valuation-based language designed to represent categorical knowledge--the lond o f knowledge tradiUon- ally represented b y rules in rule-based systems. Next, we show by means o f a small example how consistency is maintained m the knowledge base and how references are cached Our e x p o s m o n here is informal. A formal treatment (with theorems and proofs) o f the valuation language described m this section is given m Shenoy and Shafer [3]. Suppose we are interested m representing categorical knowledge m a valuauon system. Let us describe what valuations are and what the combination marginalization, and soluUon operations are for such systems VALUATIONS A v a l u a t t o n o n h is a function H : 'Wh ---' {t, f } , where t means true and f means false Thus a rule that relates the values o f variables m set h is represented as a valuation on h F o r example, consider the rule I f A = a t h e n B = b that relates two variables A and B whose frames are, respecuvely, "~7,4 = {a, - a } , and ~ B = {b, - b}. Then the rule can be represented by the valuation V o n {A, B} defined as follows: V(a, b) = t, V(a, - b ) = f , V ( - a , b) = t, V ( - a , - b ) = t. Consider the valuation U on h such that U ( x ) = t for all x E ~¢h. Obwously, such a valuation tells us nothing about the variables m h. W e call such a valuation the v a c u o u s v a l u a t i o n o n h. PROPER VALUATIONS Suppose H is a valuaUon on h. W e shall say that H is a p r o p e r v a l u a t i o n i f there exists a configuration x o f h such that H ( x ) = t. Thus a proper valuation cannot be identically equal to f for all configurations. COMBINATION Suppose G and H are valuauons on g and h, respectwely The valuauon G @ H on g I.) h is defined as follows: ( G ® H ) ( x ) = I ) for all x E q~v~suh. If G ( X ~g) = t and H ( X i h ) = t otherwise 392 Prakash P Shenoy MARGINALIZATION Suppose G is a valuation on g, and suppose h _c g. Then the marginal o f G f o r h, G *h, is defined as follows: for all x E 'Wh. If there IS a y E e~g_ h such that G ( x , y ) = t otherwise SOLUTION Suppose G is a valuaUon on g. The solution f o r G, denoted by ~b(G), is a subset o f %Vg such that y E if(G) if and only if G ( y ) = t. The combination, margmahzatlon, and solution operations are used by the VL to make Inferences from the knowledge Suppose a knowledge base Is built incrementally by adding valuations one at a time Consistency in the knowledge base is maintained by the V L by checking whether the added valuation is proper and combinable with the proper valuations already present in the system. Thus combinablhty o f valuations corresponds to consistency in the knowledge base (Shenoy and Shafer [3]) As valuations are added to the knowledge base, the system propagates all valuations and computes the marginal o f the joint valuation for each variable and the solutions for each o f these marglnals. More precisely, suppose { Rh [ h E 3E } is a collection o f proper combinable valuations in the system. The valuation @{Rh [h E 3~2 } is called the j o m t valuatton. The valuation system computes ( ~ { R h [ h E 3(~})~{x, } for each variable X, and also computes ak((®{Rhih E 3C }),{x,}). In doing so, the VS acts as a cache. At all times, the VS indicates the relevant inferences of the knowledge in the knowledge base AN EXAMPLE The following example is adapted from Ethenngton [49]. The knowledge base consists o f four rules as follows R u l e 1. Gullible citizens are citizens. R u l e 2. Elected crooks are crooks. R u l e 3. Cmzens &slike crooks. R u l e 4. Gullible citizens do not dislike elected crooks First we observe that Fred is a gullible citizen Next we observe that Dick is an elected crook. We would hke to consult our knowledge base to see if Fred dislikes Dick or not. One representaUon o f this knowledge base is as follows Let C = c, G = g, K = k, E = e, and D = d be five variables and their respective configurations representing X is a citizen, X is a gullible Otlzen, Y is a crook, Y is an elected crook, and X dislikes Y, respectively Suppose all five o f these variables are binary variables Valuauon-Based Language for Expert Systems T a b l e 1. T h e V a l u a t i o n s C o r r e s p o n d i n g to the F o u r R u l e s 393 %V{c,a} RI 'W{x,e} R2 c g t k e t c - g t k - e t - c g f - k e f - c - g t - k - e t "~7 { C,K,D } R 3 't~7 { G,E,D } R 4 c k d t g e d f c k - d f g e - d t c - k d t g - e d t c - k - d t g - e - d t - c k d t - g e d t - c k - d t - g e - d t - c - k d t - g - e d t - c - k - d t - g - e - d t R u l e s 1, 2, 3, a n d 4 a r e r e p r e s e n t e d b y p r o p e r v a l u a a o n s on { C , G } , { K , E } , { C , K , D } , a n d { G , E , D } , r e s p e c t i v e l y , as s h o w n in T a b l e 1 S u p p o s e t h e s e v a r i a b l e s a n d valuaUons a r e e n t e r e d m the s y s t e m A n e t w o r k r e p r e s e n t a t i o n o f the s y s t e m is s h o w n in F i g u r e 1 In t h a t f i g u r e , v a r i a b l e s a r e r e p r e s e n t e d b y c i r c l e s a n d v a l u a t i o n s a r e r e p r e s e n t e d b y s q u a r e s . F o r e a c h v a n a b l e , the set o f all s o l u u o n s f o r the m a r g i n a l o f the j o i n t v a l u a t i o n f o r that v a r i a b l e is i n d i c a t e d i n s i d e the v a r i a b l e v e r t e x . A s can b e seen f r o m F i g u r e 1, for e a c h v a r i a b l e the m a r g i n a l o f the j o i n t v a l u a t i o n for t h a t v a r i a b l e is the v a c u o u s valuaUon N o w , s u p p o s e w e e n t e r the o b s e r v a t i o n that F r e d ts a g u l h b l e c m z e n . T h i s is r e p r e s e n t e d in the s y s t e m a s a p r o p e r v a l u a t i o n F1 o n { G } as f o l l o w s F l ( g ) = t, F 1 ( - g ) = f . T h e s y s t e m a c c e p t s this p r o p e r v a l u a t i o n , a n d a f t e r p r o p a g a U o n it d i s p l a y s the r e s u l t s as s h o w n in Fxgure 2 N o t e that the s y s t e m p r o p e r l y c o n c l u d e s t h a t F r e d is a c i t i z e n . H o w e v e r , the s y s t e m a l s o c o n c l u d e s that Y is not an e l e c t e d c r o o k ! T h i s is t h e first hint w e h a v e that s o m e t h i n g is w r o n g w i t h out k n o w l e d g e b a s e . T h e s y s t e m has c o n c l u d e d s o m e t h i n g a b o u t Y w i t h o u t b e i n g t o l d it e x p l i c i t l y , a n d this is not an r e f e r e n c e w e e x p e c t f r o m the k n o w l e d g e b a s e . T h e r e a s o n f o r the r e f e r e n c e Y is n o t a n e l e c t e d c r o o k is the c o n t r a d i c t o r y n a t u r e o f r u l e s 3 a n d 4. F i n a l l y , w e e n t e r the o b s e r v a U o n that D i c k is an e l e c t e d c r o o k . T h i s 394 Prakash P Shenoy @ Figure 1. The valuation network with five variables and four rules Table 2. The Valuation Corresponding to Rule 5 ~dT{G,x,D~ R5 g k d t g k - d t g - k d t g - k - d t - g k d t - g k - d f - g - k d t - g - k - d t Valuation-Based Language for Expert Systems 395 Figure 2. The valuation network after valuation Fl Is included observation is represented as a proper valuation F2 on {E} as follows: F z ( e ) = t, F 2 ( - e ) = f . This time the system refuses to accept the valuation because the system detects that t h e j o i n t valuation R I @ R2 ® R3 ® R4 ® Fl ® F2 Is not a proper valuation. This signals that the knowledge in the system is inconsistent. Suppose we r e m o v e rule 3 f r o m the system and substitute instead rule 5 as follows: R u l e 5 Nongullible citizens dishke crooks. Rule 5 is represented in the system as the valuation R5 on {G, K , D } as shown m Table 2 The valuation system accepts valuation R5 wxth the results shown m Figure 3. Note that the system now concludes nothing about Y. Finally we enter valuation/72 in the system. This lame the system accepts the valuation wxth the results shown in Figure 4 Thus we conclude that Fred does not dmhke Dick. W e have not described the exact process by which the valuation language arrives at the results displayed m Figures 1-4 A computationally efficient procedure m sparse networks that uses only local computation is described m Shenoy and Shafer [3] Prakash P Shenoy Figure 3. added Ciuzen(X) ~ = { c } Citizen(X) V = {g} v = {,:t, ,-,d} The valuation network after valuation R3 is removed and valuation R5 1s ~ Crook(Y) huzen(X) ~ R5 v = { c} I V={ k} 396 Cluzen(X) Figure 4. The valuation network after valuataon F2 is included Valuation-Based Language for Expert Systems 397 A N EVIDENTIAL L A N G U A G E FOR U N C E R T A I N A N D N O N M O N O T O N I C R E A S O N I N G In this section, we describe another valuation language called an evidential language The valuations m this language are belief functions Propagation o f belief functions has been studied by Shafer and Logan [50], Shenoy and Shafer [1, 2, 51], Shenoy et al [52], Kong [53, 54], Dempster and Kong [55], Shafer et al. [56], Mellouh [57], Shafer and Shenoy [13], Dempster [22], and Almond [58] Zarley [59] describes an implementation o f an evidential system on a Symbohcs workstation (see also Zarley et al. [60]) Yen-Teh Hsia has implemented an evidential system called A U D I T O R ' S A S S I S T A N T on a Macintosh microcomputer. Shafer et al [7] describe an application o f A U D I T O R ' S A S S I S T A N T for assisting in audit decisions The use o f probabtlmes or belief functions to p e r f o r m nonmonotonlc reasoning is not new. Such an approach has been suggested, for example, by Baldwin [61], Ginsberg [62], and Rich [63]. The essence o f these approaches IS to relax the binary constraint o f Boolean logic and allow truth values to be measured by a number between 0 and 1 Our approach IS different We do not tack on probabilities or belief functions to logic. Instead, we show that pure behef-function reasoning is mherently nonmonotonic. A similar approach is taken by G r o s o f [64], who discusses how probabfliStlC reasoning is nonmono- tonic. In this section, we will first briefly describe evidential systems. Next, we sketch the basic definitions in a truth maintenance system and describe the correspondence between concepts in an truth maintenance system and concepts in a evidential system Fmally, we study a small example in nonmonotomc reasoning and demonstrate how evidential systems handle such problems This example also serves to illustrate the management o f uncertainty in evidential systems An Evidential System In evidential systems (ES), proper valuations correspond to superpotentlals, which are unnormahzed basic probabihty assignment functions. First we will briefly describe the basics o f the theory o f belief functions (Shafer [65]) Next, we define superpotentials and combination, marginahzatlon, and solution for superpotentlals Suppose XVh is the frame for a subset h o f variables A basic probablhty assignment functton (bpa function) for h is a non-negative, real-valued function m on the set o f all subsets o f 'Wh such that 1. m(fZS) = 0 2. X { m ( , 0 l a c_ ~ h } = 1 Intuitively, r e ( a ) represents the degree o f belief assigned exactly to ~x (the proposition that the true configuration o f h is in the set t~) and to nothing smaller. 398 Prakash P Shenoy A bpa function is the belief f u n c u o n equivalent o f a probability mass assignment function m probability theory. W h e r e a s a probability mass function is restricted to assigning probability masses only to singleton configurations o f variables, a bpa function is allowed to assign probability masses to sets o f configurations without assigning a n y mass to the individual configurations contained m the sets. F o r example, I f w e have absolutely no k n o w l e d g e about the true value o f a variable, w e can represent this situation b y a bpa function as follows" m(cd2h) = 1, m(a) = 0 for all other a E 2 w h Such a function is called a vacuous bpafunction. N o t e that m Bayesian t h e o r y the only w a y to express total ignorance is to assign a mass o f I / n to each value, where n is the total n u m b e r o f possible values Thus, m Bayesian theory, w e are unable to distinguish between equally likely configurations and total ignorance. The theory o f b e h e f functions offers richer semantics. Associated with a bpa function are two related functions called b e l i e f and plausibility. A belieffunctton is a function Bel: 2Wh ~ [0, 1] such that S e l ( a ) = Y ~ { m ( ~ ) l ~ _ a } W h e r e a s m(a) represented the b e h e f assigned exactly to a , B e l ( a ) represents the total b e h e f a s s l g n e d to a . N o t e that B e l ( ~ ) = 0 and Bel('Wh) = 1 f o r any belief function. F o r the v a c u o u s bpa function m , the c o r r e s p o n d i n g b e h e f function Bel is given b y Bel(%qh) = 1, B e l ( a ) = 0 for all other a E 2wh A plaustbilityfunctton is a function PI" 2wh --' [0, 1] such that P l ( a ) = Y ~ { m ( ~ ) l ~ t3 a ~ : ~ } P l ( a ) represents the total d e g r e e o f b e h e f that could be assigned to a . N o t e that P l ( a ) = 1 - Bel( - a ) , where - a represents the c o m p l e m e n t o f a in 'Wh; - a = 'Wh -- a . A l s o note that P l ( a ) _ B e l ( a ) F o r the v a c u o u s bpa function, the c o r r e s p o n d i n g plausibdlty function Is P I ( ~ ) = 0 , P l ( a ) = 1 for all other a E 2 ~ h . I f a bpa function m is also a p r o b a b d i t y mass function 0 . e . , all the probability masses are assigned only to singleton subsets), then Bel(tr) = P l ( a ) = X { m ( { x } ) I x E a } = p r o b a b d l t y o f proposition a SUPERPOTENTIALS Suppose h is a subset o f variables. A superpotentialfor h is a non-negative, real-valued function o n the set o f all subsets o f 'Wh such that the values o f n o n e m p t y subsets are not all zero. Given a superpotential H o n h , Valuatmn-Based Language for Expert Systems 399 we can construct a bpa function H ' for h f r o m H as follows: H ' ( ~ ) = O , H , ( a ) = H ( a ) / y , { H ( ~ ) l ~ c_ 'Wh, ~:/:~} Thus superpotentials can be thought o f as unnormalized bpa functmns SuperpotentlalS correspond to the notion o f proper valuatmns in the general framework. PROJECTION AND EXTENSION OF SUBSETS Before we can define combinatmn and marginahzation for superpotenuals, we need the concepts o f projection and extension for subsets o f configuratmns. I f g and h are sets o f variables, h c g, and $ is a nonempty subset o f 'Wg, then the projection o f ~ to h, denoted by ~*h, IS the subset o f 'Wn given by ~,n = { x * h l x F o r example, ff a is subset o f ~dT{ w,x,r,z}, then the marginal o f a to {X, Y} consists o f the elements o f %V{x, r} that can be obtained by projecting elements o f ,x to %V{x.r}. By extensmn o f a subset o f a frame to a subset o f a larger frame, we mean a cylinder set extensmn. I f g and h are sets o f varmbles, h _c g, h :# g, and ~ is a subset o f ' W h , then the extenston o f ~ to g is J~ x ~d?g_h. IfJ~ is a subset o f %Vn, then the extension o f ~ to h is defined to be ~ . W e will let ~)g denote the extensmn o f ~ to g, For example, ff a is a subset o f 'W{ w.x}, then the vacuous extension o f a to { W, X , Y) Z} IS d, X ¢~{y,z}. COMBINATION For superpotentmls, comblnatmn ~s called D e m p s t e r ' s rule (Dempster [44, 45]) Consider two superpotentials G and H on g and h, respectwely. I f X { G ( a ) n ( ~ ) l ( a *(guh)) N (~t(gUh)):#~} =#0 (1) then their combination, denoted by G @ H , is the superpotential on g U h given by (G ~ H)(c)=Y~{G(,~)H(g)I(a ~uh)) n ( ~ ( * u h ) ) = c } (2) for all c c_ %Vguh. I f ~ { G ( a ) H ( ~ ) l ( a ~(gUh)) n (~ r(guh)) :# ~ } = 0, then we say that G and H are not combmable. Intumvely, i f the bodies o f evidence on which G and H are based are independent, then G @ H is supposed to represent the result o f poohng these two bodies o f evidence. Note that condition (1) ensures that G @ H defined m (2) is a superpotentml. I f c o n d m o n (1) does not hold, this means that the two bodies o f evidence corresponding to G and H contradmt each other completely and it is not possible to combine such evidence MARGINALIZATION Suppose G ~s a superpotential for g, and suppose h _c g. 400 Prakash P Shenoy T h e n the marginal o f G f o r h is the s u p e r p o t e n t i a l G ~h for h d e f i n e d as f o l l o w s : G~h(a.)=~,{G(~)l ~ c_ "~Tg such that ~*h=a.} f o r all subsets ,x o f ~,Vh. SOLUTION T h e r e a r e s e v e r a l d e f i n i t i o n s o f s o l u t i o n p o s s i b l e f o r e v i d e n t i a l s y s t e m s . F o r n o n m o n o t o n i c r e a s o n i n g , w e w i l l d e f i n e a s o l u t i o n for m to be a c o n f i g u r a t i o n w h o s e p l a u s i b i l i t y IS p o s i t i v e . F o r m a l l y , s u p p o s e m is a b p a on h. S u p p o s e Pl is the p l a u s i b i l i t y f u n c t i o n on h c o r r e s p o n d i n g to m T h e n w e s a y that x E 'Wh is a solutton f o r m f f P l ( { x } ) > 0. A Truth Maintenance System A s s u m e a p r o p o s m o n a l l a n g u a g e c o n s i s t i n g o f p r o p o s i t i o n a l s y m b o l s , the l o g i c a l c o n n e c t i v e s A, V, - - , ~ , ~ , f o r m u l a s , a n d the u s u a l s t a n d a r d e n t a i l m e n t r e l a t i o n = " I f S is a set o f f o r m u l a s and w is a f o r m u l a , t h e n S = w i f e v e r y a s s i g n m e n t o f truth v a l u e s to the p r o p o s i t i o n a l s y m b o l s o f the l a n g u a g e that m a k e s e a c h f o r m u l a o f S t r u e a l s o m a k e s w true. A hteral is a p r o p o s i t i o n a l s y m b o l o r the n e g a t i o n o f a p r o p o s i t i o n a l s y m b o l A clause is a finite d i s j u n c t i o n o f h t e r a l s w i t h no h t e r a l s r e p e a t e d w h o s e truth v a l u e is t r u e A premtse is a l i t e r a l w h o s e truth v a l u e ~s t r u e . A categortcal justification is a c o n d i t i o n a l w h o s e truth v a l u e is true. N o t e that a c a t e g o r i c a l j u s t i f i c a t i o n c a n b e r e p r e s e n t e d as a c l a u s e F o r e x a m p l e , the c o n d i t i o n a l A = a --, B = b c a n b e r e p r e s e n t e d as a c l a u s e as f o l l o w s : - ( A = a) v ( B = b). A n assumption is a l i t e r a l w h o s e truth v a l u e is a s s u m e d to b e t r u e in the a b s e n c e o f a c o n t r a d i c t i o n A noncategorwal justzfication is a c o n d i t i o n a l w h o s e truth v a l u e is a s s u m e d to b e t r u e In the a b s e n c e o f a c o n t r a d i c t i o n . A nogood is a c l a u s e w h o s e truth v a l u e is false. A knowledge base is a c o l l e c t i o n o f j u s t i f i c a t i o n s ( r u l e s ) , p r e m i s e s ( o b s e r v a t i o n s ) , and a s s u m p t i o n s ( u n c e r t a i n j u d g m e n t s ) . J u s t i f i c a t i o n s m a y b e c a t e g o r i c a l o r n o n c a t e g o r i c a l . C a t e g o r i c a l j u s t i f i c a t i o n s m a y d e s c r i b e l o g i c a l r e l a t i o n s b e t w e e n p r o p o s i t i o n a l s y m b o l s N o n - c a t e g o r i c a l j u s t i f i c a t i o n s m a y d e s c r i b e facts that a r e u s u a l l y b u t not a l w a y s t r u e T h e f u n c t i o n s o f a truth m a i n t e n a n c e s y s t e m ( T M S ) a r e as f o l l o w s : 1. T h e u s e o f n o n c a t e g o r i c a l j u s t i f i c a t i o n s and a s s u m p t i o n s o r d e f a u l t s is p e r m i t t e d . 2. I n the a b s e n c e o f a c o n t r a d i c t i o n , n o n c a t e g o r l c a l j u s t i f i c a t i o n s a n d a s s u m p t i o n s a r e a s s u m e d to b e t r u e 3. I f t h e r e is a c o n t r a d i c t i o n In the k n o w l e d g e b a s e , then s o m e n o n c a t e g o n c a l j u s t i f i c a t i o n s o r a s s u m p t i o n s o r b o t h n e e d to b e r e t r a c t e d so that c o n s i s t e n c y IS r e s t o r e d W h e n an a s s u m p t i o n o r a n o n c a t e g o r l c a l j u s t i f i c a - t i o n is r e t r a c t e d , all I n f e r e n c e s m a d e u s i n g t h e s e a s s u m p t i o n s and n o n c a t e g o r i c a l j u s t i f i c a t i o n s m u s t a l s o b e r e t r a c t e d . Valuation-Based Language for Expert Systems 401 4. A l l i n f e r e n c e s t h a t a r e c o n s i s t e n t w i t h the k n o w l e d g e in the k n o w l e d g e b a s e s h o u l d b e d i s p l a y e d to the u s e r so t h a t t h e u s e r is a w a r e o f the i m p l i c a t i o n s o f t h e k n o w l e d g e . USING AN EVIDENTIAL L A N G U A G E AS A TMS W e w i l l n o w o u t h n e a c o r r e s p o n d e n c e b e t w e e n t h e c o n c e p t s in a T M S a n d c o n c e p t s in an e v i d e n t i a l s y s t e m (ES). A h t e r a l in a T M S is r e p r e s e n t e d In an E S b y a v a r i a b l e and one o f its v a l u e s . T h u s X = x is an E S r e p r e s e n t a t i o n o f the h t e r a l x w h e r e x b e l o n g s to ' W x , the set o f p o s s i b l e v a l u e s o f v a r i a b l e X . F o r e x a m p l e , s u p p o s e the p r o p o s m o n T W E E T Y IS A B I R D is r e p r e s e n t e d m a T M S as a h t e r a l . In the E S , this c o u l d b e r e p r e s e n t e d b y a v a r i a b l e B I R D w i t h t w o p o s s i b l e v a l u e s y e s a n d n o . T h e n the l i t e r a l T W E E T Y I S A B I R D c o r r e s p o n d s to B I R D - y e s in an E S A p r e m i s e is a h t e r a l w h o s e t r u t h v a l u e is t r u e I n an ES, a p r e m i s e is r e p r e s e n t e d as a c a t e g o r i c a l b e h e f f u n c t i o n . ] : o r e x a m p l e , the p r e m i s e X -- x IS r e p r e s e n t e d b y a b e l i e f f u n c t i o n on XVx g i v e n b y m ( { x } ) = 1. A n a s s u m p t i o n in a T M S is a l i t e r a l w h o s e t r u t h v a l u e is set to t r u e in the a b s e n c e o f a c o n t r a d i c t i o n m the k n o w l e d g e b a s e . I n the E S , an a s s u m p t i o n X = x is r e p r e s e n t e d b y a n o n c a t e g o r i c a l b e l i e f f u n c t i o n Bel (with b a s i c p r o b a b i l i t y a s s i g n m e n t m ) o n %Vx s u c h that m ( { x } ) = p a n d m ( ' W x ) = 1 - p w h e r e 0 < p < 1. T h e a c t u a l v a l u e o f p w i l l d e p e n d on the p a r t i c u l a r a s s u m p t i o n , p c a n b e i n t e r p r e t e d to b e the p r i o r d e g r e e o f b e l i e f in the a s s u m p t i o n A j u s U f i c a t i o n is a c o n & t i o n a l , x l A X2 A " • • A Xn--* y w h e r e x l , x2, • • ", xn, y a r e h t e r a l s . I n an E S , a c a t e g o r i c a l j u s t i f i c a t i o n x~ A x2 A • • • A xn ~ y is r e p r e s e n t e d as a c a t e g o r i c a l b e l i e f funcUon o n the f r a m e %Vh, w h e r e h = { X i , X2, • • ", A n , Y}. F o r e x a m p l e , c o n s i d e r t w o v a r i a b l e s X a n d Y w i t h f r a m e s ~d~x = {x, - x } a n d ~*Vr = { y , - y } . T h e n the c a t e g o r i c a l j u s t i f i c a t i o n x ~ y is r e p r e s e n t e d m the E S as a c a t e g o r i c a l b e l i e f f u n c t i o n on % V ( x , y ) g w e n b y m ( { ( x , y ) , ( - x , y ) , ( - x , - y ) } ) = 1 N o n c a t e g o r l c a l j u s t i f i c a t i o n s a r e r e p r e s e n t e d m the E S as n o n c a t e g o n c a l b e l i e f f u n c t i o n s T h e r e a r e s e v e r a l w a y s in w h i c h this can b e d o n e . T h e m o s t a p p r o p r i a t e w a y wall d e p e n d o n the n a t u r e o f the p a r t i c u l a r j u s U f i c a t l o n . T h e first t y p e o f b e h e f - f u n c t i o n r e p r e s e n t a t i o n o f a n o n c a t e g o r i c a l j u s t i f i c a - t i o n is c a l l e d e x c e p t i o n a l • T h e e x c e p t i o n a l r e p r e s e n t a t i o n o f a n o n c a t e g o n c a l j u s t a f i c a t l o n is i m p l i e d b y M c C a r t h y ' s [66] f o r m u l a t i o n F o r e x a m p l e , c o n s i d e r the n o n c a t e g o n c a l j u s t i f i c a t i o n M O S T B I R D S F L Y T h i s can b e r e p r e s e n t e d in a 402 Prakash P Shenoy TMS by a categorical jusUficatlon and an assumption as follows: BIRD = y e s A E X C E P T I O N A L _ B I R D = n o ~ F L Y = y e s Assume E X C E P T I O N A L _ B I R D = n o Here E X C E P T I O N A L _ B I R D = n o ~s a hteral that captures all the conditions under which birds fly. Let B = b, E = - e , a n d F = f d e n o t e the ES representation o f the literals B I R D = y e s , E X C E P T I O N A L _ B I R D = n o , and F L Y = y e s . Then the justificaUon M O S T BIRDS F L Y can be represented in an ES by two independent basic probability assignment functions, m~ on 'W~B,~,F) and m2 on "~,V E as follows: m l ( e ~ { B , E , F } - - {(b, - e, - f ) } ) = 1 m2({ - e } ) = p , m2(~gTe) = 1 - - p where 0 < p < 1. Note that ml (~) m2 Is a basic probablhty assignment on ~/B.E,F) given by m l G m2({(b, - e , f ) , ( - b , - e , f ) , ( - b , - e , -f)})=p m l G mZ(~C{B.E,F} -- {(b, - e , - f ) } ) = 1 - p m~ • m2 xs then the exceptional representation m an ES o f the jusUficatlon MOST BIRDS F L Y The second type o f behef-function representation o f a noncategorical justification is called a s s o c t a t t o n a l . Consider again the justificaUon M O S T BIRDS F L Y W e can interpret this to mean that birds are associated with flying with a certain degree o f behef. This associaUon m a y just go one way, that is, we may not necessarily assocmte all flying objects wRh birds. Interpreted in this way, we can represent this justification by a basic probability assignment function m3 on 'W{B,F} as follows m3({(b,f), ( - b , f ) , ( - b , -f)})=p, m 3 ( ~ C C { B , F ) ) = l - - p w h e r e 0 < p < 1. Obwously, excepUonal representaUons o f noncategorlcal justifications have greater expressive p o w e r than assocmtional representations. In the bird example, i f the basic probability assignment functaon m l (~) m2 is marginalized by deleting the E varmble, then we obtain precisely the assocmtlonal representation m3, that is, (ml @ m2) ~{B'F} = m3. H o w e v e r , this expressive power comes at a computataonal cost since m o r e variables are required in the exceptional representation than m the assocmtional representation. Consider a knowledge base represented b y a collection o f bpa functions {m, [i = 1, • • -, n} representing premises, rules, and assumptions. Suppose my is an assumption X = x. W e shall say that the assumption m , xs r e t r a c t e d b y t h e k n o w l e d g e b a s e { m ~ l i = 1, . . . , n} i f p l q x ) ( { x } ) = 0 where P l q x} is the plausibility function corresponding to ( ~ {m, [i = 1, - - . , n } ) q x } . W e shall say Valuation-Based Language for Expert Systems 403 that the assumption m~ is c o n f i r m e d b y the k n o w l e d g e base { m , l t = 1, . . . , n} l f m ~ X ) ( { x } ) = 1 where m = ( O { m , lt = 1, - - ' , n}). An Example Consider the following knowledge base: R u l e 1. Most Repubhcans (at least 80%) are not pacifists. R u l e 2. Most Quakers (at least 90%) are paclficists First we observe that Nlxon is a Republican. Then we observe that Nixon is also a Quaker. We would like to consult our knowledge base to find out whether Nixon is a pacifist or not. Next we will add the premise that Nlxon is not a pacifist and see how the evidential system reconciles this premise with rule 2. One representaUon o f this knowledge base is as follows. Let R = r, Q = q, P = p be three variables and their respective configurauons representmg the propositions X Is a Republican, X is a Quaker, and X is a pacifist, respectively. Furthermore, let ER = er and EQ = eq be two more variables and their respectwe configurations representing the proposmons X Is an excepUonal Repubhcan and X is an exceptional Quaker, respectively We will represent rule 1 with categorical rule 1 and assumption 1 as follows. CATEGORICAL RULE 1 I f X lS a Repubhcan and X is not an excepaonal Repubhcan, then X is not a paclficlst ASSUMPTION 1. X IS not an excepUonal Republican The bpa function representaUon o f categorical rule 1 is as follows mt(5~2{e, E R , P } - {(r, --er, p ) } ) = 1 The bpa function representation o f assumption 1 is as follows: m2({ - er}) = 0.8, m2(~CCeR) = 0.2 We will represent rule 2 with categorical rule 2 and assumption 2 as follows: CATEGORICAL RUL~ 2. I f X Is a Quaker and X is not an exceptional Quaker, then X is a pacifist ASSUMZrION 2 X IS not an exceptional Quaker. The bpa funcuon representation o f categorical rule 2 is as follows. m3('W{O, EO, p} -- {(q, -- eq, - - p ) } ) = 1 The bpa function representation o f assumption 2 is as follows: m4({ - e q } ) = 0 . 9 , m4('vdTEO) = 0.1 I f we enter these four bpa functions in the ewdential system, the resultmg 404 Prakash P Shenoy Figure 5. The evidential network with two categorical rules and two assumptmns evidential network is as shown in Figure 5. As before, variable vertices are shown as circles and valuation vertmes are shown as squares. In addition to displaying the set o f all soluuons for the marginal o f the joint valuation, the marginal bpa function is also displayed. I f {x, - x } is the frame for variable X , then the marginal o f the joint valuation for X is shown as a vector ( m ~{x)({x}), ( m t { X ) ( { - x } ) , (mqXI({x, - x } ) ) , where m = (~ {m, ll = 1, . . - , n}. Suppose we now enter the premise that Nlxon is a Republican. This is represented as a bpa function as follows m s ( { r } ) = 1 The evidential system accepts this bpa function with the results as shown m Figure 6. Note that the b e h e f in the proposition that Ntxon is not a pacifist has increased f r o m 0 to 0.8 and the b r i e f in the p r o p o s m o n Nlxon is not a Quaker has increased f r o m 0 to 0 72. Suppose we now enter the premise that Nlxon is a Quaker. This is represented by a bpa functmn as follows m6({ q})= 1 The ewdenUal system accepts this bpa functmn with the results shown m Figure 7. Note that as per the ES, Nlxon could either be a pacifist or not. The plausibility o f Nlxon being a pacifist (0.71) is higher than the plauslbdity that Nixon is not a pacifist (0.35). This is because Quakers have higher belief (0 90) o f being pacificists than Repubhcans have o f not being pacifists (0.80). Valuation-Based Language for Expert Systems 405 Figure 6. The evldenUal network with the prermse that Nlxon is a Repubhcan Figure 7. The evidential network with the prenuse that Nlxon Is a Quaker 406 Prakash P Shenoy Figure 8. The evidential network with the premise that Nlxon is not a paclficst Now suppose we enter the prermse that Nlxon is not a pacifist. This is represented by a bpa function as follows: m7({ - - p } ) = 1 The ES accepts this bpa function, and the results are displayed in Figure 8. Note that the assumption that Nlxon is not an exceptional Quaker has been retracted by the evidence! SUMMARY A N D CONCLUSIONS The main objective o f this article 1s to introduce a new language for budding knowledge-based systems as an alternative to rule-based-languages Whereas rule-based languages use rules as a knowledge representation device and modus ponens as an operation for making references, our language uses proper valuations as a knowledge representation device and three operations-- combination, marginallZatlon, and solution--for making inferences. Combina- tion corresponds to aggregation o f knowledge, marginallzation corresponds to crystalhzation o f knowledge, and solution is a decoding mechanism that maps knowledge from the space o f valuations to the space o f configurations. Conceptually, the language combines all valuations, finds the marginal o f the joint valuation for each variable, and then finds the solution for each marginal. Like rule-based languages, our valuation-based language retains the modular- Valuation-Based Language for Expert Systems 407 ity feature Each valuatmn represents a distract modular chunk o f knowledge. I f the combination operator is commutative and assocmtlve, then, like rule-based languages, valuation-based languages are nonprocedural. These desirable features o f rule-based languages are retained Unlike rule-based languages, our valuation-based language automatically maintains consistency in the knowledge base, caches and displays relevant inferences, reasons nonmonotomcally, and pernuts coherent management o f uncertainty A natural question IS, what is the computational p o w e r o f valuation-based languages? Anderson [67] has formally shown that is is possible to imagine coding any given Turing machine using a pure production system. W e suspect that valuation-based languages have the same computational power, but we do not have a proof. A C K N O W L E D G M E N T This w o r k was supported in part by the National Science Foundation under grant IRI-8610293 and a Research Opportunities in Auditing grant 87-135 f r o m the Peat M a r w l c k FoundaUon. The foundation on which tins paper rests is the result o f research done jointly with Glenn Shafer o v e r the last three years. I owe a lot to Glenn. I would also like to acknowledge the influence o f the A I group m the Business School Finally, this p a p e r has benefitted f r o m useful comments f r o m Bruce D ' A m b r o s i o and three anonymous referees. This paper is a rewsion o f Shenoy [68] References 1 Shenoy, P. P , and Shafer, G , An araomatlc framework for Bayesian and belief- funcuon propagation, Proc 4th Workshop on Uncertainty m AI, Minneapolis, Mlnn, 307-314, 1988 2 Shenoy, P P , and Shafer, G , Axioms for probabdlty and behef-functlon propagation, Workmg Paper No 209, School of Business, Umverslty of Kansas, Lawrence, K a n , 1988 3 Shenoy, P P , and Shafer, G., Constraint propagation, Workang Paper No 208, School of Business, Umverslty of Kansas, Lawrence, Kan., 1988 4 Shenoy, P P , and Shafer, G , Axioms for discrete optlratzatlon using local computauon, School of Business Working Paper No. 207, Umverslty of Kansas, Lawrence, K a n , 1988 5 Brownston, L. S , Farrell, R G , and, Kant, E., and Martin, N , Programming Expert Systems in OPS5: A n Introduction to Rule-Based Programmmg, Addison-Wesley, Reading, Mass., 1985 408 Prakash P. Shenoy 6 Davis, R , and King, J J , The origin o f rule-based systems in AI, m Rule-Based Expert Systems: The M Y C I N Experiments o f the Stanford Heuristw Program- mmg ProJect (B G Buchanan and E H Shortliffe, Eds ), Addison-Wesley, Reading, Mass., 20-52, 1984 7 Shafer, G , Shenoy, P P , and Srlvastava, R P , AUDITOR'S ASSISTANT a knowledge engmeermg tool for audit deoslons, Worlong Paper No 197, School of Business, University of Kansas, Lawrence, K a n , 1988 8 Pearl, J , Fusion, propagaUon and structuring in belief networks, A I 29, 241-288, 1986 9 Pearl, J , Networks o f Behef: Probablhstw Reasomng m Intelhgent Systems, Morgan Kanfmann, Palo Alto, Cal , 1988 10 Launtzen, S L , and Splegelhalter, D J , Local computations with probabflttles on graphical structures and their apphcatlon to expert systems (with discussion), J Roy. Star Soc. Ser B 50(2), 157-224, 1988 11 Heckerman, D E , and Horvltz, E J , On the expressiveness of rule-based systems for reasoning with uncertainty, Proc 6th National Conference on AI (AAAI-87), Seattle, W a s h , 1, 121-126, 1987 12 Sharer, G , and Shenoy, P P , Probability propagation, Working Paper No 200, School o f Business, University of Kansas, Lawrence, K a n , 1988 To appear in Proe 2nd International Workshop on AI and Statistics, Fort Lauderdale, F l a , 1989 13 Shafer, G , and Shenoy, P P , Local computation m hypertrees, Working Paper No 201, School of Business, University of Kansas, Lawrence, K a n , 1988 14 Seldel, R , A new method for solving constraint satisfaction problems, Proc 7th International Joint Conference on AI (IJCAI-81), Vancouver, B C , Canada, 1 , 3 3 8 - 342, 1981 15 Deehter, R , and Pearl, J , Tree-clustering schemes for constraint processing, Proc 7th National Conference on AI (AAAI-88), St Paul, Minn , 1, 150-154, 1988 16 Bertele, U , and Brloschl, F , Nonsertal Dynamw Programming, Academic, New York, 1972. 17 Rose, D J , A graph-theoretic study o f the numerical solution o f sparse posmve definite systems of linear equations, m Graph Theory and Computing (R C Read, Ed ), Acadenuc, New York, 183-217, 1973 18 Spohn, W , Ordinal condmonal functions a dynamic theory of epistemic states, in Causatton m Dectston, B e h e f Change, and Stattstws, Vol 2, (W L Harper and B Skyrms, Eds ), D Reidel, Dordrecht, Holland, 105-134, 1988 19 Hunter, D , Parallel belief revision, Proc 4th Workshop on Uncertainty in AI, Mmneapohs, Minn., 170-176, 1988 20. Malvestuto, F M , Decomposing complex contingency tables to reduce storage requirements, Proc 1986 Conference on Computational Statistics, 66-71, 1986 Valuation-Based Language for Expert Systems 409 21 Been, C , Fagm, R , Maler, D , and Yannakakas, M , On the deslrablhty o f acychc database schemes, J. ,4CM, 30(3), 479-513, 1983 22 Dempster, A. P , Construction and local computation aspects of network behef functions, Research Report S-125, Department of Statistics, Harvard Umverslty, Cambridge, Mass , 1988 23 Melnhold, R J , and Smgpurwalla, N D , Understanding the Kalman filter, A m . S t a t , 37, 241-288, 1982 24 Adams, E , Probablhty and the logic of condmonals, m Aspects o f Inductlve Logw (J Hmtlkka and P Suppes, Eds ,), North-Holland, New York, 1986 25 Suwa, M Scott, A C , and Shorthffe, E H , An approach to verifying completeness and consistency m a rule-based expert system, A I Mag. 3(3), 16-21, 1982 26 Nguyen, T A , Perlons, W A , Laffey, T J , and Pecora, D , Checkang an expert system's knowledge base for consistency and completeness, Proc 9th International Joint Conference on AI (IJCAI-85), Los Angeles, Cal , 1 , 3 7 5 - 3 7 8 , 1985 27 Pearl, J , Deciding consistency m inheritance networks, Tech Report No 870053 (R96), Cogmtlve Systems Laboratory, Umverslty o f Cahforma at Los Angeles, Cal , 1987 28 Touretzky, D S , The Mathematws o f Inherttance Systems, Morgan Kaufmann, Los Altos, C a l , 1986 29 Gmsberg, A , Knowledge-base reduction a new approach to checking knowledge bases for mconslstency and redundancy, Proc 7th NaUonal Conference on AI (AAAI-88), St Paul, M m n , l , 585-589, 1988 30. McCarthy, J , Clrcumscnptlon--a form of non-monotomc reasoning, A113, 27-39, 1980 31 McCarthy, J , and Hayes, P J , Some philosophical problems from the standpoint o f artlficml mtelhgence, m Machine Intelhgence, Vol. 4 (B Meltzer and D Mlchle, Eds ), Edinburgh Umverslty Press, 463-502, 1969 32 McDermott, D , and Doyle, J , Non-monotomc logic I, A I 13, 41-72, 1980 33 Moore, R C , Semantical consideration on nonmonotomc logic, A ! 25. 75-94, 1985 34 Relter, R , A logic for default reasoning, ,41 13, 81-132, 1980 35 Shafer, G , Probabdlty judgment m artlficml mtelhgence and expert systems, Stat. Scl. 2(1), 3--44, 1987 36 Heckerman, D E , and HorvltZ, E J , The myth o f modularity m rule-based systems for reasomng w~th uncertmnty, m Uncertamty zn Artificial Intelhgence, Vol 2 (J F Lemmer and L N Kanal, Eds ), North-Holland, New York, 23-34, 1988 410 Prakash P Shenoy 37 Shorthffe, E., and Buchanan, B. G , A model of inexact reasomng m medicine, Math. Btosct. 23, 351-379, 1975 38 Duda, R , Hart, P , and Ndsson, N , Subjective Bayesian methods for rule-based reference systems, m Readmgs m Arttfictal Intelhgence (B L Webber and N J Ndsson, Eds ), Tioga, Palo Alto, C a l , 192-200, 1981 39 Doyle, J., A truth maintenance system, A I 12(3), 231-272, 1979 40 de Kleer, J , An assumption-based TMS, A ! 28, 127-162, 1986 41 Relter, R , and de Kleer, J , Foundations of assumptaon-based truth maintenance systems" prehnunary report, Proc 6th National Conference on AI (AAAI-87), Seattle, W a s h , 1, 183-188, 1987 42 Laskey, K. B , and Lehner, P E , Behef maintenance an integrated approach to uncertmnty management, Proc 7th National Conference on AI (AAAI-88), St Paul, M m n , 1 , 2 1 0 - 2 1 4 , 1988 43. D'Ambroslo, B., A hybrid approach to reasomng under uncertmnty, Int. J. A p p r o x t m a t e Reasoning 2(1), 29--46, 1988 44 Dempster, A. P , New methods for reasoning toward posterior dlstnbuUons based on sample data, A n n . M a t h . Stat. 37, 355-374, 1966 45. Dempster, A P , Upper and lower probabllmes induced by a multlvalued mapping, A n n . Math. Stat. 38, 325-339, 1967 46 Pearl, J , Ewdentml reasomng using stochastic simulation o f causal models, A I 32, 245-257, 1987 47 garkpatnck, S , Gelatt, C D , J r , and Vecchl, M P , Optamlzatlon by simulated annealing, Science 220, 671-680, 1983 48 Geman, S , and Geman, D , Stochastic relaxation, Gibbs dlstnbuuon, and the Bayesmn restoration o f images, I E E E Trans. P A M I 6, 721-741, 1984 49 Ethenngton, D W , More on inheritance hierarchies with exceptions" default theories and mferentml distance, Proc 6th National Conference on AI (AAAI-87), Seattle, W a s h , 1 , 3 5 2 - 3 5 7 , 1987 50 Sharer, G , and Logan, R , Implementmg Dempster's rule for hierarchical evidence, A I 3 3 , 2 7 1 - 2 9 8 , 1987 51 Shenoy, P P , and Shafer, G., Propagating behef functions using local computa- tions, I E E E E x p e r t 1(3), 43-52, 1986. 52 Shenoy, P P , Shafer, G, and Mellouli, K (1986), Propagation o f b e h e f functions, a distributed approach, Proc 2nd Workshop on Uncertainty m AI, Phdadelphm, P e n n , 249-260, 1986 Also m Uncertamty m A r t O c l a i l n t e l h g e n c e , Vol 2 (J. F Lemmer and L N. Kanal, Eds.), North-Holland, New York, 325-335, 1988. 53 Kong, A , Multivariate behef functions and grapfucal models, Ph.D. Thesis, Department o f StatlsUcs, Harvard Umverslty, Cambridge, Mass., 1986 Valuation-Based Language for Expert Systems 411 54 Kong, A , A behef function generahzatlon o f Gibbs ensembles, Tech Report No 239, Department o f Statmtlcs, Umverslty of Chicago, Chicago, I l l , 1988 55 Dempster, A P , and Kong, A , Uncertain evtdence and artificial analysis, Research Report S-108, Department of Statisncs, Harvard University, Cambridge, M a s s , 1986 56. Shafer, G , Shenoy, P P , and Mellouh, K , Propagating behef functions in qualitative Markov trees, Int. J. Apprommate Reasomng 1(4), 349-400, 1987 57 MeUouh, K , On the propagation of behefs m networks using the Dempster-Shafer theory o f evtdence, Ph D Thesis, School of Business, University o f Kansas, Lawrence, K a n , 1987 58 Almond, R , Fusion and propagation in graphical belief models, Research Report S- 121, Department o f Statastics, Harvard Umverslty, Cambridge, M a s s , 1988 59 Zarley, D K., An evidential reasoning system, Working Paper No. 206, School of Business, University o f Kansas, Lawrence, K a n , 1988 60 Zarley, D K , Hsia, Y T , and Shafer, G , Evidential reasomng using DELIEF, Proc 7th National Conference on AI (AAAI-88), Minneapolis, M i n n , 1,205-209, 1988 61 Baldwin, J F , Evidential support logic programmang, Fuzzy Sets Syst. 24, 1-26, 1987 62 Ginsberg, M. L , Non-monotomc reasoning using Dempster's rule, Proc. 4th National Conference on AI (AAAI-84), Austin, T e x , 126-129, 1984 63 Rich, E., Default reasoning as likelihood reasoning, Proc. 3rd National Conference on AI (AAAI-83), Washington, D C , 348-351, 1983 64 Grosof, B N., Non-monotonicity m probablhstic reasomng, in Uncertamty m Arttftctal Intelhgence Vol 2 (J F Lemmer and L N Kanal, Eds ), North- Holland, New York, 237-249, 1988 65 Shafer, G , A Mathematwal Theory o f Evtdence, Pnnceton Umv Press, Princeton, N J , 1976 66 McCarthy, J , Applications of circumscription to formahzing common-sense knowledge, A I 28(1), 89-116, 1986 67 Anderson, J , Language, Memory and Thought, Erlbaum, Hfllsdale, N J , 1976 68 Shenoy, P P , Valuation systems a language for knowledge-based systems, Working Paper No 203, School of Business, Umverslty of Kansas, Lawrence, K a n , 1988