PII: 0888-613X(87)90008-9


A Reduction Methodology for 
a Differential Diagnosis 

Expert System 
Akram Salah 

D e p a r t m e n t  o f  C o m p u t e r  S c i e n c e ,  C a i r o  U n i v e r s i t y  

Kevin D. Reilly 
D e p a r t m e n t s  o f  C o m p u t e r  a n d  I n f o r m a t i o n  S c i e n c e s  a n d  B i o s t a t i s t i c s  

a n d  B i o m a t h e m a t i c s ,  U n i v e r s i t y  o f  A l a b a m a  a t  B i r m i n g h a m  

A B S T R A C T  

The simple production rule representation is generalized by adding p r o g r a m s  to a 
management s y s t e m  that manipulate rules in a rule-based system. B y  adapting this 
methodology, a single generalized rule can represent a group o f  simple ones. Then 
programs are e m p l o y e d  to satisfy the general rule in a partial way while recursively 
reducing a decision p r o b l e m  into smaller ones o f  the s a m e  nature until a decision is 
made. It is s h o w n  that the reduction m e t h o d  is m o r e  efficient than the simple rule 
approach and that it m i n i m i z e s  the n u m b e r  o f  rules used to express a problem. The 
concept o f  using a m a n a g e m e n t  program to manipulate a set o f  rules is emphasized 
through solving a p r o b l e m  in a differential diagnosis expert system. A comparison 
between the n u m b e r  o f  rules e m p l o y e d  to express a p r o b l e m  is made to s h o w  
advantages o f  the reduction m e t h o d o l o g y  over the simple rule representation. 

K E Y W O R D S :  production systems, expert systems, reduction algorithm, 
prolog, decision tables, relational databases, medical diagnosis 

I N T R O D U C T I O N  

Much recent research focuses on computer systems that facilitate methodolo- 
gies simulating experts' knowledge-based decision-making strategies (see, for 
example, [1-4]). The most popular current way to create such systems is to 
incorporate large amounts of "domain-dependent" knowledge acquired from 
experts. Because experts often express their decision-making processes in sets of 

Address correspondence to Kevin D. Reilly, Departments ~ o f  Computer and Information 
Sciences and Biostatistics and Biomathematics, University o f  Alabama at Birmingham, 
Birmingham, Alabama 35294. 

International Journal of Approximate Reasoning 1987; 1 : 131 - 139 
© 1987 Elsevier Science Publishing C o . ,  Inc. 
52 Vanderbilt A v e . ,  N e w  Y o r k ,  N Y  10017 0 8 8 8 - 6 1 3 X / 8 7 / $ 3 . 5 0  131 

CORE Metadata, citation and similar papers at core.ac.uk

Provided by Elsevier - Publisher Connector 

https://core.ac.uk/display/81989388?utm_source=pdf&utm_medium=banner&utm_campaign=pdf-decoration-v1


132 Akram Salah and Kevin D. Reilly 

if-then rules, rule bases are devised ( H a y e s - R o t h  [5]). Such systems are usually 
called knowledge-intensive rule-based systems. 

A knowledge-intensive rule-based system consists mainly o f  a set o f  rules 
describing different decision situations in the p r o b l e m  under question, together 
with actions to be taken in each case. A rule in such a system is represented as a 
structure typically in the f o r m  

A l  & A 2  & • • • & A n - ' C  

Such a rule is interpreted as follows: if Ax and A 2 and . . . and A n  are true 
simultaneously, then consequently C is true. The left side o f  a rule contains a 
conjunction o f  atoms called c o n d i t i o n s  and the right side is called a c o n c l u s i o n  
o r  a c o n s e q u e n c e  (Salah and Y a n g  [4]). 

I f  an expert expresses his o r  her decision-making process as a collection o f  
simple if-tben rules, each o f  them can be represented directly as stated above. A 
p r o b l e m  arises if an expert expresses rules in a less explicit w a y  o r  in s o m e  f o r m  
such as a function o v e r  a set o f  rules. T h e n  it is the responsibility o f  the rule 
acquirer, w h e t h e r  p e r s o n  o r  machine, to decide upon a representation o r  to 
p r o v i d e  s o m e  kind o f  control on processing such rules. 

In this article we s h o w  an a p p r o a c h  that can facilitate a generalization o f  
simple rule representation. This article is part o f  a larger study that has b o r r o w e d  
concepts f r o m  relational database systems (RDBSs), such that rules are stored in 
a rule base and then a m a n a g e m e n t  s y s t e m  retrieves the rule u n d e r  question. T h e  
system exploits a n u m b e r  o f  features studied previously (Reilly et al. [6], Yang 
[7], Reilly et al. [8]), w h e r e  key c o n c e r n s  have been incorporation into a P r o l o g  
f r a m e w o r k  (Kowalski [9], C l o c k s i n  and Mellish [10]) o f  k n o w l e d g e  representa- 
tions and R D B S s  ( B r u y n o o g h e  [11]). It is shown that the a p p r o a c h  increases the 
efficiency o f  a rule-based system. 

THE PROBLEM 

T h e  p r o b l e m  arises in a differential diagnosis expert s y s t e m  where conditions 
are either s y m p t o m s ,  observations, o r  test results gathered by a physician to be 
used to derive a conclusion, which in this case is a disease o r  a class o f  diseases. 
Rules used to derive such a conclusion would typically be in the f o r m  

O l & O 2 & .  • . & O n - ' D  

where each O i  f o r  1 _< i _< n is an observation, and D is a disease o r  class o f  
diseases. ( F r o m  here on, we refer to any a t o m  on the left side o f  a rule as an 
observation (an observation can be a test result o r  a s y m p t o m )  and on the right 
side as a disease. Thus, this rule is read as follows: if all observations 1 t h r o u g h  
n exist simultaneously in a patient, then this case can be diagnosed as D . )  

O u r  p r o b l e m  arises when a g r o u p  o f  rules is expressed within a single if-then 


Reduction Algorithm in Expert Diagnosis 133 

statement. Our particular concern is this: given a set o f  n observations and a 
conclusion D such that if any k-subset o f  observations o f  n holds, k _< n, then D 
can be concluded. That is, 

any k observations o f  {O1 & 0 2  & . . . & O , } - - , D  

A simple example o f  such a case is the c o m m o n  cold, where there are about 12 
observations and any 3 o f  them (existing simultaneously) establish the diagnosis. 
Examples o f  a similar nature occur in rheumatic diseases (more discussion is 
provided below). 

Actually, this is a generalization o f  a rule application. The special case in 
which k = n defines the " n o r m a l "  rule structure, that is, the case in which all the 
conditions have to be satisfied to derive the consequence. A more formal view o f  
this problem is as follows. In a rule system there is a set o f  conditions for each 
decision situation, each condition having a domain o f  values. The left side o f  any 
rule represents an element in the Cartesian product o f  the domains o f  these 
conditions. The case here m a y  be conceptualized as having one condition with 
one domain o f  observations, say, O with length n, such that i f  any k-subset o f  O 
with k < n occurs simultaneously, then the diagnosis is established. This 
expresses a set o f  rules, each one having a condition part c E O k, where k < n, 
and the same consequence D ,  which is the disease under consideration. 

T o  represent this situation within an expert system, we examine two 
alternatives to set the stage for subsequent comparison. 

1. The single if-then statement is re-expressed as a set o f  simple rules. Each 
such simple r u l e c o n t a i n s  k observations on its left side and D on its right 
side. Needless to say, the resulting number o f  rules consumes much 
m e m o r y  space, complicates the search when the system is applied, and 
reduces the efficiency o f  the system. 

2. The production system is extended such that if the " a n y  k out o f  n "  
formulation is expressed, it can be handled automatically. 

Note that this problem differs f r o m  those representations o f  " i n e x a c t  
r e a s o n i n g "  or uncertainty (Prade [12], Rosenbloom et al. [13]) in which subsets 
o f  conditions are used to derive a c o n s e q u e n c e - - f o r  instance, probabilistically, 
fuzzily, or using weighting schemes. 

R E D U C T I O N  M E T H O D  

The reduction method is based on viewing rule-base systems as a set o f  rules 
together with p r o g r a m s  that manage such rules. This view enables us to add 
program code to the management system such that it can extend the simple 
representation o f  production rules. Here, we apply this methodology to enable a 
direct representation o f  the generalized f o n n  discussed above. 

We denote a p r o b l e m  as " a  k / n  diagnostic p r o b l e m "  when the diagnosis is 


134 Akram Salah and Kevin D. Reilly 

dependent on n observations such that if any k o f  them are found to exist in a 
case, then the diagnosis is established. F o r  example, the case discussed by Weiss 
and Kulikowski ([14], p. 119 ff.) in diagnosis o f  rheumatic diseases such as 
mixed connective tissue disease ( M C T D )  involves 10 observations. I f  any 4 o f  
these 10 exist in a patient, then he or she definitely has a rheumatic disease. 
Using our terminology, we say that this is a 4/10 diagnostic problem. 

A Reduction Algorithm 

T o  solve a k/n diagnostic problem using the reduction methodology, we 
p e r f o r m  the following: 

1. Pick any k symptoms. 
2. Name them temporarily T1 . . . . .  Tk. 
3. Check decision table (k) with results for the k symptoms. 
4. The output o f  the decision table is tS. 
5. The problem now is tS/R, where R = n -  k. 
6. For any 6/R diagnostic problem: 

(a) if 6 = 0 diagnosis is POSITIVE; terminate. 
(b) if 6 > R diagnosis is N E G A T I V E ;  terminate. 
(c) i f 6  _< R go to step 1 (with k = 6 ,  n = R )  for further reduction. 

The set o f  tables in Table 1 depicts the situation in a simplified form to make it 
easier to focus on the steps o f  the reduction method. In realistic cases, actions 
may involve reports back to the user on the rules that are fired, auxiliary 
calculations (for instance, o f  a statistical nature), or other options. In such cases, 
a table action portion would include additional information along with the 
number o f  remaining tests that are depicted in this set o f  tables. It should be 
noted that the use o f  tables to describe the algorithm does not necessarily imply 
that implementation by tables is mandated. I f  tables are used in the implementa- 
tion, they need not always be stored; that is, there are cases in which they can be 
generated. 

An Example 

To illustrate the method, we use a specific example o f  a 4/7 diagnostic 
problem. To solve the diagnostic problem: 

1. Pick any 4 observations. 
2. N a m e  them temporarily T 1 ,  T 2 ,  T 3 ,  and T4. 
3. Check the first table in Table 1 with the results o f  these 4 observations: 

(P = positive or N = negative) 

4. The possible cases are as follows: 
(a) I f  the results o f  all 4 are P, then the diagnosis is definitely established. 
(b) I f  only 3 are P, then we need to check 1 more o f  the remaining 3. 


Reduction Algorithm in Expert Diagnosis 13S 

T a b l e  1. D e c i s i o n  T a b l e s  U s e d  for 4/n, 3/n, 2/n, and 1/n P r o b l e m s  

T I  
T2 
T3 
T4 

P P P P P P P P N N N N N N N N 
F P P P N N N N P P P P N N N N 
P P N N P P N N P P N N P P N N 
P N P N P N P N P N P N P N P N 

~i 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4 

F o r  a n y  4/n p r o b l e m  

T1 P P P P N N N N 
T2 P P N N P P N N 
T3 P N P N P N P N 

6 0 1 1 2 1 2 2 3 

F o r  a n y  3/n p r o b l e m  

T I  P P N N 
T2 P N P N 

6 0 1 1 2 

F o r  a n y  2/n p r o b l e m  

T I  P N 

6 0 1 

F o r  a n y  1/n p r o b l e m  

(c) I f  o n l y  2 a r e  P, t h e n  w e  n e e d  to c h e c k  2 m o r e  o f  the r e m a i n i n g  3. 
(d) I f  o n l y  1 is P, t h e n  w e  n e e d  to c h e c k  3 m o r e  o f  the r e m a i n i n g  3. 
(e) I f  all a r e  N ,  t h e n  h y p o t h e t i c a l l y  w e  n e e d  to c h e c k  4 o f  the r e m a i n i n g  3, 

w h i c h  is i m p o s s i b l e ;  thus, w e  r e j e c t  the d i a g n o s i s .  
In c a s e  (a) t h e r e  a r e  4 o b s e r v a t i o n s ;  all o f  t h e m  h o l d ,  and the d i a g n o s i s  is 
e s t a b l i s h e d  ( f u r t h e r  c h e c k s  a r e  0 o f  3). In c a s e s  (b), (c), o r  (d), a d i a g n o s i s  is not 


136 Akram Salah and Kevin D. Reilly 

established because there is insufficient input information. Instead o f  restarting 
the problem, we can define a new reduced problem such that we check only the 
remaining observations. N o w  we need to check either 1/3 in case (b), 2/3 in case 
(c), or 3/3 in case (d). In case (e) we can reject the diagnosis because the total 
number o f  observations is 7, 4 o f  them have already been checked and failed, 
and the remaining observations are 3 in number. To establish a diagnosis, 4 
observations need to exist; therefore, it is impossible to establish a diagnosis 
from this situation (4/3). 

Thus, the method either establishes a diagnosis from the information provided 
or uses the information to reduce the problem to a smaller problem o f  the same 
nature. The new problem can be solved recursively by the same methodology. 

Commentary 

We can cite several advantages o f  the reduction methodology: (1) there is 
guaranteed recursive reduction until a solution is reached; (2) the number o f  
rules to be checked is less than using the simple rule approach (see Table 2); (3) 
the tables given in Table 1 can be used for any diagnostic problem k/n, 
regardless o f  the value for n; and (4) the growth o f  the number o f  rules is 
limited, as all the decision tables are complete (Welland [15]), and thus there is 
no possibility o f  adding rules to any o f  them. 

As can be seen, the n u m b e r  o f  rules in the reduction method depends on the 
length o f  the subset that establishes the diagnosis, k, rather than the length o f  the 
domain o f  observations, n. This is an important property o f  the reduction 
methodology, as in simple rule approaches the number o f  rules grows 
exponentially with the length o f  the set o f  observations, assuming that simple 
rule generation uses either combinations or permutations. 

According to Weiss and Kulikowski ([14], pp. 118-119), a problem similar to 
what we have been intimating was detected while an expert system for diagnosis 
for rheumatic diseases was being implemented. As the expert system evolved, 
the number o f  its (physician) users increased; consequently, the number o f  
observations known to the system increased. The expert system started with a 4/ 
10 diagnostic problem and was extended to 4/18 and eventually to 4/35. A 

T a b l e  2. N u m b e r  o f  Rules Used to Build a Knowledge Base 

Using Using Using 
Problem ID Permutation Combination Reduction 

4/10 5020 210 31 
4/18 73440 3060 31 
4/35 1256640 52360 31 


Reduction Algorithm in Expert Diagnosis 137 

simple treatment o f  the p r o b l e m  (see Table 2) would make the number o f  rules 
increase exponentially with any increase in the number o f  observations. 

A final point to be noted about the reduction method is that it does not depend 
on any particular application. The algorithm was developed for an expert system 
for differential diagnosis o f  rheumatic diseases, but it can be used in any other 
rule representation o f  the same nature. 

E N V I R O N M E N T  

The system that we e m p l o y  for representing the methodology o f  this article is 
based on an extension o f  a previously defined system called E X P R D  (EXtended 
Prolog Rule Data system), an integration o f  a Prolog, a relational database, and 
a decision table system (Salah [16]). This system is used to store or generate 
decision tables such as those appearing here. Prolog programs expressing the 
reduction algorithm are added as a part o f  the management program. Sets o f  
observations are stored in the system as database relations. An interactive 
dialogue prompts the user to provide the proper information for the diagnostic 
problem and invokes the reduction algorithm. I f  a decision is reached, the 
program advises the user whether the diagnosis is established or rejected. I f  the 
information is not sufficient to establish the diagnosis, the program prompts the 
user to provide m o r e  information. An example dialogue in a session is as 
follows: 

Give me a test you performed: Arthralgia 

What is the result o f  arthralgia ( p =  positive, n = negative): p 

**Diagnosis is P O S I T I V E * *  

**Chronic p o l y a r t h r i t i s >  6 weeks is a significant factor** 

What is the result o f  synovial fluid inflammatory (p = positive, n = negative): p 

These results are not sufficient to establish a diagnosis 


138 Akram Salah and Kevin D. Reilly 

What is the result o f  subcutaneous nodules (p = positive, n = negative): p 

• • • 

• *Diagnosis is NEGATIVE** 

• *Two positive symptoms are noted** 

Do you wish a trace o f  this dialog? No 

• • • 

A general philosophy in dealing with rule systems emerges from our 
methodology: a management system is employed in which rules are dealt with as 
one o f  the components. Such a management system can be viewed as a meta-rule 
program that helps a user (or an expert-system administrator) to build, 
manipulate, query, and analyze a rule system. 

C O N C L U S I O N  

Although the reduction methodology for differential diagnosis expert systems 
is self-contained in the sense that it solves a well-defined problem, if we take a 
broader view o f  the situation, we see this method as part o f  the overall rule- 
management environment. The environment conceptualization emphasizes use 
of meta-level processing to manipulate rule-like representations. Given a k / n  
diagnostic problem, an extended form o f  rule, the management program is 
designed to generate a set o f  simple rules or employ the reduction methodology 
to reduce the problem to a smaller problem of the same nature. Employing 
management programs on the meta-level facilitates a global .view for expert 
systems, allowing operations such as generation, reduction, or analysis o f  rules. 

R e f e r e n c e s  

1. CODASYL Decision Table Task Group, A Modern Appraisal o f  Decision Tables, 
Association for Computing Machinery, New York, 1982. 

2. Dahl, V., Logic programming as a representation of knowledge, IEEE Computer, 
16(10), 106-111, 1983. 

3. Fikes, R., and Kehler, T., The role of frame-based representation in reasoning, 
Comm. o f  the Assn. f o r  Computing Machinery 28, 904-920, 1985. 


Reduction Algorithm in Expert Diagnosis 139 

4. Salah, A., and Yang, C. C., Rule-based systems: A set-theoretic approach, 
Proceedings o f  the 3rd Annual Computer Science Symposium on Knowledge- 
Based Systems: Theory and Application, Columbia, SC, 1986. 

5. Hayes-Roth, F., Rule-based systems, Comm. o f  the Assn. f o r  Computing 
Machinery 28,921-932, 1985. 

6. Reilly, K., Salah, A., Morgan, P., and Rowe, P., Multiple representations in a 
language-driven memory model, in Papers on Computational and Cognitive 
Science (E. Battistella, Ed.), Indiana Univ. Linguistic Club, Bloomington, Ind., 87- 
94, 1984. 

7. Yang, C. C., Relational Databases, Prentice-Hall, Englewood Cliffs, N.J., 1986. 

8. Reilly, K. D., Salah, A., and Yang, C. C., A Logic Programming Perspective on 
Decision Table Theory and Practice, University of Alabama at Birmingham Tech. 
Report, 1986. 

9. Kowalski, R., Logic f o r  Problem Solving, Elsevier-North Holland, New York, 
1979. 

10. Clocksin, W., and Mellish, C., Programming in Prolog, Springer-Verlag, New 
York, 1981. 

11. Bruynooghe, M., Prolog in C f o r  Unix Version 7: A Reference Manual, 
Katholieke Univ., Leuven, Belgium, 1980. 

12. Prade, H., A computational approach to approximate and plausible reasoning with 
applications to expert systems, IEEE Transactions on Pattern Analysis and 
Machine Intelligence PAMI-7, 260-283, 1985. 

13. Rosenbloom, P., Laird, J., McDermott, J., Newell, A., and Orciuch, E., R1-Soar: 
An experiment in knowledge-intensive programming in a problem-solving architec- 
ture, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-7, 
561-569, 1985. 

14. Weiss, S., and Kulikowski, C., A Practical Guide to Designing Expert Systems, 
Rowman & Allanheld, Philadelphia, 1984. 

15. Welland, R., Decision Tables and Computer Programming, Heyden & Son, 
London, 1981. 

16. Salah, A., A n  Integration o f  Decision Tables and a Relational Database System 
into a Prolog Environment, PhD Thesis, Univ. of Alabama at Birmingham, 
Birmingham, Ala., 1986.