PII: 0888-613X(87)90012-0


Architecture of an Expert 
System for Composite 
Document Analysis, 

Representation, and Retrieval 
E d w a r d  A .  F o x  a n d  R o b e r t  K. France 

Department o f  Computer Science, Virginia Tech 

ABSTRACT 

The C O D E R  (COmposite D o c u m e n t  Expert~extended~effective Retrieval) pro- 
j e c t  is a multi-year effort to investigate h o w  best to apply artificial intelligence 
methods to increase the effectiveness o f  information retrieval systems handling 
collections o f  composite documents. To ensure system adaptability and to allow 
controlled experimentation, C O D E R  has been designed as a distributed expert 
system. The use o f  individually tailored specialist experts, coupled with standardized 
blackboard modules f o r  communication and control and external kno wledge bases 
f o r  maintenance o f  f a c t u a l  world knowledge, allows f o r  quick prototyping, 
incremental development, and flexibility under change. The system as a whole is 
being implemented under U N I X  as a set o f  MU-Prolog and C modules communicat- 
ing through pipes and T C P / I P  sockets. 

KEYWORDS" information retrieval, artificial intelligence, distributed ex- 
pert system, knowledge bases, blackboard architecture, lexicon con- 
struction 

I N T R O D U C T I O N  

As the world's pool o f  information, particularly o f  machine-readable text, 
rapidly expands, it becomes increasingly necessary to engage the help o f  

Project funded in part by grants from the National Science Foundation (IST-8418877), the Virginia 
Center for Innovative Technology (INF-85-016), and by an AT&T equipment contribution. An 
earlier version o f  this paper was presented at the Third Annual USC Computer Science Symposium; 
Knowledge-Based Systems: Theory and Applications, Columbia, South Carolina, March 31-April 
1, 1986. 

Address correspondence to Edward A. Fox, Department o f  Computer Science, Virginia Tech, 
Blacksburg, Virginia 24061. 

International Journal of Approximate Reasoning 1987; 1 : 151-175 
© 1987 Elsevier Science Publishing Co., Inc. 
52 Vanderbilt Ave., New York, NY 10017 0888-613X/87/$3.50 151 


152 Edward A. Fox and Robert K. France 

computers to control and manipulate it. Initial attempts at computer-aided 
information storage and retrieval (ISR) made centralized databases accessible to 
a large community of users [1] but focused principally on performance and have 
achieved only moderate levels o f  effectiveness [2]. Today, end users often prefer 
to search for themselves [3], using gateways, front ends, intermediaries, and 
interfaces [4], or aided by powerful microcomputers attached to optical disk 
stores. These end users need more effective and adaptable tools such as have 
been investigated by the research community [5]. CODER (COmposite 
Document Expert~extended~effective Retrieval) is a research system intended to 
address these needs through the mechanisms of knowledge-based and goal- 
directed artificial intelligence (AI) techniques. 

Problem Description 

The CODER system is aimed at investigating issues of meaning representa- 
tion and the effective matching of user needs with relevant (passages of) 
documents. Although the SMART system has been evolving for more than 25 
years with similar objectives [6], recent experience with reimplementing a 
modem version [7] and with using its latest form [8] suggests than an AI-based 
architecture would make further development and experimentation much easier. 
Questions in key subject areas that could then be studied (along with references 
to related work) include: 

COMPOSITE DOCUMENTS 
1. Can composite documents that include text, factual information, and 

references to other documents [9] be effectively analyzed [10] so that 
entire documents or appropriately sized passages [11] can be retrieved? 

2. Can document analysis and modeling improve with findings about 
abstract document structure [12], message composition [13], office 
modeling of documents and other objects [14, 151, document formatting 
[16], and related standards [17]? 

EFFECTIVE RETRIEVAL 
3. Can effective retrieval methods suitable for the growing number of full 

text databases [18] be developed [2] using automatic techniques [19]? 
4. Because the overlap between results of different retrieval methods is 

small [20], can overall effectiveness increase by use of several? Can 
retrieval systems be tailored to different users' understandings of 
relevance? 

5. Can rule-based processing allow more effective combination of biblio- 
graphic information about documents [21] with other factual and content 
components than has been achieved with statistically based approaches 
[2217 


Expert System/Composite Document Analysis 153 

. Can a heuristic approach allow selection for each query of the most 
appropriate search strategy (e.g., choice between clustered versus 
inverted file searching, according to findings in [23]), the best retrieval 
approach (e.g., extended Boolean [24] versus vector space [25] versus 
probabilistic [26, 27]), and the fastest method for identifying good 
documents [28]? 

AI METHODS 
7. Is the logic programming paradigm in general [29, 30] and the Prolog 

language in particular [31] mature enough to use for natural language 
analysis [32], the rule-based processing commonly used in expert system 
development [33], and general AI programming [34] in a large, complex 
system? 

8. Is the blackboard model [35] of a distributed expert information- 
providing mechanism [36] suitable for an ISR system? 

KNOWLEDGE REPRESENTATION 
9. In light o f  the many knowledge representation schemes suggested for 

information retrieval [37], can computationally tractable ones [38] be 
developed [39]? 

10. In particular, are frames [40] useful in representation and reasoning [41] 
about document content in a way that can aid information retrieval [42]? 

11. Can temporal data be suitably represented and used [43] in retrieval? 

COMPUTATIONAL LINGUISTICS 
12. Can linguistic analysis aid information retrieval [44], not only through 

improved analysis o f  queries [45] but also in document analysis [46] 
through skimming [47, 48] or far more robust and detailed analysis [49, 
50] o f  more than a constrained sublanguage [51]? 

13. Can machine-readable dictionaries [52] support expansion of the small 
lexicons used to date in text analysis systems [53]? 

HUMAN-COMPUTER INTERFACE 
14. Does current knowledge about human-computer interaction [54] and 

information retrieval [55] allow development of interfaces that can adapt 
to individual needs and preferences? 

15. Can information retrieval systems satisfy some of the needs for tutoring 
systems by making books [56], encyclopedias [57], and other reference 
works more accessible? 

16. Does a graphics-based interface [58] where problem formulation, query 
construction, term expansion, feedback, browsing, and profile-based 
filtering are all interwoven in a highly interactive human-computer 
dialogue [59] lead to more effective and pleasant retrieval? 


154 Edward A. Fox and Robert K. France 

Related W o r k  

Several research investigations are related to the CODER project. The earliest 
use o f  expert system methods in the retrieval area was probably in the CONIT 
system [60]. The closest contemporary effort is development o f  I3R by 
Thompson and Croft [61]. I3R differs by being coded as a monolithic system in 
Lisp, interfaced with a database system, aimed to explore retrieval methods that 
access a statistically analyzed document collection, and implemented using 
fewer but more complex experts. Yet like CODER, I3R is built around a 
blackboard that coordinates retrieval processing. The RUBRIC system, which 
uses a rule-based approach whereby queries become small knowledge bases 
[62], incorporates a variety o f  techniques for combining evidence [63] that have 
been included in CODER. TOPIC is also o f  interest, as it attempts to parse 
documents to condense their content and identify important concepts [46]. The 
more ambitious Project Minstrel, applying retrieval and AI methods o f  office 
modeling, is based on a comprehensive knowledge representation facility [15]. 

Although CODER incorporates many ideas from other research efforts, it is 
unique in its aim and scope. CODER provides a unified paradigm for the entire 
process of information storage, representation, and retrieval based on a tailor- 
made encoding o f  knowledge (see [64] and the section on knowledge 
administration below) and a flexible architecture designed to support the storage 
and manipulation o f  that knowledge. This article concentrates on the architec- 
tural issue; the interested reader is referred to France and Fox [64] for more 
details on how knowledge is used. 

A R C H I T E C T U R E  

CODER is organized as an integrated system for document entry, analysis, 
storage, retrieval, and display. It should be adaptable as a standalone system, as 
a server for interactive or batch entry o f  documents or queries, or as an 
intelligent intermediary to another database system. The following discussion 
relates to the most comprehensive case: standalone implementation. 

In keeping with design principles o f  modularity and object-oriented program- 
ming, CODER is made up o f  four different types o f  objects differentiated by 
their use o f  knowledge (see Figure 1). Experts are specialists in particular 
restricted domains pertinent to the tasks at hand. They communicate with each 
other only through blackboards, which serve as holding areas for session-specific 
knowledge. Each blackboard is the external knowledge source for a strategist that 
also has a local knowledge base of planning rules to coordinate the activities o f 
experts in the community. External knowledge bases store information o f  
common interest to several experts. Because they deal only with factual world 
knowledge, they require only limited inference abilities. Finally, there are 


Expert System/Composite Document Analysis 155 

Expert 
Internal 

Knowledge 
Base 

BIackboard / 
Strategist 
Complex 

BLACKBOARD 
( ) 

) 
) 
) 
) 

Strategist 

Internal 
Knowledge 

Base 

External 
Knowledge Base 

Resource 
Manager [ 

Figure 1. CODER Object Classes 

r e s o u r c e  m a n a g e r s  mediating between low-level machine structures and the 
abstract representations used by the rest o f  the system. These may be 
implemented in a procedural language, as they do not require special knowledge 
or inference capabilities., 

The internal structure o f  C O D E R  is shown in Figure 2. The central region or 
" s p i n e "  includes external databases for documents and terms, along with the 
knowledge administration complex. The resources o f  the spine are shared by 
two expert communities, one for document analysis and one for retrieval. F r o m  
an external perspective the system wraps so that users (shown at either end o f  the 
figure) are inside the system; they can both enter documents and retrieve them, 
possibly in an integrated fashion. Each user communicates with a resource 
manager specialized to his or her preferred interface, which in turn communicates 
with a group o f  translation specialists to effect a two-way dialogue between the 
user and the rest o f  the system. The interaction o f  these specialists with each 


156 Edward A. F o x  and Robert K. France 

" ii i 

/ / 

\ \ 

a ~  
*.-.. 

E 

r , o  

go 

© 

e -  

~e 

© 

e4 

° 4  


Expert System/Composite Document Analysis 157 

other and with other experts is mediated by a blackboard. Each expert 
community m a y  also reference additional external knowledge bases, such as the 
user model base. Attached to each blackboard and coordinating all the activities 
o f  the subsystem is a strategist. 

The overall operation o f  C O D E R  is shown in Figure 3. Because one or more 
parts o f  C O D E R  can be assigned to separate processors, it is logical to view the 
system as made up o f  groupings o f  modules needed for c o m m o n  functions. F o r  
example, one user might be entering new documents so the system can analyze 
and store them, while other users are searching and retrieving documents. In 
both these cases, state information about the progress o f  the s y s t e m ' s  services 
for a user is maintained entirely on the blackboard involved. Finally, at the 

DOCUMENT ENTRY & ~1 \ \  
ANALYSTS SESSION 

Figure 3. 

DOCUMENT RETRI EYAL 
SESSION 

OO¢~.tt, tEN'r 

COMMON EXPERTS & " FACT BASES" 

DOCUMENT RETRI EYAL 
SESSION 

Overview of System Operation in a Distributed Environment 


158 Edward A. Fox and Robert K. France 

center of the figure are the shared experts and external knowledge bases 
involved in supporting these tasks. 

The sections that follow provide more detailed information about the various 
CODER components. 

Knowledge Administration 

Knowledge in CODER is partitioned both horizontally between the two 
subsystems and among the modules of each subsystem, and vertically along 
what Sterling [65] refers to as the "logical levels of problem solving." The top 
level of this second division is the goal-oriented planning knowledge that guides 
the session strategists. In the current CODER implementation this strategic 
knowledge is encoded in rules for recognizing and reacting to stages in the 
problem tasks. Actual steps in the problem solution are carried out by the experts 
in each community using tactical knowledge of how to accomplish their 
designated tasks. Finally, the characteristics of the problem universe are 
represented as world knowledge in the external knowledge bases. 

Strategic and tactical knowledge are stored locally in the modules that use 
them. The same is not true for world knowledge. Facts about the world provide 
the premises from which the experts reason about their tasks, and facts and 
hypotheses about the world make up the problem-state descriptions that inform 
the strategists' decisions. Thus, the factual representation language used in 
CODER to encode world knowledge also serves as a lingua franca for 
communication among the modules that make up each subsystem community. 
This language, defined in Figure 4, is itself made up of three levels. Elementary 
data types include distinct sets of names for entities of different sorts, as well as 
such familiar primitives as character, integer, and atom. Frames provide a 
facility for building definite descriptions of entities according to prototypical 
descriptions drawn from a tangled hierarchy of classes. And relations are 
predications on those entities, either ascribing accidental properties to them or 
describing relations among them. 

Relations are familiar to AI programmers by analogy to Prolog predicates. 
CODER relations differ from Prolog predicates in having specified type 
signatures (arity and types on arguments) and algebraic attributes (whether they 
are transitive, symmetric, and so forth). Elementary data types are also familiar; 
restricted data types are defined through a type of restriction polymorphism [66]. 
The semantics of frames, however, may require some explanation. The 
subsumption relation given in Figure 5a defines the inheritance hierarchy that 
can be specified for frame types. The frame with no slots subsumes all other 
frames, and two frames are equivalent if they subsume one another. 

Figure 5b defines the matching of frame objects, in terms of the types of the 
various slots and their values. Frame object A matches frame object B if every 
filled slot of A matches a filled slot of B, where elementary objects match 


Expert System/Composite Document Analysis 159 

relation ::= relation_name (argument) + 

argument ::= relation 
I frame 
I elementary..object 
I quantifier argument 

frame ::= frame_name (slot_name s l o t _ f i l l e r )  * 

s l o t _ f i l l e r  ::= frame 
I elementary_object 
I quantifier s l o t _ f i l l e r  

elementary_object ::= (quantifier) (primitive_object r e s t r i c t i o n )  

quantifier ::= I | = r t J f  
I l e t  J r  
I integer 

I n o n _ e m p t y _ l i s t _ o f  
I n o n _ . e m p t g . . s e t . . o f  

Figure 4. Definition of Factual Knowledge Representation Formalism 

s u b s u m e s ( a n c e s t o r _ f r a m e ,  d e s c e n d e n t _ f r a m e )  - 
s l o t _ l i s t ( a n c e s t o r _ f r a m e ,  a n c _ l i s t ) ,  
s l o t _ l i s t ( d e s c e n d e n t _ f r a m e ,  d e s c _ l i s t ) ,  
V x ( x ~  a n c _ l i s t  D 3 y ( y e  d e s c _ l i s t  A n a m e ( x ) = n a m e ( y ) A  

s u b s u m e s ( t y p e ( x ) ,  t y p e ( y ) )  )). 

Figure 5a. Semantics of Frames: Frame Subsumption 

m a t c h ( f r a m e l ,  frame2) - 
s l o t _ l i s t ( t y p e ( f r a m e  1 ), l i s t l  ) A 
s l o t _ l i s t ( t y p e ( f r a m e 2 ) ,  l i s t 2 )  A 
V x  (x • l i s t l  A h a s _ v a l u e ( f r a m e l ,  x, v) D 3y (y • l i s t 2  A 

name(x)=name(y) A has_value(frame2, y, r) A match(v, r) )). 

m a t c h ( e l t l ,  e l t 2 )  - 
e l t l  = elt2. 

Figure 5b. Semantics of Frames: Frame Matching 


160 Edward A. Fox and Robert K. France 

if and only if they are equal. This asymmetric matching is based only on filled 
slots, so objects o f  differing types can still match. Matching o f  frames is 
computationaUy inexpensive and mirrors whether the two objects, or two 
descriptions o f  the same object, are indeed similar. By defining suitable frame 
types, the knowledge administrator can describe the various entities to be 
handled in a particular CODER system. Experts can then instantiate objects o f 
these types and store them on the blackboard or in external knowledge bases. 

Consistent use o f  this formalism is maintained system-wide by the modules o f  
the knowledge administration complex (see Figure 6), which include type 
managers for each component o f  the language. For example, the frame type 
manager keeps track o f  the classes suitable for describing entities and the 

Constructor 
Plan~ler 

set_of 
nov-set 
ore. 

/ /  
Ralatton Type Haneger 

Is-relation 
arity 
signature 
reflexive 
transitive 
sgmmotrio 
antisgmmotrio 

Relation Object Hana-'~er / ~  
nov..relation J I 
argument 
arguments 
isomorphio 

list_el 
nov_list 
at©. 

t Fra 

is-eli 
densriptton 
subtypes 
snpertgpos 
reeker 

T 
I ElernentartJ 

Data Object 
Manager 

Frame Tupe Manager 

is_frame 
slot-list 
subframos 
super frames 
SUbsumes 

Frame Object Manager 
nov--frame 
has._slot_value 
set._slot_value 
removo__slot_value 
matohtng_frames 
equal_frames 

Figure 6. Internal Structure of the CODER Knowledge Administration Complex 
(arrows indicate dependencies) 


Expert System/Composite Document Analysis 161 

attributes appropriate to each class; the relation type manager keeps track o f  the 
relations existing among entities and the characteristics o f  each relation. New 
types are added only by an external system administrator rather than by the 
modules of the system: deciding what sorts o f  types should be recognized by the 
system is part o f  the knowledge engineering involved in constructing CODER. 
Knowledge objects, by contrast--concrete expressions in the representation 
language--are created, modified, and destroyed dynamically during the opera- 
tion o f  the system. 

External Knowledge Bases 

Whereas the knowledge administration complex aids with control o f  the types 
o f  knowledge representations employed in a particular system, the external 
knowledge bases (EKBs) provide storage and access to large numbers o f  facts. 
The document knowledge base, the lexicon, and the user model base are all 
EKBs that each maintain knowledge about a particular class o f  objects as specific 
statements o f  fact. 

The functionality o f  external knowledge bases is specified by the operations 
shown in Figure 7. Formally, propositions entered into a fact base are required 
to be ground instances o f  logical relations known to the system; that is, to 
involve neither unbound variables nor meta-terms. These propositions are added 
to an external knowledge base as single statements but may be retrieved in either 
of two ways. The knowledge base may be queried with a skeletal fact, that is, a 
fact containing one or more variables, and will return the set o f  all facts in the 
knowledge base that match the skeleton. Alternately, the knowledge base may be 
queried with an object (an elementary datum, a frame, or a relation) and will 
return the set o f  all facts involving that object. In addition, a knowledge base 
may be queried about the number o f  facts that match an object or a skeletal fact: 

e a t e r  

Figure 7. The Functionality of an External 
implementation independent internal structure. 

I t , ,  
f a c t s _ v i t h _ v a l u e  
f a o t s _ v i t h _ f r a m e  
I r a o t s _ v i t h _ r e l  

f a o t s _ m a t o h i n g  
f r a m e s _ m a t c h i n g  

n u m _ v i t h _ v a l u e  
n u m _ v i t h _ f r a m e  
n u m _ v i t h . . r e l  

--7-1/ 
Knowledge Base. An EKB has no 


162 Edward A. Fox and Robert K. France 

this information can be used by the querying entity for statistical purposes or 
simply to avoid receiving excessively large sets o f  facts. 

The lexicon maintains knowledge about terms in the language. It can be 
conceptually divided into two parts, one o f  general linguistic knowledge and the 
other o f  specialized world knowledge particular to the collection o f  documents 
employed. Although knowledge from both conceptual halves may be recalled for 
a given request, tagging the knowledge in this way promotes portability by 
allowing knowledge o f  general use to be decoupled from the pragmatics of a 
given document collection and reused in other applications. Construction o f  the 
current CODER lexicon following these principles is highlighted in Figure 8. 
The initial loading o f  facts portrayed at the top of the figure is from one large 
machine-readable dictionary [67]. Table 1 describes the various relations 
initially derived from the more than 8 0 , 0 0 0  headwords present. Further analysis 
such as of the definitions (see c _ D E F )  should lead to additional relations that 
would be more directly usable for parsing. 

The document knowledge base maintains facts about the documents as 
assertions relating a document (passage) and a knowledge structure. A simple 
attached resource manager provides storage and retrieval for raw document text. 
Together these modules constitute the document database. Finally, there is a 

File of Entr~s 

Lexieal 
md 

• Semanfl©s 
Relations 

P.O.S.'s Jtnd 
Category 

Information 

• .. ( Deflnlt~ 
... Tex~ ) 

Oenera] 
Lingutst~ 
Knovledge 

Plorphologj©al 
Variant / Ilia--Specialized 

Information Vm'ld 
Knowledge 

. •  Protot~pteal use 
Information 

[ of ArHfle|al 

Figure 8. 

PIV'~RSe, 
Co-ool~lrrehce, 

H t e r ~ ,  
Information 

Construction of the CODER Lexicon 

I~owledge 
obta~ed dur~g 
Do©ument Inpu~ 


Expert System/Composite Document Analysis 163 

T a b l e  1. Relations abstracted f r o m  the Collins Dictionary of the English 
Language [70] tapes 

c_ABBREV 
c_ALSO_CALLED 
c_CATEGORY 
c_COMPARE 
c _ D E F  
c _ D E F _ N U M  
c_HEADWORD 
c_MORPH 

c_NLAST 
c_PAST 
c_PLURAL 
c_POS 
c_RELADJ 
c_SAMP 
c_SINGULAR 
c_SYLL 
c_USAGE 
c _ U S A G E _ N U M  
c _ V A R _ S P E L L  
c _ V A R _ S Y L L  

Abbreviation of headword 
Headword also commonly called this 
Category (semantic label) of headword 
Compare to another headword and sense(s) 
Definition of headword 
Number of (up to) 80-character blocks of definition 
Headword entry 
Morphological variant of headword (including part of 
speech) 
Rest (e.g., first/middle name) of proper noun headword 
Past form of headword 
Plural of headword (sometimes just the ending) 
Part of speech 
Related adjective to headword 
Example of headword in context 
Singular form of headword (sometimes just the ending) 
Syllabification of headword 
Usage notes providing guidance on usage of headword 
Number of (up to) 80-character blocks in usage note 
Variant spelling(s) (if any) 
Syllabification of variant spelling(s) 

user model base o f  facts about individual users. These include reports o f  
occurrences during a single session and general statements about the user, such 
as the type o f  information that has proven relevant in the past, background 
knowledge particular to or supplied by the user, and c o m m o n  characteristics o f  
relevant documents. This body o f  knowledge about users, and the bodies o f  
knowledge discussed above, inform the s y s t e m ' s  response in intelligently 
analyzing and retrieving documents for a particular individual. 

Expels 

Conceptually, an expert is a specialist in a certain restricted domain pertinent 
to the task at hand. Experts are designed to be implemented in relative isolation 
f r o m  one another: no expert has knowledge o f  the internal behavior o f  the other 
experts in the community, and all experts communicate with the community 
strictly through the given blackboard. Part o f  the specification o f  an individual 
expert is the set o f  predicates that it m a y  view in a given blackboard area and the 
(possibly overlapping) set o f  predicates that it m a y  post back. Obviously, there 
must be agreement a m o n g  the expert implementors on the structure and bounds 
o f  those predicates i f  the experts are to w o r k  together. What each expert does 


164 Edward A. Fox and Robert K. France 

with those predicates, however, and what internal knowledge and processes it 
uses to produce new hypotheses are left to the implementor of the individual 
expert. Each expert can therefore be built in the way that best takes advantage of 
the characteristics of its particular domain of expertise. 

An expert has only two requirements for its operation: it should be knowledge 
driven, and it must recognize the appropriate commands from the strategist 
scheduler. The first is philosophical in nature: it is part of the CODER design 
that the complexities of the system tasks be realized in the knowledge required 
for their execution rather than the process of execution itself. For experts this 
implies that expertise be represented as explicit knowledge, separate from 
whatever engine manipulates it. The knowledge in the expert, moreover, is 
constrained by system design to be on a higher level than factual knowledge: 
either rules for finding and manipulating factual knowledge in an external 
knowledge base or facts that relate to classes of objects in the problem universe. 
The second requirement is pragmatic: for the strategist to schedule their activity 
properly, experts must go through a canonical cycle of operations. 

The typical CODER expert consists of a communications interface, a local 
knowledge base, and an inference engine (see Figure 9). The interface provides 
for communication with the blackboard and optionally with external knowledge 
sources such as resource managers or EKBs. The local knowledge base contains 
the particular expertise necessary for the proper execution of the expert's tasks; 
the inference engine is chosen to best execute those tasks. Possible engines 
include both forward-chaining and backward-chaining rule interpreters, frame- 
based classification engines, and pattern-matching engines. These engines could 
then be associated with different rule bases, classification trees, and similarity 
measures, respectively, to produce specialized experts in a variety of disciplines. 

External Call Manager I 

Inference I 
Engine 

post 
viev Ir"r'  

abort 
a D s v o r s  

attompt-hyp 
attend_to_area 
attend_to_quest 

Figure 9. Canonical CODER Expert Showing Internal Structure and Functionality of 
Interface with Blackboard/Strategist Complex 


Expert System/Composite Document Analysis 165 

Some examples o f  expert k n o w l e d g e  bases and inference techniques are given in 
Table 2. R e c e n t  research supports the view that it is possible to build engines that 
c o v e r  a b r o a d  range o f  p r o b l e m s  without falling into the computational trap o f  
general inference (see [38] f o r  a formal analysis o f  this effect). 

This m e t h o d  o f  p r o b l e m  d e c o m p o s i t i o n  is particularly well suited to the tasks 
o f  d o c u m e n t  analysis and retrieval, w h e r e  the relevance o f  a given d o c u m e n t  to a 
given information need is influenced b y  m a n y  factors. Assigning an expert to a 

T a b l e  2. K n o w l e d g e  bases and inference types f o r  some sample experts 

Expert Local Knowledge Inference 
Name Base Engine 

Date Mappings from different natural Forward chaining 
representations for dates into a (rule based) 
canonical internal representation 

Bibio. Ref. Different types o f  biblographic entities Classificational 
and clues to recognize them; lexical 
conventions for representations of biblio. 
entities in text and in bibliographies 

Doe. Type Different types o f  documents (both hard Classificational 
and soft types) and clues to recognize 
them and their component fields 

Declension, conjugation and case- 
changing rules for English; irregular 
morphological variants (or how to find 
them in the lexicon) 

Related Term Methods and metrics for navigating Relational 
relation networks in the lexicon 

Cluster Heuristics for finding document Special purpose 

Morphology Hard coded 

passages in the database that are 
similar to those identified by the 
user as close to current needs 

Methods to transform a fact-based 
representation o f  a search request to 
a p-norm representation and conduct 
a search 

Methods and metrics for identifying 
documents in the database that share 
linguistic substructures with the 
retrieval request 

P-Norm Hard coded 

Linguistic Relational 


166 Edward A. Fox and Robert K. France 

small area o f  specialization and decoupling it from the remainder o f  the system, 
however, has additional advantages. First, the development o f  the expert is 
separated from that o f  the surrounding system. Interaction problems, normally a 
plague o f  AI systems, are thereby kept to a minimum. In addition, the experts 
are kept small, so problems o f  rule interaction within the expert are minimized. 
Furthermore, tasks that are found to require too much complexity can be further 
subdivided according to the areas o f  expertise required to solve them, until they 
are reduced to manageable size. 

B lackboard/ Strategist C o m p l e x  

A blackboard is an area for communication among experts [35]. This 
communication takes place through posting and reading hypotheses in special- 
ized subject areas. In CODER blackboards (see Figure 10), a specialization of 
this process provides a means for asking and answering questions, which are 

Specialist A [ 
- competent to perform I 

certain tasks in (or [ 
between) certain [ 

- competent to perform / 
certain tasks in (or I / ~  - ~  
between) certain ~ / /  

• t Y  
- competent to per f o r m [ ~  

oertain tasks in (or r 
between) oortain [ 

Blackboard Posting Areas 

Priorit W Posting Areas: 

~ Ouestion and Answer Area 

Pending H~Jpothesis Area ~)~ 

Subject Posting Areas: 

~ Subject Area I 

Subject Area 2 

Subject Area N 

Blackboard Strategist (Planner) 

- -  maintains a model of each area 
specialist. 

- -  schedules specialist activit W. 

- -  maintains eonsistenc~ of 
blackboard postin 9 areas. 

- -  selects consistent set of 
hwpothosos for pendtng area. 

Translation 
Expert 1 

Translation 
Expert M 

Figure 10. Blackboard/Strategist Complex Showing Mapping of Experts in the 
Immediate Community to Blackboard Areas 


Expert System/Composite Document Analysis 167 

contained in a separate area o f  the blackboard. The importance o f  this type o f  
communication was noted convincingly by Belkin and colleagues [36]. In 
addition, the CODER blackboards provide a special area, maintained by the 
blackboard strategist, containing a small set o f  consistent hypotheses o f  high 
certainty. This pending hypothesis area is available for read access by all experts 
and thus, indirectly, by the outside world. It provides an instantaneous picture o f  
the " c o n s e n s u s "  o f  the blackboard: what the system as a whole hypothesizes 
about the problem under consideration at any moment. 

A hypothesis is a higher-order knowledge structure built on the factual 
knowledge forms supported by the knowledge administration complex. It 
consists o f  five parts: 

1. The fact hypothesized. 
2. The identifier o f  the expert hypothesizing it. 
3. The confidence that the expert has in it (which can be assigned by different 

methods for different types o f  hypotheses, according to whatever 
knowledge aggregation scheme is appropriate for the set o f  constraints and 
knowledge sources at hand). 

4. The hypothesis identifier. 
5. The dependencies on other hypotheses. 

This latter information, apart from aiding selection o f  the set o f  pending 
hypotheses, allows truth maintenance functions to be performed within the 
blackboard subject areas. If an expert withdraws a hypothesis, for instance, or 
radically changes the associated confidence level, this information makes it 
possible to schedule tasks to reconsider the dependent hypotheses. 

Monitoring the blackboard for this class o f  event is one function o f  the 
blackboard strategist, shown in the lower part o f  Figure 11. Because the logic 
task scheduler's rules governing truth maintenance are independent o f  the 
particular predicates involved in the facts hypothesized, this function is 
independent o f  the application domain o f  the blackboard community. The 
strategist also monitors the blackboard for domain-specific events and conditions 
that trigger new processing. These categories o f  function are kept separate in the 
strategist, so the truth maintenance function can be transported to other tasks. 
Both task schedulers in the strategist have been designed as rule interpreters, as 
neither the strategies involved in truth maintenance nor those involved in 
information analysis or retrieval are yet well understood. 

The final component o f  the strategist is a dispatcher o f  the tasks identified by 
the other components. On the basis o f  the mix o f  tasks scheduled by the truth 
maintenance and task oriented components, it attempts to make optimal use o f  
all available machine resources by issuing appropriately timed commands to the 
experts in the blackboard community. This allows different groups o f  experts to 
be active at different phases in the community task, but also allows experts 
outside the currently active group to be called up to answer a question or to 
reconsider a hypothesis. In an ideal environment with one processor per expert, 


168 Edward A. Fox and Robert K. France 

C Question / Answer Area 

C Pending Hupothes|s Ar.a ) 

C Subject Posting Area 

Subject Posting Area 

Post & 
Retract 
Commands 

Posting Area Manager 

Maintains Blackboard Areas 

| 
Logic Task 
Scheduler 

Interprets 
Generic Truth- 
Maint. Rules 

o -  i L °°-*'*k Handler Scheduler Associates Interprets 
Questions with Application- 

Answer Sources Specific Rules 

Done & 
Cheokpo 
Reports 

.ules for I I Wh|oh Exports I I Rules eor 
• c a n  Ansver Conslsten¢~ I I I i Evaluating and 

~ ~ ~  Reaoting to 
Problem Phases 

Wake, 
A b o r t ,  
Attend etc. 
Commands 

Task 
Posting Area 

History 

Figure 11. Internal Structure of Blackboard/Strategist Complex 

N 

B 
L 
A 
C 
K 
B 
0 
A 
R 
D 

5 
T 
R 
A 
T 
F 
G 
I 
$ 
T 

such dispatching duties would be kept at a minimum, and the normal cycle o f  
activities o f  each expert would ensure highly parallel processing. 

I M P L E M E N T A T I O N  S T A T U S  A N D  F U T U R E  W O R K  

Early work on the CODER system concentrated on design and on preparation 
o f  the knowledge to be loaded into the external knowledge bases. A test 
collection was needed that would allow investigation o f  the many questions o f  
interest, so the decision was made to collect all issues o f  A I L i s t  Digest, an 
electronic mail publication distributed from the D A R P A  Internet to many 


Expert System/Composite Document Analysis 169 

networks, beginning with the first one edited by Kenneth Laws in 1984. As o f 
May 1987, roughly 13 megabytes o f  data, including over 6500 messages by 
many different authors in widely differing formats, have been collected. To  
provide domain knowledge relevant to this collection as a supplement to the 
general English lexicon, The Handbook o f  Artificial Intelligence has also 
been obtained in machine-readable form. Queries and relevance judgments on 
this test collection have been captured using the SMART system. 

Experimentation in natural language processing is best supported by a large, 
comprehensive lexicon. The most efficient construction approach was to 
reformat machine-readable dictionaries and to parse entries into suitable 
structures. Because the G.&C. Merriam Company and the Longman Group 
Limited both refused to provide their dictionaries, it was decided to use four 
separate dictionaries obtained from the Oxford Text Archive [68-71] so that the 
resulting lexicon could be made freely available to other researchers. Initial 
efforts focused on the largest o f  these, the Collins Dictionary o f  the English 
Language. 

Development o f  CODER is taking place in the UNIX ~ environment. Pipes and 
TCP/IP sockets [72] allow intermodule and intermachine communication. Thus, 
procedural modules like interface managers can be coded in C or C+ + ,  a 
dialect supporting the class-object paradigm [73]. MU-Prolog was selected as 
the AI implementation language, as it includes a clause indexing facility for 
medium-size collections o f  facts or rules, and two types o f  database support 
[74]. The first scheme uses hashing, and the second employs a two-level 
superimposed coding scheme [75] that performs well for partial matches [76] 
and can easily support large Prolog databases [77]. MU-Prolog also has tools for 
information hiding, interfacing with the UNIX operating system, and reducing 
dependence on rule ordering and extra-logical operations. 

Implementation o f  the modules o f  the CODER system began early in 1986. At 
the end o f  the summer o f  1986, the knowledge administration and blackboard/ 
strategist complexes were nearly complete, the communication routines were 
well underway, interface managers using CURSES and SUN-Windows pack- 
ages had been prototyped (see Figure 12), a p-norm search routine had been 
tested, and a first version o f  the document-type specialist developed. Further 
coding according to the detailed specifications given in France [78] should lead 
to a working prototype in 1987. 

Subsequent efforts will aim first at demonstrating the feasibility o f  using 
CODER for document analysis and retrieval and at comparing different 
approaches to see if CODER will indeed simplify experimentation regarding the 
application o f  AI methods to ISR problems. Much o f  this work, however, will 
require development o f  a more refined lexicon, specification o f  heuristics 
regarding appropriate retrieval methods for particular types o f  queries, and 

Trademark of AT&T Bell Laboratories. 


170 Edward A. Fox and Robert K. France 

Blackboard T r a n s l a t i o n  I n t e r f a c e  I'lanager 
E x p e r t s  

Retrieval 

Black- 
board 

\ 

_F 

Feed@sck Expert 

Quer~ Parsing 
Expert 

Figure 12. Retrieval Subsystem User Interface 

thorough integration o f  user models into a truly interactive system for satisfying 
information needs. It is hoped that initial success with CODER will be followed 
by a long period o f  productive research and experimentation that will contribute 
to the human knowledge base about ISR. 

A C K N O W L E D G M E N T S  

Numerous students enrolled during the last two years in Virginia Tech's two- 
quarter graduate sequence on information storage and retrieval have played a 
role in the development o f  CODER. For an MS project, Robert Wohlwend made 
an initial version o f  the Prolog lexicon from the typesetting tapes o f  one major 
dictionary. Research assistants Mary Beth Weaver and Qi Fan Chen are 
currently involved in system implementation. Pat Cooper and Joy Weiss have 
provided secretarial support. 

The Oxford Text Archive and the Melbourne University Department o f  
Computer Science have supplied tapes with dictionaries and Prolog interpreters, 
respectively. The staff o f  the SUMEX Computer Project at the Stanford 
University Medical Center, with permission o f  the authors and William 
Kaufman, provided the machine-readable version o f  the Handbook o f  
Artificial Intelligence. Kenneth Laws has helped assure the project o f  a 
complete backfile o f  AIList Digest issues. 


Expert System/Composite Document Analysis 171 

References 

1. Neufeld, M. L., and Cornog, M., Database history: From dinosaurs to compact 
discs, J. A m .  Soc. Inf. Sci. 37(4), 183-190, 1986. 

2. Blair, D. C., and Maron, M. E., An evaluation of retrieval effectiveness for a full- 
text document-retrieval system, Commun. A C M  28(3), 289-299, 1985. 

3. Ojala, M., Views on end-user searching, J. A m .  Soc. Inf. Sci. 37(4), 197-203, 
1986. 

4. Williams, M. E., Transparent information systems through gateways, front ends, 
intermediaries, and interfaces, J. A m .  Soc. Inf. Sci. 37(4), 204-214, 1986. 

5. Fox, E. A., Information retrieval: Research into new capabilities, in CD-ROM: The 
N e w  Papyrus (S. Lambert and S. Ropiequet, Eds.), Microsoft Press, Redmond, 
WA, 143-174, 1986. 

6. Salton, G., The SMART system 1961-1976: Experiments in dynamic document 
processing, Encyclopedia o f  Library and Information Science 28, 1-36, 1980. 

7. Fox, E. A., Some considerations for implementing the SMART Information 
Retrieval System under UNIX, TR 83-560, Dept. of Computer Science, Cornell 
Univ., 1983. 

8. Buckley, C., Implementation of the SMART Information Retrieval System. TR 85- 
686, Dept. of Computer Science, Cornell Univ., 1985. 

9. Fox, E. A., Composite document extended retrieval: An overview, Research and 
Development in Information Retrieval, Eighth Annual Int. A C M  SIGIR 
Conference., Montreal, 42-53, 1985. 

10. Fox, E. A., Analysis and retrieval of composite documents, A S I S  "85, Proceedings 
o f  the 48th A S I S  Annual Meeting, 54-58, 1985. 

11. O'Connor, J., Answer-passage retrieval by text searching, J. A m .  Soc. Inf. Sci. 
31(4), 227-239, 1980. 

12. Kimura, G. D., A Structure Editor and Model for Abstract Document Objects, 
Dissertation, Tech. Report No. 84-07-04, Univ. of Washington, Dept. of Computer 
Science, 1984. 

13. Babatz, R., and Bogen, M., Semantic relations in message handling systems: 
Referable documents, Paper presented at IFIP Working Group 6.5 Symposium, 
1985. 

14. Horak W., and Kronert, G., An object-oriented office document architecture model 
for processing and interchange of documents, Proceedings o f  the Second A CM- 
SIGOA Conference on Office Information Systems, 152-160, 1984. 

15. Harper, D. J., Dunnion, J., Sherwood-Smith, M., and van Rijsbergen, C. J., 
Minstrel-ODM: A basic office data model, Inf. Proc. & Mgmt. 22(2), 83-107, 
1986. 


172 Edward A. Fox and Robert K. France 

16. Peels, A., Janssen, N., and Nawijn, W., Document architecture and text formatting, 
A C M  Trans. Office Inf. Sys. 3(4), 347-369, 1985. 

17. Rauch-Hindin, W., Upper level OSI protocols near completion, Mini-Micro 
Systems (18)9, 53-66, 1986. 

18. Tenopir, C., Full-text databases, A R I S T  19, 215-246, 1984. 

19. Salton, G., Another look at automatic text-retrieval systems, Commun. A C M  
29(7), 648-656, 1986. 

20. Katzer, J., et al., A study of the overlap among document representations, Inf. 
Tech.: Res. & Dev. 1(4), 261-274, 1982. 

21. Bichteler J., and Eaton HI, E. A., The combined use of bibliographic coupling and 
cocitation for document retrieval, J. A m .  Soc. Inf. Sci., 31(4), 278-282, 1980. 

22. Fox, E. A., Extending the Boolean and Vector Space Models of Information 
Retrieval and P-Norm Queries and Multiple Concept Types, Dissertation, Cornell 
Univ., University Microfilms Int., Ann Arbor, Mich., 1983. 

23. Voorhees, E. M., The Effectiveness and Efficiency of Agglomerative Hierarchic 
Clustering in Document Retrieval, Dissertation, TR 85-705, Dept. of Computer 
Science, Cornell Univ., 1985. 

24. Salton, G., Fox, E. A., and Wu, H., Extended boolean information retrieval, 
Commun. A C M  26(I1), 1022-1036, 1983. 

25. Salton, G., Wong, A., and Yang, C. S., A vector space model for automatic 
indexing, Commun. A C M  18(11), 613-620, 1975. 

26. Robertson, S. E., and Sparck Jones, K., Relevance weighting of search terms, J. 
A m .  Soc. Inf. Sci. 27(3), 129-146, 1976. 

27. Van Rijsbergen, C. J., Information Retrieval, 2nd ed., Butterworths, London, 
1979. 

28. Buckley, C., and Lewit, A. F., Optimization of inverted vector searches, Research 
and Development in Information Retrieval, Eighth Annual International A C M  
SIGIR Conference, Montreal, 97-110, 1985. 

29. Kowalski, R. A., Logic f o r  Problem Solving, Elsevier North-Holland, New York, 
1979. 

30. Genesereth, M. R., and Ginsberg, M. L., Logic programming, Commun. A C M  
28(9), 933-941, 1985. 

31. Clocksin, W. F., and Mellish, C. S., Programming in Prolog, 2nd ed., Springer- 
Verlag, New York, 1984. 

32. Pereira, F., Logic for Natural Language Analysis, Tech. Note 275, SRI Interna- 
tional, 1983. 

33. Helm, A. R., Marriott, K., and Lassez, C., Prolog for expert systems: An 


Expert System/Composite Document Analysis 173 

evaluation, Proceedings o f  the Expert Systems in Government Symposium, 284- 
293, 1985. 

34. Bobrow, D. G., If Prolog is the answer, what is the question? Or what it takes to 
support AI programming paradigms, I E E E  Trans. Software Eng. 11(11), 1401- 
1408, 1985. 

35. Nii, H. P., Blackboard systems: The blackboard model o f  problem solving and the 
evolution o f  blackboard architectures, A I  Mag. 7(2), 38-53, 1986. 

36. Belkin, N. J., Hennings, R. D., and Seeger, T., Simulation o f  a distributed expert- 
based information provision mechanism, Inf. Tech.: Res. Dev. Applications 3(3), 
122-141, 1984. 

37. Smith, L. C., and Warner, A. J., A taxonomy o f  representations in information 
retrieval system design, in Representation and Exchange o f  Knowledge as a Basis 
o f  Information Processes (H. J. Dietschmann, Ed.), North-Holland, New York, 
31-49, 1984. 

38. Levesque, H. J., A fundamental tradeoff in knowledge representation and reasoning, 
Proceedings o f  the Fifth CSCSI National Conference, London, Ontario, 141- 
152, 1984. 

39. Patel-Schneider, P. F., Brachman, R. J., and Levesque, H. J., ARGON: Knowledge 
Representation Meets Information Retrieval, Proceedings o f  the First Conference on 
Artificial Intelligence Applications, December 1984, Denver, CO. Washington, 
D.C.: IEEE Computer Society Press; 280-286, 1984. 

40. Minsky, M., A framework for representing knowledge, The Psychology o f  
Computer Vision, (P. Winston, Ed.), McGraw-Hill, New York, 1975. 

41. Fikes, R., and Kehler, T., The role o f  frame-based representation in reasoning, 
Commun. A C M  28(9), 904-920, 1985. 

42. Patel-Schneider, P. F., Small can be Beautiful in Knowledge Representation, 
Proceedings o f  the IEEE workshop on Principles o f  Knowledge-Based Systems. 
Denver, CO, 11-16, 1984. 

43. Zarri, G. P., An outline o f  the representation and use o f  temporal data in the 
RESEDA system, Inf. Tech.: Res. Dev. Applications 2(2/3), 89-108, 1983. 

44. Sparck Jones, K., and Kay, M., Linguistics and Information Science, Academic 
Press, New York, 1973. 

45. Sparck Jones, K., and Tait, J. I., Automatic search term variant generation, J. Doc. 
40(1), 50-66, 1984. 

46. Hahn, U., and Reimer, U., Heuristic text parsing in " T o p i c " :  Methodological 
issues in a knowledge-based text condensation system, in Representation and 
Exchange o f  Knowledge as a Basis o f  Information Processes (H. J. Dietschmann, 
Ed.), North-Holland, New York, 143-163, 1984. 

47. DeJong, G., An overview o f  the FRUMP system, in Strategies f o r  Natural 


174 Edward A. Fox and Robert K. France 

Language Processing, (W. G. Lehnert and M. H. Ringle, Eds.), Lawrence 
Erlbaum, Hillsdale, N.J., 149-176, 1982. 

48. Mauldin, M. L., Thesis proposal: Information retrieval by text skimming, 
unpublished manuscript, Dept. of Computer Science, Carnegie-Mellon Univ., 
Pittsburgh, Penn., 1986. 

49. Riesbeck, C. K., Realistic language comprehension, in Strategies f o r  Natural 
Language Processing, (W. G. Lehnert and M. H. Ringle, Eds.), Lawrence 
Edbaum, Hillsdale, N.J., 435-454, 1982. 

50. Selfridge, M., Integrated processing produces robust understanding, Comp. Ling. 
12(2), 89-106, 1986. 

51. Sager, N., Sublanguage grammars in science information processing, J. A m .  Soc. 
Inf. Sci. 26(1), 10-16, 1975. 

52. Amsler, R. A., Machine-readable dictionaries, A R I S T  19, 161-209, 1984. 

53. White, C., The linguistic string project dictionary for automatic text analysis, 
Proceedings o f  the Workshop on Machine-Readable Dictionaries, SRI, Menlo 
Park, Calif., 1983. 

54. Borgman, C. L., Psychological research in human-computer interaction, A R I S T  19, 
33-64, 1984. 

55. Sewell, W., and Teitelbaum, S., Observations of end-user online searching behavior 
over eleven years, J. A m .  Soc. Inf. Sci. 37(4), 234-245, 1986. 

56. Weyer, S. A., The design of a dynamic book for information search, Int. J. Man- 
Machine Stud. 17(1), 87-107, 1982. 

57. Weyer, S. A., and Borning, A. H., A prototype electronic encyclopedia, A C M  
Trans. on Office Info. Syst. 3(I), 63-68, 1984. 

58. Frei, H. P., and Jauslin, J. F., Graphical presentation of information and services: A 
user oriented interface, Inf. Tech.: Res. Dev. 2(1), 23-42, 1983. 

59. Oddy, R. N., Information retrieval through man-machine dialogue, J. Doc. 33(1), 
1-14, 1977. 

60. Yip, M.-K., An Expert System for Document Retrieval, MS Thesis, M.I.T., 1979. 

61. Thompson, R. H., and Croft, W. B., An expert system for document retrieval, 
Proceedings o f  the Expert Systems in Government Symposium, IEEE, 448-456, 
1985. 

62. McCune, B. P., et al., RUBRIC: A system for rule-based information retrieval, 
IEEE Trans. Software Eng. SE-11(9), 939-945, 1985. 

63. Tong, R. M., et al., A rule-based approach to information retrieval: Some results 
and comments, AAAI-83: Proceedings of the National Conference on Artificial 
Intelligence. Washington, DC, 411-415, 1983. 

64. France, R. K., and Fox, E. A., Knowledge structures for information retrieval: 


Expert System/Composite Document Analysis 175 

Representation in the CODER project, Proceedings of the Second IEEE Expert 
Systems in Government Conference, McLean, Va., 135-141, 1986. 

65. Sterling, L., Logical levels of problem solving, Proceedings o f  the Second 
International Logic Programming Conference, Uppsala Univ., Uppsala, Sweden, 
231-242, 1984. 

66. Cardelli, L., and Wegner, P., On understanding types, data abstraction, and 
polymorphism, A C M  Computing Surveys 17(4), 471-522, 1985. 

67. Wohlwend, R. C., Creation of a Prolog Fact Base from the Collins English 
Dictionary, MS Report, Dept. of Computer Science, Virginia Tech, Blacksburg, 
Va., 1986. 

68. Cowie, A. P., and Mackin, R., Oxford Dictionary of Current Idiomatic English. 
Volume 1: Verbs with Prepositions & Particles, Oxford UP, Oxford, England, 
1975. 

69. Cowie, A. P., Mackin, R., and McCaig, I. R., Oxford Dictionary of Current 
Idiomatic English. Volume 2: Phrase, Clause & Sentence Idioms. Oxford U.P., 
Oxford, England, 1983. 

70. Hanks, P. (Ed.), Collins Dictionary of the English Language, William Collins, 
London, 1979. 

71. Homby, A. S., (Ed.), Oxford Advanced Dictionary of Current English, Oxford, 
U.P., Oxford, England, 1974. 

72. Leffler, S. J., Fabry, R. S., and Joy, W. N., A 4.2BSD interprocess communication 
primer, in ULTRIX-32, Supplementary Documents, vol. III, 3-5-3-28, 1984. 

73. Stroustrup, B., The C+ + Programming Language, Addison-Wesley, Reading, 
Mass., 1985. 

74. Naish, L., MU-Prolog 3.2db Reference Manual, Dept. of Computer Science, 
Univ. of Melbourne, Melbourne, Australia, 1985. 

75. Sacks-Davis, R., and Ramamohanarao, K., A two level superimposed coding 
scheme for partial match retrieval, lnfo, Syst. 8(4), 273-280, 1983. 

76. Sacks-Davis, R., Performance of a multi-key access method based on descriptors 
and superimposed coding techniques, Info. Syst. 10(4), 391--403, 1985. 

77. Ramamohanaro, K., and Shepherd, J., A Superimposed Codeword Indexing Scheme 
for Very Large Prolog Databases, Tech. Report 85/17, Dept. of Computer Science, 
Univ. of Melbourne, Melbourne, Australia, 1985. 

78. France, R. K., An Artificial Intelligence Environment for Information Retrieval 
Research, MS Thesis, Dept. of Computer Science, Virginia Tech, Blacksburg, Va., 
1986.