garrett.indd


    
   

            
            

                  
             

            

            
 

          
 

         
         
 

 
 

 

 
 

    
   

   
      

 
 

      

    
    

     

     
 

      

     
      

 
     

 

    

      

Subject Headings in Full-Text 
Environments: The ECCO Experiment 

Jeffrey Garrett 

Bibliographic records regularly combine two incommensurable types of 
description: one that captures the physical and textual facts of a work, the 
other that seeks to encompass succinctly the work’s intellectual content. 
This article deals with the second type of bibliographic description: subject 
headings and their contribution to resource discovery.The article reports 
on an experiment at Northwestern University Library to add subject 
headings to online records for the Eighteenth Century Collections Online 
(ECCO).The author assesses the benefits of this enhancement by using 
a representative research topic: a search for contemporary material on 
the East India Company (1600–1873). This article extends arguments 
recently presented by Gross and Taylor (2005) in two directions: first, by 
considering the importance of subject headings for access to historical 
materials; and, second, by examining the value added by subject head-
ings even when the full text of a work is available online. 

f the focus of bibliographic 
description is the artifact—the 
precise capture of its physical 
and textual facts—the focus 

of subject headings work is the library 
user and his or her content-related needs 
and expectations.1 Ideally, subject head-
ings enable users not familiar with the 
literature of a field to identify and gather 
together relevant works, regardless of 
their authors, titles, disparate physical 
locations, and what other major topics 
they may treat. In light of this focus on 
user needs, subject heading assignment 
is pragmatic and heuristic rather than an 
exercise in truth and accuracy. Subject 
headings can indeed be simply wrong,2 

but normally the question to be asked 
is whether they are good or bad, helpful 
or irrelevant, rather than true (accurate) 
or false (incorrect). The assignment of 
subject headings cannot and need not con-
form to the same standards that apply to 
descriptive bibliography in the tradition 
of W. W. Greg and Fredson Bowers.3 

What descriptive accuracy is to catalog-
ing and bibliography, consistency is to the 
assignment of subject headings. The use 
of carefully controlled vocabulary consis-
tently applied and combined allows the 
creation of mechanical linkages between 
records, forming the basis for subject cata-
logs in both paper and electronic form. 
In this way, subject headings create mul-

Jeffrey Garre  is Assistant University Librarian for Collection Management at Northwestern University 
Library; e-mail: jgarre @northwestern.edu. This paper originated as a presentation to the International Com-
mi ee for the English Short Title Catalogue (IESTC) at their meeting on September 12, 2005, hosted by the 
American Antiquarian Society in Worcester, Massachuse s. The author especially thanks Gary L. Strawn, 
Authorities Librarian at Northwestern University Library, for his many contributions to this article. 

69 

mailto:jgarre�@northwestern.edu


 

    

  

 
    

    

        
    

      

    
    

      

     
    
     

  
 

     

 
 

     
     

      
     

    
     

      
       

  
     
    

     
     

      

    

    
    

     

    

      
      

   
        

    
    
    

       
      

       
     

      
       

     
   

      
     

     
      
     
      

       
       

       
     

     
      

      
      

       
        
        

      

70 College & Research Libraries January 2007 

tidimensional networks between works, 
complexities of relationship going far 
beyond the classificatory power of linear 
arrangements of books on shelves. These 
complex, multidimensional relationships 
ideally mirror in the aggregate what in 
1945 Vannevar Bush called the “associa-
tive trails” between separate sources of 
information that human memory creates 
over a lifetime of learning and reading.4 
Consistency requires further that con-
trolled vocabulary be constantly updated 
and improved, since classificatory words 
and concepts—unlike the physical facts 
of a publication—change over time and 
gain or lose acceptance. Maintenance of 
useful subject catalogs and thesauri was 
a daunting if not impossible task in a pre-
computer environment, but is perhaps 
more within our grasp today. 

Subject access to materials published 
in Great Britain between 1701 and 1800 
has always posed a particular problem for 
scholars and students. While the English 
Short Title Catalogue (ESTC) contains 
Library of Congress Subject Headings 
(herea er: LCSH) for books from the Eng-
lish-speaking world printed before 1701, 
entries for material dating from 1701 to 
1800 are not subject indexed. The British 
Library and other major research libraries 
have historically acknowledged this prob-
lem, directing onsite users to Averley’s 
encyclopedic four-volume Eighteenth-Cen-
tury British Books5 or to the myriad print 
subject bibliographies that exist on topics 
both prominent and arcane, from English 
Maritime Books Printed before 1801 to Eng-
lish Cookery Books to the Year 1850.6 Users 
of the British Library’s online Integrated 
Catalogue, which derives its eighteenth-
century records from the ESTC and 
therefore lacks subject headings for works 
published in that century, are directed to 
scour book titles for subject information. 
Users “should find early books with a 
title (or series title) which encapsulates 
the work’s subject ma er,” presumably 
through keyword searches in the title 
field. Yet there are problems, since: “Such 
a search will inevitably find a fair number 

of irrelevant items.”7 A far greater prob-
lem with the suggested search strategy 
than the generated bibliographic noise, 
however, is that title keyword searches 
miss an enormous amount of relevant 
material, since words we might use to 
describe a historical topic today are o en 
entirely different from words populating 
actual eighteenth-century book titles. For 
a number of reasons, some having to do 
with changes in the lexicon, some with 
a century-specific perceived need for 
circumlocution, words such as “hygiene” 
and “prostitution” occurred far less fre-
quently in the eighteenth century than 
they do today—not to mention the o en 
disastrous effects of pre-1800 orthography 
on modern-day keyword searches.8 

An already difficult situation has been 
aggravated recently by the sudden (and 
otherwise hugely welcome) availabil-
ity of a new and mammoth library of 
searchable eighteenth-century text online 
through Thomson Gale’s Eighteenth 
Century Collections Online (ECCO).9 
This new resource has meant that tens 
and even hundreds of thousands of stu-
dents and nonspecialist faculty at over a 
hundred universities across the world, 
from Germany, Great Britain, and the 
United States to Japan and Hong Kong, 
now have convenient full-text access 
to 150,000 eighteenth-century mono-
graphs with 26,000,000 pages of content. 
Bibliographic access has already been 
significantly enhanced by loading MARC 
records derived from the ESTC (i.e., 
without subject headings) into online 
catalogs, allowing users to easily locate 
known items, but also to encounter titles 
that appear relevant to their needs when 
search terms happen to match up with 
eighteenth-century title words. In effect, 
the British eighteenth century, which in-
cludes most of the American eighteenth 
century as well, has been transformed 
for the nonspecialist user from a biblio-
graphic terra incognita to what is almost 
literally an open book. In theory at least, 
students of the history of slavery, of the 
Age of Exploration, of disease and sanita-



       
        

        
      

 
     

     
    

       
    

    
    

    
    

   
  

     
    

     
     
     

      

   
   

    
     

    

     

 

    
    

   

    
   

    
    

     

  
    

     

 

 

 

 
     

       
       

      
      
       

       
    

      
       

    
      

      
       

      
     

       
     

      
       
      

      
     

    
      
    

Subject Headings in Full-Text Environments: The ECCO Experiment 71 

tion in both British and American cities, 
or of our own War of Independence now 
have direct access to the primary texts in 
which these topics were discussed and 
debated. 

But can our user communities take 
advantage of this enormous amount of 
material and truly find eighteenth-cen-
tury texts appropriate to their needs? 

Adding Subject Headings to ECCO 
In the summer of 2005, an experiment was 
conducted on a representative sample 
of bibliographic material contained in 
the OPAC of Northwestern University 
Library to help determine the benefits of 
adding subject headings to bibliographic 
records for pre-1800 monographs. Be-
fore describing this experiment, some 
background is necessary. Northwestern, 
like many other large research libraries, 
has acquired both Chadwyck-Healey’s 
Early English Books Online (EEBO)10 and 
ECCO, containing close to 125,000 and 
150,000 monographs, respectively. Both of 
these huge collections were acquired with 
MARC records sets, the former including 
LCSH, the la er, ECCO, almost complete-
ly without. In an effort to increase use of 
ECCO by its community and to realize 
to the fullest possible extent the benefit 
of this quarter-million-dollar investment, 
Northwestern harvested records from 
WorldCat containing subject headings for 
eighteenth-century titles in ECCO.11 In the 
harvest corpus of approximately 52,000 
records, a total of 30,948 different subject 
headings were applied 107,477 times—an 
average of slightly more than two subject 
headings per title. The subject headings 
thus harvested were then mapped onto 
ECCO records in Northwestern’s catalog 
(NUcat), leaving all other descriptive data 
untouched. About 80,000 ECCO records 
(of the total 132,000 searched) were le  
at the end of this project without subject 
headings (60.1%). The harvesting and 
mapping operations were planned and 
carried out in June 2005 by Gary Strawn, 
Authorities Librarian at Northwestern 
University Library. 

An examination of the 30,000-plus 
individual subject headings retrieved 
and mapped yielded a highly interesting 
distribution. (See Appendix.) The most 
frequently occurring LCSH were for 
genres of religious literature (e.g., ser-
mons, hymns, works on “Christian life”). 
The heading “Sermons—English—18th 
Century,” for example, was assigned 
1776 times. The other subject headings 
cover—no surprise—a very broad range; 
for example: 

Walpole, Robert, Earl of Orford— 
1676–1745 (149 times) 
Seven Years War—1756–1763 
(113 times) 
Astronomy—Early works to 1800 
(89 times) 
Soul (51 times) 
Tobacco Industry—Great Britain 
(6 times) 
Slave Trade—Great Britain— 
Colonies (5 times) 

Actual corpus numbers for these 
subjects are usually quite a bit higher, 
of course, which we discover if relevant 
LCSH are grouped not by frequency, 
but alphabetically. We find, for example, 
that not one, but sixteen subject headings 
are devoted to Robert Walpole, the early 
eighteenth-century British prime minister, 
yielding a total of 194 occurrences—45 
more than the main heading alone.12 By 
locating keyword occurrences within sub-
ject headings, retrieval numbers go even 
higher: “Slavery” or “Slave Trade,” for 
example, are not always the initial words 
in assigned subject headings in which 
they occur (e.g., “Society of Friends—Slav-
ery”).13 This, of course, offers yet another 
reason why keyword searching across 
the entire bibliographic record is so im-
portant. It is supported by virtually all 
modern OPACs and vastly enhances the 
usefulness of subject headings beyond that 
of the stolidly alphabetized traditional 
subject catalog in paper form.14 

The extent and quality of the harvested 
subject headings vary markedly. This 

http:ery�).13
http:alone.12


 

       
    

   
    

     

      

    
    

    

    
   

   

      

   

 
     

    

      

      

    
     

    
     

      
    

      

     

     

     

    
      

     
     

      

       

     

72 College & Research Libraries January 2007 

should also come as no surprise, since the 
records come from numerous different 
institutions and have been created over 
a span of many decades. There are, for 
example, occasional misspellings (e.g., 
one occurrence each of “Aantomy—Early 
works to 1800” and “Yokrtown (Va.)— 
History—Siege, 1781”). Additionally, we 
find various naïve mistakes (e.g., a subject 
heading identifying a certain “John Pre-
ster”—while the valid subject heading 
should read “Prester John [Legendary 
character]”). Far more frequently, we 
encounter vague and therefore virtually 
meaningless subject headings (e.g., “Great 
Britain—Politics and Government”: 155 
occurrences). Others, meaningless per-
haps in a print or card catalog environ-
ment (e.g., “Biography”: 17 occurrences), 
take on new significance and utility when 
used in Boolean searches. 

Inconsistencies in subject heading 
assignment are, as is well known, ram-
pant—but given new search methodolo-
gies in an online environment, they may 
not really be all that critical. Here, too, it 
helps that users today find what they are 
looking for by using subject headings not 
as verbatim search expressions, but as 
sources for frequently unique keyword 
material. Boolean searches across subject 
headings, combining, say, “bengal” with 
“east india company,” represent discov-
ery possibilities that are extraordinarily 
quick and efficient to execute online, very 
difficult to replicate in card catalogs and 
far more precise (i.e., likely to be success-
ful) than searches using only title page 
information. 

By contrast to the many cases where 
too few or too general subject headings 
have been created for individual works, 
we also record, gratefully, frequent 
examples of the opposite, where rich 
and diverse subject headings expose an 
otherwise invisible work to discovery 
from numerous different angles. Take, for 
example, one of the more detailed records 
acquired through the ECCO WorldCat 
harvesting operation, for a work entitled 
An authentic account of the proceedings of 

their High Mightinesses, the states of Hol-
land and West-Friezeland, on the complaint 
laid before them by His Excellency Sir Joseph 
Yorke, His Britannic Majesty’s Ambassador at 
the Hague . . . (1762), for which the follow-
ing subject headings were harvested from 
WorldCat and mapped onto the existing 
ECCO record: 

East India Company—Political 

activity—Early works to 1800.
1
Nederlandsche Oost-Indische 

Compagnie—Political activity—Early 

works to 1800.
1
Anglo-French War, 1755–1763—
1
Influence—Early works to 1800.
1
India—Commerce—Europe—Early 

works to 1800.
1
Europe—Commerce—India—Early 

works to 1800.
1
Bengal (India)—History—18th 

century.
1
Netherlands—Relations—Great 

Britain—Early works to 1800.
1
Great Britain—Relations—
1
Netherlands—Early works to 1800.
1

Against this background we can ask: 
Will the addition of this considerable 
mass of subject heading material signifi-
cantly improve user access to ECCO and, 
in this way, access to eighteenth-century 
materials more generally? And: What 
would be the hoped-for benefit of con-
tinuing this harvesting operation? 

Adequacy and Impact of the New 
ECCO Subject Headings 
To make a preliminary determination of 
the value of the subject heading harvest-
ing and mapping operation, we decided 
to conduct an initial demonstration study 
on a single, representative research topic. 
The topic we chose was the one just en-
countered, namely the British East India 
Company.15 The selection of the East India 
Company made sense as it was a corpo-
rate entity that existed for much of the 
early modern period, beginning in 1600, 
and is still referred to by the same name 
today—which allowed us to set aside 

http:Company.15


     
    

    
       

    
     

     
      

       

  

 
     

    

    
    

      

    
    

     

      
   

       

      

    
      

   

     
        

      

 
     

      

     
     

 

       
     

 
      

     
     

      

       
     

      

     
       

   
     

     
     

      

Subject Headings in Full-Text Environments: The ECCO Experiment 73 

questions of lexical or semantic change. 
There were many other possible topics 
(e.g., witchcra , slavery, urban sanitation, 
and any number of prominent and less 
prominent historical, literary, and legend-
ary persons), all of which have in common 
that they figure as research topics chosen 
by members of a typical academic com-
munity. We felt that if the results of the 
present study appeared promising, a 
more extensive sample could show the 
impact of LCSH record enhancement for 
study across the spectrum of student and 
faculty research. 

The sample upon which the experi-
ment was based was taken by North-
western library staff on August 31, 2005, 
and the results reported here reflect the 
contents of the OPAC on that date.16 Here 
were the instructions: 

• Search Northwestern’s OPAC, 
NUcat, to find as many relevant pre-1800 
monographs as possible touching on the 
history of the East India Company.17 

• Focus on the discovery of electroni-
cally available materials in Early English 
Books Online and Eighteenth Century 
Collections Online. 

• A empt to determine how much 
material is now discoverable through 
the addition of subject headings to 52,000 
ECCO records—but also seek to quantify 
how much has been le  undiscovered by 
the absence of a complete subject head-
ings file for ECCO. 

• Throughout, use a search method-
ology typical of normal use by normal 
users (i.e., without consulting print bibli-
ographies, reference librarians, or subject 
specialists). Also, as everyone knows, 
normal users do not perform subject 
searches, so these were not used here.18 

Here are the results of the study: 
In Northwestern’s online catalog NU-

cat, the keyword phrase “east india com-
pany” occurred in 1,101 records overall. 
Of these records, 787 were for pre-1800 
imprints. Of these 787, 699 (89%) were 
available electronically—testimony to the 
enormous impact that acquisitions such 

as EEBO and ECCO have had on early 
modern resources that libraries can now 
make available to their communities. 

Of the stated 699 records for elec-
tronically available items containing the 
keyword phrase “east india company,” 
430 (61.5%) had the phrase “east india 
company” in the subject headings field, 
while 269 did not. Of the 430 that did, 134 
(31.2%) were in EEBO, 296 (68.8%) were 
in ECCO, the la er, of course, all being 
subject headings added from WorldCat 
records in the harvesting operation de-
scribed above. Of the 269 that did not have 
“east india company” in the subject head-
ings, but instead somewhere else in the 
record, 28 (10.4%) were in EEBO,19 while 
the great majority, 245 (91.1%), were—not 
surprisingly—in ECCO.20 

Now, how many of the 430 records 
with “east india company” in the subject 
headings field would have been dis-
covered by the keyword phrase search 
“east india company” without the subject 
headings? Or phrased another way: how 
many of these 430 records had “east india 
company” both in the subject headings 
and somewhere else in the record? There 
was no boolean search capable of provid-
ing a satisfactory answer to this question. 
Therefore, two random samples21 of 50 
records each were prepared from the 
larger EEBO and ECCO retrieval sets (134 
and 296 records, respectively), with the 
following results: 

• In EEBO, 31 of the 50 sample re-
cords (62%) had “east india company” 
in the subject headings field but nowhere 
else in the record. Some creative searching 
would have been necessary on the part 
of library users to turn up a number of 
alternative spellings in the actual biblio-
graphic records, among these: East-Indie 
Companie - East-Indye Company - East-
India Companies - East India Co. - East 
India House. But many searchers would 
simply not have found these records 
using a simple keyword phrase search. 
Most importantly, however: a clear major-
ity of works identified by catalogers as 
pertaining significantly to the East India 

http:Company.17


 

      
   

      

      
       
   

    

    
     

     
       

      
       

     

     

       
     

     

   

      
       

     

    

    
     

     

       

       

      

      
      

     
    

      
 

      

     

    
       

     
     

74 College & Research Libraries January 2007 

Company lack any direct reference to this 
corporate body in the record—other than 
in the subject field. 

• In ECCO: 30 of 50 sample records 
(60%) had “east india company” only in 
the subject headings field. Of the 20 that 
had the phrase elsewhere in the record as 
well, several occurrences (4) were in the 
Notes field (MARC 500) and one was in 
the Alternative Titles field (MARC 246) as 
a docket title. 

To assess the number of works of pos-
sible relevance to this search that remain 
undiscovered in ECCO due to lack of 
subject access, it is necessary to project the 
results above into the corpus of electroni-
cally available material currently lacking 
subject headings in our catalog. 

Recall that as of the date of this experi-
ment, Northwestern had loaded 52,000 
MARC records for ECCO titles enhanced 
with subject headings harvested from 
WorldCat, leaving 80,000 for which no 
subject headings had yet been harvested 
and mapped onto catalog records for 
ECCO titles. Many of these will, of course, 
be belletristic in nature (i.e., would not 
under policies in effect at the time of 
cataloging have had subject headings as-
signed). For this and other good reasons, 
it would be questionable to just extend 
the figures that apply to the 52,000 to the 
balance of the corpus. 

A more promising but very speculative 
method would be to take the 245 ECCO 
records with “east india company” some-
where in the record but not in the subject 
heading as “the 40%” of records in the 
as-yet unharvested part of ECCO and 
add to them “the 60%,” or 367 estimated 
records with “east india company” in 
subject headings awaiting harvesting or 
creation. 

These investigations (and specula-
tions) yield the following final figures for 
Northwestern’s library: 

Findings 
The WorldCat harvesting project added 
177 titles to the 610 pre-1800 imprints 
having to do with the East India Company 

discoverable before the project. The new 
total is 787, an increase of 29%. 

The 367 pre-1800 imprints having to 
do with the East India Company are es-
timated to be currently bibliographically 
undiscoverable in ECCO. As a result of 
further harvesting operations (for ex-
ample, as might be possible from other 
library catalogs—and, at some point, 
through original cataloging of the remain-
ing ESTC records), we could increase the 
total retrieval to 1,154 titles in our OPAC, 
a projected 89.2% increase vis-à-vis the 
status quo. 

We can only guess at the significance of 
this vastly improved bibliographic trans-
parency of eighteenth-century materials 
for students and for scholars. 

It is interesting to compare the results 
reported here with those of a recent study 
by Tina Gross and Arlene G. Taylor pre-
sented in College & Research Libraries, May 
2005. Their study asked: 

. . . what proportion of records re-
trieved by a keyword search has a 
keyword only in a subject heading 
field and thus would not be retrieved 
if there were no subject headings? It 
was found that more than one-third 
of records retrieved by successful 
keyword searches would be lost if 
subject headings were not present, 
and many individual cases exist in 
which 80, 90, and even 100 percent 
of the retrieved records would not 
be retrieved in the absence of subject 
headings.22 

The fact is that the assignment of 
descriptive language in the subject head-
ing fields frequently a aches important 
terms and concepts to a bibliographic 
record that the record will not otherwise 
contain. 

Once again, the “experiment” reported 
here should not be regarded as an exhaus-
tive scientific study—in part because the 
author openly confesses to lacking the 
statistical skills necessary to conduct one. 
Further research is necessary, as will be 

http:headings.22


      

      

       

      
    

      
     

    

     

    
       
      

        
     

       
      

     
      

       
      

      
      

        
       

       
      
       

         
      

      

 
       
      

      
       
    

      
      

       
      

        
      

       
 

     
      

        
        

        
     

    
     

     
     

     
        

   

      
      
      

     
     

      
    

      

         
    

  

    

    

      

Subject Headings in Full-Text Environments: The ECCO Experiment 75 

described in a moment. Still, the study 
does appear to point to the potential im-
provements in discovery and access that 
the addition of subject headings to ECCO 
records have already brought at one uni-
versity and would continue to bring if this 
work were continued. Inasmuch as ESTC 
records are also intended to improve ac-
cess and facilitate discovery in addition 
to being a faithful reproduction of the 
bibliographic facts of publication, they, 
too, would benefit from this addition.23 

What About Keywords in Full-text? 
Let us take this discussion one step fur-
ther by considering the role of subject 
headings in an environment in which 
users can also search the full text of an 
eighteenth-century work. In the article 
by Gross and Taylor already mentioned, 
this important question (i.e., the relevance 
of subject headings in an era of full-text 
searching) was not considered. Yet pre-
cisely this question is what is troubling 
administrators of large American librar-
ies who must choose to fund classifica-
tion work or not—among them, Deanna 
Marcum of the Library of Congress: “[I]n 
the age of digital information, of Internet 
access, of electronic key-word search-
ing,” she asks, “just how much do we 
need to continue to spend on carefully 
constructed catalogs?”24 

In response, it can be readily shown that 
keyword searching in full-text databases 
is no substitute for searches run against 
OPACs or other bibliographic files with 
ample descriptors and subject headings. 
This observation applies a fortiori for 
historical text files such as ECCO. The de-
monstrable fact is that full-text searching 
of eighteenth-century texts o en does not 
retrieve examples of terms that describe 
the work as a whole or even important 
topics or aspects of the work, especially 
as we might describe them today. Indeed, 
those researching the topic of urban sani-
tation in the eighteenth century might be 
surprised to learn that there is not a single 
valid occurrence of the word “sanitation” 
in the entire 26,000,000-page ECCO corpus. 

Even the word “hygiene” occurs in the 
full text of only 50 ECCO titles—while 
just the subject headings already mapped 
to ECCO MARC records include the 
word “hygiene” in records for 60 ECCO 
titles. Northwestern’s catalog NUcat, 
now “souped up” through the addition 
of the 52,000 imported records, currently 
retrieves 67 ECCO titles that include the 
word “hygiene” somewhere in the record 
(i.e., 34% more than a full-text search of 
the entire ECCO database would yield). 
Presumably, the further back in time we 
go, the greater the disjunction becomes 
between accurate descriptive terms and 
the words actually occurring in a work— 
which is why the selection of “east india 
company” as a search string was, for the 
purposes of this study, such a benign one. 
With foreign-language works, of course, 
the disjunction approaches 100%. Meta-vo-
cabulary, therefore, performs an important 
hermeneutic and heuristic function in bib-
liographic searching and discovery, across 
centuries and across languages—and, as 
we have seen, even across states of the 
same language over time. 

Future studies will need to explore the 
impact of subject headings for searches fo-
cusing not on proper names and corporate 
entities (such as the East India Company), 
but on more arcane phenomena (e.g., 
“beaver hats”) and, above all, concepts 
used as subject headings (e.g., “conduct of 
life,” “the sublime,” “imaginary conversa-
tions”), since these may not map verbatim 
to eighteenth-century discourses and are 
also more susceptible than proper nouns 
and identifiers for material objects to go 
into and out of use in a ma er of decades. 
Another direction for further research 
could be to investigate whether “smart” 
relevance-determining algorithms run 
against full text can produce distillations 
of content, replacing the need for manu-
ally assigned subject headings. These 
descriptions may not even be represent-
able as LCSH-like verbal strings—and, 
in terms of aiding users, may not need to 
be. If, however, visible-readable subject 
headings are still felt to be necessary, 

http:addition.23


 

 

 
              

 

 

           

 

             
          

 

             

 
 
             

 

 

 

            

 

                

     
     

    
     

76 College & Research Libraries 

couldn’t they be derived and applied 
by these same automated text analysis 
procedures? The personal prejudice of 
this author, I confess, is that good subject 
headings, like good content analysis and 

January 2007 

text interpretation, will likely always re-
main a ma er of “intelligent design”—the 
“intelligence” here being that which hu-
man agents bring to bear on interpreting 
the human record. 

Notes 

1. Kathryn Luther Henderson, “Subject Headings,” in Encyclopedia of Library History, ed. 
Wayne A. Wiegand and Donald G. Davis, Jr. (New York & London: Garland, 1994), 605. 

2. Flat-out mistakes in the assignment of subject headings are unfortunately not uncom-
mon—their consequences will be considered later in this article. For now, let one amusing example 
suffice, a children’s book the author found once in a catalog entitled The Travels of Magnus Pole, 
about a fictional Viking bored with his life on the Shetland Islands who sets out in a dinghy for 
the Levant. Its one subject heading reads: “Polo, Marco, 1254–1323?—Juvenile literature.” 

3. Fredson Bowers, Principles of Bibliographical Description (Princeton, N.J.: Princeton Uni-
versity Press, 1949); W. W. Greg, Collected Papers, ed. J. C. Maxwell (Oxford: Clarendon Press, 
1966). 

4. Vannevar Bush, “As We May Think,” Atlantic Monthly (July 1945): 106. Available online 
at h p://www.ps.uni-sb.de/~duchier/pub/vbush/vbush-all.shtml. 

5. G. Averley, Eighteenth-century British Books: A Subject Catalogue Extracted from the 
British Museum General Catalogue of Printed Books, 4 vols. (Folkestone: Dawson, 1979). 

6. Thomas Randolph Adams and David Watkin Waters, English Maritime Books Printed 
Before 1801: Relating to Ships, Their Construction and their Operation at Sea; Including Articles 
in the Philosophical Transactions of the Royal Society and the Transactions of the American 
Philosophical Society (Providence, R.I.; Greenwich, England: John Carter Brown Library; National 
Maritime Museum, 1995); Arnold Whitaker Oxford, English Cookery Books to the Year 1850 
(London; New York [etc.]: Oxford University Press; H. Frowde, 1913). 

7. See The British Library’s “Subject Access to Early Printed Materials in the British Library.” 
Available online at h p://www.bl.uk/collections/early/subject.html. 

8. Cf. Jeffrey Garre , “KWIC and Dirty? Human Cognition and the Claims of Full-Text 
Searching,” Journal of Electronic Publishing 8, no. 1 (2006). Available online at h p://hdl.handle. 
net/2027/spo.3336451.0009.106. 

9. More information available online at h p://www.gale.com/EighteenthCentury/. 
10. For further information, see h p://hdl.handle.net/2027/spo.3336451.0009.106. 
11. As a rule, these were catalog records created for print versions of eighteenth-century 

monographs and contributed by OCLC member libraries. The actual harvesting work involved 
first dividing the 132,000 ECCO MARC records into files of about 10,000 records each. A macro 
wri en to run under OCLC’s Passport program serially read the records in each of these files. 
For each record, the macro performed a single search in OCLC that combined author, title, and 
date criteria, skipping over any ECCO record with no main entry field. Records retrieved by this 
search were examined by the macro to make sure the titles corresponded. In a final macro step, 
the LCSH subject headings were copied bodily from the first such matching OCLC record into 
the corresponding ECCO record. 

12. The subject heading “Walpole, Robert, Earl of Orford, 1676–1745—Poetry,” for example, 
was assigned thirteen times. 

13. The fact that “Society of Friends—Slavery” is not a valid LCSH string has not prevented 
it from being used. Thanks to Gary Strawn for this observation. 

14. Of course, not even the most sophisticated OPACs currently support le  truncation, 
meaning, at least for now, that “slavery” will be found by user-initiated keyword searches, but 
not “antislavery.” 

15. As historical background, the East India Company was incorporated by royal charter on 
December 31, 1600, and, a er merging with a rival in 1708, was renamed the United Company 
of Merchants of England trading to the East Indies, or the United Company for short. It lost its 
trading monopolies beginning in the late 18th century and ceased to exist as a legal entity in 1873. 
(Source: Encyclopaedia Britannica.) 

16. h p://nucat.library.northwestern.edu/ 
17. To be as complete and as circumspect as possible, even with the addition of subject headings 

containing “East India Company,” some information will be overlooked due to inconsistencies 
in heading assignment. Consider, for example, subject headings beginning with “East Indies…” 

http:h�p://nucat.library.northwestern.edu


           

 

 

 

 

 

             

             

      
   

         

    

   

  
 

          
         
         

 

 
 

             
        

   

     

     
  

Subject Headings in Full-Text Environments: The ECCO Experiment 77 

(e.g., “East Indies—Commerce—Great Britain—Early Works to 1800”). The total number of occur-
rences of subject headings beginning with “East Indies…” is around 20 in the existing database 
of harvested ECCO records. 

18. Interesting in this context and on the topic of subject headings as a whole: Thomas Mann, 
“Research at Risk,” Library Journal 130, no. 12 (2005). 

19. The search key used was (“early english books online”)[in Keyword Anywhere] AND 
(“east india company”)[in Keyword Anywhere] NOT (“east india company”)[in Subject]. 

20. I have not yet been able to explain the minor discrepancy of four records (28+245=273, not 
269), but I don’t believe it invalidates the general thrust of this argument. 

21. Titles retrieved were sorted in alphabetical order, yielding for the purposes of this inves-
tigation a fairly random sample. 

22. Tina Gross and Arlene G. Taylor, “What Have We Got to Lose? The Effect of Controlled 
Vocabulary on Keyword Searching Results,” College & Research Libraries 66, no. 3 (2005): 212. 

23. Shortly a er completion of this manuscript, The British Library announced that it intended 
to add LCSH to all 18th-century records in the ESTC by autumn 2007. The subject headings 
extracted by Northwestern from the WorldCat database will constitute an important part of this 
enhancement project. 

24. Deanna B. Marcum, “The Future of Cataloging: Address to the Ebsco Leadership Seminar,” 
in ALA Midwinter Meeting (Boston, Mass.: 2005), 1. 

70,000 Publishing Sources. 3.2 Million Book Titles. 
We’re Connected. Are You? 

IT’S EASY TO BE CONNECTED WHEN YOU’RE AN EMERY-PRATT CUSTOMER. 

VISIT OUR WEBSITE TO SEARCH 

FOR BOOKS, PLACE ORDERS, 

AND MUCH MORE. 

www.emery-pratt.com 

At Emery-Pratt, we work extra hard to be closely connected to more than 70,000 
publishing sources, allowing you to have access to millions of new book 
titles. And thanks to our close relationship with these sources, we learn 
about the hottest upcoming books long before they’re released. As a 
result, we can make sure you and your customer get all the books 
you want…right when you want them. 

Would you like to be connected? It’s easy. Just call 
us at 1.800.248.3887. We’d love to help you 
anyway we can. Or go online at emery-pratt.com 

5882 

1966 West M-21, Owosso, MI 48867-1397 Phone (toll-free) 1.800.248.3887 
Fax (toll-free) 1.800.523.6379 E-mail: mail@emery-pratt.com 

Book Distributors since 1873 

T H E N I C E S T P E O P L E I N T H E B O O K B U S I N E S S 

Visit us at the ALA Midwinter 
Show, booth #2806. 



 

 

 

 

 

78 College & Research Libraries January 2007 

APPENDIX 
Selections from the Subject Headings List 

Group I: Most frequently assigned subject headings 
1776 SERMONS ENGLISH 18TH CENTURY 
1201 CHURCH OF ENGLAND SERMONS 
915 SERMONS ENGLISH 
844 ENGLISH DRAMA 
504 GREAT BRITAIN POLITICS AND GOVERNMENT 1760–1789 
461 CHURCH OF ENGLAND SERMONS 18TH CENTURY 
392 UNITED STATES POLITICS AND GOVERNMENT 1775–1783 
380 GREAT BRITAIN POLITICS AND GOVERNMENT 1727–1760 
367 HYMNS ENGLISH 
364 FUNERAL SERMONS 
350 GREAT BRITAIN POLITICS AND GOVERNMENT 1702–1714 
296 CHRISTIAN LIFE 
293 CONDUCT OF LIFE 
290 OPERAS LIBRETTOS 
284 GREAT BRITAIN POLITICS AND GOVERNMENT 1789–1820 
265 DISSENTERS RELIGIOUS ENGLAND 
253 APOLOGETICS EARLY WORKS TO 1800 
232 GREAT BRITAIN FOREIGN RELATIONS FRANCE 
223 SOUTH SEA COMPANY 
211 FRANCE FOREIGN RELATIONS GREAT BRITAIN 
210 FRANCE HISTORY REVOLUTION 1789–1799 
207 SPANISH SUCCESSION WAR OF 1701–1714 
201 IRELAND HISTORY THE UNION 1800 
200 DEBTS PUBLIC GREAT BRITAIN 

Group II: A selection of relatively frequently applied subject headings 
21 ADULTERY 
21 ANGLICAN COMMUNION ENGLAND LITURGY TEXTS 
21 APOLOGETICS HISTORY 17TH CENTURY 
21 BEDFORD FRANCIS RUSSELL DUKE OF 1765–1802 
21 BEES 
21 BRIEFS GREAT BRITAIN 
21 BURNET GILBERT 1643–1715 
21 CALVINISM 
21 CHRISTIAN BIOGRAPHY 
21 CHURCH OF ENGLAND DOCTRINES EARLY WORKS TO 1800 



 

 

Subject Headings in Full-Text Environments: The ECCO Experiment 79 

APPENDIX 
Selections from the Subject Headings List 

21 CHURCH OF ENGLAND FINANCE 
21 CLASSICAL BIOGRAPHY 
21 CONVERSION EARLY WORKS TO 1800 
21 COURT RULES GREAT BRITAIN 
21 EDUCATION IRELAND 
21 ELECTIONS GREAT BRITAIN 
21 ELECTIONS IRELAND DUBLIN 
21 ENGLAND AND WALES CORPORATION ACT 1661 
21 EUROPE POLITICS AND GOVERNMENT 1789–1815 
21 FREDERICK II KING OF PRUSSIA 1712–1786 
21 FRENCH LANGUAGE DICTIONARIES ENGLISH 
21 GREAT BRITAIN FOREIGN RELATIONS 
21 GREAT BRITAIN FOREIGN RELATIONS 1702–1714 
21 GREAT BRITAIN FOREIGN RELATIONS TREATIES 
21 GREAT BRITAIN HISTORY COMIC SATIRICAL ETC 
21 IMAGINARY CONVERSATIONS 
21 JESUITS CONTROVERSIAL LITERATURE 
21 LOVE RELIGIOUS ASPECTS CHRISTIANITY 
21 MATHEMATICS 
21 NEWTON ISAAC SIR 1642–1727 PRINCIPIA 
21 NUMISMATICS GREAT BRITAIN 
21 ROCHEFORT EXPEDITION 1757 
21 SERMONS ENGLISH SCOTLAND 18TH CENTURY 
21 SLAVE TRADE AFRICA 
21 SOCIETY FOR PROMOTING CHRISTIAN KNOWLEDGE GREAT 

BRITAIN 
21 SOCIETY OF FRIENDS ENGLAND 
21 STEELE RICHARD SIR 1672–1729 
21 SUBLIME THE 
21 SUGAR TRADE GREAT BRITAIN 
21 TITHES 

Group III: A selection of relatively infrequently applied subject headings 
5 ABELARD PETER 1079–1142 CORRESPONDENCE 
5 ABSENTEE LANDLORDISM 
5 ACHILLES GREEK MYTHOLOGY POETRY 
5 ACTORS GREAT BRITAIN BIOGRAPHY 
5 ACTRESSES CORRESPONDENCE REMINISCENCES ETC 



 

 

 

  

 

80 College & Research Libraries January 2007 

APPENDIX 
Selections from the Subject Headings List 

5 ADAMS JOHN 1735–1826 
5 ADAMS WILLIAM 1706–1789 TEST OF TRUE AND FALSE DOCTRINES 
5 AGIS II KING OF SPARTA DRAMA 
5 AGRICULTURE ECONOMIC ASPECTS GREAT BRITAIN 
5 AGRICULTURE WALES 
5 AGRICULTURE WALES NORTH 
5 AIR EARLY WORKS TO 1800 
5 ALCOHOLISM GREAT BRITAIN 
5 ALFRED KING OF ENGLAND 849–899 DRAMA 
5 ALIENS GREAT BRITAIN 
5 AMHURST N NICHOLAS 1697–1742 
5 ANALOGY RELIGION 
5 ANGLICAN COMMUNION ENGLAND SERMONS 
5 ANIMAL BEHAVIOR 

Group IV: A selection of unique subject headings 
1 MATHEMATICS TO 1800 
1 MATHEMATICS UNITED STATES EARLY WORKS TO 1800 
1 MATHER ALEXANDER 1733–1800 DEFENCE OF THE CONDUCT OF 

THE CONFERENCE IN THE EXPULSION OF ALEXANDER KILHAM 
1 MATHER ALEXANDER APPEAL WITH A WORD OF ADVICE TO THE 

METHODIST SOCIETIES 
1 MATHER COTTON 1663–1728 BIBLIOGRAPHY 
1 MATHER COTTON 1663–1728 MAGNALIA CHRISTI AMERICANA 
1 MATHER INCREASE 1639–1723 BRIEF DISCOURSE CONCERNING THE 

UNLAWFULNESS OF THE COMMON PRAYER WORSHIP 
1 MATHER NATHANAEL 1631–1697 
1 MATHER SAMUEL 1626–1671 
1 MATHER SAMUEL 1706–1785 ALL MEN WILL NOT BE SAVED FOR-

EVER 
1 MATHEWS RICHARD 1676–1751 
1 MATHEWS THOMAS 1676–1751 ACCOUNT OF WHAT PASSD IN THE 

ENGAGEMENTS NEAR TOULON 
1 MATHEWS THOMAS 1676–1751 ADMIRAL MATHEWSS REMARKS ON 

THE EVIDENCE GIVEN AND THE PROCEEDINGS HAD ON HIS TRIAL 
1 MATHIAS THOMAS JAMES 1754–1835 SHADE OF ALEXANDER POPE 

ON THE BANKS OF THE THAMES 
1 MATHIAS THOMAS JAMES 1754–1835 THE PURSUITS OF LITERATURE 
1 MATTER EARLY WORKS TO 1800 



 

 

 

Subject Headings in Full-Text Environments: The ECCO Experiment 81 

APPENDIX 
Selections from the Subject Headings List 

1 MATTER PROPERTIES EARLY WORKS TO 1800 
1 MATTIOLI ERCOLE ANTONIO CONTE 1640–1703 
1 MATY MATTHEW 1718–1776 
1 MAUBERT DE GOUVEST JEAN HENRI 1721–1767 
1 MAUDIT ISRAEL 1708–1767 CONSIDERATIONS ON THE PRESENT 

GERMAN WAR 
1 MAUDUIT ISRAEL 1708–1787 OCCASIONAL THOUGHTS ON THE 

PRESENT GERMAN WAR 
1 MAUPEOU RENE NICOLAS CHARLES AUGUSTIN DE 1714–1792 
1 MAUPERTUIS 1698–1759 
1 MAWSON MATTHAIS 1683–1770