College and Research Libraries Online Searching and the Research Process Connie Miller and Patricia Tegler Despite evidence that researchers seek information in ways which are quite different from a logical, linear search strategy model, librarians persist in relating to the information-seeking process as if it were static and product oriented. Even online searching, which is generally considered to be flexible and interactive, is viewed merely as an alternate method of compiling a bibliography. The bibliography is considered a fixed and final product to be measured exclu- sively according to the limited variables of recall and precision. By having such a restricted view of online searching and its potential benefits to researchers, librarians fail to take full advantage of their role in the academic research community. If librarians wish to be relevant to researchers and to offer valuable services to an important constituency, they must fully under- stand the organic nature of research and the ways that scholars seek information. They must further understand and facilitate the significant way that online searching can contribute to and enhance the research process. he view of research as a linear, highly structured, logical pro- cess has been challenged by studies which indicate that scholars work in ways which can best be described as cyclical, organic, and intui- tive. 1 These illogical and intuitive ap- proaches to research mirror themselves in seemingly random, haphazard ap- proaches to locating pertinent informa- tion. Rather than following systematic li- brary search strategies, scholars generally employ less structured methods, such as browsing, consulting with colleagues, or tracing footnotes and bibliographies. Printed indexes, by their very nature, tend to limit creative, cyclical interaction be- tween researchers and information. On- line systems, on the other hand, have the potential to facilitate highly interactive seeker-information dialogues, just the type of interchange which is integral and essential to the trial-and-error2 process in- volved in research. To date, however, this potential for interaction and its corres- ponding benefits to the scholar have not been fully realized. Further, the limited ways in which online searches have been evaluated have hindered a full under- standing of the organic nature of the on- line process. A key to exploiting the potential of the online process for researchers lies in un- derstanding the distinction between seek- ing information on "topics" and seeking information on "problems." Swanson claims that ''creative scientific research does not begin with a 'topic' but with a problem-a researcher must be puzzled, curious, in a sense 'bothered' about some- thing."3 As scholars research a "prob- lem," the questions they ask and the in- formation they seek shifts and changes. Each new finding alters what follows. ''Research,'' as Maurice Line describes it, ''is a process that does not allow for too formal organization. " 4 Integral to the loose structure of research is the information-seeking behavior of research- ers. Stoan states that'' scholars ... follow Connie Miller is science librarian at the University of Illinois, Chicago, Illinois 60680. Patricia Tegler is sys- tems librarian at Kirkland and Ellis Law Library, Chicago, Illinois 60601. This paper was presented at the ACRL Fourth National Conference in Baltimore, April 9-12, 1986. 370 no mechanical procedure of thinking up a topic, doing background reading on it . . . going through the card catalog ... , con- sulting indexes for articles," etc. 5 Instead they most frequently locate additional in- formation through the bibliographies of previously identified material. This biblio- graphic tracking technique closely ap- proximates the actual research process. It is organic and cyclincal and manifests what Swanson describes as the trial-and- error, problem-oriented process of infor- mation retrieval. 6 This trial.,.and-error process of informa- tion seeking is quite different from a highly structured search strategy ap- proach, and unlike it, does not result in a complete, final bibliography. In trial-and- error searching both the process of the search and its products can lead the re- searcher to alter the original understand- ing of the problem and may lead to addi- tional sources of information. The traditional methods of ·evaluating infor- mation searches-recall and precision- have completely overlooked this genera- tive, creative aspect of a search. By evaluating the product and not the pro- cess, recall and precision limit our under- standing of information searches and fail · to measure them effectively. Recall and precision measure a specific retrieved bibliography within the context of a particular database. Recall is the per- centage of relevant documents retrieved out of all relevant documents in the data- base. If all possible relevant items are re- trieved from the database or databases, the bibliography has achieved 100 percent recall. Precision is the percentage of rele- vant documents retrieved of all the docu- ments retrieved. If half of the documents in the bibliography are relevant, its preci- sion rate is 50 percent. Let us assume that for a particular topic there are 200 relevant citations in a database. A search for that topic results in a bibliography of 160 cita- tions. Eighty of the citations are relevant. The search has achieved a recall rate of 40 percent, or 80 of 200, and a precision rate of 50 percent, or 80 of 160. Both recall and precision depend on rel- evance, an extremely difficult concept to measure. Harter summarizes two differ- Online Searching 371 ent types of relevance which have been identified: The first type of relevance is ''on the topic,'' which is the kind of relevance that would apply in subject searchs. A document is relevant to a topical query if it is on the subject named by the requester. Relevance in this sense can be judged by an individual or by a community of experts; it is objective and involves public knowledge. The second type of relevance is similar to what Kemp and others have referred to as "pertinence"-it is a subjective, private "creation of new knowledge" by the requester in the context of a personal information need. In this sense, relevance is not a property of a document and a request, but is the property of a document and a requester. 7 · Measurement of recall and precision is gen-erally based on the identification of relevance according to the first, more ob- jective meaning of the term. Search results get high marks for precision when a large percentage of the citations retrieved ap- pear to be "on the topic." The specific in- formation needs of the requester may. or may not be met by these highly precise results. Recall and precision, therefore, measure the performance of the database or system. They do not and cannot mea- sure the value of a search to a requester. The concept of recall itself is a highly questionnable one. In the first place, recall cannot be accurately estimated. Using ei- ther definition of relevance it is impossible to determine the total number of docu- ments in a database which are relevant to a given request or requester. Secondly, total recall is rarely desirable or valuable. The retrieval of all relevant documents would frequently constitute too much informa- tion, and a surplus can be as problematic as a deficiency. Furthermore, recall can only measure a specific bibliography against a hypothetically relevant portion of a measures the value of the search process itself, therefore, only in terms of a quantifiable product. To be an important part of research, however, the process of an online search and the products of that search must move beyond the restrictions of recall and preci- sion. Hawkins, Bates, Vigil, and others have described heuristic techniques, like title and descriptor scanning, citation 372 College & Research Libraries pearl growing, the '' notting'' out of pre- vious sets, and "interactive scanning," which help a searcher make more effective use of online file capabilities. 8 But t~e best techniques can still be limited by a topical, recall-and-precision dominated approach. Consider, for example, a search of the ERIC database for citations on end-user searching. Since no thesaurus term end- user currently exists, a searcher could de- velop a group of synonymous terms using the thesaurus or begin with a free-text ap- proach and locate synonyms by printing several citations in the title-descriptor for- mat. Terms like online-systems, information- retrieval, or information-seeking could be combined with training, user-satisfaction, or surveys to result in high recall. Limiting the combinations to precise user groups, e.g., college-faculty or health-personnel, could increase precision. A systematic process of notting out already examined sets could eliminate needless duplication. Logically, this technique of framing an in- formation request in terms of statically de- fined synonyms could not be faulted. Titles and descriptors function, how- ever, as more than static synonyms. Words which name or describe books or articles act as signposts, embodying con- ceptual approaches to research or indica- ting directions taken. In this fuller sense, they function as powerful disseminators of information whose importance lies less with their potential for becoming part of online search logic and more with their potential to reshape a research question. A logical combination of topical synonyms for the term end-users is directed toward the development of a final product: a high recall and/or high precision bibliography. Interacting with words as signposts be- gins the generative process of reformulat- ing an original information request. This generative process depends upon a will- ing suspension of logic, a leap, that is into the illogical or intuitive world of perti- nence. 9 Harter calls an illogically interac- tive search a ''problem-oriented'' online inquiry which is "an integral part of sci- ence itsel£."10 How could the search for end-user searching citations have been different? ''Full Service Document ·Delivery: bur July 1986 Likely Future" is one title that a free text search of ERIC would retrieve. This cita- tion appears minimally relevant at best; none of the logical synonym combinations listed above would retrieve it. By sus- pending logic, however, and responding to the words in the title, a researcher in- tent on developing an end-user searching program in an academic library might alter direction considerably, including in the development of the program an online document ordering option. Whether logically or illogically con- ducted, an online search results in a biblio- graphic product. Like the words in titles or descriptors, the importance of this prod- uct lies less with its potential to supply in- formation on a topic and more with its po- tential to alter the direction of research. Each document from the bibliography, while potentially of direct use for its con- tent, also performs the ''indirect function ... of stimulating a reformulation of [a] request. " 11 In addition, each document is a primary source, the bibliography of which provides an entry for the researcher into the literature of the field. Because "the primary literature indexes itself, and does so with greater comprehensiveness, better analytics, and greater precision than does the secondary literature, " 12 whether the bibliography achieves high recall or consists of citations precisely rele- vant to the researcher's topic is immater- ial. The bibliographic product of an online search functions as a gateway into a cita- tion network and, thus, participates in the cyclical, organic process of research. Viewed statically or as ends in them- selves, neither the process nor the product of an online search can be anything but an- cillary to research. A logical search based on heuristic techniques with a high recall or high precision bibliographic product will, almost be default, result in articles or books through which a researcher can gain entry into a citation network. The best bibliographic product, however, the one which includes illogically relevant (pertinent) as well as logically relevant ci- tations, will only result from a search in which information disseminated (through words in titles and descriptors) during the process contributes to a reformulation of the researcher's request. illogically rele- vant or pertinent sources will open up ci- tation networks of their own. This cyclical, organic type of search operates according to what Abraham Kaplan calls its own "logic-in-use, [an] internal logic ... [which,] as it germinates and develops . . . dictates the sources sought out at each stage along the way. ''13 By responding to the document delivery concept, the re- searcher developing the end-user search- ing program opened a whole new avenue of literature to explore. Because ''logic-in- use" is essential to an organic, research- related online search, Harter predicts the Online Searching 373 inevitability of end-user searching. 14 While his prediction is undeniably correct, the idea that online searches integral to the research process must operate accord- ing to a "logic-in-use" seems related less to who performs a search than to how well a search is performed. If librarians who perform online searches for researchers or who offer online training to end-users fail to understand and to communicate the vi- tal and essential implications of the cycli- cal, organic, illogical nature of research for the process and products of an online search, they risk becoming irrelevant. REFERENCES 1. Stephen K. Stoan discusses, in some detail, research on the research process in his article, "Re- search and Library Skills: An Analysis and Interpretation," College & Research Libraries 45:99-109 (1984). 2. Don R. Swanson, "Information Retrieval as a Trial-and-Error Process," Library Quarterly 47:128-48 (1977); and "Libraries and the Growth of Knowledge," Library Quarterly 50:112-34 (1980). 3. Swanson, "Information Retrieval," p.138. 4. Stoan, "Research and Library Skills," p.102. 5. Ibid., p .102. 6. Swanson, ''Information Retrieval,'' p .138-39. 7. Stephen P . Harter, "Scientific Inquiry: A Model for Online Searching," Journal of the American Society for Information Science 35:114 (1984) . 8. Marcia J. Bates, "The Fallacy of the Perfect Thirty-Item Online Search," RQ 24:13-20 (1984); Peter J. Vigil, ''The Psychology of Online Searching,'' Journal of the American Society for Information Sci- ence 34:281-87 (1983); and Donald T. Hawkins and Robert Wagers, ''Online Bibliographic Search Strategy Development," Online 6:12-19 (1982). 9. D. A. Kemp, "Relevance, Pertinence and Information System Development," Information Storage and Retrieval10:37 -47 (197 4). Kemp's article includes an excellent discussion and definition of per- tinence. 10. Harter, "Scientific Inquiry," p.114. 11 . Swanson, "Information Retrieval," p.138. 12. Stoan, "Research and Library Skills," p.103 . 13. Ibid., p.102, as derived from Abraham Kaplan, The Conduct of Inquiry: Methodology for Behavioral Science (San Francisco: Chandler, 1964), p .3-11. 14. Harter, "Scientific Inquiry," p.114.