College and Research Libraries Unobtrusive Studies and the Quality of Academic Library Reference Services Jo Bell Whitlatch This article uses empirical data from a recent obtrusive study of reference performance to explore content validity and assumptions regarding unobtrusive studies. Data collected by the author support the contention that improvements are highly desirable before conducting more unobtrusive studies of reference service. The two most important changes concern the development of test questions representing all types of queries and supplementing the correct fill rate with other measures of reference performance. uring the past two decades, the most notable advance in refer- ence services evaluation has been the increased use of unob- trusive observation methodology. Re- searchers have come to accept unobtru- sive studies as a valuable tool for the evaluation of reference services. Refer- ence librarians and library managers, however, do not appear to have inte- grated the findings from these unobtru- sive studies into reference services prac- tice. Unobtrusive studies of reference ser- vices were developed to provide an alter- native to user satisfaction surveys. The early, obtrusive and generally global sur- veys of user satisfaction provided little in- formation useful for improving services. Unfortunately, unobtrusive studies as they are presently employed in the evalu- ation of reference services also have seri- ous limitations that prevent an adequate assessment of reference services perfor- mance. This paper begins with a comparison of performance measures and methodolo- gies typically employed in unobtrusive and obtrusive studies and then discusses the assumptions underlying unobtrusive studies. A recently published text, Unob- trusive Testing and Library Reference Services by Peter Hernon and Charles McClure, provides an excellent review of unobtru- sive methodology and practice. 1 To illus- trate differences between these two types of studies, I use material from Hernon and McOure and selected findings from an ob- trusive study that I have just completed. 2 My findings support the contention that unobtrusive studies, as they are currently conducted, are extremely limited as in- struments for the evaluation of reference effectiveness in academic libraries. The paper concludes with a discussion of the changes that are needed in order to de- velop an improved system of reference evaluation. SUMMARY OF FINDINGS AND METHODOLOGY Unobtrusive Studies In unobtrusive studies of reference ser- vices, predetermined test questions are administered to reference librarians. These librarians are unaware that their re- sponses are being assessed. Thus, the ef- fect of being tested should not influence Jo Bell Whitlatch is Associate Library Director, Access and Bibliographic Services, at San Jose State University, San Jose, California 95192-0028. 181 182 College & Research Libraries the normal behavior of the librarian. Those administering the test questions pose as library users and receive training in how to administer the questions to the unsuspecting librarians. The results indi- cate that reference staff members answer correctly 50 to 60 percent of the questions posed in this manner. 3 Thomas Childers remarks that he and Crowley unintentionally initiated a tradi- tion of research and a particular way of conceiving the reference process and ap- plying the unobtrusive method to the evaluation of reference services. Today, those performing unobtrusive studies continue to conceive the reference process · and employ the unobtrusive method in roughly the same way as the original in- vestigators. 4 . Typically, unobtrusive studies use the correct answer fill rate, that is to say, the proportion of correct answers to ques- tions, as the measure of reference perfor- mance. In their recent study of govern- ment documents reference service, Hernon and McClure employed the two types of test questions most commonly used in unobtrusive studies: factual, e.g., requests for the name of an individual or for specific statistical or descriptive infor- mation; and bibliographic, e.g., requests for bibliographic citations, information on . the availability of a publication in the li- brary or through the Government Publica- tions Sales program, or information on ob- taining a Superintendent of Documents classification number. 5 They report that the most frequent reasons for incorrect an- swers are that the library staff member gave wrong data (96 cases or 64.4 percent), responded "don't know" without referral (30 cases or 20.1 percent) or incorrectly claimed that the library did not own a source (23 cases or 15.4 percent). 6 Obtrusive Study This obtrusive study includes 397 refer- ence transactions in five academic libraries in Northern California. Librarians asked users to complete a questionnaire for every fifth reference transaction; librari- ans also completed a companion question- naire for every sampled transaction. Matching questionnaires were returned March 1989 for 257 transactions. Prior to collecting the sample data, librarians from each of the participating libraries met with the re- searcher to discuss survey procedures. They were made aware of the importance of not biasing the survey results by select- ing preferred questions or treating sur- veyed users differently. Individual librar- ian confidentiality was guaranteed. Reference departments participated vol- untarily in the survey because they wished to obtain an accurate picture of the quality of their services. The study tests a model of the major var- iables influencing academic library refer- ence service outcomes. Three measures of reference service performance outcomes were employed: librarian judgments of the value of reference service, user judg- ments of the value of reference service, and user success or failure in locating needed material. Independent variables used in the study included measures of task uncertainty, time constraints of users and librarians, feedback, and type of refer- ence assistance. Only findings useful in evaluating the role of unobtrusive studies in reference performance are reported in this paper. Questions from the study were classi- fied into three categories: (1) bibliographic citation for which a correct answer could have been predetermined; (2) questions of fact for which a correct answer could have been predetermined; and (3) other ques- tions, including narrow and broad subject questions; questions concerning evalua- tions of books, movies, and plays; and questions on how to use reference sources. A small proportion (11.3 percent) of the requests were for specific factual in- formation and 18.0 percent were related to locating specific citations. The majority (70.7 percent) of the queries were requests for locating references on a subject and/ or assistance in how to use library reference sources (see table 1). Results of SPSS cross-tabulations for factual, bibliographic, and subject/in- structional questions by user success in lo- cating materials are presented in table 2. The chi-square statistic is significant, indi- cating that there is a difference between user success in finding material related to factual and bibliographic queries versus that for other types of queries. For factual and bibliographic queries, greater propor- tions of users either found what they needed, or nothing that they needed. For subject and instructional queries, a much greater proportion of users found some but not all needed material. Type TABLEt TYPE OF QUESTION Number Factual Bibliographic Subject/fustructional 29 46 181 256 Missing (1) TABLE2 TYPE OF QUESTION AND USER SUCCESS IN FINDING NEEDED MATERIAL Percent 11.3 18.0 70.7 100.0 Materials Available Fact. Question Type Bibl. Subj./Instr. Yes Some None 78.6% 10.7 10.7 100.0% (28) Missing (14) x2 - 16.87, df - 4, p - .0021 70.5% 62.6% 13.6 33.3 15.9 4.1 100.0% (44) 100.0% (171) These results are not always directly comparable to unobtrusive findings be- cause in this study some of the material needed to satisfy factual and bibliographic queries was not located because of circula- tion and collection development prob- lems. In many unobtrusive studies prob- lems with collection development and circulation failures are fairly well con- trolled through preselection of standard reference works that are likely to be in the library at all times. ASSUMPTIONS Accurate Fact Provision as a Key Indicator of Reference Performance The first assumption is that correct an- swer fill rate is a key measure of reference service effectiveness. 7 Hernon and Mc- Clure have carefully considered some im- portant aspects of the validity of the test Unobbusive Studies 183 questions and measures of accuracy such as face, internal, external, and construct validity. Their questions were judged by librarians and researchers as representa- tive of typical questions encountered at the documents and general reference desk. 8 However, there has been little discus- sion of content validity. Content validity is related to the adequacy with which impor- tant content has been sampled and the ad- equacy with which the content has been cast in the form of test items. 9 Therefore if we are really interested in measuring the performance of reference desk service we must ask how well a test represents the main body of reference questions. Childers roughly estimates that the kind of query that has been addressed through unobtrusive testing to date may represent about one-eighth of the range of reference questions asked. 1° Childers suggests that research findings on part of the process are being taken to represent the whole. The query with a short, factual, unambig- uous answer has attracted almost all of the field's attention. The problem with inves- tigating such queries is that in the minds of many of those interested in evaluating reference performance, findings from un- obtrusive studies assume unrealistic pro- portions and come to represent the whole of the reference function. However, there is no empirical literature that links perfor- mance of one kind of reference service to performance of another kind. 11 . Evelyn Daniel observes that tradition- ally fact provision has not been a major service of the library. It became a conve- nient afterthought to the referral and pro- vision of bibliographic information. 12 Duane Webster suggests that accuracy may not be a key indicator of the overall quality of reference services; users often seem to value convenience and timeliness of information more than accuracy. 13 The findings of the current study support these observations and provide evidence that requests for specific factual informa- tion represent a minority of reference que- ries in academic libraries. This study indi- cates that the majority of queries are related to broad and narrow subjects or in- volve requests for instruction in the use of 184 College & Research Libraries library reference materials (see table 1). With such a relatively small percentage of factual queries, librarians get little op- portunity to develop on-the-job expertise using a broad range of tools to answer re- quests for specific factual information. Some factual queries may represent more difficult problems for librarians than sub- - ject queries in locating useful information. Success rates for factual queries appear to fall more frequently into the categories of total success or total failure (see table 2). Total failure rates for factual queries may be higher because of the design of biblio- graphic access systems. Library biblio- graphic access systems tend to be de- signed to locate materials by broad subject topic rather than by precise fact. The pub- lic catalog still provides the primary access to a library's resources and is normally useful only for locating books. With rare exceptions, the catalog does not provide access to tables of contents or individual chapters in books; neither does it provide access to book indexes, which are most useful for locating factual information. User demand for factual answering ser- vices appears to be relatively low com- pared to other types of requests for refer- ence assistance. This is particularly true for queries related to academic course work and research. Results of SPSS cross-tabulations for factual, bibliographic, and subject/in- structional questions by purpose of the user are provided in table 3. The chi- square is significant, indicating that there is a difference in proportion among types of queries made for course work, research, and other reasons. The proportion of fac- March 1989 tual questions asked to meet course work and research needs is much lower than those asked for other reasons, i.e., other job related, personal, and miscellaneous. Of the total number of factual questions in the sample, 42.9 percent are primarily re- lated to course work, 17.9 percent for re- search and 39.3 percent for other reasons. Therefore, of the 237 questions for which both purpose and type of question are identified, only 17 (7.2 percent) are both factual and closely related to the primary mission of the academic library, support of coursework and research (see table 4). One must also be realistic by asking if ac- curacy is a key indicator of reference per- formance. While unobtrusive test ques- tions have documented, authoritatively correct answers, most real-life questions are not so conveniently documented. Pro- fessional education stresses the identifica- tion of appropriate sources containing an answer, but not the ability of the librarian to judge the accuracy of that answer. A performance measure that evaluates and identifies a proper source might be a more reasonable test of reference librarian effec- tiveness. Patrick Wilson raises serious questions about the ability of reference librarians to determine accuracy in all subject areas. He notes that reference works do not collec- tively give a single standard answer for the same question; they are, in varying de- grees, full of inaccuracies. Further, a stan- dard reference work quickly becomes dated and incomplete. 14 Wilson concludes that librarians gener- ally work in a world of texts that they take as simply given and cannot claim to evalu- TABLE3 Question Type Factual Bibliographic Subject/Instructional Missing (20) x2 - 28.99, df = 4, p< .OOOS TYPE OF QUESTION AND PURPOSE OF QUESTION Course Work 7.5% 16.8 75.8 100.1% (161) 10.0% 26.0 64.0 100.0% (50) Other 42.3% 15.4 42.3 100.0% I (26) TABLE4 QUESTIONS RELATED AND UNRELATED TO INSTITUTIONAL MISSION !Jpeof Question Number Course Work and Research Factual 17 Bibliographic 40 Subject/Instructional 154 Other Reasons Factual 11 Bibliographic 4 Subject/Instructional 11 237 Missing (20) Percent of Sample 7.2 16.9 65.0 4.6 1.7 4.6 100.0 ate independently .15 The evaluation of the content of texts requires expertise in the subject matter of the text, which the librar- ian cannot be expected to have. Librarians are not generally in a position to be able to evaluate the contents of a reference book or to make independent judgments on the correct or incorrect status of answers. It seems, therefore, that our key perfor- mance measures ought to be designed to acknowledge more thoroughly the limited judgments librarians are able to make. Data from this study support Wilson's observations. Librarians in the study re- ported good subject expertise (1 or 2 on a scale of 7) for only 51.8 percent of the transactions, and users reported the same level of subject expertise only 17.3 percent of the time. This is the nature of general reference desk service, where librarians cannot hope to have in-depth subject com- petence in all areas for which they are ex- pected to answer questions. One user noted, ''The people are helpful, and try to do their best, but some of them are not qualified enough.'' Significant subject fa- miliarity by the librarian was positively as- sociated with user success for subject and instructional questions (r = .303, p = < .0005) and factual and bibliographic queries (r = .208, p = .040). User reports of shorter lengths of time spent with the librarian were more strongly associated with user success in locating materials for factual and bibliographic citation queries (r = .560, p. = < .0005) than for subject Unobtrusive Studies 185 and instructional queries (r = .149, p. = .026). Hernon and McClure note that when conducting evaluation studies, issues re- lated to the quality of the service must be separated from the value of that service. Ultimately, the value of the service is based upon the degree to which the ser- vice meets the information needs of li- brary clientele and facilitates the accom- plishment of service objectives. 16 Libraries and librarians should be judged primarily on whether they provide added value to users. When partial and full success are both considered, users re- port greater success in finding some or all materials for subject and instructional (95. 9 percent) queries than for factual (89. 3 percent) and bibliographic (84.1 percent) queries. Only when judged by the more stringent criterion of locating all materials needed did users report the greatest suc- cess rates for factual queries (see table 2). Thus, the correct answer fill rate appears to be a useful, but extremely limited, mea- sure of reference performance. Easier-than-Average Queries Another assumption underlying many unobtrusive studies of reference service is that the questions used are not difficult to answer. McClure and Hernon suggest that the 55 percent correct answer fill rate is typically computed on questions with an ''easier-than-average'' difficulty level. 17 Also, in response to a reviewer's query as to whether the degree of diffi- culty should be used to judge the quality of reference service, Hernon and McClure state that ''factual and bibliographic ques- tions are generally recognized as two of the easier types of reference questions.'' 18 However, the degree of difficulty of av- erage unobtrusive test questions versus average questions actually asked in aca- demic libraries for factual, bibliographic, and subject/instructional questions has not been carefully studied. For service providers, task difficulty was found to be positively related to task uncertainty. 19 Thus, data on task uncertainty collected for this study provide an opportunity to 186 College & Research Libraries explore the differences in types of ques- tions. uwork flow uncertainties are created by unpredictable client arrival, ser- vice, and exit patterns." Uncertainty can be defined as a situation in which one cannot control or reliabl~ predict all of the variables and relations. Uncertainty is also thought to have an im- portant influence upon delivery of ser- vices.21 Fundamental differences exist be- tween production processes in service and manufacturing industries. These impor- tant differences include work flow and task uncertainties peculiar to service oper- ations. Work flow uncertainties are cre- ated by unpredictable client arrival, ser- vice, and exit patterns. Task uncertainty occurs when there is incomplete knowl- edge about how to produce a desired out- come. Because the production of service outcomes depends upon interaction be- tween clients and service providers, work- ers cannot totally rely upon past proce- dures when providing service to individual clients. Thus, this obtrusive study includes five uncertainty measures for each question. The five measures are librarian ratings of: (1) the frequency of use of sources used to answer a question, (2) the question as a new type of problem, (3) the similarity of a question to other questions, ( 4) familiarity with the subject of the question, and (5) fa- miliarity with the information source used to answer a question. Librarian ratings are compared for fac- tual, bibliographic, and subject/instruc- tional types of queries. Factual and biblio- graphic citation questions are separated for this analysis because the majority of questions used for unobtrusive studies appear to concern factual rather than bib- liographic citations. Thus, this study com- pares mean ratings of task uncertainty for the factual, bibliographic citation, and subject/instructional types of questions (table 5) and also compares mean ratings March 1989 of task uncertainty for factual versus all other types of questions (table 6) . Means for frequency of use of sources are significantly different for factual and subject/instructional questions, with sources for factual queries used less fre- quently (table 5). For three measures of task uncertainty-similarity of questions, librarian subject familiarity, and librarian information source familiarity-means for factual and bibliographic questions are significantly different. For three measures-frequency of use of sources, similarity of questions, and li- brarian subject familiarity-there are sig- nificant differences between the means for factual and other types of reference que- ries (table 6). On the average these librari- ans regard factual questions as somewhat less routine, because they involve the use of somewhat less frequently used sources and are slightly less similar to other types of questions. Librarians also report some- what less subject familiarity when re- sponding to the factual queries included in this sample. u librarians judge factual ques- tions to be more difficult because an- swering these questions involves the use of less familiar, less frequently used sources." Mean ratings of the task uncertainty in- volved with factual queries indicate that on the average librarians in this study view the uncertainty in the task of answer- ing factual queries as somewhat greater than the uncertainty involved in answer- ing other queries. Therefore, findings in this study suggest that librarians judge factual questions to be more difficult be- cause answering these questions involves the use of less familiar, less frequently used sources. Represents Real-Life Patrons Hernon and McClure argue that in- creased use of unobtrusive testing of refer- Unobtrusive Studies 187 TABLES MEAN RATINGS OF TASK UNCERTAINTY FOR FACfUAL, BffiLIOGRAPIDC, AND SUBJECT/INSTRUCTIONAL QUESTIONS (Rating Scale: 1 = very great or completely; 7 = very little, very seldom or not at all) Task Uncertain!! Frequency of use of sources New type of problem Similanty of questions Librarian subJect familiarity Librarian information source familiari!Y *p<.OS tp<.Ol Fact. Bibl. 2.7C? 2.07 6.00 6.04 3.38b 2.15b 3.59b 2.48b 2.5~ 1.5~ Subject/ Instruct. df F 2.038 2,249 3.34* 5.73 2,252 .40 2.69 2,253 5.20+ 2.% 2,253 3.52* 2.25c 2,253 5.53+ 8 Means for Factual and Subject/Instructional queries are significantly different at the .OS level (Scheffe test). ~eans for Factual and Bibliographic queries are significantly different at the .OS level (Scheffe test). ~eans for Factual and Subject/Instructional queries are significantly different from Bibliographic queries (Scheffe test). TABLE6 MEAN RATINGS OF TASK UNCERTAINTY FOR FACfUAL VERSUS OTHER TYPES OF QUESTIONS (Rating Scale: 1 = very great or completely; 7 = very little, very seldom or not at all) Task Uncertainty Frequency of use of sources New type of problem Similanty of questions Librarian SUbJeCt familiarity Librarian information source familiari!Y *p< .OS ence services is necessary because it pro- vides an excellent means to see the library and its services from the viewpoint of the patron . 22 However, unobtrusive testing takes the view that information is a com- modity. Catherine Sheldrick Ross notes that if we think of information as a com- modity, as existing apart from people or their contexts, then questions and an- swers can be held to exist independently. 23 Library schools have typically taught and evaluated basic reference courses this way. They give students questions to an- swer. This approach strengthens the ten- dency of these future librarians to concep- tualize information as a commodity with no reference to its context in the life of the inquirer. In reality people ask questions to fill gaps in their understanding so that they can use the information as a means to an end. Users are helped to the extent that the answers to their questions help them Type of Question Factual All Others df F 2.79 2.04 1,250 6.67* 6.00 5.79 1,253 .44 3.38 2.58 1,254 6.22* 3.59 2.86 1,254 4.27* 2.59 2.11 1,254 2.83 accomplish something. Librarians who answer questions without knowing any- thing about the context may provide an answer that is technically correct but is otherwise unhelpful in filling the user's need. 24 Fred Oser summarizes trends on the ba- sis of a survey of the literature and con- cludes that there is a large area of situa- tional content that can be of great use to the librarian in conducting an efficient and rapid interview. The type of library in which one is working can lead to highly predictable expectations toward purpose, scope, and level of reference queries. 25 Helen Gothberg also notes the variability in levels of service, which are based to a considerable extent on the type of library. For example, a special library with its lim- ited clientele may find it most expedient to provide the answer. On the other hand, li- braries located in educational institutions place a greater emphasis on educating the 188 College & Research Libraries user to answer his/her own questions at the basic level of information need; yet there need be nothing minimal about the type of instruction provided. 26 In this study, a substantial proportion (23. 9 percent) of the bibliographic citation questions are identified in the context of the library. That is, the library user had an index or other bibliographic material in hand and wanted the librarian's assis- tance with locating one or more references and/or understanding the meaning of dif- ferent parts of the citation. This relatively straightforward, but nonetheless impor- tant class of bibliographic citation ques- tions does not appear to be represented in unobtrusive studies. The unobtrusive model of testing refer- ence service assumes that the librarian is responsible for finding the correct answer. A proxy is hired to conduct the test. Hernon and McClure note that research has shown that many users of academic and public libraries are not aggressive in pressing staff for an answer. 27 They also indicate that, ultimately, the responsibil- ity for ensuring that the patrons' informa- tion needs are met belongs to library man- agement.28 Proxies neither suggested sources or places where the answer might be ob- tained nor encouraged referrals. The methodology also makes librarians, not users, responsible for locating and verify- ing the exact information. Hernon and McClure note that when library personnel referred a proxy to a source, but did not of- fer to look for the answer, the proxy would pretend to examine the source for a short time and return to the same person for fur- ther guidance in use of the source. 29 But in reality, user behavior may signifi- cantly affect reference performance. Wilson notes that in the delivery of refer- ence service, limits are set by the prefer- ences, habits, abilities, and resources of the user. The library cannot supply the user with time or ability. It can supply docu- ments to study, but not the inclination to do so. 30 In this study, librarian judgments of the value of service are significantly re- lated to librarian judgments of user partici- pation in the reference process for factual and bibliographic citation questions March 1989 (r = .236, p = .022) but not significantly related for subject and instructional queries (r = .111, p = .073). User success is posi- tively associated with librarian perceptions of feedback quality both for factual and bib- liographic questions (r = .262, p = .014) and for subject and instructional questions (r = .155, p = .024). When users let the li- brarian know how well the question is an- swered, this feedback is significantly and positively associated with all three refer- ence service outcomes, irrespective of the type of question, again suggesting that the controls placed on proxy behavior in unob- trusive studies may lower the success rate (see table 7). 11 • •• the controls placed on proxy be- havior in unobtrusive studies may lower the success rate.'' For factual questions, approximately one-third of the users report receiving a direct answer, while other users report re- ceiving assistance in locating the answer for themselves (table 8). Therefore, in these five academic libraries, users report that the librarian accompanied them (but did not refer) or provided a direct answer for 65.5 percent of all factual queries. Li- brarians did not have the opportunity to verify the complete precise information for the remaining 34.5 percent of factual queries. Consequently, in the real-life pro- vision of factual reference service, aca- demic librarians often do not assume full responsibility for direct answer provision TABLE 7 CORRELATIONS BETWEEN USER FEEDBACK AND REFERENCE SERVICE OUTCOMES BY TYPE OF QUESTION Reference Outcome Librarian judgment of service vafue User judgment of service value User success *p< .05 tp< .Ol tp< .OOl Feedback on Feedback on Factual/Bib!. Subject/Instruct. .160 .200+ .593! .387:t: .247 .306:1: TABLES TYPE OF REFERENCE ASSISTANCE FOR FACTUAL/BIBLIOGRAPHIC QUESTIONS Assistance Accompany Refer Accompany & refer Direct answer Factual 31.0% 10.3 24.1 34.5 and verification of accuracy of informa- tion. Instructions to proxies do not encourage them to push the librarian to provide as much assistance as needed. Should so much emphasis be placed on an evalua- tion method that eliminates many of the contextual clues provided by users seek- ing to fill a gap in their understanding and to use the information for their own pur- poses? An alternative model to use in studying the reference process is one in which the reference transaction and the resulting service outcome are joint products of the effort, skill, and knowledge of both librar- ian and user. In Gordon Whitaker's model of coproduction of service, service deliv- ery is accomplished through a process of mutual adjustment, whereby both client and service provider mutually consider the client's problem and what each of them can do about it. Citizens influence the content of many public services through direct participation in service de- livery. This is particularly the case in ser- vices designed to change or empower peo- ple directly, such as education and health care. The agent can encourage, suggest options, illustrate techniques, and pro- vide guidance and advice, but cannot · bring about the change alone. 31 For li- braries functioning as part of the educa- tional process, this may be an appropriate model. CONCLUSION John Campbell suggests that given a specific research question, we can ask what specific research method(s) possess the most validity for the independent and dependent variables. It should not be as- sumed that hard measures are always more valid than soft measures of depen- Unobtrusive Studies 189 dent variables. Campbell comments that the term "hard" seems to refer to depen- dent variables that consist of countable outcomes. They are objective in the sense that they can be counted by an indepen- dent party. Soft measures seem to refer to those based on human judgment or scal- ing consideration. 32 The unobtrusive studies use a modified laboratory method and have the strengths and weaknesses commonly found in such methods. Their strengths are that refer- ence librarians cannot introduce bias into the study because they are unaware of it, and a clear standard for correct and incor- rect answers to queries can be established. Their weakness is that they do not neces- sarily represent reality. Figure 1 provides a summary of the differences in treatment of users and questions for unobtrusive versus obtrusive studies of reference per- formance. Studies employing unobtrusive meth- ods are often viewed by researchers as more desirable for reference evaluation because of their objective qualities. Unob- trusive studies seem to be more scientific. Hernon and McClure express this view in the following statement: "Basing mana- gerial decisions regarding reference ser- vices on perceptions rather than realistic appraisal is a disservice to library clientele and a myopic stance that continues to im- pede the development of quality reference services. " 33 The popularity of unobtrusive studies appears to be a reaction to early uncritical global surveys of user satisfaction with li- braries and the growing awareness that al- though users appear to be highly satisfied with library service, they do not represent the best critical judgment about the provi- sion of information. Perhaps we have now gone too far in the opposite direction by studying an unrepresentative minority of reference queries. The method has been allowed to dictate the evaluation criteria and scope by limit- ing the test to only that portion amenable to hard, objective measures. The scientific method was originally developed in order to more effectively study the physical world around us. Is this method really the primary one to use for the study of how 190 College & Research Libraries Unobtrusive Studies -are passive receivers of service -have little investment in answers March 1989 Obtrusive Studies Users -may be passive or active -have a definite interest in results -operate without a context in which to place -have a knowledge gap to fill answers -provide limited or no clues to librarians -operate without specific time pressures -expect librarians to supply the answers -provide a wide range of contextual clues to librarians -operate under a variety of time pressures -normally expect librarians to provide a source for answers Questions -are quantifiable requests for specific information -are relevant to a limited task -are general requests for subject information -are relevant to a broad set of tasks FIGURE 1 Differences between Unobtrusive and Obtrusive Reference Studies people behave in organizational settings? Unobtrusive studies tell us only a little about the quality of reference services. Given the weaknesses in unobtrusive studies and the problems with content va- lidity, can more representative studies be designed using the unobtrusive method- ology? Unobtrusive methods remain a valuable alternative methodology for ·countering some of the known weak- nesses in field studies using obtrusive methodologies. But unobtrusive studies would tell us much more about the quality of reference services if the assumptions and the meth- odology were modified substantially. Test questions need to represent all facets of reference. Unobtrusive studies must de- velop more representative reference ques- tions. Research findings from the present obtrusive study demonstrate that it is fea- sible to involve reference departments in the evaluation process and collect sum- maries of questions and answers. Collect- ing and compiling such information also enhances the ability to analyze the entire spectrum of queries people bring to li- braries. Questions representing the entire body of reference could be selected from ques- tions sampled in the field, and expert peer review could be used to supply appropri- ate sources. Unobtrusive observations should also be used more extensively to collect information on referrals and ser- vice orientation (the helpfulness, respon- siveness, and interest of the librarian in the user's problem). Wilson observes that librarians can claim to be adept at locating texts and what these texts sax about each other and the external world. Therefore, the correct referral to appropriate sources would be a more appropriate measure than accuracy. Even correct referral to sources is not with- out its problems as a performance mea- sure. Sandra Naiman notes that other pro- fessions can and do agree that there is a basic core of information and or skills that members must posses. Yet she has never met a group of librarians who were willing to reach a consensus on the indispensable reference sources. 35 Reference performance measures used in obtrusive studies could also be modi- fied for use in unobtrusive studies. Charles Bunge and Marjorie Murfin have established a stringent criterion for patron-perceived fill rate. 36 In order to count as a totally successful question, us- ers must report that they located just what was wanted and that they were com- pletely satisfied with the information or materials found or suggested. The aver- age success rate for thirty-one libraries participating in their study was 55.81 per- cent for all types of reference questions and 46.7 percent for factual reference questions. Murfin and Gary Gugelchuk note that the unrealistically high ratings found in previous studies of reference per- formance may be due in great part to the use of inadequate instruments and meth- ods to study a complex phenomenon. 37 In the present study of five academic li- braries, 66 percent of users report finding what they needed. This exceeds the aver- age success rates generally found in unob- trusive studies. The majority (75 percent) also report that they are very highly satis- fied (1 or 2 on a scale of 7). For this study, adopting a more stringent criterion similar to that used by Bunge and Murfin would result in a total success rate of 57 percent for users who found what they needed and also indicated that they were very highly satisfied. Therefore, use of these or similar measures in unobtrusive studies would definitely permit researchers to in- clude the more common subject and in- structional types of questions with more precise measures than global indications of satisfaction. With the evidence of poor performance on certain types of factual and biblio- graphic questions that librarians are asked, academic librarians might be well advised to be more critical in interpreting the text to the user, particularly when they cannot claim expertise in the subject area. Library schools might consider providing more extensive education for librarians in interpretation and verification of answers in standard factual sources. Academic librarians responding to fac- tual queries in areas for which they lacked the expertise to make. an independent evaluation would then say to the user, ''Here is a source that might help you find an answer to your question'' rather than, ''Here is the answer to your question.'' When problems of authority arise and nei- ther librarian nor the user have the subject expertise to make a judgment, academic li- brarians would do well to refer users to teachers with expert knowledge. The results of the unobtrusive studies also provide convincing evidence that many librarians need to be more critical in assessing their expertise when respond- ing to f~ctual queries. Harold Wilensky Unobtrusive Studies 191 notes that one standard highly adhered to and accepted in established professions is awareness of the limited competence of your own specialty within the profession and readiness to refer clients to a more competent colleague. 38 By closely adher- ing to this professional standard, refer- ence librarians could improve the quality of their factual answering services by re- ferring users to expert sources rather than providing an answer of uncertain quality. For academic libraries with a government publications department, specialized fac- tual answering services might be com- bined with the duties of documents librar- ians. Librarians in smaller libraries who have difficulty developing such expertise in the staff might try to maintain sources for expert referral and carefully identify those sources for which they are prepared to provide in-depth factual question an- swering services. Because of the weaknesses of both un- obtrusive and obtrusive studies, more so- phisticated methods must be developed for evaluating reference service perfor- mance. When reference evaluation meth- ods are able to provide a more comprehen- sive picture of the quality of reference service, reference librarians and managers will be more interested in using the results to modify reference service practice. A major advance in improving reference evaluation will be expanding the scope of the predetermined test questions now used in unobtrusive studies and develop- ing additional measures of effectiveness to supplement correct answer fill rate. While we are waiting for more sophisti- cated studies that use a greater variety of evaluation methods, we can modify our professional service practices by more crit- ically examining our competence to an- swer certain types of factual queries. For most queries, academic librarians might do best to focus on evaluating their com- petency in source referral, both internal and external. Finally, we should consider how often we ask users whether their questions were answered fully and whether they have found what they wanted. 192 College & Research Libraries March 1989 REFERENCES 1. Peter Hernon and Charles R. McClure, Unobtrusive Testing and Library Reference Seroices (Norwood, N.J.: Ablex, 1987). 2. Jo Bell Whitlatch, "Client/Service Provider Perceptions of Reference Service Outcomes in Aca- demic Libraries: Effects of Feedback and Uncertainty" (Ph.D. diss., University of California, Berkeley, 1987). 3. Peter Hernon and Charles R. McOure, "Unobtrusive Testing: The 55 Percent Rule," Library Jour- na1111:37-41 (1986). 4. Thomas Childers, ''The Quality of Reference: Still Moot After 20 Years,'' Journal of Academic Librar- ianship 13:73-74 (1987). 5. Hernon and McOure, "Unobtrusive Testing," p.39. 6. Ibid., p.48. 7. Hernon and McClure, "Unobtrusive Testing," p.24. 8. Ibid., p.41. 9. Jum C. Nunnally, Psychometric Theory (New York: McGraw Hill, 1978), p.93. 10. Childers, "The Quality of Reference." 11. Ibid. U. Evelyn H. Daniel, "The Effects of Identity, Attitude and Priority," Journal of Academic Librarianship 13:76-78 (1987). . 13. Duane E. Webster, "Examining the Broader Domain," Journal of Academic Librarianship 13:79-80 (1987). 14. Patrick Wilson, Second-Hand Knowledge: An Inquiry into Cognitive Authority (Westport, Conn.: Greenwood, 1983), p.173-92. 15. Ibid., p.192. 16. Hernon and McOure, "Unobtrusive Testing," p.128. 17. Ibid., p.144. 18. Ibid., p.167. 19. Richard J. Butler, "User Satisfaction with a Service: An Approach from Power and Task Charac- teristics,'' Journal of Management Studies 17:1-18 (1980). · 20. James D. Thompson, Organizations in Action (New York: McGraw Hill, 1967). 21. Peter K. Mills and Dennis J. Moberg, "Perspectives on the Technology of Service Operations," Academy of Management Review 7:467-78 (1982). 22. Hernon and McClure, "Unobtrusive Testing," p.104-105. 23. Catherine Sheldrick Ross, "How to Find Out What People Really Want to Know," The Reference Librarian 16:19-30 (1986). 24. Ibid. 25. Fred Oser, ''Referens Simplex or the Mysteries of Reference Interviewing Revealed,'' The Refer- ence Librarian 16:53-78 (1986). 26. Helen M. Gothberg, "The Beginnings," The Reference Librarian 16:7-18 (1986). 27. Hernon and McOure, "Unobtrusive Testing," p.167. 28. Ibid., p.105. 29. Ibid., p.39-40. 30. Patrick Wilson, Public Knowledge, Private Ignorance: Toward a Library and Information Policy (West- port, Conn.: Greenwood, 1977), p.122. 31. Gordon P. Whitaker, "Coproduction: Citizen Participation in Service Delivery," Public Adminis- tration Review 40:240-46 (1980). 32. John N. Campbell, "Labs, Fields, and Straw Issues," in Generalizing from Laboratory to Field Set- tings, ed. Edwin A. Locke (Lexington, Mass.: Lexington Books, 1987), p.269-79. 33. Hernon and McOure, "Unobtrusive Testing," p.164. 34. Wilson, Second-Hand Knowledge, p.192. 35. Sandra M . Naiman, "The Unexamined Interview Is Not Worth Having," The Reference Librarian 16:31-46 (1986). 36. Charles A. Bunge and Marjorie E. Murfin, "Reference Questions-Data From the Field," RQ 27:15-18 (1987). 37. Marjorie E. Murfin and Gary M. Gugelchuk, "Development and Testing of a Reference Transac- tion Assessment Instrument," College and Research Libraries 48:314-38 (1987). 38. Harold Wilensky, "The Professionalization of Everyone?" American Journal of Sociology 70:137-58 (1964). Unobtrusive Studies 193 APPENDIX A: SCALES FOR VARIABLES INCLUDED IN THE AUTHOR'S OBTRUSIVE STUDY Reference Service Performance Outcomes 1. Librarian Judgments of the Value of Reference Service (Five Questions, loading on one factor, a = .89) Please check the space: __ : that best describes how you think the user viewed the quality of ser- vice received. (seven-point scale: outstanding-terrible) Relevance of Information Provided. (two seven-point scales: useful-useless; relevant-irrelevant) Amount of Information Provided. (two seven-point scales: sufficient-insufficient; reasonable- unreasonable) 2. User Judgments of the Value of Reference Service (Six Questions, loading on one factor, a = .84) Check the space: ___ : that best describes the general quality of service you received. (seven-point scale: outstanding-terrible) _ Indicate how satisfied you are on the following scale. (seven-point scale: satisfactory- unsatisfactory) Relevance of Information Provided. (two seven-point scales: useful-useless; relevant-irrelevant) Amount of Information Provided. (two seven-point scales: sufficient-insufficient; reasonable- unreasonable) 3. User Success (One Question) Were you able to locate the materials you needed? (choices were: yes; no; some but not all; and other (please explain)) Task Uncertainty (Five Questions, loading on one factor, a = .80) To what extent were the sources you suggested to this user materials you frequently consult in pro- viding reference service? (seven-point scale: very great extent-very little extent) To what extent did you see answering this reference question as a new type of problem: (seven-point scale: very great extent-very little extent) How often do you answer this question or questions that are very similar? (seven-point scale: very often-very seldom) How familiar were you with the subject(s) involved in the reference question? (seven-point scale: completely-not at all) Were you already familiar with the information resources most likely to contain the answer to this reference question from previous knowledge or experience? (seven-point scale: completely-not at all) Time User Spent with Librarian (One Question) How long did you spend with the reference librarian? (choices were: 0-2 minutes; 3-5 minutes; 6-15 minutes; over 15 minutes) User Feedback (One Question) To what degree did you inform the library staff member whether or not your question was an- swered? (seven point scale: completely-not at all) Librarian Perception of Quality of Communication (Four Questions, loading on one factor, a = .72) Communication with the user was: (two seven-point scales: very easy-very difficult; pleasant- unpleasant) Did the user give you sufficient information to answer his/her question? (seven-point scale: sufficient-insufficient) How explicit was the user's question? (seven-point scale: very explicit-not at all explicit) Librarian Judgments of User Participation (Two Questions, loading on one factor, a = .72) To what extent did the user provide you with feedback? (seven-point scale: very great extent-very little extent) 194 College & Research Libraries March 1989 How active a role did the user play in resolving his/her information need? (seven-point scale: very active-not very active) Type of Reference Assistance (One Question) How did the reference librarian assist you? (the user selected one of five choices: (1) by accompany- ing you to sources to help find the answers; (2) by referring you to sources to find the answer on your own; (3) by accompanying you to some sources and referring you to other sources; ( 4) by directly giv- ing you the answer to your question; or (5) other (please explain)) NOTE: The abbreviation 11 a'' refers to Cronbach' s alpha, a measure of internal reliability for the vari- able. The library program is basically concerned with teaching and learning, with adapting the li- brary to instructional needs, and with improving student work and achievement. -Frank A. Lundy, January 1951 William Miller, in The Book Industry, quotes G. P. Bret, Jr., president of Macmillan Com- pany, as saying in March 1948 that "all (book) costs have increased since 1940 between 60 and 70 percent. '' -Elizabeth C. Seely, October 1951