jantz.p65 Information Retrieval in Domain-specific Databases 229 229 Information Retrieval in Domain- specific Databases: An Analysis to Improve the User Interface of the Alcohol Studies Database Ronald Jantz Ronald Jantz is the Government & Social Sciences Data Librarian in the Alexander Library at Rutgers University; e-mail: rjantz@rci.rutgers.edu. The task of providing Web access to the Alcohol Studies Data- base has been a collaborative effort. Penny Page and Valerie Mead, librarians at the Center of Alcohol Stud- ies, have developed the content and indexing for the database. The author designed the original architecture and Web-based user interface. Mike LeBlanc, computer science student at Rutgers University, developed the improved user interface under the author’s direction. This team, as a whole, has participated in numerous discussions on how to improve the ASDB user interface and in the testing of the resulting improvements. Academic libraries are becoming more directly involved in the design and publishing of electronic information resources, including bibliographic data- bases, electronic journals, and digital archives. As a result, librarians are dealing with many user interface design issues that computer scientists and information specialists in other fields have encountered. Transaction log analysis can provide a rich source of information on user behavior and insights as to how user interfaces can be improved. This article describes the methodology and results of the log analysis for the Alcohol Studies Database (ASDB), a domain-specific database supported by the Center of Alcohol Studies and Rutgers University Libraries (RUL). The goals of this study were to better understand user search behavior, to analyze failure rates, and to develop approaches for improving the user interface. he ASDB content was devel- oped by the Center of Alcohol Studies at Rutgers University, and the Web site and user inter- face were designed by the Scholarly Com- munication Center of RUL. An overview of the ASDB is provided as a backdrop for the in-depth analysis of the user interface and transaction logs. As part of the design to provide Web access to the ASDB, the au- thor developed a statistical gathering and reporting subsystem. Activated in October 2000, this subsystem has since provided more than two full years of statistics on search behavior. As a result of the log analy- sis, specific improvements have been made to the ASDB user interface. A unique aspect of this article is the summary of the trans- action logs before and after improvements to the user interface, which illustrates how the changes have affected search behavior and search results. Introduction Academic libraries are becoming more di- rectly involved in the design and publish- ing of electronic information resources, including bibliographic databases, elec- 230 College & Research Libraries May 2003 tronic journals, and digital archives. These new roles represent a challenging future for librarians who want to utilize their technology and design skills.1 The Scholarly Communication Center of Rutgers University Libraries (RUL) and the Center of Alcohol Studies (CAS) at Rutgers have collaborated to provide Web access to the Alcohol Studies Database (ASDB). The ASDB contains more than 60,000 citations of documents indexed by the CAS since 1987.2 The primary focus of the database is on research and profes- sional materials dealing with beverage alcohol and its use and related conse- quences. Although a growing amount of literature on other drug use/abuse has been added in recent years, this material represents only a small percentage of the database and is not indexed to the same depth as the alcohol literature. In addi- tion to the research and professional ma- terial, the database includes a small col- lection of educational and prevention materials, including audiovisuals suitable for students and educators K–12, parents, community workers, and the general public. At the outset, the author, working with librarians at the CAS, wanted to quickly develop a Web-based user interface for the ASDB. Originally, as a nonnetworked da- tabase, a controlled vocabulary of index- ing terms had been developed and each article was extensively indexed with these terms. This vocabulary became the key component of the online search interface. The initial Web-accessible database and user interface were completed late in 1999 using the approach and technology de- scribed in an article by the author in 2001.3 As a result, users at Rutgers University and throughout the world gained access to this important and freely available collection of medical and scientific research dealing with the use of alcohol and the related con- sequences. Subsequent to this introduction on the Web, the author designed a statisti- cal gathering and reporting subsystem that was implemented in October 2000. The transaction logs now contain more than two full years of search statistics that have assisted researchers in making decisions about how to improve the user interface. Based on the transaction log analysis and extensive ASDB team discussions, an im- proved user interface was launched in Feb- ruary 2002. This article summarizes the data from the transaction logs and com- pares search results from the initial user interface and the improved user interface. User Interface Overview A partial image of the initial user inter- face showing the controlled vocabulary pick-lists is shown in figure 1. This par- tial image shows three primary subject- related pick-lists labeled as follows: physiological aspects, social aspects, and drug terms. Each pick-list has some thirty or more controlled vocabulary terms that can be selected by the user to form a query. In addition to these lists, a user can select items from two additional pick-lists, for- mat and special populations (not shown), that will further constrain the query. Fi- nally, author and title word or phrase searching also is available. Online help instructions are available on the top navi- gation bar and “example” links to screen images are provided for each type of search box to demonstrate clearly how one would specify a query. Complex Queries The user can form simple or quite com- plex Boolean queries with the ASDB in- terface. For example, one could simply do a search for a specific author or a search on a word or phrase that might be found in the title of an article. However, the user also has the capability to form complex Boolean operations by selecting multiple items from any one of the three primary pick-lists. Multiple items selected within a pick-list default to a Boolean “or” and the user also can use the toggle switch between the major pick-lists to select ei- ther an “and” or an “or” between these major categories. The default Boolean operation between search boxes is an “and.” The example in figure 1 illustrates a more complex Boolean search with the Information Retrieval in Domain-specific Databases 231 FIGURE 1 Controlled Vocabulary in the Initial User Interface ([AIDS: HIV and Alcohol] AND [Aggression and Alcohol]) OR (AIDS: HIV and Drugs) properly parenthesized result shown at the top of the figure. After forming a query, the user can then select “search” at the bottom of the screen, which will yield a set of summary results, each of which can then be selected to view the full bibliographic citation. Results Display After a user selects the “search” button, each resulting bibliographic record is dis- played in summary form, ordered by publication date with the most recent first. Within publication year, there is a second- ary ordering by author. Note that rel- evance orderings are not appropriate be- cause there are no abstracts or full-text content that can be used to make rel- evance decisions. Approach and Methodology The ASDB is a domain-specific database that contains bibliographic records of more than 60,000 citations primarily to journal articles and books relating to the beverage alcohol and its use and related consequences. The use of transaction logs is one primary method of improving user interfaces and, thereby, also improving the information retrieval performance for users. Transaction logs have been used successfully to improve user interfaces of traditional OPACs in libraries.4 This ar- ticle discusses the use of transaction logs to improve the user interface for the ASDB. The logs analyzed herein represent usage from October 2000 through Septem- ber 2001 for the initial user interface and usage from February 2002 through April 2002 for the improved user interface. The ASDB is a research-oriented database and, by Web standards, it is not heavily used; however, the transaction log contains a significant statistical representation and usage continues to grow as more people discover the availability of the ASDB. At the writing of this article, the author and colleagues were seeing between 1,300 and 1,500 searches a month during the stan- dard academic fall and spring semesters. 232 College & Research Libraries May 2003 The objectives of this analysis were to understand user behavior, analyze failure rates, and identify improvement areas for the user interface. The analysis method- ology used in this article is similar to that described by Jansen and Spink.5 Although many improvement areas were discov- ered through the analysis, a specific ob- jective was to reduce the number of searches that resulted in either zero hits or greater than 100 hits. These types of outcomes were considered potential fail- ures of the user interface. Based on identified improvement ar- eas, an improved user interface was launched in February 2002. Data from the initial and the improved user interfaces are compared to determine how the changes have improved the ability of searchers to use the ASDB. The follow- ing levels of analysis as reported by Jansen and Spink will be used. Session The session is the entire sequence of que- ries entered by the user. Heuristics will be used to define a session because the ASDB does not employ any type of “log-in” sce- nario that would accurately register each user. For the purposes of this article, the session will be defined as those queries submitted consecutively by a single IP address and not separated by more than twenty minutes. The twenty-minute inter- val was arrived at by visual inspection of the intervals that occur in the transaction log. Although it is conceivable that another user may have started another session with the same IP address and within a twenty- minute time frame, this condition is highly unlikely. It should be noted that a session can have a single query. Query Sessions are composed of queries. A query within the context of the ASDB is defined when a user selects the “search” button and an entry is written into the transac- tion log. For the purposes of this article, the concepts of initial query and modi- fied query will be used. The initial query is the first query in a session, and the modified query is a subsequent query in a session that is different from the initial query. Query length is measured by the number of terms used, and query com- plexity is determined by the use or ab- sence of Boolean expressions. Term Within the ASDB, a term is defined as any controlled vocabulary term that is se- lected from one of the five pick-lists in the user interface (i.e., physiological as- pects, social aspects, drug aspects, special populations, and special format). A term also can be an author ’s name or words/ phrases entered into the “title phrase” search box and which might be separated by the Boolean operators of AND/OR. The Statistical Gathering and Reporting Subsystem The author designed the statistical sub- system to capture as many data as possible about the user search behavior. Every as- pect of the user query is captured, includ- ing search terms and how the user has toggled the AND/OR selection between major subject areas in order to create a Bool- ean expression. Each search is associated with a unique user identification, although users always remain anonymous. In addi- tion, the results of each search are recorded, including the number of results generated and a time-date stamp. Because users do not register to search the ASDB, some mechanism was needed to identify a user session. The time-date stamp in conjunction with the IP address is used to track the con- cept of a “session” as discussed above. It should be noted that the statistical system only records data from users who conduct a search of the ASDB. Any data regarding users who are just visiting and who do not conduct a search, sometimes referred to as “tourists,” is not recorded.6 The reporting subsystem provides sev- eral types of summary reports, including total number of searches, searches with zero results, searches with more than 100 results, searches by month, and searches by major domain. In addition, the data- base administrator can select a more de- Information Retrieval in Domain-specific Databases 233 TABLE 1 User Demographics for the ASDB Domain Country Percent edu USA 26.9 com USA 18.2 net USA 16.0 us USA 3.1 ca Canada 2.6 au Australia 2.3 org USA 1.6 uk United Kingdom 1.5 nz New Zealand 0.4 jp Japan 0.3 se Sweden 0.3 nl Netherlands 0.3 mil USA 0.3 at Austria 0.3 ie Ireland 0.2 il Israel 0.2 gov USA 0.2 it Italy 0.2 no Norway 0.2 gr Greece 0.1 be Belgium 0.1 es Spain 0.1 dk Denmark 0.1 br Brazil 0.1 fr France 0.1 mx Mexico 0.1 my Malaysia 0.1 za South Africa 0.1 All others 24.0 tailed report to see all the fields and op- tions for a particular search. Transaction Log Summary Statistics User Demographics Table 1 shows the demographics by do- main for the users of the ASDB. Although usage is predominantly from the United States, users are coming to the ASDB from all over the world. Zero-hit Outcomes Many studies have analyzed user diffi- culties with the syntax and semantics of Web searching. One paper has reported that more than 30 percent of searches of a university Web site resulted in zero-hit outcomes.7 In an earlier paper, T. Peters reported that 40 percent zero-hit out- comes are common in his specific aca- demic library OPAC.8 Table 2 shows the distribution of hits in four different ranges, including zero-hit outcomes. The table indicates that the initial user inter- face of the ASDB is incurring 33.6 percent zero-hit outcomes where N = 10,267 is the total number of searches. The improved user interface has a marked reduction in zero-hit outcomes at 27.8 percent. Sessions and Queries Table 3 provides summary-level statistics for sessions and queries for both the ini- tial (N = 10,267) and improved user in- terfaces (N = 3,375). From these summary statistics, it is obvious that the sessions are relatively short (e.g., 2.45 queries in the initial UI). From an examination of session length, it is apparent that 71.1 percent of all sessions in the initial UI have either one or two queries and 80.6 percent have one or two queries in the improved user interface. In other analy- ses, researchers have found similar re- sults, speculating that users are either unwilling or unable to expend the effort to develop effective search strategies.9 Analysis: Zero-hit Outcomes The zero-hit outcomes are a fruitful area for examination and will generally reveal a wealth of information regarding the ef- fectiveness of a user interface. This analy- sis will proceed by examining the zero-hit outcomes of the initial user interface in more detail. Table 2 shows that 33.6 per- cent of the searches (3,454 out of 10,267) using the initial user interface resulted in zero hits. In the improved user interface, zero-hit outcomes have been significantly reduced to 27.8 percent. Of the zero-hit outcomes in the initial user interface (N = 3,454), 595 searches attempted some type of author search. Author Searching There were obvious syntactical and seman- tic errors with author searching. Generally, the semantic errors will be more difficult 234 College & Research Libraries May 2003 TABLE 2 Overall Search Outcomes Measure Initial UI (N = 10,267) Improved UI (N = 3,375) Zero-Hit Outcomes (%) 33.6 27.8 Outcomes - GE 1, LT 100 Hits (%) 33.9 34.2 Outcomes - GE 100, LT 1,000 Hits (%) 32.5 20.7 Outcomes - GE 1,000 Hits (%) 9.4 17.3 TABLE 3 Session and Query Summary Statistics Measure Initial UI (N = 10,267) Improved UI (N = 3,375) Mean Queries per Session 2.45 1.93 Session Length Minimum 1 1 Maximum 44 54 % 1 Query 48.3 60.7 % 2 Queries 22.8 19.9 % 3+ Queries 28.9 19.4 to detect and correct. For example, there were a few users who confused the author search field with a keyword search field and searched on phrases such as “Advo- cacy 1992” or used a term that was obvi- ously subject related rather than an author. These errors were uncovered by visual inspection of the logs and are reported in table 4 as “incorrect AU semantics.” In addition, a number of author search syn- tax errors were evident in the transaction log that could clearly be eliminated or minimized by improving the user inter- face. The following illustrate some specific examples that do not follow the conven- tions that are documented as part of the ASDB user interface: 1. typing in the first name “first” (e.g., “Bill Wilson”); 2. typing initials with no blanks (e.g., “Epstein, J. A.”); 3. omitting comma delimiters that separate the last name from the first ini- tial (e.g., “borg s”). Errors of this type comprise a signifi- cant percentage (32.6%) of the author search zero-hit outcomes as shown in table 4 as “incorrect AU Syntax.” In the improved user interface, more restrictive syntax checking was implemented with a request to the user to reformulate the author search according to the required syntax conventions for author searching. Although some types of incorrect syntax still have not been detected, the improved user interface has significantly reduced the zero-hits due to incorrect author syn- tax to only 12.8 percent (table 4). Title Searching In the initial ASDB user interface, phrase searching was implemented; however, quoted phrases and the use of an asterisk as a truncation symbol were not permitted. Given the prevalence of these conventions in existing Web search engines, many ASDB users tried using these symbols. Of the zero- hit outcomes (N = 3,454), 2,343 searches at- tempted some type of title search and 3.5 percent used special conventions that were not supported by the ASDB. (See “incorrect Title Syntax” in table 4.) In the improved user interface, both quoted phrases and use of the asterisk are flagged with a message to the user to reformulate the query using approved syntax conventions. This check- ing has virtually eliminated zero-hits due to the use of these conventions. Information Retrieval in Domain-specific Databases 235 TABLE 4 Incorrect Author and Title Searches Measure Initial UI Improved UI Incorrect AU Syntax (%) 32.6 (N = 595) 12.8 (N = 180) Incorrect AU Semantics (%) 9.4 (N = 595) 12.2 (N =180) Incorrect Title Syntax (%) 3.5 (N = 2343) 0% Keyword Searching Perhaps one of the most confusing parts of the user interface was allowing key- word and phrase searching only in the title field. The rationale behind this decision was the assumption that users would use the controlled vocabulary in lieu of gen- eral keyword/phrase searching and some users would want to search the title field only. However, it is clear from a visual inspection of the title phrase searching that users continue to use the title phrase search as a general keyword search box. One simple example illustrating that us- ers are not obtaining the proper results is evident in the search using the phrase “military and alcohol,” which returned eighteen results in the initial UI. For the improved user interface, keyword search- ing has been offered across all fields in the database and the title-specific search option eliminated. If one reruns the search “military and alcohol” in the improved UI, 130 results are returned. It is probably a reasonable assumption that the user in this case did not want those citations in which the terms “military” and “alcohol” appeared only in the titles of the citations. Hence, the change in keyword searching for ASDB has likely improved recall for the great majority of users. Analysis: Controlled Vocabulary Frequency of Use In the ASDB, there are more than 150 con- trolled vocabulary terms across the three primary areas of physiological aspects, so- cial aspects, and drug terms. Table 5 shows the fifty most frequently selected terms from the 10, 267 searches using the initial user interface. Although the data shown in table 5 are not used quantita- tively in this study, the qualitative assess- ment was that use of this highly technical vocabu- lary could not be aban- doned, thereby leaving users with only a free- text searching capability. Many users of the initial and improved user inter- faces took advantage of the controlled vocabulary; however, some unexpected results were encountered, as discussed in the following sections. Use of Subject Terms As shown in table 6, data have been col- lected and summarized for the initial and the improved user interfaces that illus- trate the percentage of users who did not use the controlled vocabulary. The “No Subjects” row includes the percentage of users who did not use any of the con- trolled vocabulary from the three main areas of physiological aspects, social as- pects, and drug aspects; however, they may have used some of the other special pick-lists or the free-text search. The “Free Text Only” row shows the percentage of users who did not use the subject vocabu- lary and other special pick-lists such as those for population and format. These queries (23.6% in the initial UI and 39.3% in the improved UI) used only the free- text searching fields. It should be noted that the two rows in table 6 are not mutu- ally exclusive. For example, a free-text- only search would be counted in both rows; hence, summing to greater than 100 percent in a column is possible. It is obvious from table 6 that a signifi- cant number of users do not use the con- trolled vocabulary in either the initial or improved user interfaces. However, the dramatic result is the increased number of users who did not use the controlled vocabulary in the improved user inter- face. The author suggests that this result stems from two major changes in the user interface as we moved from the format of the initial UI to that of the improved UI. First, general keyword searching was in- troduced in contrast to only allowing key- word searching in the title field. In all like- 236 College & Research Libraries May 2003 TABLE 5 Controlled Vocabulary: Frequency of Term Selection Term Freq Term Freq Alcohol and Drug Interactions 408 Alcoholism: Diagnosis 347 Alcoholism: Etiology, Definitions, and Theoretical 280 Alcohol-Related Mortality 237 Aggression and Alcohol 203 Alcohol Determination Methodologies 165 Attitudes toward Drinking and Alcoholism 163 Alcoholics Anonymous 161 Family Aspects and Alcohol 155 Advertising and the Media 154 AIDS, HIV, and Alcohol 153 Drinking Experiments 149 Stress and Alcohol: Physiological Aspects 149 Brain Pathology and Alcohol 136 Alcohol Education in the Schools 136 Alcoholism: Miscellaneous 136 Drug Abuse Treatment: Miscellaneous 130 Alcoholic Beverage Control Laws: U.S. 119 Alcoholic Beverages: Properties, Manufacturing Aspects 118 Attitudes toward Drug Use and Abuse 118 Abstinence 117 Fetus and Alcohol: Human Studies 116 Sexual Behavior, Sex Roles, and Alcohol 115 Detoxication and Treatment of Withdrawal 113 Alcohol Beverage Industry 107 Crime and Alcohol 106 Social and Cultural Aspects of Alcohol: Miscellaneous 104 Crime and Drug Use 102 Counseling Drug Abusers 101 Blood Pressure and Alcohol 98 Alcoholism Treatment Programs and Facilities 96 Alcoholism Treatment: Miscellaneous 92 Drug Abuse Treatment Programs and Facilities 90 Driving and Drinking: Management of Offenders 88 Intoxication and Alcohol Poisoning 81 Driving Skill and Alcohol 80 Memory and Alcohol 75 Withdrawal and Post-alcohol Phenomena 74 Cocaine 72 Heredity: Human Studies 69 Historical Aspects 65 Recidivism and Relapse in Alcoholism Treatment 63 Statistics: Alcoholic Beverages 61 Treatment Outcome Studies 61 Stress and Alcohol: Psycho-social Aspects 60 AIDS, HIV, and Drugs 60 Diagnosis, Drug Abuse 60 Individual Therapies 59 Alcohol Education: Professional Personnel 57 Cognitive and Perceptual Functions 57 lihood, users were more inclined to use the more familiar keyword searching rather than the less familiar method of selecting items from the controlled vo- cabulary list. Second, it was known that there would be trade-offs in the presen- tation styles of the two user interfaces. In the initial UI (figure 1), pick-lists were chosen in which the user could only see a small subset of the terms without scroll- ing. In the improved UI, the user is first presented with major subject areas (fig- ure 2) and then must do a mouse click to see the terms of the controlled vocabulary in a checkbox format (figure 3). The ad- vantage of this approach is that the user can see all the subject terms whereas she or he could only see a limited subject list (without scrolling) in the initial UI. Thus, the improved UI approach requires more mouse clicks for the user to select the sub- ject terms. Perhaps more important, in the Information Retrieval in Domain-specific Databases 237 TABLE 6 Percentage Of Users Who Did Not Use the Controlled Vocabulary Measure Initial UI (N = 10,267) Improved UI (N = 3,375) No Subjects (% of queries) 33.7 62.6 Free Text Only (% of queries) 23.6 39.3 improved UI, users do not see any terms on the first search page and it is suspected that they more naturally gravitated to the use of the obvious keyword searching capability rather than take the time to explore the subject terms available. Summary The improved UI has significantly re- duced the number of zero-hits that users incur from 33.6 percent to 27.8 percent. This result is due primarily to the im- proved error checking for author search- ing and the checking for special syntax conventions that users might have seen on the Web, but which are not available in the ASDB. However, when one exam- ines the distribution of outcomes with non-zero hits, there are 41.9 percent with greater than 100 hits in the initial UI and 48.0 percent with greater than 100 hits in the improved UI. One phenomenon that is occurring in the improved UI is that users are selecting many more controlled vocabulary terms to OR together, which is resulting in searches with more hits. In all probability, this search behavior stems from users being able to see the complete selection of controlled vocabulary terms in the checkbox format. It is difficult to put a value judgment on these outcomes, although it is unlikely that users are ex- amining results beyond their first 100 hits. With the change in subject term dis- play format from a pick-list to checkbox style, users have dramatically reduced the use of the ASDB controlled vocabulary from 33.7 percent not using any of the controlled vocabulary terms to 62.6 per- cent. This result was clearly unexpected by the author and the CAS librarians and not an altogether desired result. Although there is less usage of the controlled vo- cabulary, users who use the keyword FIGURE 2 Improved User Interface, Initial Display 238 College & Research Libraries May 2003 FIGURE 3 Checkbox for Physiological Aspects searching also are searching all controlled vocabulary terms. Although this ap- proach is yielding many relevant results, we are still struggling with the classic in- formation retrieval problem of the differ- ence between the user vocabulary and the indexer’s vocabulary.10 The other obser- vation here is that user overhead in terms of mouse clicks appears to be more of an issue than originally expected. It appears that initial impressions from the first search page had a very strong impact on user behavior leading them to keywords when they did not see any controlled vo- cabulary on the first page. Although only one mouse click away, the controlled vo- cabulary has become more inaccessible for a great many users. A related phenomenon is that of session length between the two user interfaces. In the initial UI, the percent of single query sessions was 48.3 percent whereas this sta- tistic jumped to 60.7 percent in the im- proved UI. There are possibly two factors that could account for this behavior. First, the reduced number of zero-hit outcomes in the improved UI suggests that more users have received appropriate results in a single-session query. The other factor is that many users may have considered the improved UI more complex than the ini- tial UI and thus did not continue to explore how to use the ASDB effectively. In addition to the major user interface changes made above, it should be noted that the improved UI of the ASDB also includes the ability to e-mail citations, to select a “print-friendly” interface, and to page results in order to limit the size of the html page returned to the user ’s browser. Steps also are under way to implement the linking to the full text of journals that RUL has licensed. Conclusions For most Web database projects, it is un- clear who the user community will be and frequently all the designer can assume is that users are all those “out there” on the Web. With the ASDB, the domain demo- graphics suggest a very diverse user com- munity. It is unlikely that one will have the luxury of understanding the user ’s search behavior prior to developing the user interface. Therefore, it is very impor- tant to capture usage statistics via trans- action logs. These logs enable the de- signer to learn more about the user and to make incremental changes to improve the user interface. Users will frequently make assump- tions about the user interface syntax given their experience with Web search engines or other database products. They fre- quently assume these conventions are standard and universal. In designing user Information Retrieval in Domain-specific Databases 239 interfaces, the librarian must either sup- port a variety of conventions or provide error checking and feedback to assist the user in learning the syntax of the specific search engine. Rarely will one get the user interface “right” on the first iteration. The designer should plan on making im- provements after the transactions logs have been reviewed and there is more information on the types of users and their search behavior. With respect to the specific ASDB analysis, a heuristic of zero-hits and greater than 100 hits has been used as an indicator that the user interface can be improved. Although certain of the searches that fall under this general clas- sification are legitimate, this indicator can serve as a useful, low-cost tool for identi- fying and making user interface improve- ments. Certainly the reduction of zero-hit outcomes in the ASDB is an improvement. However, it appears that the improved user interface might be more complex for our user community given the decrease in usage of the controlled vocabulary, the increase in single-query sessions, and sig- nificantly more outcomes with greater than 1000 hits. This conclusion suggests that there are some possible future im- provements that would help users. The researchers suspect that many of the us- ers are casual information searchers who are accustomed to a basic keyword search interface and that the professional infor- mation specialists would find the con- trolled vocabulary most useful. Thus, of- fering a “basic” and “advanced” user in- terface is likely to help considerably in meeting the needs of two quite different user populations. However, with the ba- sic keyword search, the researchers are still left with the problem of bridging the user vocabulary and the controlled vo- cabulary. In these small specialized data- bases, linking the keyword search terms with the controlled vocabulary is a fur- ther improvement that is likely to help considerably and is one area of continu- ing investigation. Many librarians are entering the infor- mation profession with technology skills or are acquiring and using technology skills while on the job. As a result, these librarians will likely be confronted with user interface design issues and the re- sulting questions of effective information retrieval. Analysis of transaction logs is an excellent method for better under- standing user search behavior and also an effective tool for identifying improve- ment areas in the user interface. Notes 1. J. Prinsen, “A Challenging Future Awaits Libraries Able to Change,” D-Lib Magazine 7, no. 11 (2001). Available online from http://www.dlib.org/dlib/november01/prinsen/11prinsen.html. 2. P. Page, R. Jantz, and V. Mead, The Alcohol Studies Database (2000). Available online from http://www.scc.rutgers.edu/alcohol_studies. 3. R. Jantz, “Publishing Databases on the Web: A Major New Role for Librarians and Re- search Libraries,” in Creating Web-accessible Databases: Case Studies for Libraries, Museums, and Other Nonprofits, ed. J. Still (Medford, N.J.: Information Today, Inc., 2001), 7–26. 4. D. Blecic, N. Bangalore, J. Dorsch, et al., “Using Transaction Log Analysis to Improve OPAC Retrieval Results,” College and Research Libraries 59, no. 1 (1998): 39– 50. 5. B. Jansen and A. Spink, “Methodological Approach in Discovering User Search Patterns through Web Log Analysis,” Bulletin of the American Society for Information Science 27, no. 1 (2000): 15–17. 6. M. Cooper, M., “Usage Patterns of a Web-based Library Catalog,” Journal of the American Society for Information Science and Technology 52, no. 2 (2001): 137–48. 7. P. Wang, W. Hawk, and C. Tenopir, “Users’ Interaction with World Wide Web Resources: An Exploratory Study Using a Holistic Approach,” Information Processing and Management 36, no. 2 (2000): 229–51. 8. T. Peters, “When Smart People Fail: An Analysis of the Transaction Log of an Online Pub- lic Access Catalog,” Journal of Academic Librarianship 15, no. 5 (1989): 267–73. 9. S. Jones, S. Cunningham, R. McNab, and S. Boddie, “A Transaction Log Analysis of a Digital Library,” International Journal of Digital Libraries 3 (2000): 152–69. 10. R. Jantz, “An approach to Managing Vocabulary for Databases on the Web,” Cataloging & Classification Quarterly 28, no. 3 (1999): 55–66.