Hahn.p65 Evaluative Usage-based Metrics for the Selection of E-journals 215 215 Evaluative Usage-based Metrics for the Selection of E-journals Karla L. Hahn and Lila A. Faulkner Karla L. Hahn is the Collection Management Team Leader in the University of Maryland Libraries; e- mail: kh86@umail.umd.edu. Lila A. Faulkner is the Electronic Publications Graduate Assistant in the University of Maryland Libraries; e-mail: lf71@umail.umd.edu. To measure the value of print journals, librarians have gathered a range of statistics and developed a variety of metrics. Similar work to assess the value of e-journals has just begun. This article explores the useful- ness of available e-journal usage statistics and develops three metrics and three benchmarks based on those metrics. The proposed metrics build on earlier work that assesses the value of print journals, although the earlier work is modified extensively to fit the e-journal arena. The e- journal statistics and metrics are further transformed to address a com- pletely new area of application: the evaluation of potential purchases. Statistics and metrics are used to build three benchmark measures for assessing e-journal candidates for purchase. A comparison of Science and Nature site licenses illustrates the utility of the assessment bench- marks. The benchmarks, metrics, and statistics developed here provide a reliable framework for assessing both current collections and candi- date collections of e-journals. Implications for standards development are clear; content measures are desperately needed for the develop- ment of an effective suite of e-journal statistics. ven in the information age, the more things change, the more they remain the same. Librar- ians did not leave behind dra- matically high serial prices with the con- version to electronic journals. If anything, the problem has worsened because many publishers seem to have used new elec- tronic formats to justify even higher prices. With their budgets continually stretched tighter, librarians need to con- stantly evaluate their current collections and potential purchases to determine their value to the librarians’ missions. E-journals, however, present the same problems of valuation posed by print jour- nals. To assess the increasing prices of e- journals, librarians must find a way to compare journals with different amounts and quality of content, publishers, and subject matter. It is important to consider users and their demand for a particular journal. Who will use the journal? How often will they use it? Under what circum- stances will they use it? E-journals further complicate the picture with complex pric- ing structures, online searching, hyperlinks, and server reliability. Regard- less, the same problem of comparison re- mains: What a publisher charges for a par- ticular journal does not necessarily reveal anything about its relative value. 216 College & Research Libraries May 2002 At first glance, this article’s title pre- sents a contradiction: How can usage- based indicators of value help librarians select resources not yet in the collection? The puzzle’s answer lies in first develop- ing metrics that indicate the value of cur- rent electronic resources based on the vendor ’s use and content data and then using those metrics to develop a set of benchmarks for the new resources. A se- ries of calculations is needed to present a full picture of value in light of the price of the resource, the amount of content available, and the quantity of usage it re- ceives. Although the metrics and bench- marks cannot provide an objective mea- surement of value, they do offer objective descriptive information useful for evalu- ating comparable resources. In this article, the authors describe metrics developed to evaluate e-journals and to help librarians select additions to that collection. The first section reviews earlier efforts to assess the value of jour- nals, both print and electronic. The next section explains the metrics created at the University of Maryland to assess value in the current collection there and the benchmarks designed to assist in selec- tion. The metrics put the data provided by vendors on use into a context that al- lows librarians to compare the subscrip- tion prices and to assess the value of dif- ferent journals. Finally, the article ad- dresses the future of e-journal usage sta- tistics. Foundations of Value Assessment This work continues a long struggle by librarians to measure the value of a jour- nal. Unlike toasters or hammers, journals are not interchangeable, which makes simple cost comparisons difficult. World Politics and the American Journal of Politi- cal Science address the same audience, but no university library would attempt to persuade its political science faculty that only one title is needed. Even beyond uniqueness of content, other challenges make comparison difficult. One journal might contain one hundred large pages; another might contain two hundred smaller pages. Some journals offer news and job advertisements; others offer bib- liographies or opinion pieces. One sub- scription offers twelve issues; another of- fers four. Researchers have developed a number of analyses designed to address the task of determining the value of a print jour- nal, to create what Barbara Meyers and Janice L. Fleming have referred to as a “reasonably equitable quantitative evalu- ation tool” to account for variations from title to title.1 Two primary approaches for assessing value have emerged. One ex- amines price in the context of a journal’s content. The best known of such studies is Henry H. Barschall’s evaluation of the cost-effectiveness of physics titles, which analyzed cost per 1,000 characters and cost per impact factor to account for varia- tions in the amount and quality of con- tent.2 Following Barschall’s example, other researchers have compared journals on the basis of price per 1,000 words, price per page, and cost per character.3 Most recently, the University of Wisconsin- Madison Libraries have updated Barschall’s studies, using the same meth- odology.4 The second major approach examines value in the context of use of the journal. Studies of the use of print journals have a long history. Although time-consuming, many libraries have nevertheless found such studies invaluable because the data can help identify potential cancellations, which faculty are more likely to support because the libraries are using the best data available. The University of Wiscon- sin-Madison Libraries have led the way in cost-per-use studies of print journals and have used their analyses to make journal cancellation decisions since 1995.5 A third approach provides a rare syn- thesis of the first two approaches. Carol Tenopir and Donald W. King discussed two metrics that reflect the value of the information in a journal: purchase value of a journal, based on the amount that re- searchers are willing to pay to use a jour- nal, versus use value, based on the ben- efits that come from use. When the pur- Evaluative Usage-based Metrics for the Selection of E-journals 217 chase value exceeds the use value, re- searchers turn away from subscriptions to alternative sources of journal articles. Tenopir and King have suggested that these metrics remain appropriate in the world of e-journals.6 Regardless of the method used, all re- searchers have agreed that a print journal’s value cannot be assessed with content evaluation alone. Rather, the print journal’s value must be put in the con- text of the amount of content it offers and the users it potentially serves.7 For ex- ample, if no one on a particular campus reads the articles in a particular journal, that journal has little value to the cam- pus, even if the quality of its content is high. At the same time, use is not com- pletely unrelated to either quality or quantity. For example, if a journal offers less content for the same price as its peers, that also must be taken consideration. Although the need for context now seems like common sense in the print arena, surprisingly few researchers have applied lessons from the assessment of print journals to the e-journal or collec- tions of e-journals.8 This shortcoming in the literature seems to stem, in part, from the focus on evolving standards for e- journal statistics rather than on their ap- plication.9 Instead of responding to the need for data to create effective metrics for assessing titles or collections, existing standards for e-journal statistics appear strongly derivative of database statistics. The standards demand counts of particu- lar kinds of uses, but little information on the title or collection itself. For example, one does not see demands for data on the amount of content online for particular journals or collections that Barschall con- sidered essential for determining the value of a print journal. Without this information on electronic content, measures of the value of e-jour- nals become difficult to create because the context needed to understand the avail- able data is missing. The Council on Li- brary and Information Resources’s White Paper on E-journal Usage Statistics empha- sized this need for context to evaluate usage statistics, but the need has yet to be translated into a demand for informa- tion on content to be included in online statistics.10 For example, the International Coalition of Library Consortia (ICOLC) guidelines recommend the provision of information on the number of queries, turn-aways, and items examined, but not on the type or amount of content cur- rently offered online.11 Content measures are essential to both librarians and pub- lishers seeking to interpret and apply us- age data. Part of the problem with devel- oping effective metrics lies in the lack of information on electronic content for par- ticular journals or collections. Librarians cannot necessarily compile information on content themselves. The task of count- ing the number of articles, pages, or words for a particular e-journal or collec- tions of e-journals can be overwhelming, and the amount of content in many col- lections changes constantly. Despite these difficulties, some cur- rently available e-journal usage statistics illustrate new possibilities for assessing relative value and suggest how a broad set of usage statistics could be useful for collection management. Perhaps because usage statistics are hard to get, they have been more rarely factored into the valua- tions of e-journals than in print. However, usage becomes even more important to assessing value in the electronic arena because libraries often pay for access and not ownership. In the ownership context, libraries can purchase materials just in case they prove useful in the future; it makes little sense to spend funds on ac- cess that is not used. A small body of work exploring usage statistics has only recently developed for e-journals, although the development of metrics to put that usage into the context of content remains unexplored. In a pio- neering analysis of HighWire Press sta- The metrics allow equitable assess- ment of the e-journals’ value in terms of both content offered and usage. 218 College & Research Libraries May 2002 tistics, Linda Mercer suggested applying electronic usage information to purchase and cancellation decisions, staff training decisions, and user studies.12 Deborah D. Blecic, Joan B. Fiscella, and Stephen Wiberley also explored the collection management implications of usage statis- tics, focusing on the potential for cancel- lation assessments. They noted that “If a library cannot afford to keep all titles, the question becomes, What percentage of use does the library want to meet? … It can then ascertain the least expensive mix of titles that meets its goal and cancel the others.”13 Although these discussions pro- vide a starting point, a great deal of ground remains to be explored if e-jour- nal assessment is to approach even the level of existing print journal assessments. Development of E-journal Metrics To create measures of value for e-journals, the authors of this article have carried the agreed-upon standard in the print world—Meyers and Fleming’s “reason- ably equitable quantitative evaluation tool”—into the electronic context.14 Build- ing on earlier analyses examining cost per article, the authors have integrated the information provided by publishers on the usage of e-journals. In this way, the authors can not only assess the cost per unit of content, but also can examine what Mercer has called “performance mea- sures” to determine the value that users derive from particular publications.15 The result builds on studies of print journals and takes advantage of the ready avail- ability of usage statistics for some e-jour- nals. The analysis offers a way to mea- sure the value of e-journals that incorporates both their relative content and the users’ relative demand for them. Evaluation of Current Collections In the authors’ analyses, the statistics pro- vided by HighWire Press have proved the most valuable.16 HighWire Press provides a variety of statistics on use and the amount of content online for each title. A subset was used to develop the authors’ metrics: HighWire Press’s “number of full-text articles in HTML format viewed,” its “number of PDF files down- loaded,” and the total articles online, in addition to the University of Maryland’s subscription price.17 Articles are the pri- mary unit of content measured for a title. The subset consists of the statistics that focus on the full text of articles because, for users, access to articles remains the primary benefit of online access. Users value the intellectual units found in jour- nals—the articles, not particular words or pages.18 From the HighWire Press statistics, the authors derived three metrics to evaluate the value and the performance of licensed e-journals and collections of e-journals. Building on valuations commonly per- formed with print journals, the authors derived an average cost per access and an average cost per article. The average cost per access represents the average cost of each access event to a full-text article and is calculated from the e-journal’s subscrip- tion price divided by the number of ar- ticles accessed.19, 20 Average cost per access = Subscription price Number of articles accessed The average cost per article is computed by dividing the e-journal’s subscription price by the total number of articles of- fered online by the e-journal. Average cost per article = Subscription price Number of articles online These metrics allow comparison of the value of e-journals with different online content and can indicate whether a site license is more cost-effective than the purchase of individual articles. In addition to metrics adapted from existing print metrics, the authors devel- oped a novel metric, content-adjusted us- age, which allows the total number of ar- ticles offered online to be compared to the total number of HTML articles viewed and the total number of PDF files downloaded. This metric addresses the question, Out of the total articles offered by an e-journal, Evaluative Usage-based Metrics for the Selection of E-journals 219 what proportion did our users access? Cost-adjusted usage is calculated by divid- ing the number of full-text accesses of an article (the number of times an HTML file is viewed or a PDF file is downloaded) by the number of available articles. Content-adjusted usage = Number of full-text accesses Number of articles online This metric provides a way to compare the usage of journals that offer widely dif- fering numbers of articles online. For ex- ample, one learns little from discovering that, in 1999, the Journal of General Physi- ology (JGP) had thirty-eight full-text ac- cesses and Pharmacological Reviews (PR) had twenty-eight full-text accesses unless one also knows that JGP offers 3,104 ar- ticles online and PR offers only 543 ar- ticles. The metric indicates that JGP had a content-adjusted usage of 1.22 percent compared to PR’s value of 5.16 percent, which puts JGP’s seemingly higher usage into perspective. The metrics allow equitable assess- ment of the e-journals’ value in terms of both content offered and usage. Usage adds a valuable dimension to the exami- nation of relative value. For example, in 2000, these variables might have been used to examine the relative prices and performance of Journal of Cell Biology and Proceedings of the National Academy of Sci- ences (PNAS), both with annual subscrip- tion prices of $880 for both print and online access. A first glance, it might seem that Journal of Cell Biology and PNAS of- fered similar value because their subscrip- tion prices were the same, but the num- bers present a slightly more complicated picture. An examination of the titles’ con- tent-adjusted usage as of the end of 2000 revealed that only 4.9 percent of JCB’s more than 13,500 full-text articles were displayed or downloaded compared to more than 19 percent of PNAS’s more than 26,000 articles. Second, although PNAS’s subscription price averaged out to $.03 per article online and JCB provided articles online at a mere $.07 per article, a look at the average cost per access showed the numbers in a different light. PNAS’s average cost per access at $.17 was only a little higher than its average cost per ar- ticle whereas Journal of Cell Biology’s aver- age cost per access was $1.32. Selectors could have used the results as a framework to begin examination of the worth of these e-journals in the context of their particu- lar disciplines and user populations As librarians receive appropriate usage statistics from more of their vendors, they can continue to create a landscape of ac- ceptable prices and costs. The various sta- tistics and metrics can be seen as map- ping a multivariate space of products de- scribed by unique combinations of values such as price, article content, and usage. If librarians view their collections as map- ping landscapes in the available space, they can more clearly assess where the boundaries lie for the values they define as reasonable. Items within the collection can be compared and items being consid- ered for purchase can be evaluated based on whether they fit into the librarians’ landscape of reasonable values or fall outside it. This concept of a multivariate landscape for value assessment offers a more sophisticated evaluative environ- ment than the isolated application of single measures. Selection and Evaluation of Potential Purchases Traditionally, usage statistics have sup- ported evaluations of past purchases, but descriptive statistics also can be trans- formed to produce three benchmarks for the evaluation of potential purchases. Two of the metrics already described— average cost per article and content-ad- justed usage—can be used to create three new benchmark metrics for analyzing a potential purchase. The key to develop- ing the evaluative benchmarks lies in the identification of comparable peer re- sources with known usage. Each bench- mark uses data already known for both the product currently in the collection and the candidate purchase. This technique can be applied to the evaluation of a single e-journal title or an e-journal collection. 220 College & Research Libraries May 2002 Analysis of a potential purchase begins with a peer product already in the collec- tion. The peer product should cover a similar subject area, address a like audi- ence, and possibly share a history of com- parable use as a print publication. Con- sider a hypothetical case where an e-jour- nal collection is proposed for purchase. The collection under consideration, the Candidate Collection, is priced at $25,000 per year and offers 45,000 articles online. The library already subscribes to a peer e-journal collection, the Licensed Collec- tion. Last year, the Licensed Collection was priced at $10,000 and experienced 25,000 full-text accesses. It contains 50,000 online articles in subject areas similar to the Candidate Collection. The titles in both collections are thought to have re- ceived similar usage in the past in print form. Table 1 illustrates the data known about the two collections. To evaluate the Candidate Collection, the first step is to compare the two collec- tions applying a metric already used to evaluate the past performance of licensed e-journals and collections—the average- cost-per-article metric. With a subscrip- tion price of $25,000 and 45,000 articles, the Candidate Collection has an average cost per article of $.55, substantially higher than the Licensed Collection’s av- erage cost per article of $.20 for 50,000 articles at $10,000. However, this alone may not be a fair basis of comparison be- tween the collections. An article that re- ceives twice as much use as another might be worth twice the price. The more ex- pensive articles also may provide enough value to justify the price. The benefit of the metric is that it quantifies the magni- tude of the price differential for content alone. The selector still determines whether the differential is substantial enough to reject purchase or whether an- other resource might offer a better return on investment. Regardless, the single metric probably does not offer enough information to allow a fully informed decision. To aid selectors, the authors have developed three additional analy- ses in the form of benchmark metrics that incorporate assessments of likely usage levels: the cost-based usage benchmark, the content-based usage benchmark, and the cost per access at the content-based usage benchmark. The cost-based usage benchmark deter- mines how many full-text accesses a po- tential e-journal or collection purchase must receive in a year for it to achieve the same average cost per access as a peer e- journal or collection already licensed by the library. Using the previous examples of peer resources, the Candidate Collec- tion and Licensed Collection, it is possible to determine how many full-text accesses the Candidate Collection would have to receive in order to attain the same value of cost per access as the Licensed Collec- tion. In the past year, the Licensed Col- lection had 25,000 full-text accesses of its TABLE 1 Comparison of Candidate Collection to Licensed Collection Licensed Candidate Collection Collection Price $10,000 $25,000 Total number of online articles as of the end of the year 50,000 45,000 Total annual number of full-text accesses to the articles in the collection 25,000 Unknown Average cost per article $.20 $.55 Content-adjusted use 0.50 Unknown Average cost per access $.40 Unknown Evaluative Usage-based Metrics for the Selection of E-journals 221 articles, giving it an average cost per ac- cess of $.40. The ultimate usage of the Candidate Collection is unknown. To es- timate the level it must achieve to meet the value of the Licensed Collection, the average-cost-per-access metric is trans- formed into a benchmark and calculated in the following manner: Cost-based usage benchmark = Price of desired resource Cost per access of peer product in collection To determine the cost-based usage benchmark for the Candidate Collection, the collection’s price of $25,000 is divided by the Licensed Collection’s average cost per access of $.40, giving the Candidate Collection a cost-based usage benchmark of 62,500 uses. Therefore, for the Candi- date Collection to achieve a cost per ac- cess level comparable to that of the Li- censed Collection would require 62,500 full-text accesses in a subscription year, 37,500 more than the Licensed Collection’s 25,000 full-text accesses. It is up to the selector to determine whether it is reasonable to expect use of the Candi- date Collection to reach that level. The second benchmark metric is the content-based usage benchmark. This met- ric determines how many full-text ac- cesses a proposed purchase must receive in a year in order to provide the same value in terms of content-adjusted usage as a peer product currently in the collec- tion. The metric allows the selector to as- sess the value of a proposed purchase from the further angle of the number of full-text accesses adjusted for collection size. Transforming the content-adjusted usage metric in the following manner cre- ates the benchmark: Content-based usage benchmark = Collection Size x Content-adjusted of Desired Usage of Peer Prod- Resource uct in Collection Using the previous example, for the Candidate Collection to have a content- adjusted usage equivalent to the peer Li- censed Collection, its 45,000 articles must reach the Licensed Collection’s usage of 0.50, or 50 percent. Therefore, the Candi- date Collection would have a content- based usage benchmark of 22,500 full-text accesses (equal to 45,000 articles multi- plied by 0.50 uses per article). The selec- tor then can consider whether the library can expect the Candidate Collection to provide a content-based usage value simi- lar to the Licensed Collection. The third benchmark metric is the cost per access at the content-based usage bench- mark. This metric takes the content-based usage benchmark a step further by calcu- lating the cost per access at that level of usage. For example, if the Candidate Col- lection were to achieve 22,500 accesses per year (equivalent to the Licensed Collection’s content-based usage value of 0.50), its cost per access at the content- based usage benchmark would be $1.11, a rate that exceeds the Licensed Collection’s cost per access by $.71. The usage metrics (summarized in table 2) do not predict the level of usage but, rather, give selectors points of comparison for assessing the likelihood of the Candi- date Collection providing value equivalent to or greater than the Licensed Col- lection. Together, the metrics provide a clearer picture of the levels of usage required for the Candidate Collec- tion to provide us- age value compa- rable to the existing investment made TABLE 2 Evaluative Metrics for Candidate Collection (based on Licensed Collection) Candidate Collection Cost-based usage benchmark 62,500 Content-based usage benchmark 22,500 Cost per access at the content-based usage benchmark $1.11 222 College & Research Libraries May 2002 by the library in the Licensed Collection. The cost-based usage benchmark determines the amount of full-text accesses the untried product would need to receive to achieve a comparable cost per access. The content- based usage benchmark examines how many full-text accesses the untried product must receive to experience comparable levels of usage per article. Assessing cost per access at the content-based usage benchmark helps put the content-based usage measure of value into perspective by calculating the average cost per access when a potential purchase achieves its content-based usage benchmark. The metrics above provide objective points of comparison between e-journal products. They adjust for variations in pricing, collection size, and usage rates. The need to examine usage becomes par- ticularly urgent when libraries purchase temporary access to, rather than owner- ship of, a resource. In contrast to print materials that can be purchased and stored until needed at a distant point in the future, the limited funds of most li- braries make it unfeasible to pay for ac- cess year after year to materials that are not used. The metrics provide informa- tion to the collection manager while al- lowing him or her to determine accept- able levels of investment and anticipated usage. A selector could conclude that based on the comparability of the content in the Candidate Collection and the Li- censed Collection, it is reasonable to ex- pect comparable content-based usage. Further, the selector might be comfortable with the anticipated price differential of $.71 per access in the two collections. A real-world analysis of information products illustrates the utility of this ap- proach. The potential purchases are two Nature site licenses, offered at different times and varied in price and amount of online content. The licensed product is the online version of Science. All site licenses are based on the size of the user commu- nity at the University of Maryland. The prices used are rounded approximations of quotes offered to the University of Maryland, not the exact prices quoted. Because the two offers from Nature gen- erate different evaluative metrics, this sce- nario maps an arresting assessment land- scape. Tables 3 and 4 provide the statis- tics, metrics, and benchmarks for the three licenses. TABLE 3 Comparison of Science and Nature Site License Offers Science Nature Nature (2000) (Fall 2000) (Spring 2001) Price $5,500 $22,000 $6,500 Total number of online articles as of end of year 16,347 2,711* 30,000** Total annual number of full-text accesses to articles 12,703 Unknown Unknown Average cost per article $.34 $8.12 $.22 Content-adjusted use 0.771 Unknown Unknown Average cost per access $.43 Unknown Unknown *Estimate based on the average number of articles published per month as reported for 1997�1999 in Journal Citation Reports and extrapolated for the period of coverage of July 1997 through December 2001 offered by Nature in its initial license. Journal Citation Reports counts only research articles, not news reports, which corresponds to the content originally offered in the 2000 Nature license pricing. ** Based on estimates provided by staff at Nature. The 2001 license offers full access to all content published in Nature, hence the substantial difference in number of articles. Evaluative Usage-based Metrics for the Selection of E-journals 223 TABLE 4 Evaluative Metrics for Nature Nature Nature (Fall 2000) (Spring 2001) Cost-based usage benchmark 51,162 15,116 Content-based usage benchmark 2,090 23,130 Cost per access at the content-based usage benchmark $10.53 $.28 At first glance, the initial Nature offer, the 2000 Nature License, would seem to provide less value than Science for Nature’s subscription price of $22,000. The 2000 Nature License’s approximately 2,711 articles had an average cost of $8.12 per article, almost twenty-four times the av- erage cost of an article for Science at $.34. To temper this comparison, however, Nature offered only those articles contain- ing research content (albeit embargoed for three months) whereas Science’s article count included news articles, book re- views, and editorials. Moving beyond the cost per article to content-based usage, the numbers appear less extreme. Science achieved a 0.77, or a 77 percent, usage of its 16,347 articles in 2000. Because the Nature 2000 License would give access to only 2,711 articles, Nature would require 2,090 full-text accesses to achieve similar content-based usage value to Science, but the cost per use at this level would be a whopping $10.53 per access. To achieve the same cost per access as Science at $.43, the 2000 Nature License would need 51,162 full-text accesses in a year, four times as much usage as Science received on campus in 2000. In its second offer, the 2001 Nature Li- cense, Nature closed this disparity. The 2001 Nature License, with more than ten times as much content at a third of the price, had a price per article of $.22, $.21 lower than the cost of Science at $.43 per article. In addition, the type of content offered is more similar because the article count for both e-journals would contain news and opinion pieces as well as re- search articles. To achieve a similar con- tent-based usage, because of the larger number of articles, Nature would require more full-text accesses for its 30,000 ar- ticles. It would have to have at least 23,130 full-text accesses to meet the usage level of Science of 0.77, or 77 percent; however, the cost per access at that level would be only $.28, $.15 less than the current Sci- ence cost per access. To achieve a cost per access equal to that of Science, Nature need only receive 15,116 uses (1.19 times the usage of online version of Science in 2000). If one assumes that the content-based usage benchmark sets a reasonable expec- tation of full-text access for Nature, the dif- ferences between the two offers become clearer. The anticipated cost per use un- der the first Nature License would be more than twenty-four times the cost per access of Science articles in 2000 at the content- based usage benchmark. Under the 2001 license terms, Nature articles would have a lower cost per access than Science articles at the content-based usage benchmark. In the fall of 2000, a selector could evaluate the likelihood that Nature would receive four times the usage levels of Sci- ence (assuming he or she wanted to achieve cost per use parity). Similarly, the selector could consider whether it is rea- sonable to pay an estimated $10.53 per ar- ticle if Nature sees usage at the per article level comparable to what has been ob- served in the recent past with the online version of Science. The benchmarks do not answer these questions, but they do pro- vide a powerful framing system for evalu- ating selection decisions. The metrics can create a landscape of acceptable costs or, A high level of comparability maximizes the utility of the bench- mark metrics. 224 College & Research Libraries May 2002 alternatively, a landscape of reasonable usage. The landscape, although useful, does not provide a directive for purchase; it simply provides a context for decision- making. Some obvious questions remain. A ma- jor concern is the determination of an ac- ceptable average cost per full-text access for any collection of articles. A number of factors could influence acceptable levels of cost in the electronic resource and print landscapes, such as budget levels or qual- ity of content. What constitutes an accept- able level of usage, and thus acceptable benchmark usage levels, depends on the clientele of a library, the disciplinary fo- cus of the content, the currency of the con- tent, the perceived quality of the content, and a variety of other factors. Likewise, the setting of reasonable benchmarks of cost adjusted for content or use is context dependent. Available funding, availability of alternate information sources, level of need, or peculiarities of the local environ- ment always contextualize such decisions. Another key issue is the comparabil- ity of the peer resource in the collection to the purchase candidate. A high level of comparability maximizes the utility of the benchmark metrics. Peer resources should address similar subject areas. An e-journal in art could not develop useful benchmark metrics for a physics e-jour- nal because it would not offer a realistic comparison. Peers also should have simi- lar usage rates in print. Two physics e- journals with comparable usage rates in print could be expected to generate simi- lar usage rates in electronic form. How- ever, a variety of factors could alter the picture. It is not uncommon for electronic versions of print journals to offer a dif- ferent amount of backfile or differ in the speed with which they load current is- sues. If users value access to current con- tent highly, it might be worth paying a higher per article rate for more current content. A third issue centers on the number of factors that can contribute to the levels of full-text access. Although the focus here is on a year ’s worth of data, a selector would ideally need to review a number of years of data to get a clear understand- ing of the usage levels for particular prod- ucts. A new product typically will not reach ongoing usage for some period of time. In addition, a number of other fac- tors can help increase or decrease usage levels. Marketing of electronic resources can affect their rate of adoption and ulti- mate use. As linkage between electronic resources becomes more frequent, it can enhance the rate of use of electronic re- sources. A different concern affecting both the original usage metrics and the bench- marks is the challenge of quantifying ei- ther use or content. The proposed metrics use article access as a proxy for use and the article as the basic unit to measure content, but these units are somewhat arbitrary. Usage may reflect different ac- tivities with different resources. Some content providers offer different versions of their articles for printing and on-screen browsing. A user may access an article two or more times to use it once. Differ- ent content providers are likely to catego- rize different types of content as articles; for instance, letters, editorials, and news pieces might be counted in some re- sources and not in others. Articles can vary considerably in length. Despite these questions, the metrics provide a powerful system for the evalu- ation of selection decisions. As informa- tion providers struggle to stretch their limited budgets, they must constantly reevaluate the value of their current col- lections and potential purchases. The metrics allow the selector to use known data to examine the full-text access and subscription price of both current items and potential purchases from a variety of angles. These data can reinforce the selector’s subjective judgments on cancel- lation and purchase decisions. The use- fulness of the data, however, depends on their availability in a consistent form from a variety of publishers so that selectors can create an assessment landscape that serves as a framework for their decisions. Demands for standards for publishers’ Evaluative Usage-based Metrics for the Selection of E-journals 225 usage statistics must begin to move be- yond mere quests for data to the creation of standards that will advance the appli- cation of the statistics in collection man- agement and other aspects of library ad- ministration. Conclusion This exploration of usage metrics and evaluative benchmarks suggests a world of possibilities and continuing conun- drums in the search for measures of value to assist with collection development and management. Building on valuations of print journals, the authors have devel- oped a variety of metrics to assess the value of e-journals and e-journal collec- tions. The average cost per access, aver- age cost per article, and content-adjusted usage allow the comparison of e-journals currently in the collection that have dif- ferent amounts of online content. The cost-based usage benchmark, the content- based usage benchmark, and the cost per access at the content-based usage bench- mark allow the comparison of a currently held resource with a desired peer prod- uct, even if the products have different amounts of content and value. The metrics and benchmarks enhance the value and utility of the basic statistics pro- vided by a publisher. Both provide deci- sion support. The benchmarks take the metrics a step further and provide sup- port for purchase decisions. The creation of these metrics and benchmarks highlights a number of short- comings in the usage statistics that pub- lishers currently provide. First, because the metrics and benchmarks are built on the publisher ’s statistics, ambiguity and inconsistencies in the units counted pose real problems. It is not always clear what publishers count as an “article” or an “ac- cess.” Fortunately, metrics do not have to be perfect to be helpful. Selectors con- stantly make qualitative and quantitative assessments based on imperfect informa- tion. Metrics need not provide absolute answers to collection development or col- lection management questions to help selectors make better decisions. A larger problem is the paucity of needed statistics. For evaluative land- scapes to develop, librarians must insist on statistics relevant to value assessment. The librarians’ analysis demonstrates the critical need for content measures. Con- stantly changing amounts of content and collections holding thousands of articles prevent the authors from cost-effectively and reliably gathering their own data on content. Standards remain the best hope for information professionals to effec- tively communicate the statistics they need to publishers. It is therefore disap- pointing that e-journal statistics appear to still be mired in the measures of data- bases, largely conceptualized as electronic indexing and abstracting resources. Judy Luther is right to insist that context is es- sential to the interpretation of usage.21 The information community needs to deter- mine what additional information can provide the needed context and insist that it be made explicit as part of the contract for access. The metrics and benchmarks show that article counts can be used to enhance col- lection-building and management. That at least one publisher is already provid- ing them suggests that counts could be easily made available. Indeed, article counts would only be a beginning; the lit- erature on content measures for print journals suggest alternatives of counts of words or even characters. With develop- ing e-publication standards such as XML, such counts should become feasible if they are not already. Further breakdowns of both content and usage measures are conceivable, such as usage of the current year ’s content or a comparison of the us- age of research articles and editorials. The analyses presented here also demonstrate some approaches for more rational price development by publishers. As we begin to develop our first crude tools for evalu- ating electronic journals, it is satisfying to see their power, but daunting to recog- nize their crudeness. The metrics and benchmarks the authors have explored suggest the possibilities of what might be accomplished and suggest how much 226 College & Research Libraries May 2002 might be lost if librarians are unable to obtain the data they need to function as stewards for their collections and acqui- sitions budgets. Notes 1. Barbara Meyers and Janice L. Fleming, “Price Analysis and the Serials Situation: Trying to Solve an Age-old Problem,” Journal of Academic Librarianship 17, no. 2 (May 1991): 87, 86–92. 2. Henry H. Barschall, “The Cost of Physics Journals,” Physics Today 39, no. 12 (Dec. 1986): 34–36; ———, “The Cost-Effectiveness of Physics Journals,” Physics Today 41, no. 7 (July 1988): 56–59. 3. For example, see American Mathematical Society, Journal Price Survey (1994–2001) (Provi- dence, R.I.: American Mathematical Society, 2001), available online from ; Journal Price Study: Core Agricultural and Biological Journals (Ithaca, N.Y.: Faculty Taskforce, College of Agriculture and Life Sciences, Albert R. Mann Li- brary, Cornell University, 1998), available online from ; Meyers and Fleming, “Price Analysis and the Serials Situation.” 4. George Soete, Measuring the Cost-Effectiveness of Journals: Ten Years after Barschall (Madi- son, Wis.: UW-Madison Libraries, 1999), available online from . 5. University of Wisconsin-Madison Libraries, Journals & Journal Articles—Introduction (Madi- son, Wis.: UW-Madison Libraries, 2001), available online from ; University of Wisconsin-Madison Libraries, More Information about Jour- nal Cost per Use Statistics (Madison, WI: UW-Madison Libraries, 1999), available online at . 6. Donald W. King and Carol Tenopir, “Designing Electronic Journals with 30 Years of Les- sons from Print,” JEP the Journal of Electronic Publishing 4, no. 2 (Dec. 1998), available online from . 7. In If You Want to Evaluate Your Library (Champaign, Ill.: Univ. of Illinois, Graduate School of Library and Information Science, 1988): 8–9, F. W. Lancaster reflected this thinking in his discussion of one of Ranganathan’s Five laws of Library Science: “Books Are for Use.” Under this principle, libraries must consider the cost-effectiveness of their resources “because of lim- ited resources, $30 spent on a book that is little if ever used is $30 less available for an item (possibly a duplicate copy) of something that might be in great demand.” 8. An even larger question, which this analysis does not attempt to answer, is how to com- pare the value of print and electronic journals. As we develop means to more effectively assess and estimate the value of e-journals, it is also time to reassess the value of print journals with increasing subscription costs. As Suzanne D. Gyeszly has pointed out in “Electronic or Paper Journals? Budgetary, Collection Development, and User Satisfaction Questions,” Collection Building 20, no. 1 (2001): 10, 5–10: “Until sufficient and standard use data are unavailable [sic], the library must pay for dual subscriptions for paper and online versions, as contracts with major e-journal providers such as Elsevier do not presently allow cancellation of paper subscription, despite heavy use of electronic subscriptions of the same title.” 9. Pioneering work in the development of standard e-journal statistics, based on early work with the JSTOR project, has occurred under the auspices of the International Coalition of Library Consortia (ICOLC). See International Coalition of Library Consortia, Guidelines for Statistical Measures of Usage of Web-based Information Resources (December 2001 Revision of Original November 1998 Guidelines), available online from . A more recent critique of e-journal statistics has been offered by Judy Luther in White Paper on Electronic Journal Usage Statistics (Washington, D.C.: Council on Library and Information Resources, 2000), available online from . The Association of Research Libraries funded the development of a set of metrics for electronic resources including e-journals. See Wonsick Shim, et al., Measures and Statistics for Research Library Networked Services: Procedures and Issues, ARL E-metrics Phase II Report (Washing- ton, D.C.: Association of Research Libraries, 2001), also available online from . 10. Luther, White Paper on Electronic Journal Usage Statistics. 11. International Coalition of Library Consortia (ICOLC), Guidelines for Statistical Measures of Usage of Web-based, Indexed, Abstracted and Full-text Resources, available online from . 12. Linda Mercer, “Measuring the Use and Value of Electronic Journals and Books,” Issues in Science and Technology Librarianship, no. 25 (winter 2000), available online from . 13. Deborah D. Blecic, Joan B. Fiscella, and Stephen Wiberley, “The Measurement of Use of Web-based Information Resources: An Early Look at Vendor-supplied Data Using ICOLC Guide- lines,” College & Research Libraries 62, no. 5 (Sept. 2001): 450, 434–53. 14. Meyers and Fleming, “Price Analysis and the Serials Situation,” 87. 15. Mercer, “Measuring the Use and Value of Electronic Journals and Books.” 16. HighWire Press, the Internet imprint of Stanford University Libraries, hosts more than 250 science, technology, and medicine journals online. The home page for HighWire Press is located on the Web at . 17. HighWire Press, Detailed Online Article Counts, available online from , and Usage Statistics, available online from . 18. Issue level data also might reflect a relevant unit of content for users, but these data were unavailable for analysis. 19. With the exception of Science, all the HighWire Press subscriptions include both print and online access for one price. This raises the dilemma of how to calculate an e-journal’s subscrip- tion price. Several methods could be used, but what is important is to use a consistent approach. One sensible method is to use the price of the print/online bundle. Alternatively, when compar- ing online resources where access is priced independently of print subscription, as is the case with Science and Nature, the price of online access is the reasonable datum to use. 20. Note that the full-text accesses summarized in the attached statistics are not unique. An access event is counted regardless of whether it is to an article previously accessed or to a new article. For example, a single article in Science may be viewed in HTML format two times and downloaded as a PDF file once. Each of these accesses would be counted, giving a total of three “articles accessed.” 21. Luther, White Paper on Electronic Journal Usage Statistics.