tags into the page where they want data from Google Book Search to display. These tags contain an HTML attribute that acts as an identifier to describe the bibliographic item for which information should be retrieved. It may contain its ISBN, OCLC num- ber, or LCCN. In addition, the tags also contain one or more HTML <class> attributes to describe which processing should be done with the information retrieved from Google to integrate it into the page. These classes can be combined with a list of traditional CSS classes in the <class> attribute to apply further style and formatting control. examples As an example, consider the following HTML an adapter may use in a page: <span title=“ISBN:0596000278” class=“gbs -thumbnail gbs-link-to-preview”></span> When processed by the Google Book Classes widget library, the class “gbs-thumbnail” instructs the widget to embed a thumbnail image of the book jacket for ISBN 0596000278, and “gbs-link-to-preview” provides instruc- tions to wrap the <span> tag in a hyperlink pointing to Google’s preview page. The result is as if the server had contacted Google’s Web service and constructed the HTML shown in example 1 in table 2, but the mash-up creator does not need to be concerned with the mechanics of contacting Google’s service and making the necessary manipulations to the document. Example 2 in table 2 demonstrates a second possible use of the widget. In this example, the creator’s intent is to display an image that links to Google’s information page if and only if Google provides at least a partial preview for the book in question. This goal is accom- plished by placing the image inside the span and using style=“display:none” to make the span initially invisible. The span is made visible only if a preview is available at Google, displaying the hyperlinked image. The full list of features supported by the Google Book Classes widget library can be found in table 3. integration with legacy oPAcs The approach described thus far assumes that the mash- up creator has sufficient control over the HTML markup that is sent to the user. This assumption does not always hold if the HTML is produced by a vendor-provided system, since such systems automatically generate most of the HTML used to display OPAC search results or indi- vidual bibliographic records. If the OPAC provides an extension system, such as a facility to embed customized links to external resources, it may be used to generate the necessary HTML by utilizing variables (e.g., “@#ISBN@” for ISBN numbers) set by the OPAC software. If no extension facility exists, accommodations by the widget library are needed to maintain the goal of not requiring any programming on the part of the adapter. We implemented such accommodations to facilitate the use of Google Book Classes within a III Millennium OPAC.7 We used magic strings such as “ISBN:millennium.record” in a Table 1. Sample Request and Response for Google Book Search Dynamic Link API Request: http://books.google.com/books?bibkeys=ISBN:0596000278&jscmd=viewapi&callback=process JSON Response: process({ “ISBN:0596000278”: { “bib_key”: “ISBN:0596000278”, “info_url”: “http://books.google.com/books?id=ezqe1hh91q4C\x26source=gbs_ViewAPI”, “preview_url”: “http://books.google.com/books?id=ezqe1hh91q4C\x26printsec=frontcover\x26 source=gbs_ViewAPI”, “thumbnail_url”: “http://bks4.books.google.com/books?id=ezqe1hh91q4C\x26printsec=frontcover\x26 img=1\x26zoom=5\x26sig=ACfU3U2d1UsnXw9BAQd94U2nc3quwhJn2A”, “preview”: “partial”, “embeddable”: true } }); 80 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 Table 2. Example of client-side processing by the Google Book Classes widget library Example 1: HTML Written by Adapter Browser Display <span title=“ISBN:0596000278” class=“gbs-thumbnail gbs-link-to-preview”> </span> Resultant HTML after Client-Side Processing <a href=“http://books.google.com/books?id=ezqe1hh91q4C& printsec=frontcover&source=gbs_ViewAPI”> <span title=“” class=”gbs-thumbnail gbs-link-to-preview”> <img src=“http://bks3.books.google.com/books?id=ezqe1hh91q4C& amp;printsec=frontcover&img=1&zoom=5& sig=ACfU3U2d1UsnXw9BAQd94U2nc3quwhJn2A” /> </span> </a> Example 2: HTML Written by Adapter Browser Display <span style=“display: none” title=“ISBN:0596000278” class=“gbs-link-to-info gbs-if-partial-or-full”> <img src=“http://www.google.com/intl/en/googlebooks/images/ gbs_preview_button1.gif” /> </span> Resultant HTML after Client-Side Processing <a href=”http://books.google.com/books?id=ezqe1hh91q4C& source=gbs_ViewAPI”> <span title=“” class=“gbs-link-to-info gbs-if-partial-or-full”> <img src=“http://www.google.com/intl/en/googlebooks/images/ gbs_preview_button1.gif” /> </span> </a> Table 3. Supported Google Book classes Google Book Class Meaning gbs-thumbnail gbs-link-to-preview gbs-link-to-info gbs-link-to-thumbnail gbs-embed-viewer gbs-if-noview gbs-if-partial-or-full gbs-if-partial gbs-if-full gbs-remove-on-failure Include an <img...> embedding the thumbnail image Wrap span/div in link to preview at Google Book Search (GBS) Wrap span/div in link to info page at GBS Wrap span/div in link to thumbnail at GBS Directly embed a viewer for book’s content into the page, if possible Keep this span/div only if GBS reports that book’s viewability is “noview” Keep this span/div only if GBS reports that book’s viewability is at least “partial” Keep this span/div only if GBS reports that book’s viewability is “partial” Keep this span/div only if GBS reports that book’s viewability is “full” Remove this span/div if GBS doesn’t return book information for this item <title> attribute to instruct the widget library to harvest the ISBN from the current page via screen scraping. Figure 3 provides an example of how a Google Book Classes widget can be integrated into an OPAC search results page. ■■ The Tictoclookup Widget Library The ticTOCs Journal Table of Contents Service is a free online service that allows academic researchers and weB services ANd widGets For liBrArY iNFormAtioN sYstems | BAck ANd BAileY 81 other users to keep up with newly published research by giving them access to thousands of journal tables of con- tents from multiple publishers.8 The ticTOCs consortium compiles and maintains a dataset that maps ISSNs and journal titles to RSS-feed URLs for the journals’ tables of contents. the tictoclookup web service We used the ticTOCs dataset to create a simple JSON Web service called “Tictoclookup” that returns RSS-feed URLs when queried by ISSN and, optionally, by journal title. Table 4 shows an example query and response. To accommodate different hosting scenarios, we created two implementations of this Tictoclookup: a standalone and a cloud-based implementation. The standalone version is implemented as a Python Web application conformant to the Web Services Gateway Interface (WSGI) specification. Hosting this version requires access to a Web server that supports a WSGI- compatible environment, such as Apache’s mod_wsgi. The Python application reads the ticTOCs dataset and responds to lookup requests for specific ISSNs. A cron job downloads the most up-to-date version of the dataset periodically. The cloud version of the Tictoclookup service is implemented as a Google App Engine (GAE) applica- tion. It uses the highly scalable and highly available GAE Datastore to store ticTOCs data records. GAE applications run on servers located in Google’s regional data centers so that requests are handled by a data center geographically close to the requesting client. As of June 2009, Google hosting of GAE applications is free, which includes a free allotment of several computational resources. For each application, GAE allows quotas of up to 1.3 MB requests and the use of up to 10 GB of bandwidth per twenty-four- hour period. Although this capacity is sufficient for the purposes of many small- and medium-size institutions, additional capacity can be purchased at a small cost. widgetization To facilitate the easy integration of this service into websites without JavaScript programming, we developed a widget library. Like Google Book Classes, this widget library is controlled via HTML attributes associated with HTML <span> or <div> tags that are placed into the page where the user decides to display data from the Tictoclookup service. The HTML <title> attribute identifies the journal by its ISSN or its ISSN and title. As with Google Book Classes, Figure 3. Sample use of Google Book Classes in an OPAC results page Table 4. Sample request and response for ticTOCs lookup Web service Request: http://tictoclookup.appspot.com/0028-0836?title=Nature&jsoncallback=process JSON Response: process({ “lastmod”: “Wed Apr 29 05:42:36 2009”, “records”: [{ “title”: “Nature”, “rssfeed”: http://www.nature.com/nature/current_issue/rss }], “issn”: “00280836” }); 82 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 the HTML <class> attribute describes the desired process- ing, which may contain traditional CSS classes. example Consider the following HTML an adapter may use in a page: <span style=“display:none” class=“tictoc-link tictoc-preview tictoc-alternate-link” title=“ISSN:00280836: Nature”> Click to subscribe to Table of Contents for this journal </span> When processed by the Tictoclookup widget library, the class “tictoc-link” instructs the widget to wrap the span in a link to the RSS feed at which the table of con- tent is published, allowing users to subscribe to it. The class “tictoc-preview” associates a tooltip element with the span, which displays the first entries of the feed when the user hovers over the link. We use the Google Feeds API, another JSON-based Web service, to retrieve a cached copy of the feed. The “tictoc-alternate-link” class places an alternate link into the current document, which in some browsers triggers the display of the RSS feed icon Figure 4. Sample use of tictoclookup classes in the status bar. The <span> element, which is initially invisible, is made visible if and only if the Tictoclookup service returns information for the given pair of ISSN and title. Figure 4 provides a screenshot of the display if the user hovers over the link. As with Google Book Classes, the mash-up creator does not need to be concerned with the mechanics of contacting the Tictoclookup Web service and making the necessary manipulations to the document. Table 5 provides a com- plete overview of the classes Tictoclookup supports. integration with legacy oPAcs Similar to the Google Book Classes widget library, we implemented provisions that allow the use of Tictoclookup classes on pages over which the mash-up creator has limited control. For instance, specifying a title attribute of “ISSN:millennium.issnandtitle” harvests the ISSN and journal title from the III Millennium’s record display page. ■■ MAJAX Whereas the widget libraries discussed thus far integrate external Web services into an OPAC display, MAJAX is a widget library that integrates information coming from an OPAC into other pages, such as resource guides or course displays. MAJAX is designed for use with a III Millennium Integrated Library System (ILS) whose vendor does not provide a Web-services interface. The tech- niques we used, however, extend to other OPACs as well. Like many Table 5. Supported Tictoclookup classes Tictoclookup Class Meaning tictoc-link tictoc-preview tictoc-embed-n tictoc-alternate-link tictoc-append-title Wrap span/div in link to table of contents Display tooltip with preview of current entries Embed preview of first n entries Insert <link rel=“alternate”> into document Append the title of the journal to the span/div weB services ANd widGets For liBrArY iNFormAtioN sYstems | BAck ANd BAileY 83 legacy OPACs, Millennium does not only lack a Web-services interface, but lacks any programming interface to the records contained in the system and does not provide access to the database or file system of the machine housing the OPAC. Providing oPAc data as a web service We implemented two methods to access records from the Millennium OPAC using bibliographic identifi- ers such as ISBN, OCLC number, bibliographic record number, and item title. Both methods provide access to complete MARC records and holdings information, along with locations and real-time availability for each held item. MAJAX extracts this information via screen- scraping from the MARC record display page. As with all screen-scraping approaches, the code performing the scraping must be updated if the output format provided by the OPAC changes. In our experience, such changes occur at a frequency of less than once per year. The first method, MAJAX 1, implements screen scrap- ing using JavaScript code that is contained in a document placed in a directory on the server (/screens), which is normally used for supplementary resources, such as images. This document is included in the target page as a hidden HTML <iframe> element (see frame B in figure 2). Consequently, the same-domain restriction applies to the code residing in it. MAJAX 1 can thus be used only on pages within the same domain—for instance, if the OPAC is housed at opac.library.university.edu, MAJAX 1 may be used on all pages within *.university.edu (not merely *.library.university.edu). The key advantage of MAJAX 1 is that no additional server is required. The second method, MAJAX 2, uses an intermediary server that retrieves the data from the OPAC, translates it to JSON, and returns it to the client. This method, shown in figure 5, returns JSON data and therefore does not suffer from the same-domain restriction. However, it requires hosting the MAJAX 2 Web service. Like the Tictoclookup Web service, we implemented the MAJAX 2 Web service using Python conformant to WSGI. A single installation can support multiple OPACs. widgetization The MAJAX widget library allows the integration of both MAJAX 1 and MAJAX 2 data into websites without JavaScript programming. The <span> tags function as placeholders, and <title> and <class> attributes describe the desired processing. MAJAX provides a number of “MAJAX classes,” multiple of which can be specified. These classes allow a mash-up creator to insert a large variety of bibliographic information, such as the val- ues of MARC fields. Classes are also provided to insert fully formatted, ready-to-copy bibliographic references in Harvard style, live circulation information, links to the catalog record, links to online versions of the item (if applicable), a ready-to-import RIS description of the item, and even images of the book cover. A list of classes MAJAX supports is provided in table 6. examples Figure 6 provides an example use of MAJAX widgets. Four <span> tags expand into the book cover, a complete Harvard-style reference, the valid of a specific MARC field (020), and a display of the current availability of the item, wrapped in a link to the catalog record. Texts such as “copy is available” shown in figure 6 are localizable. Even though there are multiple MAJAX <span> tags that refer to the same ISBN, the MAJAX widget library will contact the MAJAX 1 or MAJAX 2 Web service only once per identifier, independent of how often it is used in a page. To manage the load, the MAJAX client site library can be configured to not exceed a maximum number of requests per second, per client. All software described in this paper is available under the LGPL Open Source License. The MAJAX libraries have been used by us and others for about two years. For instance, the “New Books” list in our library uses MAJAX 1 to provide circulation information. Faculty members at our institution are using MAJAX to enrich their course websites. A number of libraries have adopted MAJAX 1, which is particularly easy to host because no additional server is required. ■■ Related work Most ILSs in use today do not provide suitable Web-services interfaces to access either bibliographic information Figure 5. Architecture of the MAJAX 2 Web service 84 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 or availability data.9 This shortcoming is addressed by multiple initiatives. The ILS Discovery Interface task force (ILS-DI) created a set of rec- ommendations that facilitate the integration of discovery interfaces with legacy ILSs, but does not define a concrete API.10 Related, the ISO 20775 Holdings standard describes an XML schema to describe the availability of items across sys- tems, but does not describe an API for accessing them.11 Many ILSs provide a Z39.50 interface in addition to their HTML- based Web OPACs, but Z39.50 does not provide standardized holdings and availability.12 Nevertheless, there is hope within the community that ILS vendors will react to their customers’ needs and provide Web-services interfaces that implement these recommenda- tions. The Jangle project provides an API and an implementation of the ILS-DI recommendations through a Representations State Transfer (REST)–based interface that uses the Atom Publishing Protocol (APP).13 Jangle can be linked to legacy ILSs via connec- tors. The use of the XML-based APP prevents direct access from client-side JavaScript code, how- ever. In the future, adoption and widespread implementation of the W3C working draft on cross- origin resource sharing may relax the same-origin restriction in a controlled fashion, and thus allow access to APP feeds from JavaScript across domains.14 Screen-scraping is a common technique used to over- come the lack of Web-services interfaces. For instance, OCLC’s WorldCat Local product obtains access to avail- ability information from legacy ILSs in a similar fashion as our MAJAX 2 service.15 Whereas the Web services used or created in our work exclusively use a REST-based model and return data in JSON format, interfaces based on SOAP (formerly Simple Object Access Protocol) whose semantics are described by a WSDL specification provide an alternative if access from within client-side JavaScript code is not required.16 HTML Written by Adapter <table width=“340”><tr><td> <span class=“majax-syndetics-vtech” title=“i1843341662”></span> </td><td> <span class=“majax-harvard-reference” title=“i1843341662”></span> <br /> ISBN: <span class=“majax-marc-020” title=“i1843341662”></span> <br /> <span class=“majax-linktocatalogmajax-showholdings” title=“i1843341662”></span> </td></tr></table> Display in Browser after Processing Dahl, Mark., Banerjee, Kyle., Spalti, Michael., 2006, Digital libraries : integrating content and systems / Oxford, Chandos Publishing, xviii, 203 p. ISBN: 1843341662 (hbk.) 1 copy is available Figure 6. Example use of MAJAX widgets OCLC Grid Services provides REST-based Web-services interfaces to several databases, including the WorldCat Search API and identifier services such as xISBN, xISSN, and xOCLCnum for FRBR-related metadata.17 These ser- vices support XML and JSON and could benefit from widgetization for easier inclusion into client pages. The use of HTML markup to encode processing instructions is common in JavaScript frameworks, such as YUI or Dojo, which use <div> elements with custom- defined attributes (so-called expando attributes) for this purpose.18 Google Gadgets uses a similar technique as well.19 The widely used Context Objects in Spans (COinS) specification exploits <span> tags to encode OpenURL Table 6. Selected MAJAX classes MAJAX Class Replacement majax-marc-FFF-s majax-marc-FFF majax-syndetics-* majax-showholdings majax-showholdings-brief majax-endnote majax-ebook majax-linktocatalog majax-harvard-reference majax-newline majax-space MARC field FFF, subfields concatenation of all subfields in field FFF book cover image current holdings and availability information …in brief format RIS version of record link to online version, if any link to record in catalog reference in Harvard style newline space weB services ANd widGets For liBrArY iNFormAtioN sYstems | BAck ANd BAileY 85 techniques for the seamless inclusion of information from Web services into websites. We considered the cases where an OPAC is either the target of such integra- tion or the source of the information being integrated. We focused on client-side techniques in which each user’s browser contacts Web services directly because this approach lends itself to the creation of HTML widgets. These widgets allow the integration and customization of Web services without requiring programming. Therefore nonprogrammers can become mash-up creators. We described in detail the functionality and use of several widget libraries and Web services we built. Table 7 provides a summary of the functionality and hosting requirements for each system discussed. Although the specific requirements for each system differ because of their respective nature, all systems are designed to be deployable with minimum effort and resource require- ments. This low entry cost, combined with the provision of a high-level, nonprogramming interface, constitute two crucial preconditions for the broad adoption of mash-up techniques in libraries, which in turn has the potential to context objects in pages for processing by client-side extension.20 LibraryThing uses client-side mash-up tech- niques to incorporate a social tagging service into OPAC pages.21 Although their technique uses a <div> ele- ment as a placeholder, it does not allow customization via classes—the changes to the content are encoded in custom-generated JavaScript code for each library that subscribes to the service. The Juice Project shares our goal of simplifying the enrichment of OPAC pages with content from other sources.22 It provides a set of reusable components that is directed at JavaScript programmers, not librarians. In the computer-science community, multiple emerg- ing projects investigate how to simplify the creation of server-side data mash-ups by end user programmers.23 ■■ Conclusion This paper explored the design space of mash-up Table 7. Summary of features and requirements for the widget libraries presented in this paper Majax 1 Majax 2 Google Book Classes Tictoclookup Classes Web Service Screen Scraping III Record Display JSON Proxy for III Record Display Google Book Search Dynamic Link API books.google.com ticTOC Cloud Application tictoclookup .appspot.com Hosted By Existing Millennium Installation /screens WSGI/Python Script on libx.lib.vt.edu Google, Inc. Google, Inc. via Google App Engine Data Provenance Your OPAC Your OPAC Google JISC (www.tictocs .ac.uk) Additional Cost N/A Can use libx.lib.vt.edu for testing, must run WSGI-enabled web server in production Free, but subject to Google Terms of Service Generous free quota, pay per use beyond that Same Domain Restriction Yes No No No Widgetization majax.js: class-based: majax- classes gbsclasses.js:class- based: gbs- tictoc.js:class-based: tictoc- Requires JavaScript programming No No No No Requires Additional Server No Yes (Apache+mod_wsgi) No No (if using GAE), else need Apache+mod_wsgi III Bibrecord Display N/A N/A Yes Yes III WebBridge Integration Yes Yes Yes Yes 86 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 vastly increase the reach and visibility of their electronic resources in the wider community. References 1. Nicole Engard, ed., Library Mashups—Exploring New Ways to Deliver Library Data (Medford, N.J.: Information Today, 2009); Andrew Darby and Ron Gilmour, “Adding Delicious Data to Your Library Website,” Information Technology & Libraries 28, no. 2 (2009): 100–103. 2. Monica Brown-Sica, “Playing Tag in the Dark: Diagnosing Slowness in Library Response Time,” Information Technologies & Libraries 27, no. 4 (2008): 29–32. 3. Dapper, “Dapper Dynamic Ads,” http://www.dapper .net/ (accessed June 19, 2009); Yahoo!, “Pipes,” http://pipes .yahoo.com/pipes/ (accessed June 19, 2009). 4. Jennifer Bowen, “Metadata to Support Next-Genera- tion Library Resource Discovery: Lessons from the Extensible Catalog, Phase 1,” Information Technology & Libraries 27, no. 2 (2008): 6–19; John Blyberg, “ILS Customer Bill-of-Rights,” online posting, Blyberg.net, Nov. 20, 2005, http://www.blyberg .net/2005/11/20/ils-customer-bill-of-rights/ (accessed June 18, 2009). 5. Douglas Crockford, “The Application/JSON Media Type for JavaScript Object Notation (JSON),” memo, The Inter- net Society, July 2006, http://www.ietf.org/rfc/rfc4627.txt (accessed Mar. 30, 2010). 6. Google, “Who’s Using the Book Search APIs?” http:// code.google.com/apis/books/casestudies/ (accessed June 16, 2009). 7. Innovative Interfaces, “Millennium ILS,” http://www.iii .com/products/millennium_ils.shtml (accessed June 19, 2009). 8. Joint Information Systems Committee, “TicTOCs Jour- nal Tables of Contents Service,” http://www.tictocs.ac.uk/ (accessed June 18, 2009). 9. Mark Dahl, Kyle Banarjee, and Michael Spalti, Digital Libraries: Integrating Content and Systems (Oxford, United King- dom: Chandos, 2006). 10. John Ockerbloom et al., “DLF ILS Discovery Interface Task Group (ILS-DI) Technical Recommendation,” (Dec. 8, 2008), http://diglib.org/architectures/ilsdi/DLF_ILS_ Discovery_1.1.pdf (accessed June 18, 2009). 11. International Organization for Standardization, “Information and Documentation—Schema for Holdings Information,” http://www.iso.org/iso/catalogue_detail .htm?csnumber=39735 (accessed June 18, 2009) 12. National Information Standards Organization, “ANSI/ NISO Z39.50—Information Retrieval: Application Service Defi- nition and Protocol Specification,” (Bethesda, Md.: NISO Pr., 2003), http://www.loc.gov/z3950/agency/Z39-50-2003.pdf (accessed May 31, 2010). 13. Ross Singer and James Farrugia, “Unveiling Jangle: Untangling Library Resources and Exposing Them through the Atom Publishing Protocol,” The Code4Lib Journal no. 4 (Sept. 22, 2008), http://journal.code4lib.org/articles/109 (accessed Apr. 21, 2010); Roy Fielding, “Architectural Styles and the Design of Network-Based Software Architectures” (PhD diss., University of California, Irvine, 2000); J. C. Gregorio, ed., “The Atom Pub- lishing Protocol,” memo, The Internet Engineering Task Force, Oct. 2007, http://bitworking.org/projects/atom/rfc5023.html (accessed June 18, 2009). 14. World Wide Web Consortium, “Cross-Origin Resource Sharing: W3C Working Draft 17 March 2009,” http://www .w3.org/TR/access-control/ (accessed June 18, 2009). 15. OCLC Online Computer Library Center, “Worldcat and Cataloging Documentation,” http://www.oclc.org/support/ documentation/worldcat/default.htm (accessed June 18, 2009). 16. F. Curbera et al., “Unraveling the Web Services Web: An Introduction to SOAP, WSDL, and UDDI,” IEEE Internet Comput- ing 6, no. 2 (2002): 86–93. 17. OCLC Online Computer Library Center, “OCLC Web Services,” http://www.worldcat.org/devnet/wiki/Services (accessed June 18, 2009); International Federation of Library Asso- ciations and Institutions Study Group on the Functional Require- ments for Bibliographic Records, “Functional Requirements for Bibliographic Records : Final Report,” http://www.ifla.org/files/ cataloguing/frbr/frbr_2008.pdf (accessed Mar. 31, 2010). 18. Yahoo!, “The Yahoo! User Interface Library (YUI),” http://developer.yahoo.com/yui/ (accessed June 18, 2009); Dojo Foundation, “Dojo—The JavaScript Toolkit,” http://www .dojotoolkit.org/ (accessed June 18, 2009). 19. Google, “Gadgets.* API Developer’s Guide,” http://code. google.com/apis/gadgets/docs/dev_guide.html (accessed June 18, 2009). 20. Daniel Chudnov, “COinS for the Link Trail,” Library Jour- nal 131 (2006): 8–10. 21. LibraryThing, “LibraryThing,” http://www.librarything .com/widget.php (accessed June 19, 2009). 22. Robert Wallis, “Juice—JavaScript User Interface Compo- nentised Extensions,” http://code.google.com/p/juice-project/ (accessed June 18, 2009). 23. Jeffrey Wong and Jason Hong, “Making Mashups with Marmite: Towards End-User Programming for the Web” Confer- ence on Human Factors in Computing Systems, San Jose, California, April 28–May 3, 2007: Conference Proceedings, Volume 2 (New York: Association for Computing Machinery, 2007): 1435–44; Guiling Wang, Shaohua Yang, and Yanbo Han, “Mashroom: End-User Mashup Programming Using Nested Tables” (paper presented at the International World Wide Web Conference, Madrid, Spain, 2009): 861–70; Nan Zang, “Mashups for the Web-Active User” (paper presented at the IEEE Symposium on Visual Languages and Human-Centric Computing, Herrshing am Ammersee, Germany, 2008): 276–77. 3147 ---- weB services ANd widGets For liBrArY iNFormAtioN sYstems | HAN 87oN tHe clouds: A New wAY oF comPutiNG | HAN 87 shape cloud computing. For exam- ple, Sun’s well-known slogan “the network is the computer” was estab- lished in late 1980s. Salesforce.com has been providing on-demand Software as a Service (SaaS) for cus- tomers since 1999. IBM and Microsoft started to deliver Web services in the early 2000s. Microsoft’s Azure service provides an operating sys- tem and a set of developer tools and services. Google’s popular Google Docs software provides Web-based word-processing, spreadsheet, and presentation applications. Google App Engine allows system devel- opers to run their Python/Java applications on Google’s infrastruc- ture. Sun provides $1 per CPU hour. Amazon is well-known for provid- ing Web services such as EC2 and S3. Yahoo! announced that it would use the Apache Hadoop frame- work to allow users to work with thousands of nodes and petabytes (1 million gigabytes) of data. These examples demonstrate that cloud computing providers are offer- ing services on every level, from hardware (e.g., Amazon and Sun), to operating systems (e.g., Google and Microsoft), to software and ser- vice (e.g., Google, Microsoft, and Yahoo!). Cloud-computing provid- ers target a variety of end users, from software developers to the general public. For additional infor- mation regarding cloud computing models, the University of California (UC) Berkeley’s report provides a good comparison of these models by Amazon, Microsoft, and Google.4 As cloud computing providers lower prices and IT advancements remove technology barriers—such as virtualization and network band- width—cloud computing has moved into the mainstream.5 Gartner stated, “Organizations are switching from factors related to cloud computing: infinite computing resources avail- able on demand, removing the need to plan ahead; the removal of an up-front costly investment, allowing companies to start small and increase resources when needed; and a system that is pay-for-use on a short-term basis and releases customers when needed (e.g., CPU by hour, storage by day).2 National Institute of Standards and Technology (NIST) currently defines cloud computing as “a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. network, servers, storage, appli- cations, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”3 As there are several definitions for “utility computing” and “cloud computing,” the author does not intend to suggest a better definition, but rather to list the characteristics of cloud computing. The term “cloud computing” means that ■■ customers do not own network resources, such as hardware, software, systems, or services; ■■ network resources are provided through remote data centers on a subscription basis; and ■■ network resources are delivered as services over the Web. This article discusses using cloud computing on an IT-infrastructure level, including building virtual server nodes and running a library’s essen- tial computer systems in remote data centers by paying a fee instead of run- ning them on-site. The article reviews current cloud computing services, presents the author’s experience, and discusses advantages and disadvan- tages of using the new approach. All kinds of clouds Major IT companies have spent bil- lions of dollars since the 1990s to On the Clouds: A New Way of Computing This article introduces cloud computing and discusses the author’s experience “on the clouds.” The author reviews cloud computing services and providers, then presents his experience of running mul- tiple systems (e.g., integrated library sys- tems, content management systems, and repository software). He evaluates costs, discusses advantages, and addresses some issues about cloud computing. Cloud com- puting fundamentally changes the ways institutions and companies manage their computing needs. Libraries can take advan- tage of cloud computing to start an IT project with low cost, to manage computing resources cost-effectively, and to explore new computing possibilities. S cholarly communication and new ways of teaching provide an opportunity for academic institutions to collaborate on pro- viding access to scholarly materials and research data. There is a grow- ing need to handle large amounts of data using computer algorithms that presents challenges to libraries with limited experience in handling nontextual materials. Because of the current economic crisis, aca- demic institutions need to find ways to acquire and manage computing resources in a cost-effective manner. One of the hottest topics in IT is cloud computing. Cloud computing is not new to many of us because we have been using some of its services, such as Google Docs, for years. In his latest book, The Big Switch: Rewiring the World, from Edison to Google, Carr argues that computing will go the way of electricity: purchase when needed, which he calls “utility computing.” His examples include Amazon’s EC2 (Elastic Computing Cloud), and S3 (Simple Storage) services.1 Amazon’s chief technol- ogy officer proposed the following Yan HanTutorial Yan Han (hany@u.library.arizona.edu) is Associate Librarian, University of Arizona Libraries, Tucson. 88 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 201088 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 company-owner hardware and software to per-use service-based models.”6 For example, the U.S. gov- ernment website (http://www.usa .gov/) will soon begin using cloud computing.7 The New York Times used Amazon’s EC2 and S3 services as well as a Hadoop application to pro- vide open access to public domain articles from 1851 to 1922. The Times loaded 4 TB of raw TIFF images and their derivative 11 million PDFs into Amazon’s S3 in twenty-four hours at very reasonable cost.8 This project is very similar to digital library proj- ects run by academic libraries. OCLC announced its movement of library management services to the Web.9 It is clear that OCLC is going to deliver a Web-based integrated library sys- tem (ILS) to provide a new way of running an ILS. DuraSpace, a joint organization by Fedora Commons and DSpace Foundation, announced that they would be taking advan- tage of cloud storage and cloud computing.10 On the clouds Computing needs in academic librar- ies can be placed into two categories: user computing needs and library goals. User computing needs Academic libraries usually run hun- dreds of PCs for students and staff to fulfill their individual needs (e.g., Microsoft Office, browsers, and image-, audio-, and video-processing applications). Library goals A variety of library systems are used to achieve libraries’ goals to sup- port research, learning, and teaching. These systems include the following: ■■ Library website: The website may be built on simple HTML web- pages or a content management system such as Drupal, Joomla, or any home-grown PHP, Perl, ASP, or JSP system. ■■ ILS: This system provides tra- ditional core library work such as cataloging, acquisition, reporting, accounting, and user management. Typical systems include Innovative Interfaces, SirsiDynix, Voyager, and open- source software such as Koha. ■■ Repository system: This sys- tem provides submission and access to the institution’s digi- tal collections and scholarship. Typical systems include DSpace, Fedora, EPrints, ContentDm, and Greenstone. ■■ Other systems: for example, fed- erated search systems, learning object management systems, interlibrary loan (ILL) systems, and reference tracking systems. ■■ Public and private storage: staff file-sharing, digitization, and backup. Due to differences in end users and functionality, most systems do not use computing resources equally. For example, the ILS is input and output intensive and database query intensive, while repository systems require storage ranging from a few gigabytes to dozens of terabytes and substantial network bandwidth. Cloud computing brings a funda- mental shift in computing. It changes the way organizations acquire, configure, manage, and maintain computing resources to achieve their business goals. The availability of cloud computing providers allows organizations to focus on their busi- ness and leave general computing maintenance to the major IT compa- nies. In the fall of 2008, the author started to research cloud computing providers and how he could imple- ment cloud computing for some library systems to save staff and equipment costs. In January 2009, the author started his plan to build library systems “on the clouds.” The University of Arizona Libraries (UAL) has been a key player in the process of rebuilding higher education in Afghanistan since 2001. UAL Librarian Atifa Rawan and the author have received multiple grant contracts to build technical infra- structures for Afghanistan’s academic libraries. The technical infrastructure includes the following: ■■ Afghanistan ILS: a bilingual ILS based on the open-source system Koha.11 ■■ Afghanistan Digital Libraries website (http://www.afghan digitallibraries.org/): originally built on simple HTML pages, later rebuilt in 2008 using the con- tent management system Joomla. ■■ A digitization management sys- tem. The author has also developed a Japanese ILL system (http://gif project.libraryfinder.org) for the North American Coordinating Council on Japanese Library Resources. These systems had been running on UAL’s internal technical infrastructure. These systems run in a complex computing environment, require different modules, and do not use computing resources equally. For example, the Afghan ILS runs on Linux, Apache, MySQL, and Perl. Its OPAC and staff interface run on two different ports. The Afghanistan Digital Libraries website requires Linux, Apache, MySQL, and PHP. The Japanese ILL system was written in Java and runs on Tomcat. There are several reasons why the author moved these systems to the new cloud computing infrastructure: ■■ These systems need to be accessed in a system mode by people who are not UAL employees. ■■ System rebooting time can be substantial in this infrastructure because of server setup and IT policy. ■■ The current on-site server has weB services ANd widGets For liBrArY iNFormAtioN sYstems | HAN 89oN tHe clouds: A New wAY oF comPutiNG | HAN 89 reached its life expectancy and requires a replacement. By analyzing the complex needs of different systems and considering how to use resources more effec- tively, the author decided to run all the systems through one cloud computing provider. By comparing the features and the costs, Linode (http://www.linode.com/) was chosen because it provides full SSH and root access using virtualization, four data centers in geographically diverse areas, high availability and clustering support, and an option for month-to-month contracts. In addition, other customers have pro- vided positive reviews. In January 2009, the author purchased one node located in Fremont, California, for $19.95 per month. An imple- mentation plan (see appendix) was drafted to complete the project in phases. The author owns a virtual server and has access to everything that a physical server provides. In addition, the provider and the user community provided timely help and technical support. The migration of systems was straightforward: A Linux kernel (Debian 4.0) was installed within an hour, domain registration was com- plete and the domains went active in twenty-four hours, the Afghanistan Digital Libraries’ website (based on Joomla) migration was complete within a week, and all supporting tools and libraries (e.g., MySQL, Tomcat, and Java SDK) were installed and configured within a few days. A month later, the Afghanistan ILS (based on Koha) migration was com- pleted. The ILL system was also migrated without problem. Tests have been performed in all these systems to verify their usabil- ity. In summary, the migration of systems was very successful and did not encoun- ter any barriers. It addresses the issues facing us: After the migration, SSH log-ins for users who are not univer- sity employees were set up quickly; systems maintenance is managed by the author’s team, and rebooting now only takes about one minute; and there is no need to buy a new server and put it in a temperature and security controlled environment. The hardware is maintained by the provider. The administrative GUI for the Linux Nodes is shown in figure 1. Since migration, no downtime because of hardware or other failures caused by the provider has been observed. After migrating all the sys- tems successfully and running them in a reliable mode for a few months, the second phase was implemented (see appendix). Another Linux node (located in Atalanta, Georgia) was purchased for backup and moni- toring (see figure 2). Nagios, an open-source monitoring system, was tested and configured to identify and report problems for the above library systems. Nagios provides the follow- ing functions: (1) monitoring critical computing components, such as the network, systems, services, and serv- ers; (2) timely alerts delivered via e-mail or cell phone; and (3) report and record logs of outages, events, and alerts. A backup script is also run as a prescheduled job to back up the systems on a regular basis. Figure 1. Linux Node Administration Web interface Figure 2. Two Linux Nodes located in two remote data centers Node 1: 64.62.xxx.xxx (Fremont, CA) Node 2: 74.207.xxx.xxx (Atlanta, GA) Nagios Backup Afghan Digital Libraries Website Afghan ILS Interlibrary loan system DSpace 90 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 201090 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 Findings and discussions Since January 2009, all the systems have been migrated and have been running without any issues caused by the provider. The author is very satisfied with the outcomes and cost. The annual cost of running two nodes is $480 per year, compared to at least $4,000 dollars if the hardware had been run in the library.12 From the author ’s experience, cloud computing provides the following advantages over the tradi- tional way of computing in academic institutions: ■■ Cost-effectiveness: From the above example and literature review, it is obvious that using cloud computing to run applications, systems, and IT infrastruc- ture saves staff and financial resources. UC Berkeley’s report and Zawodny’s blog provide a detailed analysis of costs for CPU hours and disk storage.13 ■■ Flexibility: Cloud computing allows organizations to start a project quickly without worrying about up-front costs. Computing resources such as disk storage, CPU, and RAM can be added when needed. In this case, the author started on a small scale by purchasing one node and added additional resources later. ■■ Data safety: Organizations are able to purchase storage in data centers located thousands of miles away, increasing data safety in case of natural disasters or other factors. This strategy is very difficult to achieve in a tra- ditional off-site backup. ■■ High availability: Cloud comput- ing providers such as Microsoft, Google, and Amazon have bet- ter resources to provide more up-time than almost any other organizations and companies do. ■■ The ability to handle large amounts of data: Cloud computing has a pay-for-use business model that allows academic institutions to analyze terabytes of data using distributed computing over hundreds of computers for a short-time cost. On-demand data storage, high availability and data safety are criti- cal features for academic libraries.14 However, readers should be aware of some technical and business issues: ■■ Availability of a service: In sev- eral widely reported cases, Amazon’s S3 and Google Gmail were inaccessible for a duration of several hours in 2008. The author believes that the com- mercial providers have better technical and financial resources to keep more up-time than most academic institutions. For those wanting no single point of fail- ure (e.g., a provider goes out of business), the author suggests storing duplicate data with a dif- ferent provider or locally. ■■ Data confidentiality: Most aca- demic libraries have open-access data. This issue can be solved by encrypting data before moving to the clouds. In addition, licens- ing terms can be negotiated with providers regarding data safety and confidentiality. ■■ Data transfer bottlenecks: Accessing the digital collections requires considerable network bandwidth, and digital collections are usually optimized for customer access. Moving huge amounts of data (e.g., preservation digital images, audios, videos, and data sets) to data centers can be scheduled during off hours (e.g., 1–5 a.m.), or data can be shipped on hard disks to the data centers. ■■ Legal jurisdiction: Legal jurisdic- tion creates complex issues for both providers and end users. For example, Canadian privacy laws regulate data privacy in public and private sectors. In 2008, the Office of the Privacy Commissioner of Canada released a finding that “outsourcing of canada .com email services to U.S.-based firm raises questions for subscrib- ers,” and expressed concerns about public sector privacy pro- tection.15 This brings concerns to both providers and end users, and it was suggested that privacy issues will be very challenging.16 Summary The author introduces cloud comput- ing services and providers, presents his experience of running multiple sys- tems such as ILS, content management systems, repository software, and the other system “on the clouds” since January 2009. Using cloud comput- ing brings significant cost savings and flexibility. However, readers should be aware of technical and business issues. The author is very satisfied with his experience of moving library systems to cloud computing. His experience demonstrates a new way of managing critical computing resources in an aca- demic library setting. The next steps include using cloud computing to meet digital collections’ storage needs. Cloud computing brings fun- damental changes to organizations managing their computing needs. As major organizations in library fields, such as OCLC, started to take advan- tage of cloud computing, the author believes that cloud computing will play an important role in library IT. Acknowledgments The author thanks USAID and Washington State University for pro- viding financial support. The author thanks Matthew Cleveland’s excel- lent work “on the clouds.” References 1. Nicholars Carr, The Big Switch: Rewiring the World, from Edison to Google weB services ANd widGets For liBrArY iNFormAtioN sYstems | HAN 91oN tHe clouds: A New wAY oF comPutiNG | HAN 91 (London: Norton, 2008). 2. Werner Vogels, “A Head in the Clouds—The Power of Infrastructure as a Service” (paper presented at the Cloud Computing and in Applications confer- ence (CCA ’08), Chicago, Oct. 22–23, 2008). 3. Peter Mell and Tim Grance, “Draft NIST Working Definition of Cloud Com- puting,” National Institute of Standards and Technology (May 11, 2009), http:// csrc.nist.gov/groups/SNS/cloud-com- puting/index.html (accessed July 22, 2009). 4. Michael Armbust et al., “Above the Clouds: A Berkeley View of Cloud Com- puting,” technical report, University of California, Berkeley, EECS Department, Feb. 10, 2009, http://www.eecs.berkeley .edu/Pubs/TechRpts/2009/EECS-2009- 28.html (accessed July 1, 2009). 5. Eric Hand, “Head in the Clouds: ‘Cloud Computing’ Is Being Pitched as a New Nirvana for Scientists Drowning in Data. But Can It Deliver?” Nature 449, no. 7165 (2007): 963; Geoffery Fowler and Ben Worthen, “The Internet Indus- try Is On a Cloud—Whatever That May Mean,” Wall Street Journal, Mar. 26, 2009, http://online.wsj.com/article/ SB123802623665542725.html (accessed July 14, 2009); Stephen Baker, “Google and the Wisdom of the Clouds,” Business Week (Dec. 14, 2007), http://www.msnbc .msn.com/id/22261846/ (accessed July 8, 2009). 6. Gartner, “Gartner Says Worldwide IT Spending on Pace to Supass $3.4 Tril- lion in 2008,” press release, Aug. 18, 2008, http://www.gartner.com/it/page .jsp?id=742913 (accessed July 7, 2009). 7. Wyatt Kash, “USA.gov, Gobierno USA.gov move into the Internet cloud,” Government Computer News, Feb. 23, 2009, http://gcn.com/articles/2009/02/23/ gsa-sites-to-move-to-the-cloud.aspx?s =gcndaily_240209 (accessed July 14, 2009). 8. Derek Gottfrid, “Self-Service, Prorated Super Computing Fun!” online posting, New York Times Open, Nov. 1, 2007, http://open.blogs .nytimes.com/2007/11/01/self-service -prorated-super-computing-fun/?scp =1&sq=self%20service%20prorated&st =cse (accessed July 8, 2009). 9. OCLC Online Computing Library Center, “OCLC announces strategy to move library management services to Web scale,” press release, Apr. 23, 2009, http://www.oclc.org/us/en/news/ releases/200927.htm (accessed July 5, 2009). 10. DuraSpace, “Fedora Commons and DSpace Foundation Join Together to Create DuraSpace Organization,” press release, May 12, 2009, http:// duraspace.org/documents/pressrelease .pdf (accessed July 8, 2009). 11. Yan Han and Atifa Rawan, “Afghanistan Digital Library Initiative: Revitalizing an Integrated Library Sys- tem,” Information Technology & Libraries 26, no. 4 (2007): 44–46. 12. Fowler and Worthen, “The Internet Industry Is on a Cloud.” 13. Jeremy Zawodney, “Replacing My Home Backup Server with Amazon’s S3,” online posting, Jeremy Zawod- ny’s Blog, Oct. 3, 2006, http://jeremy .zawodny.com/blog/archives/007624 .html (accessed June 19, 2009). 14. Yan Han, “An Integrated High Availability Computing Platform,” The Electronic Library 23, no. 6 (2005): 632–40. 15. Office of the Privacy Commissioner of Canada, “Tabling of Privacy Com- missioner of Canada’s 2005–06 Annual Report on the Privacy Act: Commissioner Expresses Concerns about Public Sector Privacy Protection,” press release, June 20, 2006, http://www.priv.gc.ca/media/ nr-c/2006/nr-c_060620_e.cfm (accessed July 14, 2009); Office of the Privacy Com- missioner of Canada, “Findings under the Personal Information Protection and Elec- tronic Documents Act (PIPEDA),” (Sept. 19, 2008), http://www.priv.gc.ca/cf -dc/2008/394_20080807_e.cfm (accessed July 14, 2009). 16. Stephen Baker, “Google and the Wisdom of the Clouds,” Business Week (Dec. 14, 2007), http://www.msnbc.msn .com/id/22261846/ (accessed July 8, 2009). Appendix. Project Plan: Building HA Linux Platform Using Cloud Computing Project Manager: Project Members: Object Statement: To build a High Availability (HA) Linux platform to support multiple systems using cloud computing in six months. Scope: The project members should identify cloud computing providers, evaluate the costs, and build a Linux platform for computer systems, including Afghan ILS, Afghanistan Digital Libraries website, repository system, Japanese inter- library loan website, and digitization management system. Resources: Project Deliverable: January 1, 2009—July 1, 2009 92 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 201092 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 Phase I ■■ To build a stable and reliable Linux Platform to support multiple Web applications. The platform needs to consider reliability and high availability in a cost-effective manner ■■ To install needed libraries for the environment ■■ To migrate ILS (Koha) to this Linux platform ■■ To migrate Afghan Digital Libraries’ website (Joomla) to this platform ■■ To migrate Japanese interlibrary loan website ■■ To migrate Digitization Management system Phase II ■■ To research and implement a monitoring tool to monitor all Web applications as well as OS level tools (e.g. Tomcat, MySQL) ■■ To configure a cron job to run routine things (e.g., backup ) ■■ To research and implement storage (TB) for digitization and access Phase III ■■ To research and build Linux clustering Steps: 1. OS installation: Debian 4 2. Platform environment: Register DNS 3. Install Java 6, Tomcat 6, MySQL 5, etc. 4. Install source control env Git 5. Install statistics analysis tool (Google Analytics) 6. Install monitoring tool: Ganglia or Nagios 7. Web Applications 8. Joomla 9. Koha 10. Monitoring tool 11. Digitization management system 12. Repository system: Dspace, Fedora, etc. 13. HA tools/applications Note Calculation based on the following: ■■ leasing two nodes $20/month: $20 x 2 nodes x 12 months = $480/year ■■ A medium-priced server with backup with a life expectancy of 5 years ($5,000): $1,000/year ■■ 5 percent of system administrator time for managing the server ($60,000 annual salary): $3,000/year ■■ Ignore telecommunication cost, utility cost, and space cost. ■■ Ignore software developer’s time because it is equal for both options. Appendix. Project Plan: Building HA Linux Platform Using Cloud Computing (cont.) 3148 ---- From our reAders | edeN 93 Bradford Lee EdenFrom Our Readers The New User Environment: The End of Technical Services? Editor’s Note: “From Our Readers” is an occasional feature high- lighting ITAL readers’ letters and commentaries on timely issues. Technical Services: an obsolete term used to describe the largest component of most library staffs in the twentieth century. That component of the staff was entirely devoted to arcane and mysterious processes involved in selecting, acquiring, cataloging, pro- cessing, and otherwise making available to library users physical material containing information con- tent pieces (incops). The processes were compli- cated, expensive, and time-consuming, and generally served to severely limit direct service to users both by producing records that were difficult to under- stand and interpret, even by other library staff, and by consuming from 75–80 percent of the library’s financial and personnel resources. In the twenty-first century, the advent of new forms of publication and new techniques for providing universal records and universal access to information content made the organizational structure obsolete. That change in organizational structure, more than any other single factor, is generally credited as being responsible for the dramatic improvement in the quality of library service that has occurred in the first decade of the twenty-first century. T here are many who would say that I was the one who wrote this quotation. I didn’t, and it is, in fact, more than twenty-five years old!1 While I was beginning to research and prepare for this article, I began as most users today start their search for information: I started with Google. Granted, I rarely go beyond the first page of results (as most user surveys indicate), but the paucity of links made me click to the next screen. There, at number 16, was a scanned article. Jackpot! I thought as I started perusing the contents of this resource online, thinking to myself how the future had changed so dramatically since 1984, with the emergence of the Internet and the laptop, all of the new information formats, and the digitization of information. Ahh, the power of full text! After reading through the table of contents, introduction, and the first chapter, I noticed that some of the pages were missing. Mmmm, obviously some very shoddy scanning on the part of Google. But no, I finally realized that only part of this special issue was available on Google. Obviously, I missed the statement at the bottom of the front scan of the book: “This is a preview. The total pages displayed will be limited. Learn more.” And thus the issues regarding copy- right reared their ugly head. When discussing the new user environment, there are many demands facing libraries today. In a report by Martha Bates, citing the principle of least effort first attributed to philologist George Zipf and quoted in the Calhoun report to the Library of Congress, she states: People do not just use information that is easy to find; they even use information that they know to be of poor quality and less reliable—so long as it requires little effort to find—rather than using information they know to be of high quality and reliable, though harder to find . . . despite heroic efforts on the part of librarians, students seldom have sufficiently sustained exposure to and practice with library skills to reach the point where they feel real ease with and mastery of library information systems.2 According to the final report of Bibliographic Services Task Force of the University of California Libraries, users expect the following: ■■ one system or search to cover a wide information universe (e.g., Google or Amazon) ■■ enriched metadata (e.g., ONIX, tables of contents, and cover art) ■■ full-text availability ■■ to move easily and seamlessly from a citation about an item to the item itself—discovery alone is not enough ■■ systems to provide a lot of intelligent assistance ■❏ correction of obvious spelling errors ■❏ results sorting in order of relevance to their queries ■❏ help in navigating large retrievals through logi- cal subsetting or topical maps or hierarchies ■❏ help in selecting the best source through rel- evance ranking or added commentary from peers and experts or “others who used this also used that” tools ■❏ customization and personalization services ■■ authenticated single sign-on ■■ security and privacy ■■ communication and collaboration ■■ multiple formats available: e-books, MPEG, JPEG, RSS and other push technologies, along with tradi- tional, tangible formats ■■ direct links to e-mail, instant messaging, and sharing ■■ access to online virtual communities ■■ access to what the library has to offer without actu- ally having to visit the library3 Bradford lee eden (eden@library.ucsb.edu) is Associate Uni- versity Librarian for Technical Services & Scholarly Communica- tion, University of California, Santa Barbara. 94 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 What is there in this new user environment for those who work in technical services? As indicated in the open- ing quote, would a dramatic improvement in library services occur if technical services were removed from the organizational structure? Even in 1983, the huge financial investment that libraries made in the organization and description of information, inventory, workflows, and personnel was recognized; today, that investment comes under intense scrutiny as libraries realize that we no longer have a monopoly on information access, and to survive we need to move forward more aggressively into the digital environment than ever before. As Marcum stated in her now-famous article, ■■ If the commonly available books and journals are accessible online, should we consider the search engines the primary means of access to them? ■■ Massive digitization radically changes the nature of local libraries. Does it make sense to devote local efforts to the cataloging of unique materials only rather than the regular books and journals? ■■ We have introduced our cataloging rules and the MARC format to libraries all over the world. How do we make massive changes without creating chaos? ■■ And finally, a more specific question: Should we proceed with AACR3 in light of a much-changed environment?4 There are larger internal issues to consider here as well. The budget situation in libraries requires the application of business models to workflows that have normally not been questioned nor challenged. Karen Calhoun discusses this topic in a number of her contribu- tions to the literature: When catalog librarians identify what they contribute to their communities with their methods (the catalog- ing rules, etc.) and with the product they provide (the catalog), they face the danger of “marketing myopia.” Marketing myopia is a term used in the business litera- ture to describe a nearsighted view that focuses on the products and services that a firm provides, rather than the needs those products and services are intended to address.5 For understanding the implementation issues associ- ated with the leadership strategy, it is important to be clear about what is meant by the “excess capacity” of catalogs. Most catalogers would deny there is excess capacity in today’s cataloging departments, and they are correct. Library materials continue to flood into acquisitions and cataloging departments and staff can barely keep up. Yet the key problem of today’s online catalog is the effect of declining demand. In healthy businesses, the demand for a product and the capacity to produce it are in balance. Research libraries invest huge sums in the infrastructure that produces their local catalogs, but search engines are students and scholars’ favorite place to begin a search. More users bypass catalogs for search engines, but research librar- ies’ investment in catalogs—and in the collections they describe—does not reflect the shift in user demand.6 I have discussed this exact problem in recent articles and technical reports as well.7 There have to be better, more efficient ways for libraries to organize and describe information not based on the status quo of redundant “localizing” of bibliographic records. A good analogy would be the current price of gas and the looming trans- portation crisis. For many years, Americans have had the luxury of being able to purchase just about any type of car, truck, SUV, Hummer, etc., that they wanted on the basis of their own preferences, personalities, and incomes, not on the size of the gas tank or on the mileage per gallon. Why not buy a Mercedes over a Kia? But with gas prices now well above the average person’s ability to consistently fill their gas tank without mortgaging their future, the market demands that people find alternative solutions in order to survive. This has meant moving away from the status quo of personal choice and selec- tion toward a more economic and sustainable model of informed fuel-efficiency transportation, so much so that public transportation is now inundated with more users than it can handle, and consumers have all but abandoned the truck and SUV markets. Libraries have long worked in the Mercedes arena, providing features such as authority control, subject classification, and redundant localizing of bibliographic records that were essential when libraries held the monopoly on informa- tion access but are no longer cost-efficient—nor even sane—strategies in the current information marketplace. Users are not accessing the OPAC anymore; well-known studies indicate that more than 80 percent of informa- tion seekers begin their search on a Web search engine. Libraries are investing huge resources in staffing and priorities fiddling with MARC bibliographic records in a time when they are struggling to survive and adapt from a monopoly environment to being just one of many players in the new information marketplace. Budgets are stagnant, staffing is at an all-time low, new information formats continue to appear and require attention, and users are no longer patient nor comfortable working with our clunky OPACs.8 Why do libraries continue to support an infrastructure of buying and offering the same books, CDs, DVDs, journals, etc., at every library, when the new information environment offers libraries the opportu- nity to showcase and present their unique information resources and one-of-a-kind collections to the world? Special collections materials held by every major research and public library in the world can now be digitized, and From our reAders | edeN 95 sparse library resources need to be adjusted to compete and offer these unique collections and their services to our users and the world. The October 2007 issue of Computers in Libraries is devoted solely to articles related to the enhancement, usability, appropriateness, and demise of the library OPAC. Interesting articles include “Fac-Back-OPAC: An Open Source Solution Interface to your Library System,” “Dreaming of a Better ILS,” “Plug Your Users into Library Resources with OpenSearch Plug-Ins,” Delivering What People Need, When and Where They Need It,” “The Birth of a New Generation of Library Interfaces,” and “Will the ILS Soon Be as Obsolete as the Card Catalog?” An especially interesting quote is given by Cervone, then assistant university librarian for information technology at Northwestern University: What I’d like to see is for the catalog to go away. To a great degree, it is an anachronism. What we need from the ILS is a solid, business-process back end that would facilitate the functions of the library that are truly unique such as circulation, acquiring materials, and “cataloging” at the item level for what amounts to inventory-control purposes. Most of the other tradi- tional ILS functions could be rolled over into a central- ized system, like OCLC, that would be cooperatively shared. The catalog itself should be treated as just another database in the world of resources we have access to. A single interface to those resources that would combine our local print holdings, electronic text (both journal and ebook), as well as multimedia material is what we should be demanding from our vendors.9 One book that needs to be required reading for all librarians, especially catalogers, is Weinberger ’s Everything Is Miscellaneous.10 He describes the three orders of order (self organization, metadata, and digi- tal); provides an extensive history of how Western civilization has ordered information, specifically the links to nineteenth-century Victorianism; and the con- cepts of lumping and splitting. In the end, Weinberger argues that the digital environment allows users to manipulate information into their own organization sys- tem, disregarding all previous organizational attempts by supposed experts using outdated and outmoded systems. In the digital disorder of information, an object (leaf) can now be placed on many shelves (branches), figuratively speaking, and this new shape of knowledge brings out four strategic principles: 1. Filter on the way out, not on the way in. 2. Put each leaf on as many branches as possible. 3. Everything is metadata and everything can be a label. 4. Give up control. It is this last principle that libraries have challenges with. Whether we agree with this principle or not, it has already happened. Arguing about it, ignoring it, or just continuing to do business as usual isn’t going to change the fact that information is user-controled and user- initiated in the digital environment. So, where do we go from here? The future of technical services (and its staff) Far be it from me to try to predict the future of libraries as viable, and more importantly marketable, information organizations in this new environment. One has only to examine the quotations from the first issues of Technical Services Quarterly to see what happens to predictions and opinions. Titles of some of the contributions (from 1983, mind you) are worthy of mention: “Library Automation in the Year 2000,” “Musings on the Future of the Catalog,” and “Libraries on the Line.” There are developments, however, that require reexamination and strategic brain- storming regarding the future of library bibliographic organization and description. The appearance of WorldCat Local will have a tre- mendous impact on the disappearance of proprietary vendor OPACs. There will no longer be a need for an integrated library system (ILS); with WorldCat Local, the majority of the world’s MARC bibliographic records are available in a Library 2.0 format. The only things miss- ing are some type of inventory and acquisitions module that can be formatted locally and a circulation module. If OCLC could focus their programming efforts on these two services and integrate them into WorldCat Local, library administrators and systems staff would no longer have to deal with proprietary and clunky OPACs (and their huge budgetary lines), but could use the power of Web 2.0 (and hopefully 3.0) tools and services to better position themselves in the new information marketplace. Another major development is the Google digitiza- tion project (and other associated ventures). While there are some concerns about quality and copyright,11 as well as issues related to the disappearance of print and the time involved to digitize all print,12 no one can deny the gradual and inevitable effect that mass digitization of print resources will have in the new information marketplace. Just the fact that my research explorations for this article brought up digitized portions of the 1983 Technical Services Quarterly articles is an example. More and more, published print information will be available in full-text online. What effect will this have on the physical collection that all libraries maintain, not only in terms of circulation, but also in terms of use of space, preservation, and collection devel- opment? No one knows for sure, but if the search strategies and information discovery patterns of our users are any 96 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 indication, then we need to be strategically preparing and developing directions and options. Automatic metadata generation has been a topic of discussion for a number of years, and Jane Greenberg’s work at the University of North Carolina–Chapel Hill is one of the leading examples of research in this area.13 While there are still viable concerns about metadata generation without any type of human intervention, semiautomatic and even nonlibrary-facilitated metadata generation has been successful in a number of venues. As libraries grapple with decreased budgets, multi- plying formats, fewer staff to do the work, and more retraining and reprofessional development of existing staff, library administrators have to examine all options to maximize personnel as well as budgetary resources. Incorporating new technologies and tools for generat- ing metadata without human intervention into library workflows should be viewed as a viable option. User tagging would be included in this area. Even Intner, a long-time proponent of traditional technical services, has written that generating cataloging data automati- cally would be of great benefit to the profession, and that more tools and more programming ought to be focused toward this goal.14 So, with print workflows being replaced by digital and electronic workflows, how can administrators assist their technical services staff to remain viable in this new information environment? How can technical services staff not only help themselves but their supervisors and administrators to incorporate their unique talents, exper- tise, education, and experience toward the type of future scenarios indicated above? Competencies and challenges for technical services staff There are some good opinions available for assisting technical services staff with moving into the new environ- ment. Names have power, whether we like to admit it or not, and changing the name from “Technical Services” to something more understandable to our users, let alone our colleagues within the library, is one way to start. Names such as “Collections and Data Management Services” or “Reference Data Services” have been men- tioned.15 An interesting quote sums up the dilemma: It’s pretty clear that technical services departments have long been the ugly ducklings in the library pond, trumped by a quintet of swans: reference departments (the ones with answers for a grateful public); IT depart- ments (the magicians who keep the computers hum- ming); children’s and youth departments (the warm and fuzzy nurturers); other specialty departments (the experts in good reads, music, art, law, business, medicine, government documents, AV, rare books and manuscripts, you-name-it); and administrative groups (the big bosses). Part of the trouble is that the rest of our colleagues don’t really know what technical services librarians do. They only know that we do it behind closed doors and talk about it in language no one else understands. If it can’t be seen, can’t be understood, and can’t be discussed, maybe it’s all smoke and mirrors, lacking real substance. It’s easy to ignore.16 Ruschoff mentions competencies for technical ser- vices librarians in the new information environment: comfortable working in both print and digital worlds, specialized skills such as foreign languages and subject area expertise, comfortable working in both digital and Web-based technologies (suggesting more computing and technology skills), expertise in digital asset manage- ment, and problem-solving analytical skills.17 In a recent blog posting summarizing a presentation at the 2008 ALA Annual Conference on this topic, comparisons between catalogers going extinct or retooling are provided. The following is a summary of that post: converging trends ■■ More catalogers work at the support-staff level than as professional librarians. ■■ More cataloging records are selected by machines. ■■ More catalog records are being captured from pub- lisher data or other sources. ■■ More updating of catalog records is done via batch processes. ■■ Libraries continue to deemphasize processing of sec- ondary research products in favor of unique primary materials. what are our choices? ■■ Behind door number one—the extinction model. ■■ Behind door number two—the retooling model. How it’s done ■■ Extinction ■❏ Keep cranking about how nobody appreciates us. ■❏ Assert over and over that we’re already doing everything right—why should we change? ■❏ Adopt a “chicken little” approach to envision- ing the future. ■■ Retooling ■❏ Considers what catalogers already do. ■❏ Look for support. ■❏ Find a new job. what catalogers do ■■ Operate within the boundaries of detailed standards. ■■ Describe items one-at-a-time. ■■ Treat items as if they are intended to fit carefully From our reAders | edeN 97 within a specific application—the catalog. ■■ Ignore the rest of the world of information. what metadata librarians do ■■ Think about descriptive data without preconceptions around descriptive level, granularity, or descriptive vocabularies. ■■ Consider the entirety of the discovery and access issues around a set or collection of materials. ■■ Consider users and uses beyond an individual ser- vice when making design decisions—not necessarily predetermined. ■■ Leap tall buildings in a single bound. what new metadata librarians do ■■ Be aware of changing user needs. ■■ Understand the evolving information environment. ■■ Work collaboratively with technical staff. ■■ Be familiar with all metadata formats and encoding metadata. ■■ Seek out tall buildings—otherwise jumping skills will atrophy. the cataloger skill set ■■ AACR2, LC, etc. the metadata librarian skill set ■■ Views data as collections, sets, streams. ■■ Approaches the task as designing data to “play well with others.” characteristics of our new world ■■ No more ILS ■■ Bibliographic utilities are unlikely to be the central node for all data. ■■ Creation of metadata will become more decentralized. ■■ Nobody knows how this will all shake out, but meta- data librarians will be critical in forging solutions.18 While the above summary focuses on catalogers and their future, many of the directions also apply to any librarian or support staff member currently working in technical services. In a recent EDUCAUSE Review article, Brantley lists a number of mantras that all libraries need to repeat and keep in mind in this new information environment: ■■ Libraries must be available everywhere. ■■ Libraries must be designed to get better through use. ■■ Libraries must be portable. ■■ Libraries must know where they are. ■■ Libraries must tell stories. ■■ Libraries must help people learn. ■■ Libraries must be tools of change. ■■ Libraries must offer paths for exploration. ■■ Libraries must help forge memory. ■■ Libraries must speak for people. ■■ Libraries must study the art of war.19 You will have to read the article to find out about that last point. The above mantras illustrate that each of these issues must also be aligned with the work done by technical services departments in support of the rest of the library’s services. And there definitely isn’t one right way to move forward; each library with its unique blend of services and staff has to define, initiate, and engender dialogue on change and strategic direction, and then actively make decisions with integrity and vigor toward both its users and its staff. As Calhoun indicates, there are a number of challenges to feasibility for next steps in this area, some technically oriented but many based on our own organizational structures and strictures: ■■ Difficulty achieving consensus on standardized, sim- plified, more automated workflows. ■■ Unwillingness or inability to dispense with highly customized acquisitions and cataloging operations. ■■ Overcoming the “not invented here” mindset pre- venting ready acceptance of cataloging copy from other libraries or external sources. ■■ Resistance to simplifying cataloging. ■■ Inability to find and successfully collaborate with necessary partners (e.g., ILS vendors). ■■ Difficulty achieving basic levels of system interoper- ability. ■■ Slow development and implementation of necessary standards. ■■ Library-centric decision making; inability to base priorities on how users behave and what they want ■■ Limited availability of data to support management decisions. ■■ Inadequate skill set among library staff; unwilling- ness or inability to retrain. ■■ Resistance to change from faculty members, deans, or administrators.20 Moving forward in the new information world In a recent discussion on the Autocat electronic discus- sion list regarding the client-business paradigm now being impressed on library staff, an especially interesting quote puts the entire debate into perspective: The irony of this discussion is that our patrons/users/ clients [et al.] expect to be treated as well as business customers. They pay tuition or taxes to most of our institutions and expect to have a return in value. And a very large percentage of them care about the differ- ences between the government services vs. business 98 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 arguments we present. What they know is that when they want something, they want it. More library powers-that-be now come from the world of business rather than libraries because of the pressure on the bottom line. Business administrators are viewed, even by those in public administration, as being more fiscally able than librarians. I would rec- ommend that we fuss less about titles and semantics and develop ways to show the value of libraries to the public.21 Wheeler, in a recent Educause Review article, docu- ments a number of “eras” that colleges and universities have gone through in recent history.22 First is the “Era of Publishing,” followed by the “Era of Participation” with the appearance of the Internet and its social networking tools. The next era, the “Era of Certitude,” is one in which users will want quick, timely answers to questions, along with some thought about the need and context of the question. Wheeler espouses five dimensions that tools of certitude must have: reach, response, results, resources, and rights. He explains these dimensions in regards to var- ious tools and services that libraries can provide through human–human, human–machine, and machine–machine interaction.23 Wheeler sees extensive rethinking and reengineering by libraries, campuses, and information technology to assist users to meet their information needs. Are there ways that technical services staff can assist in these efforts? Although somewhat dated, Calhoun’s extensive article on what is needed from catalogers and librarians in the twenty-first century expounds a number of salient points.24 In table 1, she illustrates some of the many challenges fac- ing traditional library cataloging, providing her opinion on what the challenges are, why they exist, and some solutions for survivability and adaptability in the new marketplace.25 One quote in particular deserves attention: At the very least, adapting successfully to current demands will require new competencies for librarians, and I have made the case elsewhere that librarians must move beyond basic computer literacy to “IT flu- ency”—that is, an understanding of the concepts of information technology, especially applying problem solving and critical thinking skills to using informa- tion technology. Raising the bar of IT fluency will be even more critical for metadata specialists, as they shift away from a focus on metadata production to approaches based on IT tools and techniques on the one hand, and on consulting and teamwork on the other. As a result of the increasing need for IT fluency among metadata specialists, they may become more closely allied with technical support groups in campus computing centers. The chief challenges for metadata spe- cialists will be getting out of library back rooms, becoming familiar with the larger world of university knowledge communities, and developing primary contacts with the appropriate domain experts and IT specialists.26 Getting out of the back room and interacting with users seems to be one of the dominant themes of evolv- ing technical services positions to fit the new information marketplace. Putting Web 2.0 tools and services into the library OPAC has also gained some momentum since the launch of the Endeca-based OPAC at North Carolina State University. As some people have stated, however, putting “lipstick on a pig” doesn’t change the fundamen- tal problems and poor usability of something that never worked well in the first place.27 In their recent article, Jia Mi and Cathy Weng tried to answer the following questions: Why is the current OPAC ineffective? What can libraries and librarians do to deliver an OPAC that is as good as search engines to better serve our users?28 Of course, the authors are biased toward the OPAC and wish to make it better, given that the last sentence in their abstract is, “Revitalizing the OPAC is one of the press- ing issues that has to be accomplished.” Users’ search patterns have already moved away from the OPAC as a discovery tool; why should personnel and resource investment continue to be allocated toward something that users have turned away from? In their recommenda- tions, Mi and Weng indicate that system limitations, not fully exploiting the functionality already made available by ILSs, and the unsuitability of MARC standards to online bibliographic display are the primary factors to the ineffectiveness of library OPACs. Exactly. Debate and discussion on Autocat after the publication of their article again shows the line drawn between conservative opin- ions (added value, noncommercialization, and overall ideals of the library profession and professional cata- loging workflows) and the newer push for open-source models, junking the OPAC, and learning and working with non-MARC metadata standards and tools. Conclusion From an administrative point of view, there are a number of viable options for making technical services as efficient as possible, in its current emanation: ■■ Conduct a process review of all current workflows, following each type of format from receipt at loading dock to access by user. Revise and redesign work- flows for efficiency. ■■ Eliminate all backlogs, incorporating and standardiz- ing various types of bibliographic organization (from brief records to full records, using established criteria of importance and access). ■■ As much as possible, contract with vendors to make From our reAders | edeN 99 all print materials shelf-ready, establishing and moni- toring profiles for quality and accuracy. Establish a rate of error that is amenable to technical services staff; once that error rate is met, review incoming print materials only once or twice a year. ■■ Assure technical services staff that their skills, expe- rience, and attention to detail are needed in the electronic environment, and provide training and professional development to assist them in scan- ning and digitizing unique collections, learning non-MARC metadata standards, improving project management, and performing consultation training to interact with faculty and students who work with data sets, metadata, and research planning. Support and actively work for revised job reclassification of library support staff positions. Most libraries are forced to work with fewer staff, and it is essential that current personnel are valued for their institutional knowledge and skill sets (knowledge man- agement philosophy). Library administrations need to emphasize to their staff that the organization has a vested interest in providing them with the tools and training they need to assist the organization in the new informa- tion marketplace. The status quo of technical services operations is no longer viable or cost-effective; all of us must look at ways to regain market share and restruc- ture our organizations to collaborate and consult with users regarding their information and research needs. No longer is it enough to just provide access to information; we must also provide tools and assistance to the user in manipulating that information. To end, I would like to quote from a few of the articles from that 1983 issue of Technical Services Quarterly I have alluded to throughout this chapter: Like all prognostications, predictions about cataloging in a fully automated library may bear little resem- blance to the ultimate reality. While the future cata- loging scenario discussed here may seem reasonable now, it could prove embarrassing to read 10–20 years hence. Still, I would be pleasantly surprised if, by the year 2000, TS operations are not fully integrated, TS staff has not been greatly reduced, there has not been a large-scale jump in TS productivity accompanied by a dramatic decline in TS costs, and if most of us are not cooperating through a national database.29 In conclusion, I will revert to my first subject, the uncertain nature of predictions. In addition to the fear- less predictions already recorded, I predict that some of these predictions will come true and perhaps even most of them. Some of them will come true, but not in the time anticipated, while others never will. Let us hope that the influences not guessed that will prevent the actualization of some of these predictions will be happy ones, not dire. However they turn out, I predict that in ten years no one will remember or really care what these predictions were.30 Technical services as we know them now may well not exist by the end of the century. The aims of technical services will exist for as long as there are libraries. The Technical Services Quarterly may well have changed its name and its coverage long before then, but its con- cerns will remain real and the work to which many of us devote our lives will remain worthwhile. There can be few things in life that are as worth doing as enabling libraries to fulfill their unique and uniquely important role in culture and civilization.31 Twenty-five years have come and gone; some of the predictions in this first issue of Technical Services Quarterly came true, many of them did not. There have been dra- matic changes in those twenty-five years, most of which were unforeseen, as they always are. What is a certainty is that libraries can no longer sustain or maintain the status quo in technical services. What also is a certainty is that technical services staff, with their unique skills, talents, abilities, and knowledge in relation to the organization and description of information, are desperately needed in the new information environment. It is the responsibil- ity of both library administrators and technical services staff to work together to evolve and redesign workflows, standards, procedures, and even themselves to survive and succeed into the future. References 1. Norman D. Stevens, “Selections from a Dictionary of Libinfosci Terms,” in “Beyond ‘1984’: The Future of Technical Services,” special issue, Technical Services Quarterly 1, no. 1–2 (Fall/Winter 1983): 260. 2. Marcia J. Bates, “Improving User Access to Library Catalog and Portal Information: Final Report,” (paper pre- sented at the Library of Congress Bicentennial Conference on Bibliographic Control for the New Millennium, June 1, 2003): 4, http://www.loc.gov/catdir/bibcontrol/2.3BatesReport6-03 .doc.pdf (accessed Apr. 7, 2009). See also Karen Calhoun, “The Changing Nature of the Catalog and Its Integration with Other Discovery Tools,” final report to the Library of Congress, Mar. 17, 2006, 25, http://www.loc.gov/catdir/calhoun-report-final .pdf (accessed Apr. 7, 2009). 3. University of California Libraries Bibliographic Services Task Force, “Rethinking How We Provide Bibliographic Ser- vices for the University of California,” final report, Dec. 2005, 8, http://libraries.universityofcalifornia.edu/sopag/BSTF/Final. pdf (accessed Apr. 7, 2009). 4. Deanna B. Marcum, “The Future of Cataloging,” Library Resources & Technical Services 50, no. 1 (Jan. 2006): 9, http://www .loc.gov/library/reports/CatalogingSpeech.pdf (accessed Apr. 100 iNFormAtioN tecHNoloGY ANd liBrAries | JuNe 2010 7, 2009). 5. Karen Calhoun, “Being a Librarian: Metadata and Meta- data Specialists in the Twenty-First Century,” Library hi tech 25, no. 2 (2007), http://www.emeraldinsight.com/Insight/View ContentServlet?Filename=Published/EmeraldFullTextArticle/ Articles/2380250202.html (accessed Apr. 7, 2009). 6. Calhoun, “The Changing Nature of the Catalog,” 15. 7. Bradford Lee Eden, “Ending the Status Quo,” American Libraries 39, no. 3 (Mar. 2008): 38; Eden, introduction to “Infor- mation Organization Future for Libraries,” Library Technology Reports 44, no. 8 (Nov./Dec. 2007): 5–7. 8. See Karen Schneider’s “How OPACs Suck” series on the ALA TechSource blog, http://www.techsource.ala.org/ blog/2006/03/how-opacs-suck-part-1-relevance-rank-or-the -lack-of-it.html, http://www.techsource.ala.org/blog/2006/04/ how-opacs-suck-part-2-the-checklist-of-shame.html, and http:// www.techsource.ala.org/blog/2006/05/how-opacs-suck-part- 3-the-big-picture.html (accessed Apr. 7, 2009). 9. H. Frank Cervone, quoted in Ellen Bahr, “Dreaming of a Better ILS,” Computers in Libraries 27, no. 9 (Oct. 2007): 14. 10. David Weinberger, Everything Is Miscellaneous: The Power of the New Digital Disorder (New York: Times, 2007). 11. For a list of these concerns, see Robert Darnton, “The Library in the New Age,” The New York Review of Books 55, no. 10 (June 12, 2008), http://www.nybooks.com/articles/21514 (accessed Apr. 7, 2009). 12. See Calhoun, “The Changing Nature of the Catalog,” 27. 13. See the Metadata Research Center, “Automatic Metadata Generation Applications (AMeGA),” http://ils.unc.edu/mrc/ amega (accessed, Apr. 7, 2009). 14. Sheila S. Intner, “Generating Cataloging Data Automati- cally,” Technicalities 28, no. 2 (Mar./Apr. 2008): 1, 15–16. 15. Sheila S. Intner, “A Technical Services Makeover,” Techni- calities 27, no. 5 (Sept./Oct. 2007): 1, 14–15. 16. Ibid, 14 (emphasis added). 17. Carlen Ruschoff, “Competencies for 21st Century Techni- cal Services,” Technicalities 27, no. 6 (Nov./Dec. 2007): 1, 14–16. 18. Diane Hillman, “A Has-Been Cataloger Looks at What Cataloging Will Be,” online posting, Metadata Blog, July 1, 2008, http://blogs.ala.org/nrmig.php?title=creating_the_future_of_ the_catalog_aamp_&more=1&c=1&tb=1&pb=1 (accessed Apr. 7, 2009). 19. Peter Brantley, “Architectures for Collaboration: Roles and Expectations for Digital Libraries,” Educause Review 43, no. 2 (Mar./Apr. 2008): 31–38. 20. Calhoun, “The Changing Nature of the Catalog,” 13. 21. Brian Briscoe, “That Business/Customer Stuff (Was: Let- ter to AL),” online posting, Autocat, May 30, 2008. 22. Brad Wheeler, “In Search of Certitude,” Educause Review 43, no. 3 (May/June 2008): 15–34. 23. Ibid., 22. 24. Karen Calhoun, “Being a Librarian.” 25. Ibid. 26. Ibid. (emphasis added). 27. Andrew Pace, quoted in Roy Tennant, “Digitl Librar- ies: ‘Lipstick on a Pig,’” Library Journal, Apr. 15, 2005, http:// www.libraryjournal.com/article/CA516027.html (accessed Apr. 7, 2009). 28. Jia Mi and Cathy Weng, “Revitalizing the Library OPAC: Interface, Searching, and Display Challenges,” Information Tech- nology & Libraries 27, no. 1 (Mar. 2008): 5–22. 29. Gregor A. Preston, “How Will Automation Affect Cata- loging Staff?” in “Beyond ‘1984’: The Future of Technical Ser- vices,” special issue, Technical Services Quarterly 1, no. 1–2 (Fall/ Winter 1983): 134. 30. David C. Taylor, “The Library Future: Computers,” in “Beyond ‘1984’: The Future of Technical Services,” special issue, Technical Services Quarterly 1, no. 1–2 (Fall/Winter 1983): 92–93. 31. Michael Gorman, “Technical Services, 1984–2001 (and before),” in “Beyond ‘1984’: The Future of Technical Services,” special issue, Technical Services Quarterly 1, no. 1–2 (Fall/Winter 1983): 71. LITA cover 2, cover 3 Neal-Schuman cover 4 Index to Advertisers 3150 ---- eDitORiAl | tRuitt 3 Marc Truitt Marc truitt (marc.truitt@ualberta.ca) is Associate University Librarian, Bibliographic and information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. Marc Truitt Editorial: And Now for Something (Completely) Different T he issue of ITAL you hold in your hands—be that issue physical or virtual; we won’t even go into the question of your hands!—represents something new for us. For a number of years, Ex Libris (and previ- ously, Endeavor Information Systems) has generously sponsored the LITA/Ex Libris (née LITA/Endeavor) Student Writing Award competition. The competition seeks manuscript submissions from enrolled LIS students in the areas of ITAL’s publishing interests; a LITA committee on which the editor of ITAL serves as an ex-officio member evaluates the entries and names a winner. Traditionally, the winning essay has appeared in the pages of ITAL. In recent years, perhaps mirroring the waning interest in publication in traditional peer- reviewed venues, the number of entrants in the competi- tion has declined. In 2008, for instance, there were but nine submissions, and to get those, we had to extend the deadline six weeks from the end of February to mid- April. In previous years, as I understand it, there often were even fewer. This year, without moving the goalposts, we had— hold onto your hats!—twenty-seven entries. Of these, the review committee identified six finalists for discussion. The turnout was so good, in fact, that with the agreement of the committee, we at ITAL proposed to publish not only the winning paper but the other finalist entries as well. We hope that you will find them as stimulating as have we. Even more importantly, we hope that by pub- lishing such a large group of papers representing 2009’s best in technology-focused LIS work, we will encourage similarly large numbers of quality submissions in the years to come. I would like to offer sincere thanks to my University of Alberta colleague Sandra Shores, who as guest editor for this issue worked tirelessly over the past few months to shepherd quality student papers into substantial and interesting contributions to the literature. She and Managing Editor Judith Carter—who guest-edited our recent Discovery Issue—have both done fabulous jobs with their respective ITAL special issues. Bravo! n Ex Libris’ sponsorship In one of those ironic twists that one more customarily associates with movie plots than with real life, the LITA/Ex Libris Student Writing Award recently almost lost its spon- sor. At very nearly the same time that Sandra was complet- ing the preparation of the manuscripts for submission to ALA Production Services (where they are copyedited and typeset), we learned that Ex Libris had notified LITA that it had “decided to cease sponsoring” the Student Writing Award. A brief round of e-mails among principals at LITA, Ex Libris, and ITAL ensued, with the outcome being that Carl Grant, president of Ex Libris North America, gra- ciously agreed to continue sponsorship for another year and reevaluate underwriting the award for the future. We at ITAL and I personally are grateful. Carl’s message about the sponsorship raises some interesting issues on which I think we should reflect. His first point goes like this: It simply is not realistic for libraries to continue to believe that vendors have cash to fund these things at the same levels when libraries don’t have cash to buy things (or want to delay purchases or buy the product for greatly reduced amounts) from those same vendors. Please understand the two are tied together. Point taken and conceded. Money is tight. Carl’s argu- ment, I think, speaks as well to a larger, implied question. Libraries and library vendors share highly synergistic and, in recent years, increasingly antagonistic relation- ships. Library vendors—and I think library system ven- dors in particular—come in for much vitriol and precious little appreciation from those of us on the customer side. We all think they charge too much (and by implication, must also make too much), that their support and service are frequently unresponsive to our needs, and that their systems are overly large, cumbersome, and usually don’t do things the way we want them done. At the same time, we forget that they are catering to the needs and whims of a small, highly specialized market that is characterized by numerous demands, a high degree of complexity, and whose members—“standards” notwithstanding—rarely perform the same task the same way across institutions. We expect very individualized service and support, but at the same time are penny-pinching misers in our ability and willingness to pay for these services. We are beggars, yet we insist on our right to be choosers. Finally, at least for those of us of a certain generation—and yep, I count myself among its members—we chose librarianship for very specific reasons, which often means we are more than a little uneasy with concepts of “profit” and “bottom line” as applied to our world. We fail to understand the open-source dictum that “free as in kittens and not as in beer” means that we will have to pay someone for these services—it’s only a question of whom we will pay. Carl continues, making another point: I do appreciate that you’re trying to provide us more recognition as part of this. Frankly, that was another consideration in our thought of dropping it—we just didn’t feel like we were getting much for it. Marc truitt (marc.truitt@ualberta.ca) is Associate University Librarian, Bibliographic and information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 4 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 I’ve said before and I’ll say again, I’ve never, in all my years in this business had a single librarian say to me that because we sponsored this or that, it was even a consideration in their decision to buy something from us. Not once, ever. Companies like ours live on sales and service income. I want to encourage you to help make librarians aware that if they do appreciate when we do these things, it sure would be nice if they’d let us know in some real tangible ways that show that is true. . . . Good will does not pay bills or salaries unless that good will translates into purchases of products and services (and please note, I’m not just speaking for Ex Libris here, I’m saying this for all vendors). And here is where Carl’s and my views may begin to diverge. Let’s start by drawing a distinction between vendor tchotchkes and vendor sponsorship. In fairness, Carl didn’t say anything about tchotchkes, so why am I? I do so because I think that we need to bear in mind that there are multiple ways vendors seek to advertise themselves and their services to us, and geegaws are one such. Trinkets are nice—I have yet to find a better gel pen than the ones given out at IUG 14 (would that I could get more!)—but other than reminding me of a vendor’s name, they serve little useful purpose. The latter, vendor sponsorship, is something very different, very special, and not readily totaled on the bottom line. Carl is quite right that sponsorship of the Student Writing Award will not in and of itself cause me to buy Aleph, Primo, or SFX (Oh right, I have that last one already!). These are products whose purchase is the result of lengthy and complex reviews that include highly detailed and painstaking needs analysis, specifications, RFPs, site visits, demonstrations, and so on. Due diligence to our parent institutions and obligations to our users require that we search for a balance among best-of-breed solutions, top-notch support, and fair pricing. Those things aren’t related to sponsorship. What is related to sponsorship, though, is a sense of shared values and interests. Of “doing the right thing.” I may or may not buy Carl’s products because of the con- siderations above (and yes, Ex Libris fields very strong contenders in all areas of library automation); I definitely will, though, be more likely to think favorably of Ex Libris as a company that has similar—though not necessarily identical—values to mine, if it is obvious that it encour- ages and materially supports professional activities that I think are important. Support for professional growth and scholarly publication in our field are two such values. I’m sure we can all name examples of this sort of behavior: In addition to support of the Student Writing Award, Ex Libris’ long-standing prominence in the National Information Standards Organization (NISO) comes to mind. So too does the founding and ongoing support by Innovative Interfaces and the library consulting firm R2 for the Taiga Forum (http://www.taigaforum.org/), a group of academic associate university librarians. To the degree that I believe Ex Libris or another firm shares my values by supporting such activities—that it “does the right thing”—I will be just a bit more inclined to think positively of it when I’m casting about for solutions to a technology or other need faced by my institution. I will think of that firm as kin, if you will. With that, I will end this by again thanking Carl and Ex Libris—because we don’t say thank you often enough!—for their generous support of the LITA/Ex Libris Student Writing Award. I hope that it will continue for a long time to come. That support is something about which I do care deeply. If you feel similarly—be it about the Student Writing Award, NISO, Taiga, or whatever—I urge you to say so by sending an appropriate e-mail to your vendor’s representative or by simply saying thanks in person to the company’s head honcho on the ALA exhibit floor. And the next time you are neck-deep in seemingly identical vendor quotations and need a way to figure out how to decide between them, remember the importance of shared values. n Dan Marmion Longtime LITA members and ITAL readers in particu- lar will recognize the name of Dan Marmion, editor of this journal from 1999 through 2004. Many current and recent members of the ITAL editorial board—including Managing Editor Judith Carter, Webmaster Andy Boze, Board member Mark Dehmlow, and I—can trace our involvement with ITAL to Dan’s enthusiastic period of stewardship as editor. In addition to his leadership of ITAL, Dan has been a mentor, colleague, boss, and friend. His service philoso- phy is best summarized in the words of a simple epigram that for many years has graced the wall behind the desk in his office: “it’s all about access!!” Because of health issues, and in order to devote more time to his wife Diana, daughter Jennifer, and grand- daughter Madelyn, Dan recently decided to retire from his position as Associate Director for Information Systems and Digital Access at the University of Notre Dame Hesburgh Libraries. He also will pursue his personal interests, which include organizing and listening to his extensive collection of jazz recordings, listening to books on CD, and following the exploits of his favorite sports teams, the football Irish of Notre Dame, the Indianapolis Colts, and the New York Yankees. We want to express our deep gratitude for all he has given to the profession, to LITA, to ITAL, and to each of us personally over many years. We wish him all the best as he embarks on this new phase of his life. 3151 ---- A PARtNeRsHiP FOR cReAtiNG successFul PARtNeRsHiPs | GRANt 5 Ex Libris Column Carl Grant carl Grant is [tK] Ex Libris Column Carl Grant A Partnership for Creating Successful Partnerships Carl Grant W hen Marc asked me to write this column I eagerly accepted because I feel strongly about libraries leveraging their role to their greater advantage in the rapidly changing information land- scape. I see sponsorships and partnerships as an impor- tant tool for doing that. However, as noted in Marc’s column in this issue, we’d been having a discussion about the continuing involvement of Ex Libris in the LITA/Ex Libris Student Writing Award. Like many of you, we at Ex Libris are trying to keep our costs low in this chal- lenging economic environment so that we can in turn keep your costs low. Thus we are closely evaluating all expenditures to ensure their cost is justified by the value they return to our organization. I won’t repeat the discus- sion already outlined by Marc above, but will just note with great pleasure his willingness to not only listen to my concerns, but to try and address them. His invitation to write this column was part of that response, a chance for me to share my thoughts and concerns with you about sponsorships and partnerships and where they need to go in the future. To do that, I’d like to expand on some of the concepts Marc and I were discussing and talk about how to make sponsorships and partnerships successful. I want to look at what successful ones consist of as well as what types are needed in our profession tomorrow. n The elements of successful sponsorships and partnerships For a sponsorship or partnership to be successful in today’s environment, it should offer at least the following components: 1. Clear and shared goals. Agreeing what is to be achieved via the sponsorship or partnership is essential. Furthermore, it should be readily appar- ent that the goals are achievable. This will happen through joint planning and execution of an agreed- upon project plan that results in that achievement. It is up to each partner to ensure that they have the resources to execute that project plan on schedule and on budget. As there will always be unplanned events and issues, there must also be ongoing, open communications throughout the life of the sponsorship or partnership. This way, surprises are avoided and issues can be dealt with before they become problems. 2. Risks and rewards must be real and shared. Members of a sponsorship or partnership should share risks and rewards in proportion to the role they hold. Furthermore, the rewards must be seen to be real rewards to all the members. Step into the other members’ shoes and look at what you’re offering. Does it clearly bring value to the other organizations in the arrangement? If so, how? If not, what can be done to address that disparity? Sponsorships and partnerships should not take advantage of any one sponsor or partner by allo- cating risks or rewards disproportionately to their contributions. Rewards realized by members of the sponsorship or partnership should be proportion- ally shared by all the members. 3. Defined time. A sponsorship or partnership is for a defined amount of time and should not be assumed to be ongoing. Regular reviews of how well the sponsorship or partnership is working for the partners must be conducted and decisions made on the basis of those results. It might be that the landscape is changing and the benefits are no longer as meaningful, or there are alternatives now available that provide better benefits for on of the members. Maintaining a sponsorship or partner- ship past its useful life will only result in the disin- tegration of the overall relationship. 4. Write it down. Organizations merge, are acquired and sold, people change jobs, and people change responsibilities. Any sponsorship or partnership should have a written agreement outlining the ele- ments above. Once finalized, it should be signed by an appropriate person representing each member organization. That way, when things do change, there is a reference point and the arrangement is more likely to survive any of these precipitous events. n The sponsorships and partnerships needed for tomorrow Successful sponsorships and partnerships are a necessary part our landscape today. The world of information and knowledge has become too large, exists in too many silos, and is far too complex. “Competition, collaboration, and cooperation” defines the only path possible for navigating the landscape successfully. As the president of a company in the library automation marketplace, I continue to seek out opportunities that uniquely position our company to effectively maintain success in the marketplace and to provide value for our customers and thus our company. I believe libraries need to seek the same opportunities for their organizations. carl Grant (carl.grant@exlibrisgroup.com) is President of Ex Libris North America, des Plaines, illinois. Continued on page 7 eDitORiAl BOARD tHOuGHts | sHORes 7 Looking ahead, it seems clear that the pace of change in today’s environment will only continue to accelerate; thus the need for us to quickly form and dissolve key sponsorships and partnerships that will result in the suc- cessful fostering and implementation of new ideas, the currency of a vibrant profession. The next challenge is to realize that many of the key sponsorship and partnerships that need to be formed are not just with traditional organizations in this profession. Tomorrow’s sponsorships and partnership will be with those organizations that will benefit from the expertise of libraries and their suppliers while in return helping to develop or provide the new funding opportunities and means and places for disseminating access to their expertise and resources. Likely organizations would be those in the fields of education, publishing, content cre- ation and management, and social and community Web- based software. To summarize, we at Ex Libris believe in sponsor- ships and partnerships. We believe they’re important and should be used in advancing our profession and organizations. From long experience we also have learned there are right ways and wrong ways to implement these tools, and I’ve shared thoughts on how to make them work for all the parties involved. Again, I thank Marc for his receptiveness to this discussion and my even deeper appreciation for trying to address the issues. It’s serves as an excellent example of what I discussed above. People forget, but paper, the scroll, the codex, and later the book were all major technological leaps, not to mention the printing press and moveable type. . . . There is so much potential for using technology to equalize access to information, regardless of how much money you have, what language you speak, or where you live. Big ideas, enthusiasm, and hope for the profession, in addition to practical technology-focused information await the reader. Enjoy the issue, and congratulations to the winner and all the finalists! Note 1. All quotations are taken with permission from private e-mail correspondence. A Partnership for Creating Successful Partnerships continued from page 5 3152 ---- 6 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 sandra shores is [tk] Sandra Shores Editorial Board Thoughts: Issue Introduction to Student Essays T he papers in this special issue, although covering diverse topics, have in common their authorship by people currently or recently engaged in gradu- ate library studies. It has been many years since I was a library science student—twenty-five in fact. I remember remarking to a future colleague at the time that I found the interview for my first professional job easy, not because the interviewers failed to ask challenging questions, but because I had just graduated. I was passionate about my chosen profession, and my mind was filled from my time at library school with big ideas and the latest theories, techniques, and knowledge of our discipline. While I could enthusiastically respond to anything the interviewers asked, my colleague remarked she had been in her job so long that she felt she had lost her sense of the big questions. The busyness of her daily work life drew her focus away from contemplation of our purpose, principles, and values as librarians. I now feel at a similar point in my career as this colleague did twenty-five years ago, and for that reason I have been delighted to work with these student authors to help see their papers through to publication. The six papers represent the strongest work from a wide selection that students submitted to the LITA/ Ex Libris Student Writing Award competition. This year’s winner is Michael Silver, who looks for- ward to graduating in the spring from the MLIS program at the University of Alberta. Silver entered the program with a strong library technology foundation, having pro- vided IT services to a regional library system for about ten years. He notes that “the ‘accidental systems librarian’ position is probably the norm in many small and medium sized libraries. As a result, there are a number of practices that libraries should adopt from the IT world that many library staff have never been exposed to.”1 His paper, which details the implementation of an open-source mon- itoring system to ensure the availability of library systems and services, is a fine example of the blending of best practices from two professions. Indeed, many of us who work in IT in libraries have a library background and still have a great deal to learn from IT professionals. Silver is contemplating a PhD program or else a return to a library systems position when he graduates. Either way, the pro- fession will benefit from his thoughtful, well-researched, and useful contributions to our field. Todd Vandenbark’s paper on library Web design for persons with disabilities follows, providing a highly prac- tical but also very readable guide for webmasters and others. Vandenbark graduated last spring with a mas- ters degree from the School of Library and Information Science at Indiana University and is already working as a Web services librarian at the Eccles Health Sciences Library at the University of Utah. Like Mr. Silver, he entered the program with a number of years’ work experience in the IT field, and his paper reflects the depth of his technical knowledge. Vandenbark notes, however, that he has found “the enthusiasm and collegiality among library technology professionals to be a welcome change from other employment experiences,” a gratifying com- ment for readers of this journal. Ilana Tolkoff tackles the challenging concept of global interoperability in cataloguing. She was fascinated that a single database, OCLC, has holdings from libraries all over the world. This is also such a recent phenom- enon that our current cataloging standards still do not accommodate such global participation. I was inter- ested to see what librarians were doing to reconcile this variety of languages, scripts, cultures, and indepen- dently developed cataloging standards. Tolkoff also graduated this past spring and is hoping to find a position within a music library. Marijke Visser addresses the overwhelming question of how to organize and expose Internet resources, looking at tagging and the social Web as a solution. Coming from a teaching background, Visser has long been interested in literacy and life-long learning. She is concerned about “the amount of information found only online and what it means when people are unable . . . to find the best resources, the best article, the right website that answers a question or solves a critical problem.” She is excited by “the potential for creativity made possible by technol- ogy” and by the way librarians incorporate “collaborative tools and interactive applications into library service.” Visser looks forward to graduating in May. Mary Kurtz examines the use of the Dublin Core metadata schema within DSpace institutional repositor- ies. As a volunteer, she used DSpace to archive historical photographs and was responsible for classifying them using Dublin Core. She enjoyed exploring how other institutions use the same tools and would love to delve further into digital archives, “how they’re used, how they’re organized, who uses them and why.” Kurtz graduated in the summer and is looking for the right job for her interests and talents in a location that suits herself and her family. Finally, Lauren Mandel wraps up the issue exploring the use of a geographic information system to under- stand how patrons use library spaces. Mandel has been an enthusiastic patron of libraries since she was a small child visiting her local county and city public libraries. She is currently a doctoral candidate at Florida State University and sees an academic future for herself. Mandel expresses infectious optimism about technology in libraries: sandra shores (sandra.shores@ualberta.ca) is Guest Editor of this issue and operations Manager, information Technology Servi- ces, University of Alberta Libraries, Edmonton, Alberta, Canada. eDitORiAl BOARD tHOuGHts | sHORes 7 Looking ahead, it seems clear that the pace of change in today’s environment will only continue to accelerate; thus the need for us to quickly form and dissolve key sponsorships and partnerships that will result in the suc- cessful fostering and implementation of new ideas, the currency of a vibrant profession. The next challenge is to realize that many of the key sponsorship and partnerships that need to be formed are not just with traditional organizations in this profession. Tomorrow’s sponsorships and partnership will be with those organizations that will benefit from the expertise of libraries and their suppliers while in return helping to develop or provide the new funding opportunities and means and places for disseminating access to their expertise and resources. Likely organizations would be those in the fields of education, publishing, content cre- ation and management, and social and community Web- based software. To summarize, we at Ex Libris believe in sponsor- ships and partnerships. We believe they’re important and should be used in advancing our profession and organizations. From long experience we also have learned there are right ways and wrong ways to implement these tools, and I’ve shared thoughts on how to make them work for all the parties involved. Again, I thank Marc for his receptiveness to this discussion and my even deeper appreciation for trying to address the issues. It’s serves as an excellent example of what I discussed above. People forget, but paper, the scroll, the codex, and later the book were all major technological leaps, not to mention the printing press and moveable type. . . . There is so much potential for using technology to equalize access to information, regardless of how much money you have, what language you speak, or where you live. Big ideas, enthusiasm, and hope for the profession, in addition to practical technology-focused information await the reader. Enjoy the issue, and congratulations to the winner and all the finalists! Note 1. All quotations are taken with permission from private e-mail correspondence. A Partnership for Creating Successful Partnerships continued from page 5 3153 ---- 8 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 T. Michael Silver Monitoring Network and Service Availability with Open-Source Software Silver describes the implementation of a monitoring sys- tem using an open-source software package to improve the availability of services and reduce the response time when troubles occur. He provides a brief overview of the litera- ture available on monitoring library systems, and then describes the implementation of Nagios, an open-source network monitoring system, to monitor a regional library system’s servers and wide area network. Particular atten- tion is paid to using the plug-in architecture to monitor library services effectively. The author includes example displays and configuration files. Editor’s note: This article is the winner of the LITA/Ex Libris Writing Award, 2009. L ibrary IT departments have an obligation to provide reliable services both during and after normal busi- ness hours. The IT industry has developed guide- lines for the management of IT services, but the library community has been slow to adopt these practices. The delay may be attributed to a number of factors, including a dependence on vendors and consultants for technical expertise, a reliance on librarians who have little formal training in IT best practices, and a focus on automation systems instead of infrastructure. Larger systems that employ dedicated IT professionals to manage the orga- nization’s technology resources likely implement best practices as a matter of course and see no need to discuss them within the library community. In The Practice of System and Network Administration, Thomas A. Limoncelli, Christine J. Hogan, and Strata R. Chalup present a comprehensive look at best practices in managing systems and networks. Early in the book they provide a short list of first steps toward improving IT ser- vices, one of which is the implementation of some form of monitoring. They point out that without monitoring, systems can be down for extended periods before admin- istrators notice or users report the problem.1 They dedi- cate an entire chapter to monitoring services. In it, they discuss the two primary types of monitoring—real-time monitoring, which provides information on the current state of services, and historical monitoring, which pro- vides long-term data on uptime, use, and performance.2 While the software discussed in this article provides both types of monitoring, I focus on real-time monitoring and the value of problem identification and notification. Service monitoring does not appear frequently in library literature, and what is written often relates to single-purpose custom monitoring. An article in the September 2008 issue of ITAL describes the development and deployment of a wireless network, including a Perl script written to monitor the wireless network and asso- ciated services.3 The script updates a webpage to display the results and sends an e-mail notifying staff of problems. An enterprise monitoring system could perform these tasks and present the results within the context of the complete infrastructure. It would require using advanced features because of the segregation of networks discussed in their article but would require little or no extra effort than it took to write the single-purpose script. Dave Pattern at the University of Huddersfield shared another Perl script that monitors OPAC functionality.4 Again, the script provided a single-purpose monitoring solution that could be integrated within a larger model. Below, I discuss how I modified his script to provide more meaningful monitoring of our OPAC than the stock webpage monitoring plug-in included with our open- source networks monitoring system, Nagios. Service monitoring can consist of a variety of tests. In its simplest form, a ping test will verify that a host (server or device) is powered on and successfully con- nected to the network. Feher and Sondag used ping tests to monitor the availability of the routers and access points on their network, as do I for monitoring connectivity to remote locations.5 A slightly more meaningful check would test for the establishment of a connection on a port. Feher and Sondag used this method to check the daemons in their network.6 The step further would be to evaluate a service response, for example checking the status code returned by a Web server. Evaluating content forms the next level of meaning. Limoncelli, Hogan, and Chalup discuss end-to-end monitoring, where the moni- toring system actually performs meaningful transactions and evaluates the results.7 Pattern’s script, mentioned above, tests OPAC func- tionality by submitting a known keyword search and evaluating the response.8 I implemented this after an incident where Nagios failed to alert me to a problem with the OPAC. The Web server returned a status code of 200 to the request for the search page. Users, however, want more from an OPAC, and attempts to search were unsuccessful because of problems with the index server. Modifying Pattern’s original script, I was able to put together a custom check command that verifies a greater level of functionality by evaluating the number of results for the known search. n Software selection Limoncelli, Hogan, and Chalup do not address specific t. Michael silver (michael.silver@ualberta.ca) is an MLiS stu- dent, School of Library and information Studies, University of Al- berta, Edmonton, Alberta, Canada. MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 9 how-to issues and rarely mention specific products. Their book provides the foundational knowledge necessary to identify what must be done. In terms of monitoring, they leave the selection of an appropriate tool to the reader.9 Myriad monitoring tools exist, both commercial and open-source. Some focus on network analysis, and some even target specific brands or model lines. The selection of a specific software package should depend on the ser- vices being monitored and the goals for the monitoring. Wikipedia lists thirty-five different products, of which eighteen are commercial (some with free versions with reduced functionality or features); fourteen are open- source projects under a General Public License or similar license (some with commercial support available but without different feature sets or licenses); and three offer different versions under different licenses.10 Von Hagen and Jones suggest two of them: Nagios and Zabbix.11 I selected the Nagios open-source product (http:// www.nagios.org). The software has an established his- tory of active development, a large and active user community, a significant number of included and user- contributed extensions, and multiple books published on its use. Commercial support is available from a company founded by the creator and lead developer as well as other authorized solution providers. Monitoring appliances based on Nagios are available, as are sensors designed to interoperate with Nagios. Because of the flexibility of a software design that uses a plug-in archi- tecture, service checks for library-specific applications can be implemented. If a check or action can be scripted using practically any protocol or programming language, Nagios can monitor it. Nagios also provides a variety of information displays, as shown in appendixes A–E. n Installation The Nagios system provides an extremely flexible solu- tion to monitor hosts and services. The object-orientation and use of plug-ins allows administrators to monitor any aspect of their infrastructure or services using standard plug-ins, user-contributed plug-ins, or custom scripts. Additionally, the open-source nature of the package allows independent development of extensions to add features or integrate the software with other tools. Community sites such as MonitoringExchange (formerly Nagios Exchange), Nagios Community, and Nagios Wiki provide repositories of documentation, plug-ins, extensions, and other tools designed to work with Nagios.12 But that flexibility comes at a cost—Nagios has a steep learning curve, and user- contributed plug-ins often require the installation of other software, most notably Perl modules. Nagios runs on a variety of Linux, Unix, and Berkeley Software Distribution (BSD) operating systems. For testing, I used a standard Linux server distribution installed on a virtual machine. Virtualization provides an easy way to test software, especially if an alternate operating system is needed. If given sufficient resources, a virtual machine is capable of running the production instance of Nagios. After installing and updating the operating system, I installed the following packages: n Apache Web server n Perl n GD development library, needed to produce graphs and status maps n libpng-devel and libjpeg-devel, both needed by the GD library n gcc and GNU make, which are needed to compile some plug-ins and Perl modules Most major Linux and BSD distributions include Nagios in their software repositories for easy instal- lation using the native package management system. Although the software in the repositories is often not the most recent version, using these repositories simplifies the installation process. If a reasonably recent version of the software is available from a repository, I will install from there. Some software packages are either outdated or not available, and I manually install these. Detailed installation instructions are available on the Nagios web- site, in several books, and on the previously mentioned websites.13 The documentation for version 3 includes a number of quick-start guides.14 Most package managers will take care of some of the setup, including modifying the Apache configuration file to create an alias available at http://server.name/nagios. I prepared the remainder of this article using the latest stable versions of Nagios (3.0.6) and the plug-ins (1.4.13) at the time of writing. n Configuration Nagios configuration relies on an object model, which allows a great deal of flexibility but can be complex. Planning your configuration beforehand is highly recom- mended. Nagios has two main configuration files, cgi.cfg and nagios.cfg. The former is primarily used by the Web inter- face to authenticate users and control access, and it defines whether authentication is used and which users can access what functions. The latter is the main configuration file and controls all other program operations. The cfg_file and cfg_dir directives allow the configuration to be split into manageable groupsusing additional recourse files and the object definition files (see figure 1). The flexibility offered allows a variety of different structures. I group network 10 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 devices into groups but create individual files for each server. Nagios uses an object- oriented design. The objects in Nagios are dis- played in table 1. A complete review of Nagios configuration is beyond the scope of this article. The documenta- tion installed with Nagios covers it in great detail. Special attention should be paid to the concepts of templates and object inheritance as they are vital to creating a man- ageable configuration. The discussion below provides a brief introduction, while appendixes F–J provide concrete examples of working configuration files. n cgi.cfg The cgi.cfg file controls the Web interface and its asso- ciated CGI (Common Gateway Interface) programs. During testing, I often turn off authentication by setting use_authentication to 0 if the Web interface is not accessible from the Internet. There also are various configuration directives that provide greater control over which users can access which features. The users are defined in the /etc/nagios/htpasswd.users file. A summary of com- mands to control entries is presented in table 2. The Web interface includes other features, such as sounds, status map displays, and integration with other products. Discussion of these directives is beyond the scope of this article. The cgi.cfg file provided with the software is well commented, and the Nagios documen- tation provides additional information. A number of screenshots from the Web interface are provided in the appendixes, including status displays and reporting. n nagios.cfg The nagios.cfg file controls the operation of everything except the Web interface. Although it is possible to have a single monolithic configuration file, organizing the con- figuration into manageable files works better. The two main directives of note are cfg_file, which defines a single file that should be included, and cfg_dir, which includes all files in the specified directory with a .cfg extension. A third type of file that gets included is resource.cfg, which defines various macros for use in commands. Organizing the object files takes some thought. I monitor more than one hundred services on roughly seventy hosts, so the method of organizing the files was of more than academic interest. I use the following con- figuration files: n commands.cfg, containing command definitions n contacts.cfg, containing the list of contacts and associated information, such as e-mail address, (see appendix H) n groups.cfg, containing all groups—hostgroups, ser- vicegroups, and contactgroups, (see appendix G) n templates.cfg, containing all object templates, (see appendix F) n timeperiods.cfg, containing the time ranges for checks and notifications All devices and servers that I monitor are placed in directories using the cfg_dir directive: Servers—Contains server configurations. Each file includes the host and service configurations for a physical or virtual server. Devices—Contains device information. I create indi- vidual files for devices with service monitoring that goes beyond simple ping tests for connectiv- Table 1. Nagios objects Object Used for hosts servers or devices being monitored hostgroups groups of hosts services services being monitored servicegroups groups of services timeperiods scheduling of checks and notifications commands checking hosts and services notifying contacts processing performance data event handling contacts individuals to alert contactgroups groups of contacts Figure 1. Nagios configura- tion relationships. Copyright © 2009 Ethan Galstead, Nagios Enterprises. Used with permis- sion. MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 11 ity. Devices monitored solely for connectivity are grouped logically into a single file. For example, we monitor connectivity with fifty remote locations, and all fifty of them are placed in a single file. The resource.cfg file uses two macros to define the path to plug-ins and event handlers. Thirty other macros are available. Because the CGI programs do not read the resource file, restrictive permissions can be applied to them, enabling some of the macros to be used for user- names and passwords needed in check commands. Placing sensitive information in service configurations exposes them to the Web server, creating a security issue. n Configuration The appendixes include the object configuration files for a simple monitoring situation. A switch is monitored using a simple ping test (see appendix J), while an opac server on the other side of the switch is monitored for both Web and Z39.50 operations (see appendix I). Note that the opac configuration includes a parents directive that tells Nagios that a problem with the gateway-switch will affect connectivity with the opac server. I monitor fifty remote sites. If my router is down, a single notification regarding my router provides more information if it is not buried in a storm of notifications about the remote sites. The Web port, Web service, and opac search services demon- strate different levels of monitoring. The Web port simply attempts to establish a connection to port 80 without evalu- ating anything beyond a successful connection. The Web service check requests a specific page from the Web server and evaluates only the status code returned by the server. It displays a warning because I configured the check to download a file that does not exist. The Web server is run- ning because it returns an error code, hence the warning status. The opac search uses a known search to evaluate the result content, specifically whether the correct number of results is returned for a known search. I used a number of templates in the creation of this configuration. Templates reduce the amount of repeti- tive typing by allowing the reuse of directives. Templates can be chained, as seen in the host templates. The opac definition uses the Linux-server template, which in turn uses the generic-host template. The host definition inher- its the directives of the template it uses, overriding any elements in both and adding new elements. In practical terms, generic-host directives are read first. Linux-server directives are applied next. If there is a conflict, the Linux- server directive takes precedence. Finally, opac is read. Again, any conflicts are resolved in favor of the last con- figuration read, in this case opac. n Plug-ins and service checks The nagios plugins package provides numerous plug-ins, including the check-host-alive, check_ping, check_tcp, and check_http commands. Using the plug-ins is straightfor- ward, as demonstrated in the appendixes. Most plug- ins will provide some information on use if executed with—help supplied as an argument to the command. By default, the plug-ins are installed in /usr/lib/nagios/ plugins. Some distributions may install them in a differ- ent directory. The plugins folder contains a subfolder with user- contributed scripts that have proven useful. Most of these plug-ins are Perl scripts, many of which require additional Perl modules available from the Comprehensive Perl Archive Network (CPAN). The check_hip_search plug-in (appendix K) used in the exam- ples requires additional modules. Installing Perl mod- ules is best accomplished using the CPAN Perl module. Detailed instructions on module installation are avail- able online.15 Some general tips: n Gcc and make should be installed before trying to install Perl modules, regardless of whether you are installing manually or using CPAN. Most modules are provided as source code, which may require compiling before use. CPAN automates this pro- cess but requires the presence of these packages. n Alternately, many Linux distributions provide Perl module packages. Using repositories to install usu- ally works well assuming the repository has all the needed modules. In my experience, that is rarely the case. Table 2. Sample commands for managing the htpasswd.users file Create or modify an entry, with password entered at a prompt: htpasswd /etc/nagios/htpasswd.users <username> Create or modify an entry using password from the command line: htpasswd -b /etc/nagios/htpasswd.users <username> <password> Delete an entry from the file: htpasswd -D /etc/nagios/htpasswd.users <username> 12 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 n Many modules depend on other modules, some- times requiring multiple install steps. Both CPAN and distribution package managers usually satisfy dependencies automatically. Manual installation requires the installer to satisfy the dependencies one by one. n Most plug-ins provide information on required software, including modules, in a readme file or in the source code for the script. In the absence of such documentation, running the script on the command line usually produces an error contain- ing the name of the missing module. n Testing should be done using the nagios user. Using another user account, especially the root user, to create directories, copy files, and run programs creates folders and files that are not accessible to the nagios user. The best practice is to use the nagios user for as much of the configuration and testing as possible. The lists and forums frequently include questions from new users that have successfully installed, configured, and tested Nagios as the root user and are confused when Nagios fails to start or function properly. n Advanced topics Once the system is running, more advanced features can be explored. The documentation describes many such enhancements, but the following may be particularly use- ful depending on the situation. n Nagios provides access control through the combi- nation of settings in the cgi.cfg and htpasswd.users files. Library administration and staff, as well as patrons, may appreciate the ability to see the sta- tus of the various systems. However, care should be taken to avoid disclosing sensitive information regarding the network or passwords, or allowing access to CGI programs that perform actions. n Nagios permits the establishment of dependency relationships. Host dependencies may be useful in some rare circumstances not covered by the parent–child relationships mentioned above, but service dependencies provide a method of connect- ing services in a meaningful manner. For example, certain OPAC functions are dependent on ILS ser- vices. Defining these relationships takes both time and thought, which may be worthwhile depending on any given situation. n Event handlers allow Nagios to initiate certain actions after a state change. If Nagios notices that a particular service is down, it can run a script or program to attempt to correct the problem. Care should be taken when creating these scripts as ser- vice restarts may delete or overwrite information critical to solving a problem, or worsen the actual situation if an attempt to restart a service or reboot a server fails. n Nagios provides notification escalations, permit- ting the automatic notification of problems that last longer than a certain time. For example, a service escalation could send the first three alerts to the admin group. If properly configured, the fourth alert would be sent to the managers group as well as the admin group. In addition to escalating issues to management, this feature can be used to establish a series of responders for multiple on-call personnel. n Nagios can work in tandem with remote machines. In addition to custom scripts using Secure Shell (SSH), the Nagios Remote Plug-in Executor (NRPE) add-on allows the execution of plug-ins on remote machines, while the Nagios Service Check Acceptor (NSCA) add-on allows a remote host to submit check results to the Nagios server for processing. Implementing Nagios on the Feher and Sondag wireless network mentioned earlier would require one of these options because the wireless network is not accessible from the external network. These add-ons also allow for distributed monitoring, sharing the load among a number of servers while still providing the administrators with a single interface to the entire monitored network. The Nagios Exchange (http://exchange.nagios .org/) contains similar user-contributed programs for Windows. n Nagios can be configured to provide redundant or failover monitoring. Limoncelli, Hogan, and Chalup call this metamonitoring and describe when it is needed and how it can be implemented, suggesting self-monitoring by the host or having a second monitoring system that only monitors the main system.16 Nagios permits more complex configurations, allowing for either two servers operating in parallel, only one of which sends notifications unless the main server fails, or two servers communicating to share the monitoring load. n Alternative means of notification increase access to information on the status of the network. I imple- mented another open-source software package, QuickPage, which allows Nagios text messages to be sent from a computer to a pager or cell phone.17 Appendix L shows a screenshot of a Firefox exten- sion that displays host and service problems in the status bar of my browser and provides optional audio alerts.18 The Nagios community has devel- oped a number of alternatives, including special- ized Web interfaces and RSS feed generators.19 MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 13 n Appropriate use Monitoring uses bandwidth and adds to the load of machines being monitored. Accordingly, an IT depart- ment should only monitor its own servers and devices, or those for which it has permission to do so. Imagine what would happen if all the users of a service such as WorldCat started monitoring it! The additional load would be noticeable and could conceivably disrupt service. Aside from reasons connected with being a good “netizen,” monitoring appears similar to port-scanning, a technique used to discover network vulnerabilities. An organization that blithely monitors devices without the owner’s permission may find their traffic is throttled back or blocked entirely. If a library has a definite need to moni- tor another service, obtaining permission to do so is a vital first step. If permission is withheld, the service level agree- ment between the library and its service provider or ven- dor should be reevaluated to ensure that the provider has an appropriate system in place to respond to problems. n Benefits The system-administration books provide an accurate overview of the benefits of monitoring, but personally reaping those benefits provides a qualitative background to the experience. I was able to justify the time spent on setting up monitoring the first day of production. One of the available plug-ins monitors Sybase database servers. It was one of the first contributed plug-ins I implemented because of past experiences with our production database running out of free space, causing the system to become nonfunctional. This happened twice, approximately a year apart. Each time, the integrated library system was down while the vendor addressed the issue. When I enabled the Sybase service checks, Nagios immediately returned a warning for the free space. The advance warning allowed me to work with the vendor to extend the database volume with no downtime for our users. That single event con- vinced the library director of the value of the system. Since that time, Nagios has proven its worth in alert- ing IT staff to problem situations, providing information on outage patterns both for in-house troubleshooting and discussions with service providers. n Conclusion Monitoring systems and services provides IT staff with a vital tool in providing quality customer service and managing systems. Installing and configuring such a system involves a learning curve and takes both time and computing resources. My experiences with Nagios have convinced me that the return on investment more than justifies the costs. References 1. Thomas A. Limoncelli, Christina J. Hogan, and Strata R. Chalup, The Practice of System and Network Administration, 2nd ed. (Upper Saddle River, N.J.: Addison-Wesley, 2007): 36. 2. Ibid., 523–42. 3. James Feher and Tyler Sondag, “Administering an Open- Source Wireless Network,” Information Technology & Libraries 27, no. 3 (Sept. 2008): 44–54. 4. Dave Pattern, “Keeping an Eye on Your HIP,” online post- ing, Jan. 23, 2007, Self-Plagiarism is Style, http://www.daveyp .com/blog/archives/164 (accessed Nov. 20, 2008). 5. Feher and Sondag, “Administering an Open-Source Wire- less Network,” 45–54. 6. Ibid., 48, 53–54. 7. Limoncelli, Hogan, and Chalup, The Practice of System and Network Administration, 539–40. 8. Pattern, “Keeping an Eye on Your HIP.” 9. Limoncelli, Hogan, and Chalup, The Practice of System and Network Administration, xxv. 10. “Comparison of Network Monitoring Systems,” Wikipe- dia, The Free Encyclopedia, Dec. 9, 2008, http://en.wikipedia .org/wiki/Comparison_of_network_monitoring_systems (accessed Dec. 10, 2008). 11. William Von Hagen and Brian K. Jones, Linux Server Hacks, Vol. 2 (Sebastopol, Calif.: O’Reilly, 2005): 371–74 (Zabbix), 382–87 (Nagios). 12. MonitoringExchange, http://www.monitoringexchange. org/ (accessed Dec. 23, 2009); Nagios Community, http:// community.nagios.org (accessed Dec. 23, 2009); Nagios Wiki, http://www.nagioswiki.org/ (accessed Dec. 23, 2009). 13. “Nagios Documentation,” Nagios, Mar. 4, 2008, http:// www.nagios.org/docs/ (accessed Dec. 8, 2008); David Joseph- sen, Building a Monitoring Infrastructure with Nagios (Upper Saddle River, N.J.: Prentice Hall, 2007); Wolfgang Barth, Nagios: System and Network Monitoring, U.S. ed. (San Francisco: Open Source Press; No Starch Press, 2006). 14. Ethan Galstead, “Nagios Quickstart Installation Guides,” Nagios 3.x Documentation, Nov. 30, 2008, http://nagios.source forge.net/docs/3_0/quickstart.html (accessed Dec. 3, 2008). 15. The Perl Directory, (http://www.perl.org/) contains com- plete information on Perl. Specific information on using CPAN is available in “How Do I Install a Module from CPAN?” perlfaq8, Nov. 7, 2007, http://perldoc.perl.org/perlfaq8.html (accessed Dec. 4, 2008). 16. Limoncelli, Hogan, and Chalup, The Practice of System and Network Administration, 539–40. 17. Thomas Dwyer III, QPage Solutions, http://www.qpage .org/ (accessed Dec. 9, 2008). 18. Petr Šimek, “Nagioschecker,” Google Code, Aug. 12, 2008, http://code.google.com/p/nagioschecker/ (accessed Dec. 8, 2008). 19. “Notifications,” MonitoringExchange, http://www .monitoringexchange.org/inventory/Utilities/AddOn-Proj- ects/Notifications (accessed Dec. 23, 2009). 14 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 Appendix A. Service detail display from test system Appendix B. Service details for OPAC (hip) and ILS (horizon) servers from production system Appendix C. Sybase freespace trends for a specified period Appendix D. Connectivity history for a specified period Appendix E. Availability report for host shown in Appendix D Appendix F. templates.cfg file ############################################################################ # TEMPLATES.CFG - SAMPLE OBJECT TEMPLATES ############################################################################ ############################################################################ # CONTACT TEMPLATES ############################################################################ MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 15 # Generic contact definition template - This is NOT a real contact, just # a template! define contact{ name generic-contact service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r,f,s host_notification_options d,u,r,f,s service_notification_commands notify-service-by-email host_notification_commands notify-host-by-email register 0 } ############################################################################ # HOST TEMPLATES ############################################################################ # Generic host definition template - This is NOT a real host, just # a template! define host{ name generic-host notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 failure_prediction_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 notification_period 24x7 register 0 } # Linux host definition template - This is NOT a real host, just a template! define host{ name linux-server use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period workhours notification_interval 120 notification_options d,u,r contact_groups admins register 0 } Appendix F. templates.cfg file (cont.) 16 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 # Define a template for switches that we can reuse define host{ name generic-switch use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period 24x7 notification_interval 30 notification_options d,r contact_groups admins register 0 } ############################################################################ # SERVICE TEMPLATES ############################################################################ # Generic service definition template - This is NOT a real service, # just a template! define service{ name generic-service active_checks_enabled 1 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 1 check_freshness 0 notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 failure_prediction_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 is_volatile 0 check_period 24x7 max_check_attempts 3 normal_check_interval 10 retry_check_interval 2 contact_groups admins notification_options w,u,c,r notification_interval 60 notification_period 24x7 register 0 } Appendix F. templates.cfg file (cont.) MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 17 # Define a ping service. This is NOT a real service, just a template! define service{ use generic-service name ping-service notification_options n check_command check_ping!1000.0,20%!2000.0,60% register 0 } Appendix F. templates.cfg file (cont.) Appendix G. groups.cfg file ############################################################################ # CONTACT GROUP DEFINITIONS ############################################################################ # We only have one contact in this simple configuration file, so there is # no need to create more than one contact group. define contactgroup{ contactgroup_name admins alias Nagios Administrators members nagiosadmin } ############################################################################ # HOST GROUP DEFINITIONS ############################################################################ # Define an optional hostgroup for Linux machines define hostgroup{ hostgroup_name linux-servers ; The name of the hostgroup alias Linux Servers ; Long name of the group } # Create a new hostgroup for ILS servers define hostgroup{ hostgroup_name ils-servers ; The name of the hostgroup alias ILS servers ; Long name of the group } # Create a new hostgroup for switches define hostgroup{ hostgroup_name switches ; The name of the hostgroup alias Network Switches ; Long name of the group } ############################################################################ # SERVICE GROUP DEFINITIONS ############################################################################ 18 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 # Define a service group for network connectivity define servicegroup{ servicegroup_name network alias Network infrastructure services } # Define a servicegroup for ILS define servicegroup{ servicegroup_name ils-services alias ILS related services } Appendix G. groups.cfg file (cont.) Appendix H. contacts.cfg ############################################################################ # CONTACTS.CFG - SAMPLE CONTACT/CONTACTGROUP DEFINITIONS ############################################################################ # Just one contact defined by default - the Nagios admin (that’s you) # This contact definition inherits a lot of default values from the # ‘generic-contact’ template which is defined elsewhere. define contact{ contact_name nagiosadmin use generic-contact alias Nagios Admin email nagios@localhost } Appendix I. opac.cfg ############################################################################ # OPAC SERVER ############################################################################ ############################################################################ # HOST DEFINITION ############################################################################ # Define a host for the server we’ll be monitoring # Change the host_name, alias, and address to fit your situation define host{ use linux-server host_name opac parents gateway-switch alias OPAC server MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 19 Appendix I. opac.cfg (cont.) address 192.168.1.123 } ############################################################################ # SERVICE DEFINITIONS ############################################################################ # Create a service for monitoring the HTTP port define service{ use generic-service host_name opac service_description web port check_command check_tcp!80 } # Create a service for monitoring the web service define service{ use generic-service host_name opac service_description Web service check_command check_http!-u/bogusfilethatdoesnotexist.html } # Create a service for monitoring the opac search define service{ use generic-service host_name opac service_description OPAC search check_command check_hip_search } # Create a service for monitoring the Z39.50 port define service{ use generic-service host_name opac service_description z3950 port check_command check_tcp!210 } Appendix J. switches.cfg ############################################################################ # SWITCH.CFG - SAMPLE CONFIG FILE FOR MONITORING SWITCHES ############################################################################ ############################################################################ # HOST DEFINITIONS ############################################################################ 20 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 Appendix K. check_hip_search script #!/usr/bin/perl -w ######################### # Check Horizon Information Portal (HIP) status. # HIP is the web-based interface for Dynix and Horizon # ILS systems by SirsiDynix corporation. # # This plugin is based on a standalone Perl script written # by Dave Pattern. Please see # http://www.daveyp.com/blog/index.php/archives/164/ # for the original script. # # The original script and this derived work are covered by # http://creativecommons.org/licenses/by-nc-sa/2.5/ ######################### use strict; use LWP::UserAgent; # Note the requirement for Perl module LWP::UserAgent! use lib “/usr/lib/nagios/plugins”; use utils qw($TIMEOUT %ERRORS); # Define the switch that we’ll be monitoring define host{ use generic-switch host_name gateway-switch alias Gateway Switch address 192.168.0.1 hostgroups switches } ############################################################################ ### # SERVICE DEFINITIONS ############################################################################ ### # Create a service to PING to switches # Note this entry will ping every host in the switches hostgroup define service{ use ping-service hostgroups switches service_description PING normal_check_interval 5 retry_check_interval 1 } Appendix J. switches.cfg MONitORiNG NetwORK AND seRvice AvAilABilitY witH OPeN-sOuRce sOFtwARe | silveR 21 ### Some configuration options my $hipServerHome = “http://ipac.prl.ab.ca/ipac20/ipac. jsp?profile=alap”; my $hipServerSearch = “http://ipac.prl.ab.ca/ipac20/ipac.jsp?menu=se arch&aspect=subtab132&npp=10&ipp=20&spp=20&profile=alap&ri=&index=.GW&term=li nux&x=18&y=13&aspect=subtab132&GetXML=true”; my $hipSearchType = “xml”; my $httpProxy = ‘’; ### check home page is available... { my $ua = LWP::UserAgent->new; $ua->timeout( 10 ); if( $httpProxy ) { $ua->proxy( ‘http’, $httpProxy ) } my $response = $ua->get( $hipServerHome ); my $status = $response->status_line; if( $response->is_success ) { } else { print “HIP_SEARCH CRITICAL: $status\n”; exit $ERRORS{‘CRITICAL’}; } } ### check search page is returning results... { my $ua = LWP::UserAgent->new; $ua->timeout( 10 ); if( $httpProxy ) { $ua->proxy( ‘http’, $httpProxy ) } my $response = $ua->get( $hipServerSearch ); my $status = $response->status_line; if( $response->is_success ) { my $results = 0; my $content = $response->content; if( lc( $hipSearchType ) eq ‘html’ ) { if ( $content =~ /\<b\>(\d+?)\<\/b\>\ \;titles matched/ ) { $results = $1; Appendix K. check_hip_search script (cont.) 22 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 } } if( lc( $hipSearchType ) eq ‘xml’ ) { if( $content =~ /\<hits\>(\d+?)\<\/hits\>/ ) { $results = $1; } } ### Modified section - original script triggered another function to ### save results to a temp file and email an administrator. unless( $results ) { print “HIP_SEARCH CRITICAL: No results returned|results=0\n”; exit $ERRORS{‘CRITICAL’}; } if ( $results ) { print “HIP_SEARCH OK: $results results returned|results=$results\n”; exit $ERRORS{‘OK’}; } } } Appendix K. check_hip_search script (cont.) Appendix L. Nagios Checker display 3156 ---- 34 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 Tagging: An Organization Scheme for the Internet Marijke A. Visser How should the information on the Internet be organized? This question and the possible solutions spark debates among people concerned with how we identify, classify, and retrieve Internet content. This paper discusses the benefits and the controversies of using a tagging system to organize Internet resources. Tagging refers to a clas- sification system where individual Internet users apply labels, or tags, to digital resources. Tagging increased in popularity with the advent of Web 2.0 applications that encourage interaction among users. As more information is available digitally, the challenge to find an organiza- tional system scalable to the Internet will continue to require forward thinking. Trained to ensure access to a range of informational resources, librarians need to be concerned with access to Internet content. Librarians can play a pivotal role by advocating for a system that sup- ports the user at the moment of need. Tagging may just be the necessary system. W ho will organize the information available on the Internet? How will it be organized? Does it need an organizational scheme at all? In 1998, Thomas and Griffin asked a similar question, “Who will create the metadata for the Internet?” in their article with the same name.1 Ten years later, this question has grown beyond simply supplying metadata to assuring that at the moment of need, someone can retrieve the information necessary to answer their query. Given new classification tools available on the Internet, the time is right to reas- sess traditional models, such as controlled vocabularies and taxonomies, and contrast them with folksonomies to understand which approach is best suited for the future. This paper gives particular attention to Delicious, a social networking tool for generating folksonomies. The amount of information available to anyone with an Internet connection has increased in part because of the Internet’s participatory nature. Users add content in a variety of formats and through a variety of applications to personalize their Web experience, thus making Internet content transitory in nature and challenging to lock into place. The continual influx of new information is caus- ing a rapid cultural shift, more rapid than many people are able to keep up with or anticipate. Conversations on a range of topics that take place using Web technologies happen in real time. Unless you are a participant in these conversations and debates using Web-based communica- tion tools, changes are passing you by. Internet users in general have barely grasped the concept of Web 2.0 and already the advanced “Internet cognoscenti” write about Web 3.0.2 Regarding the organization and availability of Internet content, librarians need to be ahead of the crowd as the voice who will assure content will be readily accessible to those that seek it. Internet users actively participat- ing in and shaping the online communities are, perhaps unintentionally, influencing how those who access infor- mation via the Internet expect to be able to receive and use digital resources. Librarians understand that the way information is organized is critical to its accessibility. They also understand the communities in which they operate. Today, librarians need to be able to work seam- lessly among the online communities, the resources they create, and the end user. As Internet use evolves, librar- ians as information stakeholders should stay abreast of Web 2.0 developments. By positioning themselves to lead the future of information organization, librarians will be able to select the best emerging Web-based tools and applications, become familiar with their strengths, and leverage their usefulness to guide users in organizing Internet content. Shirky argues that the Internet has allowed new com- munities to form. Primarily online, these communities of Internet users are capable of dramatically changing society both on- and offline. Shirky contends that because of the Internet, “group action just got easier.”3 According to Shirky, we are now at the critical point where Internet use, while dependent on technology, is actually no longer about the technology at all. The Web today (Web 2.0) is about participation. “This [the Internet] is a medium that is going to change society.”4 Lessig points out that content creators are “writing in the socially, culturally relevant sense for the 21st century and to be able to engage in this writing is a measure of your literacy in the 21st century.”5 It is significant that creating content is no longer reserved for the Internet cognoscenti. Internet users with a variety of technological skills are participating in Web 2.0 com- munities. Information architects, Web designers, librarians, busi- ness representatives, and any stakeholder dependent on accessing resources on the Internet have a vested interest in how Internet information is organized. Not only does the architecture of participation inherent in the Internet encourage completely new creative endeavors, it serves as a platform for individual voices as demonstrated in Marijke A. visser (marijkea@gmail.com) is a Library and infor- mation Science graduate student at indiana University, india- napolis, and will be graduating May 2010. She is currently work- ing for ALA’s office for information and Technology Policy as an information Technology Policy Analyst, where her area of focus includes telecommunications policy and how it affects access to information. tAGGiNG: AN ORGANizAtiON scHeMe FOR tHe iNteRNet | visseR 35 personal and organizationally sponsored blogs: Lessig 2.0, Boing Boing, Open Access News, and others. These Internet conversations contribute diverse viewpoints on a stage where, theoretically, anyone can access them. Web 2.0 technologies challenge our understanding of what con- stitutes information and push policy makers to negotiate equitable Internet-use policies for the public, the content creators, corporate interests, and the service providers. To maintain an open Internet that serves the needs of all the players, those involved must embrace the opportunity for cultural growth the social Web represents. For users who access, create, and distribute digital content, information is anything but static; nor is using it the solitary endeavor of reading a book. Its digital format makes it especially easy for people to manipulate it and shape it to create new works. People are sharing these new works via social technologies for others to then remix into yet more distinct creative work. Communication is fundamentally altered by the ability to share content on the Internet. Today’s Internet requires a reevaluation of how we define and organize information. The manner in which digital information is classified directly affects each user’s ability to access needed information to fully participate in twenty-first-century culture. New para- digms for talking about and classifying information that reflect the participatory Internet are essential. n Background The controversy over organizing Web-based information can be summed up comparing two perspectives repre- sented by Shirky and Peterson. Both authors address how information on the Web can be most effectively orga- nized. In her introduction, Peterson states, “Items that are different or strange can become a barrier to networking.”6 Shirky maintains, “As the Web has shown us, you can extract a surprising amount of value from big messy data sets.”7 Briefly, in this instance ontology refers to the idea of defining where digital information can and should be located (virtually). Folksonomy describes an organiza- tional system where individuals determine the placement and categorization of digital information. Both terms are discussed in detail below. Although any organizational system necessitates talking about the relationship(s) among the materials being organized, the relationships can be classified in multiple ways. To organize a given set of entities, it is necessary to establish in what general domain they belong and in what ways they are related. Applying an ontological, or hierar- chical, classification system to digital information raises several points to consider. First, there are no physical space restrictions on the Internet, so relationships among digital resources do not need to be strictly identified. Second, after recognizing that Internet resources do not need the same classification standards as print material, librarians can begin to isolate the strengths of current nondigital systems that could be adapted to a system for the Internet. Third, librarians must be ready to eliminate current systems entirely if they fail to serve the needs of Internet users. Traditional systems for organizing information were developed prior to the information explosion on the Internet. The Internet’s unique platform for creating, storing, and disseminating information challenges pre– digital-age models. Designing an organizational system for the Internet that supports creative innovation and succeeds in providing access to the innovative work is paramount to moving the twenty-first-century culture forward. n Assessing alternative models Controversy encourages scrutiny of alternative models. In understanding the options for organizing digital infor- mation, it is important to understand traditional classifi- cation models. Smith discusses controlled vocabularies, taxonomies, and facets as three traditional methods for applying metadata to a resource. According to Smith, a controlled vocabulary is an unambiguous system for managing the meanings of words. It links synonyms, allowing a search to retrieve information on the basis of the relationship between synonyms.8 Taxonomies are hierarchical, controlled vocabularies that establish par- ent–child relationships between terms. A faceted classifi- cation system categorizes information using the distinct properties of that information.9 In such a system, infor- mation can exist in more than one place at a time. A fac- eted classification system is a precursor to the bottom-up system represented by folksonomic tagging. Folksonomy, a term coined in 2004 by Thomas Vander Wal, refers to a “user-created categorical structure development with an emergent thesaurus.”10 Vander Wal further separates the definition into two types: a narrow and a broad folk- sonomy.11 In a broad folksonomy, many people tag the same object with numerous tags or a combination of their own and others’ tags. In a narrow folksonomy, one or few people tag an object with primarily singular terms. Internet searching represents a unique challenge to people wanting to organize its available information. Search engines like Yahoo! and Google approach the cha- otic mass of information using two different techniques. Yahoo! created a directory similar to the file folder system with a set of predetermined categories that were intended to be universally useful. In so doing, the Yahoo! devel- opers made assumptions about how the general public would categorize and access information. The categories 36 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 and subsequent subcategories were not necessarily logi- cally linked in the eyes of the general public. The Yahoo! directory expanded as Internet content grew, but the digi- tal folder system, like a taxonomy, required an expert to maintain. Shirky notes the Yahoo! model could not scale to the Internet. There are too many possible links to be able to successfully stay within the confines of a hierar- chical classification system. Additionally, on the Internet, the links are sufficient for access because if two items are linked at least once, the user has an entry point to retrieve either one or both items.12 A hierarchical system does not assure a successful Internet search and it requires a user to comprehend the links determined by the managing expert. In the Google approach, developers acknowl- edged that the user with the query best understood the unique reasoning behind her search. The user therefore could best evaluate the information retrieved. According to Shirky, the Google model let go of the hierarchical file system because developers recognized effective search- ing cannot predetermine what the user wants. Unlike Yahoo!, Google makes the links between the query and the resources after the user types in the search terms.13 Trusting in the link system led Google to understand and profit from letting the user filter the search results. To select the best organizational model for the Internet it is critical to understand its emergent nature. A model that does not address the effects of Web 2.0 on Internet use and fails to capture participant-created content and tagging will not be successful. One approach to orga- nizing digital resources has been for users to bookmark websites of personal interest. These bookmarks have been stored on the user’s computer, but newer models now combine the participatory Web with saving, or tagging, websites. Social bookmarking typifies the emergent Web and the attraction of online networking. Innovative and controversial, the folksonomy model brings to light numerous criteria necessary for a robust organizational system. A social bookmarking network, Delicious is a tool for generating folksonomies. It com- bines a large amount of self-interest with the potential for an equal, if not greater, amount of social value. Delicious users add metadata to resources on the Internet by apply- ing terms, or tags, to URLs. Users save these tagged web- sites to a personal library hosted on the Delicious website. The default settings on Delicious share a user’s library publicly, thus allowing other people—not limited to reg- istered Delicious account holders—to view any library. That the Delicious developers understood how Internet users would react to this type of interactive application is reflected in the popularity of Delicious. Delicious arrived on the scene in 2003, and in 2007 developers introduced a number of features to encourage further user collabora- tion. With a new look (going from the original del.icio.us to its current moniker, Delicious) as well as more ways for users to retrieve and share resources by 2007, Delicious had 3 million registered users and 100 million unique URLs.14 The reputation of Delicious has generated inter- est among people concerned with organizing the infor- mation available via the Internet. How does the folksonomy or Delicious model of open-ended tagging affect searching, information retriev- ing, and resource sharing? Delicious, whose platform is heavily influenced by its users, operates with no hier- archical control over the vocabulary used as tags. This underscores the organization controversy. Bottom-up tagging gives each person tagging an equal voice in the categorization scheme that develops through the user generated tags. At the same time, it creates a chaotic infor- mation-retrieval system when compared to traditional controlled vocabularies, taxonomies, and other methods of applying metadata.15 A folksonomy follows no hier- archical scheme. Every tag generated supplies personal meaning to the associated URL and is equally weighted. There will be overlap in some of the tags users select, and that will be the point of access for different users. For the unique tags, each Delicious user can choose to adopt or reject them for their personal tagging system. Either way, the additional tags add possible future access points for the rest of the user community. The social usefulness of the tags grows organically in relationship to their adop- tion by the group. Can the Internet support an organizational system controlled by user-generated tags? By the very nature of the participatory Web, whose applications often get bet- ter with user input, the answer is yes. Delicious and other social tagging systems are proving that their folksonomic approach is robust enough to satisfy the organizational needs of their users. Defined by Vander Wal, a broad folk- sonomy is a classification system scalable to the Internet.16 The problem with projecting already-existing search and classification strategies to the Internet is that the Internet is constantly evolving, and classic models are quickly overcome. Even in the nonprint world of the Internet, taxonomies and controlled vocabulary entail a commitment both from the entity wanting to organize the system and the users who will be accessing it. Developing a taxonomy involves an expert, which requires an outlay of capital and, as in the case with Yahoo!, a taxonomy is not necessarily what users are looking for. To be used effectively, taxonomies demand a certain amount of user finesse and complacency. The user must understand the general hierarchy and by default must suspend their own sense of category and subcategory if they do not mesh with the given system. The search model used by Google, where the user does the filtering, has been a significantly more successful search engine. Google recognizes natural language, making it user friendly; however, it remains merely a search engine. It is successful at making links, but it leaves the user stranded without a means to orga- nize search results beyond simple page rank. Traditional tAGGiNG: AN ORGANizAtiON scHeMe FOR tHe iNteRNet | visseR 37 hierarchical systems and search strategies like those of Yahoo! and Google neglect to take into account the tre- mendous popularity of the participatory Web. Successful Web applications today support user interaction; to disre- gard this is naive and short-sighted. In contrast to a simple page-rank results list or a hierarchical system, Delicious results provide the user with rich, multilayer results. Figure 1 shows four of the first ten results of a Delicious search for the term “folk- sonomy.” The articles by the four authors in the left col- umn were tagged according to the diagram. Two of the articles are peer-reviewed, and two are cited repeatedly by scholars researching tagging and the Internet. In this example, three unique terms are used to tag those articles, and the other terms provide additional entry points for retrieval. Further information available using Delicious shows that the Guy article was tagged by 1,323 users, the Mathes article by 2,787 users, the Shirky article by 4,383 users, and the Peterson article by 579 users.17 From the basic Delicious search, the user can combine terms to narrow the query as well as search what other users have tagged with those terms. Similar to the card catalog, where a library patron would often unintentionally find a book title by browsing cards before or after the actual title she originally wanted, a Delicious user can browse other users’ libraries, often finding additional pertinent resources. A user will return a greater number of relevant and automatically filtered results than with an advanced Google search. As an ancillary feature, once a Delicious user finds an attractive tag stream—a series of tags by a particular user—they can opt to follow the user who created the tag stream, thereby increasing their personal resources. Hence Delicious is effective personally and socially. It emulates what Internet users expect to be able to do with digital content: find interesting resources, per- sonalize them, in this case with tags, and put them back out for others to use if they so choose. Proponents of folksonomy recognize there are ben- efits to traditional taxonomies and controlled vocabulary systems. Shirky delineates two features of an organi- zational system and their characteristics, providing an example of when a hierarchical system can be successful (see table 1).18 These characteristics apply to situations using data- bases, journal articles, and dissertations as spelled out by Peterson, for example.19 Specific organizations with identifiable common terminology—for example, medical libraries—can also benefit from a traditional classification system. These domains are the antithesis of the domain represented by the Web. The success of controlled vocab- ularies, taxonomies, and their resulting systems depends on broad user adoption. That, in combination with the cost of creating and implementing a controlled system, raises questions as to their utility and long-term viability for use on the Web. Though meant for longevity, a taxonomy fulfills a need at one fixed moment in time. A folksonomy is never static. Taxonomies developed by experts have not yet been able to be extended adequately for the breadth and depth of Internet resources. Neither have traditional viewpoints been scaled to accept the challenges encountered in try- ing to organize the Internet. Folksonomy, like taxonomy, seeks to provide the information critical to the user at the moment of need. Folksonomy, however, relies on users to create the links that will retrieve the desired results. Doctorow puts forward three critiques of a hierarchical metadata system, emphasizing the inadequacies of apply- ing traditional classification schemes to the digital stage: 1. There is not a “correct” way to categorize an idea. 2. Competing interests cannot come to a consensus Figure 1. Search results for “folksonomy” using delicious. Table 1. Domains and their participants Domain to be Organized Participants in the Domain Small corpus Expert catalogers Formal categories Authoritative source of judgment Restricted entities Coordinated users Clear Edges Expert users 38 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 on a hierarchical vocabulary. 3. There is more than one way to describe some- thing. Doctorow elaborates: “Requiring everyone to use the same vocabulary to describe their material denudes the cognitive landscape, enforces homogeneity in ideas.”20 The Internet raises the level of participation to include innumerable voices. The astonishing thing is that it thrives on this participation. Guy and Tonkin address the “folksonomic flaw” by saying user-generated tags are by definition imprecise. They can be ambiguous, overly personal, misspelled, and a contrived compound word. Guy and Tonkin suggest the need to improve tagging by educating the users or by improving the systems to encourage more accurate tagging.21 This, however, does not acknowledge that successful Web 2.0 applications depend on the emergent wisdom of the user community. The systems permit organic evolution and continual improvement by user participation. A folksonomy evolves much the way a spe- cies does. Unique or single-use tags have minimal social import and do not gain recognition. Tags used by more than a few people reinforce their value and emerge as the more robust species. n Conclusion The benefits of the Internet are accessible to a wide range of users. The rewards of participation are imme- diate, social, and exponential in scope. User-generated content and associated organization models support the Internet’s unique ability to bring together unlikely social relationships that would not necessarily happen in another milieu. To paraphrase Shirky and Lessig, people are participating in a moment of social and technologi- cal evolution that is altering traditional ways of thinking about information, thereby creating a break from tradi- tional systems. Folksonomic classification is part of that break. Its utility grows organically as users add tagged content to the system. It is adaptive, and its strengths can be leveraged according to the needs of the group. While there are “folksonomic flaws” inherent in a bottom- up classification system, there is tremendous value in weighting individual voices equally. Following the logic of Web 2.0 technology, folksonomy will improve accord- ing to the input of the users. It is an organizational system that reflects the basic tenets of the emergent Internet. It may be the only practical solution in a world of participa- tory content creation. Shirky describes the Internet by saying, “There is no shelf in the digital world.”22 Classic organizational schemes like the Dewey Decimal System were created to organize resources prior to the advent of the Internet. A hierarchical system was necessary because there was a physical limita- tion on where a resource could be located; a book can only exist in one place at one time. In the digital world, the shelf is simply not there. Material can exist in many different places at once and can be retrieved through many avenues. A broad folksonomy supports a vibrant search strategy. It combines individual user input with that of the group. This relationship creates data sets inherently meaningful to the community of users seeking information on any given topic at any given moment. This is why a folksonomic approach to organizing information on the Internet is suc- cessful. Users are rewarded for their participation, and the system improves because of it. Folksonomy mirrors and supports the evolution of the Internet. Librarians, trained to be impartial and ethically bound to assure access to information, are the logical mediators among content creators, the architecture of the Web, corporate interests, and policy makers. Critical con- versations are no longer happening only in traditional publications of the print world. They are happening with communication platforms like YouTube, Twitter, Digg, and Delicious. Information organization is one issue on which librarians can be progressive. Dedicated to making information available, librarians are in a unique position to take on challenges raised by the Internet. As the profession experiments with the introduction of Web 3.0, librarians need to position themselves between what is known and what has yet to evolve. Librarians have always leveraged the interests and needs of their users to tailor their services to the individual entry point of every person who enters the library. Because more and more resources are accessed via the Internet, librarians will have to maintain a presence throughout the Web if they are to continue to speak for the informational needs of their users. Part of that presence necessitates an ability to adapt current models to the Internet. More importantly, it requires recognition of when to forgo con- ventional service methods in favor of more innovative approaches. Working in concert with the early adopters, corporate interests, and general Internet users, librarians can promote a successful system for organizing Internet resources. For the Internet, folksonomic tagging is one solution that will assure users can retrieve information necessary to answer their queries. References and notes 1. Charles F. Thomas and Linda S. Griffin, “Who Will Cre- ate the Metadata for the Internet?” First Monday 3, no. 12 (Dec. 1998). 2. Web 2.0 is a fairly recent term, although now ubiquitous among people working in and around Internet technologies. Attributed to a conference held in 2004 between MediaLive tAGGiNG: AN ORGANizAtiON scHeMe FOR tHe iNteRNet | visseR 39 International and O’Reilly Media, Web 2.0 refers to the Web as being a platform for harnessing the collective power of Internet users interested in creating and sharing ideas and information without mediation from corporate, government, or other hierar- chical policy influencers or regulators. Web 3.0 is a much more fluid concept as of this writing. There are individuals who use it to refer to a Semantic Web where information is analyzed or processed by software designed specifically for computers to carry out the currently human-mediated activity of assigning meaning to information on a webpage. There are librarians involved with exploring virtual-world librarianship who refer to the 3D environment as Web 3.0. The important point here is that what Internet users now know as Web 2.0 is in the process of being altered by individuals continually experimenting with and improving upon existing Web applications. Web 3.0 is the undefined future of the participatory Internet. 3. Clay Shirky, “Here Comes Everybody: The Power of Organizing Without Organizations” (presentation videocast, Berkman Center for Internet & Society, Harvard University, Cambridge, Mass., 2008), http://cyber.law.harvard.edu/inter active/events/2008/02/shirky (accessed Oct. 1, 2008). 4. Ibid. 5. Lawerence Lessig, “Early Creative Commons History, My Version,” videocast, Aug. 11, 2008, Lessig 2.0, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed Aug. 13, 2008). 6. Elaine Peterson, “Beneath the Metadata: Some Philosophi- cal Problems with Folksonomy,” D-Lib Magazine 12, no. 11 (2006), http://www.dlib.org/dlib/november06/peterson/11peterson .html (accessed Sept. 8, 2008). 7. Clay Shirky, “Ontology is Overrated: Categories, Links, and Tags” online posting, Spring 2005, Clay Shirky’s Writings about the Internet, http://www.shirky.com/writings/ontology_ overrated.html#mind_reading (accessed Sept. 8, 2008). 8. Gene Smith, Tagging: People-Powered Metadata for the Social Web (Berkeley, Calif.: New Riders, 2008): 68. 9. Ibid., 76. 10. Thomas Vander Wal, “Folksonomy,” online posting, Feb. 7, 2007, vanderwal.net, http://www.vanderwal.net/folksonomy .html (accessed Aug. 26, 2008). 11. Thomas Vander Wal, “Explaining and Showing Broad and Narrow Folksonomies,” online posting, Feb. 21, 2005, Personal InfoCloud, http://www.personalinfocloud.com/2005/02/ explaining_and_.html (accessed Aug. 29, 2008). 12. Shirky, “Ontology is Overrated.” 13. Ibid. 14. Michael Arrington, “Exclusive: Screen Shots and Feature Overview of Delicious 2.0 Preview,” online posting, June 16, 2005, TechCrunch, http://www.techcrunch.com/2007/09/06/ exclusive-screen-shots-and-feature-overview-of-delicious-20 -preview/(accessed Jan. 6, 2010). 15. Smith, Tagging, 67–93 . 16. Vander Wal, “Explaining and Showing Broad and Narrow Folksonomies.” 17. Adam Mathes, “Folksonomies—Cooperative Classifica- tion and Communication through Shared Metadata” (graduate paper, University of Illinois Urbana–Champaign, Dec. 2004); Peterson, “Beneath the Metadata”; Shirky, “Ontology is Over- rated”; Thomas and Griffin, “Who Will Create the Metadata for the Internet?” 18. Shirky, “Ontology is Overrated.” 19. Peterson, “Beneath the Metadata.” 20. Cory Doctorow, “Metacrap: Putting the Torch to Seven Straw-Men of the Meta-Utopia,” online posting, Aug. 26, 2001, The Well, http://www.well.com/~doctorow/metacrap.htm (accessed Sept. 15, 2008). 21. Marieke Guy and Emma Tonkin, “Folksonomies: Tidy- ing up Tags?” D-Lib Magazine 12, no. 1 (2006), http://www.dlib .org/dlib/january06/guy/01guy.html (accessed Sept. 8, 2008). 22. Shirky, “Ontology is Overrated.” Global Interoperability continued from page 33 9. Julie Renee Moore, “RDA: New Cataloging Rules, Com- ing Soon to a Library Near You!” Library Hi Tech News 23, no. 9, (2006): 12. 10. Rick Bennett, Brian F. Lavoie, and Edward T. O’Neill, “The Concept of a Work in WorldCat: An Application of FRBR,” Library Collections, Acquisitions, & Technical Services 27, no. 1, (2003): 56. 11. Park, “Cross-Lingual Name and Subject Access.” 12. Ibid. 13. Thomas B. Hickey, “Virtual International Authority File” (Microsoft PowerPoint presentation, ALA Annual Conference, New Orleans, June 2006), http://www.oclc.org/research/ projects/viaf/ala2006c.ppt (accessed Dec. 9, 2009). 14. LEAF, “LEAF Project Consortium,” http://www.crxnet .com/leaf/index.html (accessed Dec. 9, 2009). 15. Bennett, Lavoie, and O’Neill, “The Concept of a Work in WorldCat.” 16. Alan Danskin, “Mature Consideration: Developing Biblio- graphic Standards and Maintaining Values,” New Library World 105, no. 3/4, (2004): 114. 17. Ibid. 18. Bennett, Lavoie, and O’Neill, “The Concept of a Work in WorldCat.” 19. Moore, “RDA.” 20. Danskin, “Mature Consideration,” 116. 21. Ibid.; Park, “Cross-Lingual Name and Subject Access.” 3154 ---- teNDiNG A wilD GARDeN: liBRARY weB DesiGN FOR PeRsONs witH DisABilities | vANDeNBARK 23 R. Todd Vandenbark Tending a Wild Garden: Library Web Design for Persons with Disabilities Nearly one-fifth of Americans have some form of dis- ability, and accessibility guidelines and standards that apply to libraries are complicated, unclear, and difficult to achieve. Understanding how persons with disabilities access Web-based content is critical to accessible design. Recent research supports the use of a database-driven model for library Web development. Existing tech- nologies offer a variety of tools to meet disabled patrons’ needs, and resources exist to assist library professionals in obtaining and evaluating product accessibility infor- mation from vendors. Librarians in charge of technology can best serve these patrons by proactively updating and adapting services as assistive technologies improve. I n March 2007, eighty-two countries signed the United Nations’ Convention on the Rights of Persons with Disabilities, including Canada, the European Community, and the United States. The convention’s purpose was “to promote, protect and ensure the full and equal enjoyment of all human rights and fundamental freedoms by all persons with disabilities, and to promote respect for their inherent dignity.”1 Among the many proscriptions for assuring respect and equal treatment of people with disabilities (PWD) under the law, signatories agreed to take appropriate measures: (g) To promote access for persons with disabilities to new information and communications technolo- gies and systems, including the Internet; and (h) To promote the design, development, production and distribution of accessible information and communications technologies and systems at an early stage, so that these technologies and systems become accessible at minimum cost. In addition, the convention seeks to guarantee equal access to information by doing the following: (c) Urging private entities that provide services to the general public, including through the Internet, to provide information and services in accessible and usable formats for persons with disabilities; and (d) Encouraging the mass media, including providers of information through the Internet, to make their services accessible to persons with disabilities.2 Because the Internet and its design standards are evolv- ing at a dizzying rate, it is difficult to create websites that are both cutting-edge and standards-compliant. This paper evaluates the challenge of Web design as it relates to individuals with disabilities, exploring current standards, and offering recommendations for accessible development. Examining the provision of IT for this demographic is vital because according to the U.S. Census Bureau, the U.S. public includes about 51.2 mil- lion noninstitutionalized people living with disabilities, 32.5 million of which are severely disabled. This means that nearly one-fifth of the U.S. public faces some physi- cal, mental, sensory, or other functional impairment (18 percent in 2002).3 Because a library’s mandate is to make its resources accessible to everyone, it is important to attend to the special challenges faced by patrons with disabilities and to offer appropriate services with those special needs in mind. n Current U.S. regulations, standards, and guidelines In 1990 Congress enacted the Americans with Disabilities Act (ADA), the first comprehensive legislation mandating equal treatment under the law for PWD. The ADA pro- hibits discrimination against PWD in employment, public services, public accommodations, and in telecommunica- tions. Title II of the ADA mandates that all state govern- ments, local governments, and public agencies provide access for PWD to all of their activities, services, and programs. Since school, public, and academic libraries are under the purview of Title II, they must “furnish auxiliary aids and services when necessary to ensure effective com- munication.”4 Though predating widespread use of the Internet, the law’s intent points toward the adoption and adaptation of appropriate technologies to allow persons with a variety of disabilities to access electronic resources in a way that is most effective for them. Changes to Section 508 of the 1973 Rehabilitation Act enacted in 1998 and 2000 introduced the first standards for “accessible information technology recognized by the federal government.”5 Many state and local govern- ments have since passed laws applying the standards of Section 508 to government agencies and related services. According to the Access Board, the independent federal agency charged with assuring compliance with a variety of laws regarding services to PWD, information and com- munication technology (ICT) includes any equipment or interconnected system or subsystem of equipment, that is used in the creation, conversion, or duplication of data or information. The term electronic R. todd vandenbark (todd.vandenbark@utah.edu) is Web Ser- vices Librarian, Eccles health Sciences Library, University of Utah, Salt Lake City. 24 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 and information technology includes, but is not limited to, telecommunications products (such as telephones), information kiosks and transaction machines, World Wide Web sites, multimedia, and office equipment such as copiers and fax machines.6 The Access Board further specifies guidelines for “Web-based intranet and internet information and appli- cations,” which are directly relevant to the provision of such services in libraries.7 What follows is a detailed examination of these standards with examples to assist in understanding and implementation. (a) A text equivalent for every non-text element shall be provided. Assistive technology cannot yet describe what pictures and other images look like; they require meaningful text-based information asso- ciated with each picture. If an image directs the user to do something, the associated text must explain the purpose and meaning of the image. This way, someone who cannot see the screen can understand and navigate the page success- fully. This is generally accomplished by using the “alt” and “longdesc” attributes for images: <img src=“image.jpg” alt=“Short description of image.” longdesc=“explanation.txt” />. However, these aids also can clutter a page when not used properly. The current versions of the most popular screen-reader software do not limit the amount of “alt” text they can read. However, Freedom Scientific’s JAWS 6.x divides the “alt” attribute into distinct chunks of 125 characters each (excluding spaces) and reads them separately as if they were separate graphics.8 This can be confusing to the end user. Longer con- tent can be put into a separate text file and the file linked to using the “longdesc” attribute. When a page contains audio or video files, a text alternative needs to be provided. For audio files such as inter- views, lectures, and podcasts, a link to a transcript of the audio file must be immediately available. For video clips such as those on YouTube, captions must accompany the clip. (b) Equivalent alternatives for any multimedia presen- tation shall be synchronized with the presentation. This means that captions for video must be real-time and synchronized with the actions in the video, not contained solely in a separate transcript. (c) Web pages shall be designed so that all informa- tion conveyed with color is also available with- out color, for example from context or markup. While color can be used, it cannot be the sole source or indicator of information. Imagine an edu- cational website offering a story problem presented in black and green print, and the answer to the problem could be deciphered using only the green letters. This would be inaccessible to students who have certain forms of color-blindness as well as those who use screen-reader software. (d) Documents shall be organized so they are read- able without requiring an associated style sheet. The introduction of cascading style sheets (CSS) can improve accessibility because they allow the separation of presentation from content. However, not all browsers fully support CSS, so webpages need to be designed so any browser can read them accurately. The content needs to be organized so that it can be read and understood with CSS for- matting turned off. (e) Redundant text links shall be provided for each active region of a server-side image map, and (f) Client-side image maps shall be provided instead of server-side image maps except where the regions cannot be defined with an available geometric shape. An image map can be thought of as a geometri- cally defined and arranged group of links to other content on a site. A clickable map of the fifty U.S. states is an example of a functioning image map. A server-side image map would appear to a screen reader only as a set of coordinates, whereas client- side maps can include information about where the link leads through “alt” text. The best practice is to only use client-side image maps and make sure the “alt” text is descriptive and meaningful. (g) Row and column headers shall be identified for data tables, and (h) Markup shall be used to associate data cells and header cells for data tables that have two or more logical levels of row or column headers. Correct table coding is critical. Each table should use the “table summary” attribute to provide a meaningful description of its content and arrange- ment: <table summary=“Concise explanation belongs here.”>. Headers should be coded using the table header (“th”) tag, and its “scope” attri- bute should specify whether the header applies to a row or a column: <th scope=“col”> or <th scope=“row”>. If the table’s content is complex, it may be necessary to provide an alternative presen- tation of the information. It is best to rely on CSS for page layout, taking into consideration the direc- tions in subparagraph (d) above. (i) Frames shall be titled with text that facili- tates frame identification and navigation. Frames are a deprecated feature of HTML, and their use should be avoided in favor of CSS layout. (j) Pages shall be designed to avoid caus- ing the screen to flicker with a frequency greater than 2 Hz and lower than 55 Hz. Lights with flicker rates in this range can trigger epileptic seizures. Blinking or flashing elements on teNDiNG A wilD GARDeN: liBRARY weB DesiGN FOR PeRsONs witH DisABilities | vANDeNBARK 25 a webpage should be avoided until browsers pro- vide the user with the ability to control flickering. (k) A text-only page, with equivalent information or functionality, shall be provided to make a Web site comply with the provisions of this part, when compliance cannot be accomplished any other way. The content of the text-only page shall be updated whenever the primary page changes. Complex content that is entirely visual in nature may require a separate text-only page, such as a page showing the English alphabet in American Sign Language. This requirement also serves as a stopgap measure for existing sites that require reworking for accessibility. Some consider this to be the Web’s version of separate-but-equal ser- vices, and should be avoided.9 Offering a text-only alternative site can increase the sense of exclusion that PWD already feel. Also, such versions of a website tend not to be equivalent to the parent site, leaving out promotions or advertisements. Finally, a text-only version increases the workload of Web development staff, making them more costly than creating a single, fully accessible site in the first place. (l) When pages utilize scripting languages to display content, or to create interface elements, the informa- tion provided by the script shall be identified with functional text that can be read by assistive technology. Scripting languages such as JavaScript allow for more interactive content on a page while reducing the number of times the computer screen needs to be refreshed. If functional text is not available, the screen reader attempts to read the script’s code, which outputs as a meaningless jumble of charac- ters. Using redundant text links avoids this result. (m) When a Web page requires that an applet, plug-in, or other application be present on the client system to interpret page content, the page must provide a link to a plug-in or applet that complies with [Subpart B: technical standards] §1194.22(a) through (i). Web developers need to ascertain whether a given plug-in or applet is accessible before requiring their webpage’s visitors to use it. When using applications such as QuickTime or RealAudio, it is important to provide an accessible link on the same page that will allow users to install the necessary plug-in. (n) When electronic forms are designed to be completed on-line, the form shall allow people using assistive technology to access information, field elements, and functionality required for completion and submis- sion of the form, including all directions and cues. If scripts used in the completion of the form are inaccessible, an alternative method of completing the form must be made immediately available. Each element of a form needs to be labeled prop- erly using the <label> tag. (o) A method shall be provided that per- mits users to skip repetitive navigation links. Persons using screen reader software typically navigate through pages using the Tab key, listen- ing as the text is read aloud. Websites commonly place their logo at the top of each page and make this graphic a link to the site’s homepage. Many sites also use a line of graphic images just beneath this logo on every page to serve as a navigation bar. To avoid having to listen through this same list of links on every page just to get to the page’s content, a “skip to content” link as the first option at the top of each page provides a simple solution to this problem. (p) When a timed response is required, the user shall be alerted and given sufficient time to indicate more time is required. Some sites log a user off if they have not typed or otherwise interacted with the page after a certain time period. Users must be notified in advance that this is going to happen and given sufficient time to respond and request more time as needed. n Standards-setting groups and their work One organization that seeks to move Internet tech- nology beyond basic Section 508 compliance is the Web Accessibility Initiative (WAI) of the World Wide Web Consortium (W3C). The mission of the WAI is to develop n guidelines that are widely regarded as the interna- tional standard for Web accessibility; n support materials to help understand and imple- ment Web accessibility; and n resources through international collaboration.10 The W3C published its first Web Content Accessibility Guidelines (WCAG 1.0) in May of 1999 for making online content accessible to PWD. By following these guidelines, developers create Web content that is readily available to every user regardless of the way it’s accessed. The WAI provides ten quick tips for improving accessibility in website design: n Images and animations. Use the “alt” attribute to describe the function of each visual. n Image maps. Use the client-side map and text for hotspots. n Multimedia. Provide captioning and transcripts of audio, and descriptions of video. 26 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 n Hypertext links. Use text that makes sense when read out of context. For example, avoid “click here.” n Page organization. Use headings, lists, and consis- tent structure. Use CSS for layout and style where possible. n Graphs and charts. Summarize or use the “longdesc” attribute. n Scripts, applets, and plug-ins. Provide alternative content in case active features are inaccessible or unsupported. n Frames. Use the “noframes” element and meaning- ful titles. n Tables. Make line-by-line reading sensible. Summarize. n Check your work. Validate. Use tools, checklist, and guidelines at http://www.w3.org/TR/WCAG.11 Many libraries and other organizations have sought to follow WCAG 1.0 since it was published. Recently, the W3C updated their standards to WCAG 2.0, and the WAI website offers an overview of these guidelines along with a “customizable quick reference” designed to facilitate successful compliance. The principles behind 2.0 can be summarized by the acronym P.O.U.R. Perceivable n Provide text alternatives for non-text content. n Provide captions and alternatives for multimedia. n Make information adaptable and available to assistive technologies. n Use sufficient contrast to make things easy to see and hear. Operable n Make all functionality keyboard accessible. n Give users enough time to read and use content. n Do not use content known to cause seizures. n Help users navigate and find content. Understandable n Make text readable and understandable. n Make content appear and operate in predictable ways. n Help users avoid and correct mistakes. Robust n Maximize compatibility with current and future technologies.12 These guidelines offer assistance in creating acces- sible Web-based materials. Given their breadth, however, they raise concerns of overly wide interpretation and the strong possibility of falling short of Section 508 standards. Reading the details in WCAG 2.0 does not give any additional assistance to library Web developers on how to create a Section 508–compliant website. Clark points out that the three WCAG 2.0 documents are long (72–165 pages), confusing, and sometimes internally contradic- tory.13 The goal of a library webmaster is to provide an interface (website, OPAC, database, and so on) that is both cutting-edge and accessible, and to encourage its use by patrons of all ability levels. While they have outlined a helpful rationale, the W3C’s overlong guidelines do little to help library Web developers to achieve this goal. n Recommendations Libraries today typically offer three types of Web-based resources: (1) access to the Internet, (2) access to subscrip- tion databases, and (3) a library’s own webpage, all of which need to be accessible to PWD. Libraries trying to comply with Section 508 are required to “furnish auxil- iary aids and services when necessary to ensure effective communication.”14 There are a number of options avail- able to libraries on tight budgets. The first set involves the features built into each computer’s operating sys- tem and software. For some users with visual impair- ments, enlarging the font size of text and images on the screen will make electronic content more accessible. Both Macintosh and Windows system software have universal-access capabilities built in, including the ability to read aloud text that is on the screen using synthesized speech. The Mac read-aloud tool is called Voice Over; the Windows read-aloud tool is called Narrator. Both systems allow for screen magnification. Exploring and learning the capabilities of these systems to enhance accessibility is a free and easy first step for any library’s technology offerings, regardless of funding restrictions. Libraries with more substantial technology budgets have a wide variety of hardware and software options to choose from to meet the needs of PWD. For patrons with visual impairments, several software packages are available to read aloud the content of a website or other electronic document using synthesized speech. JAWS by Freedom Scientific and WindowEyes by GW Micro are two of the best-known software packages, and both include the ability to output to a refreshable Braille dis- play (which both companies also sell). Kurzweil 3000 is an education-oriented software package that not only reads on-screen text aloud but has a wealth of additional tools to assist students with learning difficulties such as attention deficit disorder or dyslexia. It is designed to integrate with any education package as well as to assist students whose primary language is not English. Persons with low vision needing screen magnification beyond the features Windows offers may look to Magic by Freedom Scientific or ZoomText by Ai Squared. Some of these teNDiNG A wilD GARDeN: liBRARY weB DesiGN FOR PeRsONs witH DisABilities | vANDeNBARK 27 software companies offer free trial versions, have online demonstrations, or both. Because prices for this software and related equipment can be high, it is prudent to first check with patrons with visual impairments and profes- sionals in the field prior to making your purchase. Humbert and Stores, members of Indiana University’s Web Accessibility Team, offer accessibility evaluations of websites and other services at the university. When asked to compare Windows and Macintosh systems as to their usefulness in assisting PWD with Web-based media, Humbert rated the Windows operating system superior, explaining that it has the proper “handles” coded into its software for screen readers and assistive technologies to grab onto. Assistive technology software is more stable in Windows Vista because its predecessor, Windows XP, “used hacked together drivers to display the informa- tion.”15 Humbert discourages the use of Vista and JAWS on an older machine because Vista is a memory hog and can crash JAWS along with the rest of the system. The Web browsers Internet Explorer and Firefox allow the user to enlarge text and images on a webpage, though Firefox is more effective. Text can be enlarged only if the webpage being viewed is designed using resizable fonts. Stores, who is profoundly visually impaired, uses JAWS screen-reader software to work and to surf the Web. She notes that both browsers work equally well with screen- reader software.16 An important Web-based resource that libraries pro- vide is subscription databases. However, as one study has shown, “most librarians lack the time, resources and/or skills to evaluate the degree to which their library subscription databases are accessible to their disability communities.”17 The question is do the vendors them- selves make an effort to produce an accessible product? A 2007 survey of twelve major database companies found that while most “have integrated accessibility standards/ guidelines into their search interfaces and/or plan to improve accessibility in future releases,” only five actu- ally conducted usability studies with people who use assistive technology. A number of studies have found that “while most databases are functionally accessible, com- panies need to do more to meet the needs of the disability community and assure librarians of the accessibility of their products.”18 Subscription databases can be inaccessible to PWD in the display of search results and accompanying infor- mation. The three most common forms of results deliv- ery are HTML full text, HTML full text with graphics, and PDF files. PDF files are notoriously inaccessible to persons using screen readers. While Adobe has made significant strides in rendering PDFs accessible, many databases contain numerous PDF documents created in versions of Adobe Acrobat prior to version 5.0 (released in 2001), which are not properly tagged for screen read- ers. Even newer PDF documents are only as accessible as their tagging allows. Journal articles received from publishers may or may not be properly tagged, so data- base companies cannot guarantee that their content is fully accessible. One vendor that is avoiding this trap is JSTOR. Using optical character recognition (OCR) soft- ware, JSTOR delivers image-based PDFs with embedded text to make their content available to screen readers.19 Librarians must insist that database packages be acces- sible and compatible with the forms of assistive technol- ogy most frequently used by their patrons, both in-house and online. One tool used to evaluate database (or other prod- uct) accessibility is the Voluntary Product Accessibility Template (VPAT). Created in partnership between the Information Technology Industry (ITI) Council and the U.S. General Services Administration (GSA) in 2001, it provides “a simple, Internet-based tool to assist Federal contracting and procurement officials in fulfilling the new market research requirements contained in the Section 508 implementing regulations.”20 VPAT is a voluntary disclosure form arranged in a series of tables listing the criteria of relevant subsections of Section 508 discussed previously. Blank cells are provided to allow company representatives to describe how their product’s support- ing features meet the criteria and to provide additional detailed information. Library personnel can request that vendors complete this form to document which sub- sections of Section 508 their products meet, and how. To be most useful, the form needs to be completed by company representatives with both a clear understand- ing of Section 508 and its technical details and thorough knowledge of their product. Knowledgeable library staff are encouraged to verify the quality and accuracy of the information provided before purchasing. Like databases, a library’s website needs to be acces- sible to patrons with a variety of needs. According to Muncaster, accessible sites are 35 percent easier for every- one to use and are more likely to be found by Internet search engines.21 Fully accessible websites are simpler to maintain and are on average 50 percent smaller than inaccessible ones, which means they download faster, making them easier to use.22 In creating a basic site, cur- rent best practice has been to render the content in HTML or XHTML and design the layout using CSS. This way, if it is discovered the site’s pages are not fully accessible, a simple change to the CSS updates all pages, saving the site manager time and effort. Finally, creating an acces- sible site from the beginning is substantially easier than retrofitting an old one. A complete rebuild of a library website is an opportu- nity to improve accessibility. Reynolds’ article on creating a user-centered website for the Johnson County (Kans.) Library offers an example of how libraries can apply basic information architecture design principles on a budget. Johnson County focused on simple, low-budget 28 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 usability studies involving patrons in the selection of site navigation categories, designing the layout, and testing the resulting user interface. By involving average users in this process, this library was able to achieve substantial improvements in the site’s usability. Prior to the redesign, usability testing determined that 42 percent of users were not successful in finding information on the library’s old site. After the redesign, “only 4% of patrons were unsuccessful in finding core-task information on the first attempt.”23 Even so, a quick test of the site with the online accessibility evaluation tool CynthiaSays indicates that it still does not fully meet the requirements of Section 508. Had the library’s staff included PWD in their process, the demonstrated degree of improvement might have allowed them to meet and possibly exceed this standard. An understanding of how a person with disabilities experiences the online environment can help point the way toward improved accessibility. A recent study in the United Kingdom tracked the eye movements of able- bodied computer users in an effort to answer these ques- tions. Researchers asked eighteen people with normal or corrected vision to search for answers on two versions of a BBC website—the standard graphical page and the text- only version. Subjects’ eyes tended to dart around the standard page “as they attempt to locate what appears visually to be the next most likely location”24 for the answer. But in searching the text-only page, subjects went line-by-line, making smaller jumps across each page. Researchers determined that the webpage and its layout serve as a form of external memory, providing visual cues to the structure of its content and how to navigate it. If the Internet is an information superhighway, then the layout of a standard webpage serves as the borders and directional signs for browsing. The visual cues and navigation aids inherent in cur- rent webpages’ layouts provide no auditory equivalent for presentation to people with visual impairments. Information seeking on the Web is a complex process requiring “the ability to switch and coordinate among multiple information-seeking strategies” such as brows- ing, scanning, query-based searching, and so on.25 If Web browsers could translate formatting and presentation into audio tailored to the needs of the visually impaired, the use of the Internet would be a far more satisfying experi- ence for those users. However, such Web programming would require years of additional research and develop- ment. In the meantime, Web librarians must strive to build sites that are clean, hierarchical, and usable by all persons by following to the standards and guidelines currently available. One way to enhance the accessibility of sites is to fol- low a database-driven Web development model. In addi- tion to using XHTML and CSS, Dunlap recommends that content be stored in a relational database such as MySQL and that a coding language such as PHP be used to create pages dynamically. This approach has two advantages. First, it allows for the creation of “a flexible website design style that lives in a single, easily modified file that controls the presentation of every Web page of the site.”26 Second, it requires far less time for site maintenance, freeing staff to devote time to assuring accessibility while accommodating changes in Web technology. Such a model can be used by database vendors to ensure that their services can seamlessly integrate with the library’s online content. Use of mobile phones and similar devices to browse the Web is at an all-time high, and content providers are eager to make their sites mobile-friendly. Many of these end users experience similar barriers to accessing this content as PWD do. For example, persons with some motor disabilities as well as mobile phones with only a numeric keypad cannot access sites with links requiring the use of a mouse. Sites that follow either the W3C’s Mobile Web Best Practices (MWBP) or WCAG are well on their way to meeting both standards.27 By properly asso- ciating labels with their controls, Internet content can be made fully accessible to both end users. Understanding the similarities between MWPB and WCAG can lead to website design that is truly perceivable, operable, under- standable, and robust. n Summary Librarians with responsibility for Web design and tech- nology management operate in an evolving environment. Legal requirements make clear the expectation to serve the wide variety of needs of patrons with disabilities. Yet the guidelines and standards available to assist in this venture range from complex to vague and insufficient. Assistive technologies continue to improve with many traditional vendors confident that their products are accessible. In actual use, however, substantial challenges and shortcomings remain. The challenge for technology librarians is to be proactive in keeping abreast of tech- nological advances, to experiment and learn from their efforts, and to continually update and adapt to provide Web or hypermedia information and services to patrons of all kinds. References 1. United Nations, Convention on the Rights of Persons with Disabilities, 2008, http://www.un.org/disabilities/default .asp?navid=12&pid=150 (accessed Aug. 10, 2009). 2. Ibid. 3. Erika Steinmetz, Americans with Disabilities (Washington, D.C.: U.S. Census Bureau, 2002). teNDiNG A wilD GARDeN: liBRARY weB DesiGN FOR PeRsONs witH DisABilities | vANDeNBARK 29 4. U.S. Department of Justice, Civil Rights Division, Disabil- ity Rights Section, “Title II Highlights,” Aug. 29, 2002, http:// www.ada.gov/t2hlt95.htm (accessed July 26, 2008). 5. Marilyn Irwin, Resources and Services for People with Dis- abilities: Lesson 1b Transcript (Indianapolis: Indiana University at Indianapolis School of Library and Information Science, 2008): 10 6. Ibid., 10 7. 1998 Amendment to Section 508 of the Rehabilitation Act, Subpart B—Technical Standards, §1194.22, http://www .section508.gov/index.cfm?FuseAction=content&ID=12#Appli cation (access Dec. 2, 2009). 8. Access IT, “How Long Can an ‘Alt’ Attribute Be?” Uni- versity of Washington, 2008, http://www.washington.edu/ accessit/articles?257 (accessed Dec. 12, 2008). 9. Matt May, “On ‘Separate but Equal’ Design,” online post- ing, June 24, 2004, bestkungfu weblog, http://www.bestkungfu .com/archive/date/2004/06/on-separate-but-equal-design/ (accessed Dec. 18, 2008). 10. Web Accessibility Initiative, “WAI Mission and Organiza- tion,” 2008, http://www.w3.org/WAI/about.html (accessed July 22, 2008). 11. Shawn Lawton Henry and Pasquale Popolizio, “WAI, Quick Tips to Make Accessible Web Sites,” World Wide Web Consortium, Feb. 5, 2008, http://www.w3.org/WAI/quicktips/ Overview.php (accessed Mar. 30, 2008). 12. Ben Caldwell et al., “Web Content Accessibility Guide- lines (WCAG) 2.0,” World Wide Web Consortium, Dec. 11, 2008, http://www.w3.org/TR/WCAG20/ (accessed July 27, 2008). 13. Joe Clark, “To Hell with WCAG 2,” A List Apart no. 217 (May 26, 2006), http://www.alistapart.com/articles/tohellwith wcag2 (accessed July 25, 2008). 14. U.S. Department of Justice, “Title II Highlights.” 15. Joseph A. Humbert and Mary Stores, Questions about New Software and Accessibility (Richmond, Ind., July 28, 2008). 16. Ibid. 17. S. L. Byerley, M. B. Chambers, and M. Thohira, “Acces- sibility of Web-Based Library Databases: The Vendors’ Perspec- tives in 2007,” Library Hi Tech 25, no. 4 (2007): 509–27. 18. Ibid. 19. P. Muncaster, “Poor Accessibility Has a Price,” VNU Net, Feb. 9, 2006, http://www.vnunet.com/articles/send/2150099 (accessed July 27, 2008). 20. Information Technology Industry Council, “FAQ: Volun- tary Product Accessibility Template (VPAT),” http://www.itic .org/archives/articles/20040506/faq_voluntary_product_ accessibility_template_vpat.php (accessed July 29, 2008). 21. Muncaster, “Poor Accessibility Has a Price.” 22. Isaac Hunter Dunlap, “How Database-Driven Web Sites Enhance Accessibility,” Library Hi Tech 23, no. 8 (2008): 34–38. 23. Erica Reynolds, “The Secret to Patron-Centered Web Design: Cheap, Easy, and Powerful Usability Techniques,” Com- puters in Libraries 28, no. 6 (2008): 6–47. 24. Caroline Jay et al., “How People Use Presentation to Search for a Link: Expanding the Understanding of Accessibility on the Web,” Universal Access in the Information Society 6, no. 3 (2006): 307–20. 25. C. Kouroupetroglou, M. Salampasis, and A. Manitsaris, “Browsing Shortcuts as a Means to Improve Information Seek- ing of Blind People in the WWW,” Universal Access in the Informa- tion Society 6, no. 3 (2007): 11. 26. Dunlap, “How Database-Driven Web Sites Enhance Accessibility.” 27. Web Accessibility Initiative, “Mobile Web Best Prac- tices 1.0,” July 29, 2008, http://www.w3.org/TR/mobile-bp (accessed Aug. 10, 2009). 3155 ---- 30 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 The Path toward Global Interoperability in Cataloging Ilana Tolkoff Libraries began in complete isolation with no uniformity of standards and have grown over time to be ever more interoperable. This paper examines the current steps toward the goal of universal interoperability. These projects aim to reconcile linguistic and organizational obstacles, with a particular focus on subject headings, name authorities, and titles. I n classical and medieval times, library catalogs were completely isolated from each other and idiosyncratic. Since then, there has been a trend to move toward greater interoperability. We have not yet attained this international standardization in cataloging, and there are currently many challenges that stand in the way of this goal. This paper will examine the teleological evolution of cataloging and analyze the obstacles that stand in the way of complete interoperability, how they may be overcome, and which may remain. This paper will not provide a comprehensive list of all issues pertaining to interoper- ability; rather, it will attempt to shed light on those issues most salient to the discussion. Unlike the libraries we are familiar with today, medi- eval libraries worked in near total isolation. Most were maintained by monks in monasteries, and any regulations in cataloging practice were established by each religious order. One reason for their lack of regulations was that their collections were small by our standards; a monastic library had at most a few hundred volumes (a couple thousand in some very rare cases). The “armarius,” or librarian, kept more of an inventory than an actual cata- log, along with the inventories of all other valuable pos- sessions of the monastery. There were no standard rules for this inventory-keeping, although the armarius usually wrote down the author and title, or incipit if there was no author or title. Some of these inventories also contained bibliographic descriptions, which most often described the physical book rather than its contents. The inventories were usually taken according to the shelf organization, which was occasionally based on subject, like most librar- ies are today. These trends in medieval cataloging varied widely from library to library, and their inventories were entirely different from our modern OPACs. The inventory did not provide users access to the materials. Instead, the user consulted the armarius, who usually knew the col- lection by heart. This was a reasonable request given the small size of the collections.1 This type of nonstandardized cataloging remained relatively unchanged until the nineteenth century, when Charles C. Jewett introduced the idea of a union catalog. Jewett also proposed having stereotype plates for each bibliographic record, rather than a book catalog, because this could reduce costs, create uniformity, and organize records alphabetically. This was the precursor to the twentieth-century card catalog. While many of Jewett’s ideas were not actually practiced during his lifetime, they laid the foundation for later cataloging practices.2 The twentieth century brought a great revolution in cataloging standards, particularly in the United States. In 1914, the Library of Congress Subject Headings (LCSH) were first published and introduced a controlled vocabu- lary to American cataloging. The 1960s saw a wide array of advancements in standardization. The Library of Congress (LC) developed MARC, which became a national standard in 1973. It also was the time of the cre- ation of Anglo-American Cataloguing Rules (AACR), the Paris Principles, and International Standard Bibliographic Description (ISBD). While many of these standardization projects were uniquely American or British phenomena, they quickly spread to other parts of the world, often in translated versions.3 While the technology did not yet exist in the 1970s to provide widespread local online catalogs, technology did allow for union catalogs containing the records of many libraries in a single database. These union catalogs included the Research Libraries Information Network (RLIN), the OCLC Online Computer Library Center (OCLC), and the Western Library Network (WLN). In the 1980s the local online public access catalog (OPAC) emerged, and in the 1990s OPACs migrated to the Web (WebPACs).4 Currently, most libraries have OPACs and are members of OCLC, the largest union catalog, used by more than 71,000 libraries in 112 countries and ter- ritories.5 Now that most of the world’s libraries are on OCLC, librarians face the challenge and inconvenience of dis- crepancies in cataloging practice due to the differing stan- dards of diverse countries, languages, and alphabets. The fields of language engineering and linguistics are work- ing on various language translation and analysis tools. Some of these include machine translation; ontology, or the hierarchical organization of concepts; information extraction, which deciphers conceptual information from unorganized information, such as that on the Web; text summarization, in which computers create a short sum- mary from a long piece of text; and speech processing, which is the computer analysis of human speech.6 While these are all exciting advances in information technol- ogy, as of yet they are not intelligent enough to help us establish cataloging interoperability. It will be interesting to see whether language engineering tools will be capable of helping catalogers in the future, but for now they are ilana tolkoff (ilana.tolkoff@gmail.com) holds a BA in music and italian from vassar College, an MA in musicology from Brandeis University, and an MLS from the University at Buffalo. She is cur- rently seeking employment as a music librarian. tHe PAtH tOwARD GlOBAl iNteROPeRABilitY iN cAtAlOGiNG | tOlKOFF 31 best at making sense of unstructured information, such as the Web. The interoperability of library catalogs, which consist of highly structured information, must be tackled through software that innovative librarians of the future will produce. In an ideal world, OCLC would be smoothly interop- erable at a global level. A single thesaurus of subject headings would have translations in every language. There would be just one set of authority files. All mani- festations of a single work would be grouped under the same title, translatable to all languages. There would be a single bibliographic record for a single work, rather than multiple bibliographic records in different languages for the same work. This single bibliographic record could be translatable into any language, so that when searching in WorldCat, one could change the settings to any language to retrieve records that would display in that chosen lan- guage. When catalogers contribute to OCLC, they would create the records in their respective languages, and once in the database the records would be translatable to any other language. Because records would be so fluidly translatable, an OPAC could be searched in any language. For example, the default settings for the University at Buffalo’s OPAC could be English, but patrons could change those settings to accommodate the great variety of international students doing research. This vision is uto- pian to say the least, and it is doubtful that we will ever reach this point. But it is valuable to establish an ideal scenario to aim our innovation in the right direction. One major obstacle in the way of global interoper- ability is the existence of different alphabets and the inherently imperfect nature of transliteration. There are essentially two types of transliteration schemes: those based on phonetic structure and those based on mor- phemic structure. The danger of phonetic transliteration, which mimics pronunciation, is that semantics often get lost. It fails to differentiate between homographs (words that are spelled and pronounced the same way but have different meanings). Complications also arise when there are differences between careful and casual styles of speech. Park asserts, “When catalogers transcribe words according to pronunciation, they can create inconsistent and arbitrary records.”7 Morphemic transliteration, on the other hand, is based on the meanings of morphemes, and sometimes ends up being very different from the pronunciation in the source language. One advantage to this, however, is that it requires fewer diacritics than phonetic transliteration. Park, whose primary focus is on Korean–Roman transliteration, argues that the McCune Reischauer phonetic transliteration that libraries use loses too much of the original meaning. In other alphabets, however, phonetic transliteration may be more beneficial, as in the LC’s recent switch to Pinyin transliteration in Chinese. The LC found Pinyin to be more easily search- able than Wade-Giles or monosyllabic Pinyin, which are both morphemic. However, another problem with translit- eration that neither phonetic nor morphemic schemes can solve is word segmentation—how a transliterated word is divided. This becomes problematic when there are no contextual clues, such as in a bibliographic record.8 Other obstacles that stand in the way of interoperabil- ity are the diverse systems of subject headings, author- ity headings, and titles found internationally. Resource Description and Access (RDA) will not deal with subject headings because it is such a hefty task, so it is unlikely that subject headings will become globally interoperable in the near future.9 Fortunately, twenty-four national libraries of English speaking countries use LCSH, and twelve non-English-speaking countries use a translated or modified version of LCSH. This still leaves many more countries that use their own systems of subject headings, which ultimately need to be made interoperable. Even within a single language, subject headings can be compli- cated and inconsistent because they can be expressed as a single noun, compound noun, noun phrase, or inverted phrase; the problem becomes even greater when trying to translate these to other languages. Bennett, Lavoie, and O’Neill note that catalogers often assign different subject headings (and classifications) to different manifestations of the same work.10 That is, the record for the novel Gone with the Wind might have different subject headings than the record for the movie. This problem could poten- tially be resolved by the Functional Requirements for Bibliographic Records (FRBR), which will be discussed below. Translation is a difficult task, particularly in the con- text of strict cataloging rules. It is especially complicated to translate among unrelated languages, where one might be syntactic and the other inflectional. This means that there are discrepancies in the use of prepositions, con- junctions, articles, and inflections. The ability to add or remove terms in translation creates endless variations. A single concept can be expressed in a morpheme, a word, a phrase, or a clause, depending on the language. There also are cultural differences that are reflected in differ- ent languages. Park gives the example of how Anglo- American culture often names buildings and brand names after people, reflecting our culture’s values of individualism, while in Korea this phenomenon does not exist at all. On the other hand, Korean’s use of formal and informal inflections reflects their collectivist hierarchical culture. Another concept that does not cross cultural lines is the Korean pumasi system in which family and friends help someone in a time of need with the understanding that the favor will be returned when they need it. This cannot be translated into a single English word, phrase, or subject heading. One way of resolving ambiguity in translations is through modifiers or scope notes, but this is only a partial solution.11 Because translation and transliteration are so difficult, 32 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 as well as labor-intensive, the current trend is to link already existing systems. Multilingual Access to Subjects (MACS) is one such linking project that aims to link subject headings in English, French, and German. It is a joint project under the Conference of European National Librarians among the Swiss National Library, the Bibliothèque nationale de France (BnF), the British Library (BL), and Die Deutsche Bibliothek (DDB). It aims to link the English LCSH, the French Répertoire d’autorité matière encyclopédique et alphabétique unifié (RAMEAU), and the German Schlagwortnormdatei/ Regeln für den Schlagwortkatalog (SWD/RSWK). This requires manually analyzing and matching the concepts in each heading. If there is no conceptual equivalent, then it simply stands alone. MACS can link between headings and strings or even create new headings for linking pur- poses. This is not as fruitful as it sounds, however, as there are fewer correspondences than one might expect. The MACS team experimented with finding correspondences by choosing two topics: sports, which was expected to have a particularly high number of correspondences, and theater, which was expected to have a particularly low number of correspondences. Of the 278 sports head- ings, 86 percent matched in all three languages, 8 percent matched in two, and 6 percent was unmatched. Of the 261 theater headings, 60 percent matched in three lan- guages, 18 percent matched in two, and 22 percent was unmatched.12 Even in the most cross-cultural subject of sports, 14 percent of terms did not correspond fully, mak- ing one wonder whether linking will work well enough to prevail. A similar project—the Virtual International Authority File (VIAF)—is being undertaken for authority headings, a joint project of the LC, the BnF, and DDB, and now including several other national libraries. VIAF aims to link (not consolidate) existing authority files, and its beta version (available at http://viaf.org) allows one to search by name, preferred name, or title. OCLC’s software mines these authority files and the titles associated with them for language, LC control number, LC classifica- tion, usage, title, publisher, place of publication, date of publication, material type, and authors. It then derives a new enhanced authority record, which facilitates map- ping among authority records in all of VIAF’s languages. These derived authority records are stored on OAI serv- ers, where they are maintained and can be accessed by users. Users can search VIAF by a single national library or broaden their possibilities by searching all participat- ing national libraries. As of 2006, between the LC’s and DDB’s authority files, there were 558,618 matches, includ- ing 70,797 complex matches (one-to-many), and 487,821 unique matches (one-to-one) out of 4,187,973 LC names and 2,659,276 DDB names. Ultimately, VIAF could be used for still more languages, including non-Roman alphabets.13 Recently the National Library of Israel has joined, and VIAF can link to the Hebrew alphabet. A similar project to VIAF that also aimed to link authority files was Linking and Exploring Authority Files (LEAF), which was under the auspices of the Information Society Technologies Programme of the Fifth Framework of the European Commission. The three-year project began in 2001 with dozens of libraries and organizations (many of which are national libraries), representing eight languages. Its website describes the project as follows: Information which is retrieved as a result of a query will be stored in a pan-European “Central Name Authority File.” This file will grow with each query and at the same time will reflect what data records are rel- evant to the LEAF users. Libraries and archives want- ing to improve authority information will thus be able to prioritise their editing work. Registered users will be able to post annotations to particular data records in the LEAF system, to search for annotations, and to download records in various formats.14 Park identifies two main problems with linking authority files. One is that name authorities still contain some language-specific features. The other is that disam- biguation can vary among name authority systems (e.g., birth/death dates, corporate qualifiers, and profession/ activity). These are the challenges that projects like LEAF and VIAF must overcome. While the linking of subject headings and name authorities is still experimental and imperfect, the FRBR model for linking titles is much more promising and will be incorporated in the soon-to-be-released RDA. According to Bennett, Lavoie, and O’Neill, there are three important benefits to FRBR: (1) it allows for different views of a bibliographic database, (2) it creates a hierarchy of bibliographic entities in the catalog such that all versions of the same work fall into a single collapsible entry point, (3) and the confluence of the first two benefits makes the cata- log more efficient. In the FRBR model, the bibliographic record consists of four entities: (1) the work, (2) the expres- sion, (3) the manifestation, and (4) the item. All manifesta- tions of a single work are grouped together, allowing for a more economical use of information because the title needs to be entered only once.15 That is, a “title authority file” will exist much like a name authority file. This means that all editions in all languages and in all formats would be grouped under the same title. For example, the Lord of the Rings title would include all novels, films, translations, and editions in one grouping. This would reduce the number of bibliographic records, and as Danskin notes, “The idea of creating more records at a time when publishing output threatens to outstrip the cataloguing capacity of national bibliographic agencies is alarming.”16 The FRBR model is particularly beneficial for com- plex canonical works like the Bible. There are a small number of complex canonical works, but they take up a tHe PAtH tOwARD GlOBAl iNteROPeRABilitY iN cAtAlOGiNG | tOlKOFF 33 disproportionate number of holdings in OCLC.17 Because this only applies to a small number of works, it would not be difficult to implement, and there would be a disproportionate benefit in the long run. There is some uncertainty, however, in what constitutes a complex work and whether certain items should be grouped under the same title.18 For instance, should Prokofiev’s Romeo and Juliet be grouped with Shakespeare’s? The advantage of the FRBR model for titles over subject headings or name authorities is that no such thing as a title authority file exists (as conceptualized by FRBR). We would be able to start from scratch, creating such title authority files at the international level. Subject headings and name authori- ties, on the other hand, already exist in many different forms and languages so that cross-linking projects like VIAF might be our only option. It is encouraging to see the strides being made to make subject headings, name authority headings, and titles globally interoperable, but what about other access points within a record’s bibliographic description? These are usually in only one language, or two if cataloged in a bilingual country. Should these elements (format, contents, and so on) be cross-linked as well, and is this even possible? What should reasonably be considered an access point? Most people search by subject, author, or title, so perhaps it is not worth making other types of access points interoperable for the few occasions when they are useful. Yet if 100 percent universal interoperabil- ity is our ultimate utopian goal, perhaps we should not settle for anything less than true international access to all fields in a record. Because translation and transliteration are such com- plex undertakings, linking of extant files is the future of the field. There are advantages and disadvantages to this. On the one hand, linking these files is certainly bet- ter than having them exist only for their own countries. They are easily executed projects that would not require a total overhaul of the way things currently stand. The disadvantages are not to be ignored, however. The fact that files do not correspond perfectly from language to language means that many files will remain in isolation in the national library that created them. Another problem is that cross-linking is potentially more confusing to the user; the search results on http://www.viaf.org are not always simple and straightforward. If cross-linking is where we are headed, then we need to focus on a more user-friendly interface. If the ultimate goal of interoper- ability is simplification, then we need to actually simplify the way query results are organized rather than make them more confusing. Very soon RDA will be released and will bring us to a new level of interoperability. AACR2 arrived in 1978, and though it has been revised several times, it is in many ways outdated and mainly applies to books. RDA will bring something completely new to the table. It will be flexible enough to be used in other metadata schemes besides MARC, and it can even be used by different industries such as publishers, museums, and archives.19 Its incorporation of the FRBR model is exciting as well. Still, there are some practical problems in implementing RDA and FRBR, one of which is that reeducating librar- ians about the new rules will be costly and take time. Also, FRBR in its ideal form would require a major over- haul of the way OCLC and integrated library systems currently operate, so it will be interesting to see to what extent RDA will actually incorporate FRBR and how it will be practically implemented. Danskin asks, “Will the benefits of international co-operation outweigh the costs of effecting changes? Is the USA prepared to change its own practices, if necessary, to conform to European or wider IFLA standards?”20 It seems that the United States is in fact ready and willing to adopt FRBR, but to what extent is yet to be determined. What I have discussed in this paper are some of the more prominent international standardization projects, although there are countless others, such as EuroWordNet, the Open Language Archives Community (OLAC), and International Cataloguing Code (ICC), to name but a few.21 In general, the current major projects consist of linking subject headings, name authority files, and titles in multiple languages. Linking may not have the best cor- respondence rates, we have still not begun to tackle the cross-linking of other bibliographic elements, and at this point search results may be more confusing than help- ful. But the existence of these linking projects means we are at least headed in the right direction. The emergent universality of OCLC was our most recent step toward interoperability, and it looks as if cross-linking is our next step. Only time will tell what steps will follow. References 1. Lawrence S. Guthrie II, “An Overview of Medieval Library Cataloging,” Cataloging & Classification Quarterly 15, no. 3 (1992): 93–100. 2. Lois Mai Chan and Theodora Hodges, Cataloging and Classification: An Introduction, 3rd ed. (Lanham, Md.: Scarecrow, 2007): 48. 3. Ibid., 6–8. 4. Ibid., 7–9. 5. OCLC, “About OCLC,” http://www.oclc.org/us/en/ about/default.htm (accessed Dec. 9, 2009). 6. Jung-Ran Park, “Cross-Lingual Name and Subject Access: Mechanisms and Challenges,” Library Resources & Technical Ser- vices 51, no. 3 (2007): 181. 7. Ibid., 185. 8. Ibid. Continued on page 39 tAGGiNG: AN ORGANizAtiON scHeMe FOR tHe iNteRNet | visseR 39 International and O’Reilly Media, Web 2.0 refers to the Web as being a platform for harnessing the collective power of Internet users interested in creating and sharing ideas and information without mediation from corporate, government, or other hierar- chical policy influencers or regulators. Web 3.0 is a much more fluid concept as of this writing. There are individuals who use it to refer to a Semantic Web where information is analyzed or processed by software designed specifically for computers to carry out the currently human-mediated activity of assigning meaning to information on a webpage. There are librarians involved with exploring virtual-world librarianship who refer to the 3D environment as Web 3.0. The important point here is that what Internet users now know as Web 2.0 is in the process of being altered by individuals continually experimenting with and improving upon existing Web applications. Web 3.0 is the undefined future of the participatory Internet. 3. Clay Shirky, “Here Comes Everybody: The Power of Organizing Without Organizations” (presentation videocast, Berkman Center for Internet & Society, Harvard University, Cambridge, Mass., 2008), http://cyber.law.harvard.edu/inter active/events/2008/02/shirky (accessed Oct. 1, 2008). 4. Ibid. 5. Lawerence Lessig, “Early Creative Commons History, My Version,” videocast, Aug. 11, 2008, Lessig 2.0, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed Aug. 13, 2008). 6. Elaine Peterson, “Beneath the Metadata: Some Philosophi- cal Problems with Folksonomy,” D-Lib Magazine 12, no. 11 (2006), http://www.dlib.org/dlib/november06/peterson/11peterson .html (accessed Sept. 8, 2008). 7. Clay Shirky, “Ontology is Overrated: Categories, Links, and Tags” online posting, Spring 2005, Clay Shirky’s Writings about the Internet, http://www.shirky.com/writings/ontology_ overrated.html#mind_reading (accessed Sept. 8, 2008). 8. Gene Smith, Tagging: People-Powered Metadata for the Social Web (Berkeley, Calif.: New Riders, 2008): 68. 9. Ibid., 76. 10. Thomas Vander Wal, “Folksonomy,” online posting, Feb. 7, 2007, vanderwal.net, http://www.vanderwal.net/folksonomy .html (accessed Aug. 26, 2008). 11. Thomas Vander Wal, “Explaining and Showing Broad and Narrow Folksonomies,” online posting, Feb. 21, 2005, Personal InfoCloud, http://www.personalinfocloud.com/2005/02/ explaining_and_.html (accessed Aug. 29, 2008). 12. Shirky, “Ontology is Overrated.” 13. Ibid. 14. Michael Arrington, “Exclusive: Screen Shots and Feature Overview of Delicious 2.0 Preview,” online posting, June 16, 2005, TechCrunch, http://www.techcrunch.com/2007/09/06/ exclusive-screen-shots-and-feature-overview-of-delicious-20 -preview/(accessed Jan. 6, 2010). 15. Smith, Tagging, 67–93 . 16. Vander Wal, “Explaining and Showing Broad and Narrow Folksonomies.” 17. Adam Mathes, “Folksonomies—Cooperative Classifica- tion and Communication through Shared Metadata” (graduate paper, University of Illinois Urbana–Champaign, Dec. 2004); Peterson, “Beneath the Metadata”; Shirky, “Ontology is Over- rated”; Thomas and Griffin, “Who Will Create the Metadata for the Internet?” 18. Shirky, “Ontology is Overrated.” 19. Peterson, “Beneath the Metadata.” 20. Cory Doctorow, “Metacrap: Putting the Torch to Seven Straw-Men of the Meta-Utopia,” online posting, Aug. 26, 2001, The Well, http://www.well.com/~doctorow/metacrap.htm (accessed Sept. 15, 2008). 21. Marieke Guy and Emma Tonkin, “Folksonomies: Tidy- ing up Tags?” D-Lib Magazine 12, no. 1 (2006), http://www.dlib .org/dlib/january06/guy/01guy.html (accessed Sept. 8, 2008). 22. Shirky, “Ontology is Overrated.” Global Interoperability continued from page 33 9. Julie Renee Moore, “RDA: New Cataloging Rules, Com- ing Soon to a Library Near You!” Library Hi Tech News 23, no. 9, (2006): 12. 10. Rick Bennett, Brian F. Lavoie, and Edward T. O’Neill, “The Concept of a Work in WorldCat: An Application of FRBR,” Library Collections, Acquisitions, & Technical Services 27, no. 1, (2003): 56. 11. Park, “Cross-Lingual Name and Subject Access.” 12. Ibid. 13. Thomas B. Hickey, “Virtual International Authority File” (Microsoft PowerPoint presentation, ALA Annual Conference, New Orleans, June 2006), http://www.oclc.org/research/ projects/viaf/ala2006c.ppt (accessed Dec. 9, 2009). 14. LEAF, “LEAF Project Consortium,” http://www.crxnet .com/leaf/index.html (accessed Dec. 9, 2009). 15. Bennett, Lavoie, and O’Neill, “The Concept of a Work in WorldCat.” 16. Alan Danskin, “Mature Consideration: Developing Biblio- graphic Standards and Maintaining Values,” New Library World 105, no. 3/4, (2004): 114. 17. Ibid. 18. Bennett, Lavoie, and O’Neill, “The Concept of a Work in WorldCat.” 19. Moore, “RDA.” 20. Danskin, “Mature Consideration,” 116. 21. Ibid.; Park, “Cross-Lingual Name and Subject Access.” 3157 ---- 40 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 Mary Kurtz Dublin Core, DSpace, and a Brief Analysis of Three University Repositories This paper provides an overview of Dublin Core (DC) and DSpace together with an examination of the institutional repositories of three public research universities. The uni- versities all use DC and DSpace to create and manage their repositories. I drew a sampling of records from each reposi- tory and examined them for metadata quality using the criteria of completeness, accuracy, and consistency. I also examined the quality of records with reference to the meth- ods of educating repository users. One repository used librarians to oversee the archiving process, while the other two employed two different strategies as part of the self- archiving process. The librarian-overseen archive had the most complete and accurate records for DSpace entries. T he last quarter of the twentieth century has seen the birth, evolution, and explosive proliferation of a bewildering variety of new data types and formats. Digital text and images, audio and video files, spreadsheets, websites, interactive databases, RSS feeds, streaming live video, computer programs, and macros are merely a few examples of the kinds of data that can be now found on the Web and elsewhere. These new dataforms do not always conform to conventional cata- loging formats. In an attempt to bring some sort of order from chaos, the concept of metadata (literally “data about data”) arose. Metadata is, according to ALA, “structured, encoded data that describe characteristics of information- bearing entities to aid in the identification, discovery, assessment, and management of the described entities.”1 Metadata is an attempt to capture the contextual information surrounding a datum. The enriching con- textual information assists the data user to understand how to use the original datum. Metadata also attempts to bridge the semantic gap between machine users of data and human users of the same data. n Dublin Core Dublin Core (DC) is a metadata schema that arose from an invitational workshop sponsored by the Online Computer Library Center (OCLC) in 1995. “Dublin” refers to the location of this original meeting in Dublin, Ohio, and “Core” refers to that fact DC is set of metadata elements that are basic, but expandable. DC draws upon concepts from many disciplines, including librarianship, computer science, and archival preservation. The standards and definitions of the DC element sets have been developed and refined by the Dublin Core Metadata Initiative (DCMI) with an eye to interoperabil- ity. DCMI maintains a website (http://dublincore.org/ documents/dces/) that hosts the current definitions of all the DC elements and their properties. DC is a set of fifteen basic elements plus three addi- tional elements. All elements are both optional and repeatable. The basic DC elements are: 1. Title 2. Creator 3. Subject 4. Description 5. Publisher 6. Contributor 7. Date 8. Type 9. Format 10. Identifier 11. Source 12. Language 13. Relation 14. Coverage 15. Rights The additional DC Elements are: 16. Audience 17. Provenance 18. Rights Holder DC allows for element refinements (or subfields) that narrow the meaning of an element, making it more specific. The use of these refinements is not required. DC also allows for the addition of nonstandard elements for local use. n DSpace DSpace is an open-source software package that provides management tools for digital assets. It is frequently used to create and manage institutional repositories. First released in 2002, DSpace is a joint development effort of Hewlett Packard (HP) Labs and the Massachusetts Institute of Technology (MIT). Today, DSpace’s future Mary Kurtz (mhkurtz@gmail.com) is a june 2009 graduate of drexel University’s School of information Technology. She also holds a BS in Secondary Education from the University of Scran- ton and an MA in English from the University of illinois at Urbana– Champaign. Currently, kurtz volunteers her time in technical ser- vices/cataloging at Simms Library at Albuquerque Academy and in corporate archives at Lovelace Respiratory Research institute (www.lrri.org), where she is using dSpace to manage a diverse collection of historical photographs and scientific publications. Dc, DsPAce, AND A BRieF ANAlYsis OF tHRee uNiveRsitY RePOsitORies | KuRtz 41 is guided by a loose grouping of interested developers called the DSpace Committers Group, whose members currently include HP Labs, MIT, OCLC, the University of Cambridge, the University of Edinburgh, the Australian National University, and Texas A&M University. DSpace version 1.3 was released in 2005 and the newest version, DSpace 1.5, was released in March 2008. More than one thousand institutions around the world use DSpace, including public and private colleges and universities and a variety not-for-profit corporations. DC is at the heart of DSpace. Although DSpace can be customized to a limited extent, the basic and quali- fied elements of DC and their refinements form DSpace’s backbone.2 n How DSpace works: a contributor’s perspective DSpace is designed for use by “metadata naive” contribu- tors. This is a conscious design choice made by its devel- opers and in keeping with the philosophy of inclusion for institutional repositories. DSpace was developed for use by a wide variety of contributors with a wide range of metadata and bibliographic skills. DSpace simplifies the metadata markup process by using terminology that is different from DC standards and by automating the production of element fields and XML/HTML code. DSpace has four hierarchical levels of users: users, contributors, community administrators, and network/ systems administrators. The user is a member of the general public who will retrieve information from the repository via browsing the database or conducting structured searches for specific information. The contributor is an individual who wishes to add their own work to the database. To become a contributor, one must be approved by a DSpace community adminis- trator and receive a password. A contributor may create, upload, and (depending upon the privileges bestowed upon him by his community administrator), edit or remove informational records. Their editing and removal privileges are restricted to their own records. A community administrator has oversight within their specialized area of DSpace and accordingly has more privileges within the system than a contributor. A community administrator may create, upload, edit, and remove records, but also can edit and remove all records available within the community’s area of the database. Additionally, the community administrator has access to some metadata about the repository’s records that is not available to users and contributors and has the power to approve requests to become contributors and grant upload access to the database. Lastly, the commu- nity administrator sets the rights policy for all materials included in the database and writes the statement of rights that every contributor must agree to with every record upload. The network/systems administrator is not involved with database content, focusing rather on software main- tenance and code customization. When a DSpace contributor wishes to create a new record, the software walks them through the process. DSpace presents seven screens in sequence that ask for specific information to be entered via check buttons, fill- in textboxes, and sliders. At the end of this process, the contributor must electronically sign an acceptance of the statement of rights. Because DSpace’s software attempts to simplify the metadata-creation process for contributors, its terminol- ogy is different from DC’s. DSpace uses more common terms that are familiar to a wider variety of individu- als. For example, DSpace asks the contributor to list an “author” for the work, not a “creator” or a “contribu- tor.” In fact, those terms appear nowhere in any DSpace. Instead, DSpace takes the text entered in the author textbox and maps it to a DC element—something that has profound implications if the mapping does not follow expected DC definitions. Likewise, DSpace does not use “subject” when asking the contributor to describe their material. Instead, DSpace asks the contributor to list keywords. Text entered into the keyword field is then mapped into the subject ele- ment. While this seems like a reasonable path, it does have some interesting implications for how the subject element is interpreted and used by contributors. DC’s metadata elements are all optional. This is not true in DSpace. DSpace has both mandatory and auto- matic elements in its records. Because of this, data records created in DSpace look different than data records created in DC. These mandatory, automatic, and default fields affect the fill frequency of certain DC elements—with all of these elements having 100 percent participation. In DSpace, the title element is mandatory; that is, it is a required element. The software will not allow the contributor to proceed if the title text box is left empty. As a consequence, all DSpace records will have 100 percent participation in the title element. DSpace has seven automatic elements, that is, ele- ment fields that are created by the software without any need for contributor input. Three are date elements, two are format elements, one is an identifier, and one is provenance. DSpace automatically records the time of the each record’s creation in machine-readable form. When the record is uploaded into the database, this time- stamp is entered into three element fields: dc.date.avail- able, dc.date.accessioned, and dc.date.issued. Therefore DSpace records have 100 percent participation in the date element. For previously published materials, a separate screen asks for the original publication date, which is then 42 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 placed in the dc.date.issued element. Like title, the origi- nal date of publication is a mandatory field, and failure to enter a meaningful numerical date into the textbox will halt the creation of a record. In a similar manner, DSpace “reads” the kind of file the contributor is uploading to the database. DSpace automatically records the size and type (.doc, .jpg, .pdf, etc.) of the file or files. This data is automatically entered into dc.format.mimetype and dc.format.extent. Like date, all DSpace records will have 100 percent participation in the format element. Likewise, DSpace automatically assigns a location identifier when a record is uploaded to the database. This information is recorded as an URI and placed in the identifier element. All DSpace records have a dc.identifier.uri field. The final automatic element is provenance. At the time of record creation, DSpace records the identity of the contributor (derived from the sign-in identity and pass- word) and places this information into a dc.provenance element field. This information becomes a permanent part of the DSpace record; however, this field is a hidden to users. Typically only community and network/sys- tems administrators may view provenance information. Still, like date, format, and identifier elements, DSpace records have automatic 100 percent participation in prov- enance. Because of the design of DSpace’s software, all DSpace-created records will have a combination of both contributor-created and DSpace-created metadata. All DSpace records can be edited. During record cre- ation, the contributor may at any time move backward through his record to alter information. Once the record has been finished and the statement of rights signed, the completed record moves into the community administra- tor’s workflow. Once the record has entered the workflow, the community administrator is able to view the record with all the metadata tags attached and make changes using DSpace’s editing tools. However, depending on the local practices and the volume of records passing through the administrator’s workflow, the administrator may simply upload records without first reviewing them. A record may also be edited after it has been uploaded, with any changes being uploaded into the database at the end of editing process. In editing a record after it has been uploaded, the contributor, providing he has been granted the appropriate privileges, is able to see all the metadata elements that have attached to the record. Calling up the editing tools at this point allows the contributor or admin- istrator to make significant changes to the elements and their qualifiers, something that is not possible during the record’s creation. When using the editing tools, the simpli- fied contributor interface disappears, and the metadata elements fields are labeled with their DC names. The con- tributor or administrator may remove metadata tags and the information they contain and add new ones selecting the appropriate metadata element and qualifier from a slider. For example, during the editing process, the contrib- utor or administrator may choose to create dc.contributor. editor or dc.subject.lcsh options—something not possible during the record-creation process. In the examination of the DSpace records from our three repositories, DSpace’s shaping influence on element participation and metadata quality will be clearly seen. n The repositories DSpace is principally used by academic and corporate nonprofit agencies to create and manage their insti- tutional repositories. For this study, I selected three academic institutions that shared similar characteristics (large, public, research-based universities) but which had differing approaches to how they managed their metadata-quality issues. The University of New Mexico (UNM) DSpace reposi- tory (DSpaceUNM) holds a wide-ranging set of records, including materials from the university’s faculty and administration, the Law School, the Anderson School of Business Administration, and the Medical School, as well as materials from a number of tangentially related university entities like the Western Water Policy Review Advisory Commission, New Mexico Water Trust Board, and Governor Richardson’s Task Force on Ethic Reform. At the time of the initial research for this paper (spring 2008), DSpaceUNM provided little easily acces- sible on-site education for contributors about the DSpace record-creation process. What was offered—a set of eight general information files—was buried deep inside the library community. A contributor would have to know the files existed to find them. By summer 2009, this had changed. DSpaceUNM had a new homepage layout. There is now a link to “help sheets and promotional materials” at the top center of the homepage. This link leads to the previously difficult-to- find help files. The content of the help files, however, remains largely unchanged. They discuss community creation, copy- rights, administrative workflow for community creation, a list of supported formats, a statement of DSpaceUNM’s privacy policy, and a list of required, encouraged, and not required elements for each new record created. For the most part, DSpaceUNM help sheets do not attempt to educate the contributor in issues of metadata quality. There is no discussion of DC terminology, no attempts to refer the contributor to a thesaurus or controlled vocabu- lary list, nor any explanation of the record-creation or editing process. This lack of contributor education may be explained in part because DSpaceUNM requires all new records Dc, DsPAce, AND A BRieF ANAlYsis OF tHRee uNiveRsitY RePOsitORies | KuRtz 43 to be reviewed by a subject area librarian as part of the DSpace community workflow. Thus any contributor errors, in theory, ought to be caught and corrected before being uploaded to the database. The University of Washington (UW) DSpace reposi- tory (ResearchWorks at The University of Washington) hosts a narrower set of records than DSpaceUNM, with the materials limited to the those contributed by the university’s faculty, students, and staff, plus materials from the UW’s archives and UW’s School of Public and Community Health. In 2008, ResearchWorks was self-archiving. Most contributors were expected to use DSpace to create and upload their record. There is no indication in the publicly available information about the record creation workflow if record reviews were conducted before record upload. The help link on the ResearchWorks homepage brought contributors to a set of screen-by-screen instructions on how to use DSpace’s software to create and upload a record. The step-through did not include instructions on how to edit a record once it had been created. No expla- nation of the meanings or definitions of the various DC elements was included in the help files. There also were no suggestions about the use of a controlled vocabulary or a thesaurus for subject headings. By 2009, this link had disappeared and the associated contributor education materials with it. The Knowledge Bank at Ohio State University(OSU) is the third repository examined for this paper. OSU’s repository hosts more than thirty communities, all of which are associated with various academic departments or special university programs. Like ResearchWorks at UW, OSU’s repository appears to be self-archiving with no clear policy statement as to whether a record is reviewed before it is uploaded to the repository’s database. OSU makes a strong effort to educate its contribu- tors. On the upper-left of the Knowledge Bank homepage is a slider link that brings the contributor (or any user) to several important and useful sources of repository information: About Knowledge Bank, FAQs, Policies, Video Upload Procedures, Community Set-Up Form, Describing Your Resources, and Knowledge Bank Licensing Agreement. The existence and use of metadata in Knowledge Bank are explicitly mentioned in the FAQ and Policies areas, together with an explanation of what metadata is and how metadata is used (FAQ), and a list of sup- ported metadata elements (Policies). The Describe Your Resources section gives extended definitions of each DSpace-available DC metadata element and provides examples of appropriate metadata-element use. Knowledge Bank provides the most comprehensive contributor education information of any of the three repositories examined. It does not use a controlled vocabulary list for subject headings, and it does not offer a thesaurus. n Data and analysis I chose twenty randomly selected full records from each repository. No more than one record was taken from any one collection to gather a broad sampling from each repository. I examined each record for the quality of its metadata. Metadata quality is a semantically slippery term. Park, in the spring 2009 special metadata issue of Cataloging and Classification Quarterly, suggested that most com- monly accepted criteria for metadata quality are com- pleteness, accuracy, and consistence.3 Those criteria will be applied in this analysis. For the purpose of this paper, I define completeness as the fill rate for key metadata elements. Because the purpose of metadata is to identify the record and to assist in the user’s search process, the key elements are title, contributor/creator, subject, and description.abstract— all contributor-generated fields. I chose these elements because these are the fields that the DSpace software uses when someone conducts an unrestricted search. Table 1 shows the fill rate for the title element is 100 percent for all three repositories. This is to be expected because, as noted above, title is mandatory field. The fill rate for contributor/creator is likewise high: 16 of 20 (80 percent) for UNM, 19 of 20 (95 percent) for UW, and 19 of 20 (95 percent) for OSU. (OSU’s fill rate for creator and contributor were summed because OSU uses different definitions for creator and contributor element fields than do UNM or UW. This discrepancy will be discussed in greater depth in the consistency of metadata terminology below.) The fill rate for subject was more variable. UNM’s subject fill rate was 100 percent, while UW’s was 55 per- cent, and OSU’s was 40 percent. The fill rate for the description.abstract subfield was 12 of 80 (60 percent) at UNM, 15 of 20 (75 percent) at UW, and 8 of 20 (40 percent) at OSU. (See appendix A for a complete list of metadata elements and subfields used by each of the three repositories.) The relatively low fill rate (below 50 percent) at the OSU KnowledgeBank in both subject and description .abstract suggests a lack of completeness in that reposi- tory’s records. Accuracy in metadata quality is the essential “cor- rectness” of a record. Correctness issues in a record range from data-entry issues (typos, misspellings, and inconsis- tent date formats) to the correct application of metadata definitions and data overlaps.4 Accuracy is perhaps the most difficult of the metadata 44 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 quality criteria to judge. Local practices vary widely, and DC allows for the creation of custom metadata tags for local use. Additionally, there is long-standing debate and confusion about the definitions of metadata elements even among librarians and information professionals.5 Because of this, only the most egregious of accuracy errors were considered for this paper. All three repositories had at least one record that contained one or more inaccurate metadata fields; two of them had four or more inaccurate records. Inaccurate records included a wide variety of accu- racy errors, including poor subject information (no matter how loosely one defines a subject heading, “the” is not an accurate descriptor); mutually contradictory metadata (record contained two different language tags, although only one applied to the content); and one in which the abstract was significantly longer and only tangentially related than the file it described. Additionally, records showed confusion over contributor versus creator ele- ments. In a few records, contributors entered duplicate information into both element fields. This observation supports Park and Childress’s findings that there is wide- spread confusion over these elements.6 Among the most problematic records in terms of accuracy were those contained in UW’s Early Buddhist Manuscripts Project. This collection, which has been removed from public access since the original data was drawn for this paper, contained numerous ambiguous, contradictory, and inaccurate metadata elements.7 While contributor-generated subject headings were specifically not examined for this paper, it must be noted that was a wide variation in the level of detail and vocab- ulary used to describe records. No community within any of the repositories had specific rules for the generation of keyword descriptors for records, and the lack of guidance shows. Consistency can be defined as the homogeneity of formats, definitions, and use of DC elements within the records. This consistency, or uniformity, of data is impor- tant because it promotes basic semantic interoperability. Consistency both inside the repository itself and with other repositories makes the repository easier to use and provides the user with higher quality information. All three repositories showed 100 percent consistency in DSpace-generated elements. DSpace’s automated cre- ation of date and format fields provided reliably consis- tent records in those element fields. DSpace’s automatic formatting of personal names in the dc.contributor.author and dc.creator fields also provided excellent internal con- sistency. However, the metadata elements were much less consistent for contributor-generated information. Inconsistency within the subject element is where most problems occurred. Personal names used as subject heading and capitalization within subject headings both proved to be particular issues. DSpace alphabetizes sub- ject headings according to the first letter of the free text entered in the keyword box. Thus the same name entered in different formats (first name first or last name first) generates different subject-heading listings. The same is true for capitalization. Any difference in capitalization of any word within the free-text entry generates a separate subject heading. Another field where consistency was an issue was dc.description.sponsorship. Sponsorship is problem because different communities, even different collections within the same community, use the field to hold differ- ent information. Some collections used the sponsorship field to hold the name of a thesis or dissertation advisor. Some collections used sponsorship to list the funding agency or underwriter for a project being documented inside the record. Some collections used sponsorship to acknowledge the donation of the physical materials docu- mented by the record. While all of these are valid uses of the field, they are not the same thing and do not hold the same meaning for the user. The largest consistency issue, however, came from Table 1. Metadata Fields and their Frequencies Element Univ. of N.M. Univ. of Wash. Ohio State Univ. Title 20 20 20 Creator 0 0 16 Subject 20 11 8 Description 12 16 17 Publisher 4 4 8 Contributor 16 19 3 Date 20 20 20 Type 20 20 20 Identifier 20 20 20 Source 0 0 0 Language 20 20 20 Relation 3 1 6 Coverage 2 0 0 Rights 2 0 0 Provenance ** ** ** **provenance tags are not visible to public users Dc, DsPAce, AND A BRieF ANAlYsis OF tHRee uNiveRsitY RePOsitORies | KuRtz 45 a comparison of repository policies regarding element use and definition. Unaltered DSpace software maps contributor-generated information entered into the author textbox during the record-creation process into the dc.contributor.author field. However, OSU’s DSpace software has been altered so that the dc.contributor .author field does not exist. Instead, text entered into the author textbox during the record-creation process maps to dc.creator. Although both uses are correct, this choice does create a significant difference in element definitions. OSU’s DSpace author fields are no longer congruent with other DSpace author fields. n Conclusions DSpace was created as repository management tool. By streamlining the record creation workflow and partially automating the creation of metadata, DSpace’s develop- ers hoped to make institutional repositories more useful and functional while time providing an improved experi- ence for both users and contributors. In this, DSpace has been partially successful. DSpace has made it easier for the “metadata naive” contributor to create records. And, in some ways, DSpace has improved the quality of repository metadata. Its automatically generated fields ensure better consistency in those elements and subfields. Its mandatory fields guarantee 100 percent fill rates in some elements, and this contributes to an increase in metadata completeness. However, DSpace still relies heavily on contributor- generated data to fill most of the DC elements, and it is in these contributor-generated fields that most of the metadata quality issues arise. Nonmandatory fields are skipped, leading to incomplete records. Data entry errors, a lack of authority control over subject headings, and con- fusion over element definitions can lead to poor metadata accuracy. A lack of enforced, uniform naming and capi- talization conventions leads to metadata inconsistency, as does the localized and individual differences in the application of metadata element definitions. While most of the records examined in this small survey could be characterized as “acceptable” to “good,” some are abysmal. To improve the inconsistency of the DSpace records, the three universities have tried differ- ing approaches. Only UNM’s required record review by a subject area librarian before upload seems to have made any significant impact on metadata quality. UNM has a 100 percent fill rate for subject elements in its records, while UW and OSU do not. This is not to say that UNM’s process is perfect and that poor records do not get into the system—they do (see appendix B for an example). But it appears that for now, the intermediary interven- tion of a librarian during the record-creation process is an improvement over self-archiving—even with educa- tion—by contributors. References and notes 1. Association of Library Collections & Technical Services, Committee on Cataloging: Description & Access, Task Force on Metadata, “Final Report,” June 16, 2000, http://www.libraries .psu.edu/tas/jca/ccda/tf-meta6.html (accessed Mar. 10, 2007). 2. A voluntary (and therefore less-than-complete) list of current DSpace users can be found at http://www.dspace. org/index.php?option=com_content&task=view&id=596&Ite mid=180. Further specific information about DSpace, includ- ing technical specifications, training materials, licensing, and a user wiki, can be found at http://www.dspace.org/index .php?option=com_content&task=blogcategory&id=44&Itemi d=125. 3. Jung-Ran Park “Metadata Quality in Digital Repositories: A Survey of the Current State of the Art,” Cataloging & Classifica- tion Quarterly 47, no. 3 (2009): 213–28. 4. Sarah Currier et al., “Quality Assurance for Digital Learning Object Repositories: Issues for the Metadata Creation Process,” ALT-J: Research in Learning Technology 12, no. 1 (2004): 5–20. 5. Jung-Ran Park and Eric Childress, “DC Metadata Seman- tics: An Analysis of the Perspectives of Informational Profession- als,” Journal of Information Science 20, no. 10 (2009): 1–13. 6. Ibid. 7. For a fuller discussion of the collection’s problems and challenges in using both DSpace and DC, see Kathleen For- sythe et al., University of Washington Ealy Buddhist Manuscripts Project in DSpace (paper presented at DC-2003, Seattle, Wash., Sept. 28–Oct. 2, 2003), http://dc2003.ischool.washington.edu/ Archive-03/03forsythe.pdf (accessed Mar. 10, 2007). LITA cover 2, cover 3 Neal-Schuman cover 4 OCLC 7 Index to Advertisers 46 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 Appendix A. A list of the most commonly used qualifiers in each repository university of New Mexico dc.date.issued (20) dc.date.accessioned (20) dc.date.available (20) dc.format.mimetype (20) dc.format.extent (20) dc.identifier.uri (20) dc.contributor.author (15)) dc.description.abstract (12) dc.identifier.citation (6) dc.description.sponsorship (4) dc.subject.mesh (2) dc.contributor.other (2) dc.description.sponsor (1) dc.date.created (1) dc.relation.isbasedon (1) dc.relation.ispartof (1) dc.coverage.temporal (1) dc.coverage.spatial (1) dc.contributor.other (1) university of washington dc.date.accessioned (20) dc.date.available (20) dc.date.issued (20) dc.format.mimetype (20) dc.format.extent (20) dc. identifier.uri (20) dc.contributor.author (18) dc.description.abstract (15) dc.identifier.citation (4) dc.identifier.issn (4) dc.description.sponsorship (1) dc.contributor.corporateauthor (1) dc.contributor.illustrator (1) dc.relation.ispartof (1) Ohio state university dc.date.issued (20) dc.date.available (20) dc.date.accessioned (20) dc.format.mimetype (20) dc.format.extent (20) dc.identifier.uri (20) dc.description.abstract (8) dc.identifier.citation (4) dc.subject.lcsh (4) dc.relation.ispartof (4) dc.description.sponsorship (3) dc.identifier.other (2) dc.contributor.editor (2) dc.contribtor.advisor (1) dc.identifier.issn (1) dc.description.duration (1) dc.relation.isformatof (1) dc.description.statementofresponsi- bility (1) dc.description.tableofcontents (1) Appendix B. Sample Record dc.identifier.uri http://hdl.handle.net/1928/3571 dc.description.abstract President Schmidly’s charge for the creation of a North Golf Course Community Advisory Board. dc.format.extent 17301 bytes dc.format.mimetype application/pdf dc.language.iso en_US dc.subject President dc.subject Schmidly dc.subject North dc.subject Golf dc.subject Course dc.subject Community dc.subject Advisory dc.subject Board dc.subject Charge dc.title Community_Advisory_Board_Charge dc.type Other 3158 ---- lauren H. Mandel (lmandel@fsu.edu) is a doctoral candidate at the florida State University College of Communication & informa- tion, School of Library & information Studies, and is Research Coordinator at the information Use Management & Policy insti- tute. Geographic Information Systems: Tools for Displaying In-Library Use Data Lauren H. Mandel GeOGRAPHic iNFORMAtiON sYsteMs: tOOls FOR DisPlAYiNG iN-liBRARY use DAtA | MANDel 47 In-library use data is crucial for modern libraries to understand the full spectrum of patron use, including patron self-service activities, circulation, and reference statistics. Rather than using tables and charts to display use data, a geographic information system (GIS) facili- tates a more visually appealing graphical display of the data in the form of a map. GISs have been used by library and information science (LIS) researchers and practitio- ners to create maps that display analyses of service area populations and demographics, facilities space manage- ment issues, spatial distribution of in-library use of materials, planned branch consolidations, and so on. The “seating sweeps” method allows researchers and librari- ans to collect in-library use data regarding where patrons are locating themselves within the library and what they are doing at those locations, such as sitting and reading, studying in a group, or socializing. This paper proposes a GIS as a tool to visually display in-library use data col- lected via “seating sweeps” of a library. By using a GIS to store, manage, and display the data, researchers and librarians can create visually appealing maps that show areas of heavy use and evidence of the use and value of the library for a community. Example maps are included to facilitate the reader’s understanding of the possibilities afforded by using GISs in LIS research. T he modern public library operates in a context of limited (and often continually reduced) funding where the librarians must justify the continued value of the library to funding and supervisory authori- ties. This is especially the case as more and more patrons access the library virtually, calling into question the relevance of the physical library. In this context, there is a great need for librarians and researchers to evaluate the use of library facility space to demonstrate that the physical library is still being used for important social and educational functions. Despite this need, no model of public library facility evaluation emphasizes the ways patrons use library facilities. The systematic collection of in-library use data must go beyond traditional circula- tion and reference transactions to include self-service activities, group study and collaboration, socializing, and more. Geographic information systems (GISs) are beginning to become deployed in library and information science (LIS) research as a tool for graphically displaying data. An initial review of the literature has yielded studies where a GIS has been used in analyzing service area populations through U.S. Census data;1 sitting facility locations;2 managing facilities, including spatial distribu- tion of in-library book use and occupancy of library study space;3 and planning branch consolidations.4 These uses of GIS are not mutually exclusive; studies have combined multiple uses of GISs.5 Also, GISs have been proposed as viable tools for producing visual representations of measurements of library facility use.6 These studies show the capabilities of a GIS for storing, managing, analyzing, and displaying in-library use data and the value of GIS- produced maps for library facility evaluations, in-library use research, and library justification. n Research purpose Observing and measuring the use of a library facility is a crucial step in the facility evaluation process. The library needs to understand how the facility is currently being used in order to justify the continued financial support necessary to maintain and operate it. Understanding how the facility is used can also help librarians identify high- traffic areas of the library that are ideal locations to mar- ket library services and materials. This understanding cannot be reached by analyzing circulation and reference transaction data alone; it must include in-library use mea- sures that account for all ways patrons are using the facil- ity. The purpose of this paper is to suggest a method by which to observe and record all uses of a library facility during a sampling period, the so-called “seating sweep” performed by Given and Leckie, and then to use a GIS to store, manage, and display the collected data on a map or series of maps that graphically depict library use.7 n Significance of facility evaluation Facility evaluation is a topic of vital importance in all fields, but this is especially true of a field such as public librarianship where funding is often a source of concern.8 In times of economic instability, libraries can benefit from the ability to identify uses of existing facilities and employ this information to justify the continued opera- tion of the library facility. Also, knowing which areas of the library are more frequently used than others can help lauren H. Mandel (lmandel@fsu.edu) is a doctoral candidate at the florida State University College of Communication & in- formation, School of Library & information Studies, and is Re- search Coordinator at the information Use Management & Policy institute. 48 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 librarians determine where to place displays of library materials and advertisements of library services. For a library to begin to evaluate patron use and how well the facility meets users’ needs, there must be an understanding of what users need from the library facility.9 To determine those needs, it is vital that library staff observe the facility while it is being used. This obser- vation can be applied to the facility evaluation plan to justify the continued operation of the facility to meet the needs of the library service population. Understanding how people use the public library facility beyond traditional measures of circulation statis- tics and reference transactions can lead to new theories of library use, an area of significant research interest for LIS. Additionally, the importance of this work transcends LIS because it applies to other government-funded com- munity service agencies as well. For example, recreation facilities and community centers could also benefit from a customer-use model that incorporates measures of the true use of those facilities. n Literature review Although much has been written on the use of library facilities, little of the research includes studies of how patrons actually use existing public library facilities and whether facilities are designed to accommodate this use.10 Rather, much of the research in public library facility eval- uation has focused on collection and equipment space needs,11 despite the user-oriented focus of public library accountability models.12 Recent research in library facility design is beginning to reflect this focus,13 but additional study would be useful to the field. Use of GIS is on the rise in the modern technologi- cal world. A GIS is a computer-based tool for compiling, storing, analyzing, and displaying data graphically.14 Usually this data is geospatial in nature, but a GIS also can incorporate descriptive or statistic data to provide a richer picture than figures and tables can. Although GIS has been around for half a century, it has become increas- ingly more affordable, allowing libraries and similar institutions to consider using a GIS as a measurement and analysis tool. GISs have started being used in LIS research as a tool for graphically displaying library data. One fruitful area has been the mapping of user demographics for facil- ity planning purposes,15 including studies that mapped library closures.16 Mapping also can include in-library use data,17 in which case a GIS is used to overlay collected in-library use data on library floor plans. This can offer a richer picture of how a facility is being used than tradi- tional charts and tables can provide. using a Gis to display library service area population data Adkins and Sturges suggest libraries use a GIS-based library service area assessment as a method to evaluate their service areas and plan library services to meet the unique demographic demands of their communities.18 They discuss the methods of using GIS, including down- loading U.S. Census TIGER (Topologically Integrated Geographic Encoding and Referencing) files, geocoding library locations, delineating service areas by multiple methods, and analyzing demographics. A key tenet of this approach is the concept that public libraries need to understand the needs of their patrons. This is a prevailing concept in the literature.19 Prieser and Wang, in reporting a method used to create a facilities master plan for the Public Library of Cincinnati and Hamilton County, Ohio, offer a convincing argument for combining GIS and building performance evaluation (BPE) methods to examine branch facility needs and offer individualized facilities recommendations.20 Like other LIS researchers,21 Preiser and Wang suggest a relation- ship between libraries and retail stores, noting the similar modern trends of destination libraries and destination bookstores. They also acknowledge the difficulty in com- pleting an accurate library performance assessment due to the multitude of activities and functions of a library. Their method is a combination of a GIS-based service area and population analysis with a BPE that includes staff and user interviews and surveys, direct observation, and photography. The described multimethod approach offers a more complete picture of a library facility’s per- formance than traditional circulation-based evaluations. Further use of GISs in library facility planning can be seen from a study comparing proposed branches by demographic data that has been analyzed and presented through a GIS. Hertel and Sprague describe research that used a GIS to conduct geospatial analysis of U.S. Census data to depict the demographics of populations that would be served by two proposed branch libraries for a public library system in Idaho.22 A primary purpose of this research is to demonstrate the possible ways public libraries can use GIS to present visual and quantitative demographic analyses of service area populations. Hertel and Sprague identify that public libraries are challenged to determine which public they are serving and the needs of that population, writing that “libraries are beginning to add customer-based satisfaction as a critical compo- nent of resource allocation decisions” and need the help of a GIS to provide hard-data evidence in support of staff observations.23 This evidence could take the form of demographic data, as discussed by Hertel and Sprague, and also could incorporate in-library use data to present a fuller picture of a facility’s use. GeOGRAPHic iNFORMAtiON sYsteMs: tOOls FOR DisPlAYiNG iN-liBRARY use DAtA | MANDel 49 using Gis to display in-library use data Xia conducted several studies in which he collected library- use data and mapped that data via a GIS. In one study designed to identify the importance of space management in academic libraries, Xia suggests applications of GISs in library space management, particularly his tool integrating library floor plans with feature data in a GIS.24 He explains that a GIS can overcome the constraints of drafting and computer automated design tools, such as those in use at Chico Meriam Library at California State University and at the Michigan State University Main Library. For example, GISs are not limited to space visualization manipulation, but can incorporate user perceptions, behavior, and daily activities, all of which are important data to library space management considerations and in-library use research. Xia also reviews the use of GIS tools that incorporate hos- pital and casino floor plans, noting that library facilities are as equally complex as hospitals and casinos; this is a com- pelling argument that academic libraries should consider the use of a GIS as a space management tool. In another study, Xia uses a GIS to visualize the spatial distribution of books in the library in an attempt to establish the relationship between the height of book- shelves and the in-library use of books.25 This study seeks to answer the question of how the location of books on shelves of different heights could influence user behav- ior (i.e., patrons may prefer to browse shelves at eye level rather than the top and bottom shelves). What is of interest here is Xia’s use of a GIS to spatially represent the collected data. Xia remarks that a GIS “is suitable for assisting in the research of in-library book use where library floor layouts can be drawn into maps on multiple- dimensional views.”26 In fact, Xia’s graphics depict the use of books by bookshelf height in a visual manner that could not be achieved without the use of a GIS. Similarly, a GIS can be used to spatially represent the collected data in an in-library use study by overlaying the data onto a representation of the library floor plan. In a third project, Xia measures study space use in academic libraries as a metric of user satisfaction with library services.27 He says that libraries need to evaluate space needs on case-by-case basis because every library is unique and serves a unique population. Therefore, to observe the occupancy of study areas in an academic library, Xia drew the library’s study facilities (including furniture) in a GIS. He then observed patrons’ use of the facilities and entered the observation data into the GIS to overlay on maps of the study areas. There are several advantages of using GIS in this way: Spatial databases can store continuing data sets, the system is powerful and flexible for manipulating and analyzing the spatial dataset, there are enhanced data visualization capabili- ties, and maps and data become interactive. conclusions drawn from the literature A GIS is a tool gaining momentum in the world of LIS research. GISs have been used to conduct and display service area population assessments,28 propose facility locations,29 and plan for and measure branch consolidation impacts and benefits.30 GISs also have been used to graphi- cally represent in-library use for managing facility space allocation, mapping in-library book use, and visualizing the occupancy of library study space.31 Additionally, GISs have been used in combination studies that examine library service areas and facility location proposals.32 These uses of GISs are only the beginning; a GIS can be used to map any type of data a library can collect, including all measures of in-library use. Additionally, GIS-based data analysis and display complements the focus in library-use research on gathering data to show a richer picture of a facility’s use and the focus in library facility design literature on build- ing libraries on the basis of community needs.33 n In-library use research that would benefit from spatial data displays Unobtrusive observational research offers a rich method for identifying and recording the use of a public library facility. A researcher could obtain a copy of a library’s floor plan, predetermine sampling times during which to “sweep” the library, and conduct the sweeps by marking all patrons observed on the floor plan.34 This data then could be entered into a GIS database for spatial analysis and display. Specific questions that could be addressed via such a method include the following: n What are all the ways in which people are using the library facility? n How many people are using traditional library resources, such as books and computers? n How many people are using the facility for other rea- sons, such as relaxation, meeting friends, and so on? n Do the ways in which patrons use the library vary by location within the facility (e.g., are the people using traditional library resources and the people using the library for other reasons using the same areas of the library or different areas)? n Which area(s) of the library facility receive the highest level of use? It is hoped that answers to these questions, in whole or in part, could begin to offer a picture of how a library facility is currently being used by library patrons. To better view this picture, the data recorded from the observational research could be entered into a GIS to 50 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 overlay onto the library floor plan in a similar manner as Xia’s use of a GIS to display occupancy of library study space.35 This spatial representation of the data should facilitate greater understanding of the actual use of the library facility. Instead of a library presenting tables and graphs of library use, it would be able to produce illus- trative maps that would help explain patterns of use to funding and supervising authorities. These maps would not require expensive proprietary GIS packages; the examples provided in this paper were created using the free, open-source MapWindow GIS package. example using Gis to display in-library use data For this paper, I produced example maps on the basis of fictional in-library use data. These maps were created using MapWindow GIS software along with Microsoft Excel, Publisher, and Paint (see figure 1 for a diagram of this process). MapWindow is an open-source GIS package that is easy to learn and use, but its layout and graphic design features are limited compared to the more expensive and sophisticated proprietary GIS packages.36 MapWindow files are compatible with the proprietary packages, so they could be imported into other GIS packages for finishing. For this paper, however, the goal was to create simple maps that a novice could replicate. Therefore Publisher and Paint were used for finalizing the maps, instead of a sophisticated GIS package. It was relatively easy to create the maps. First, I drew a sample floor plan of a fictional library computer lab in Excel and imported it into MapWindow as a JPEG file. I then overlaid polygons (shapes that represent area units such as chairs and tables) onto the floor plan and saved two shapefiles, one for tables and one for computers. A shapefile is a basic storage file used in most GIS pack- ages. For each of those shapefiles I created an attribute table (basically, a linked spreadsheet) using fictitious data representing use of the tables and computers at 9 and 11 a.m. and 1, 3, 5, and 7 p.m. on a sample day. The field cal- culator generated a final column summing the total use of each table and computer for the fictitious sample day. I then created maps depicting the use of both tables and computers at each of the sample time periods (see figure 2) and for the total use (see figure 3). Benefits of Gis-created displays for library managers The maps presented here are not based on actual data, but are meant to demonstrate the capabilities of GISs for spa- tially representing the use of a library facility. This could be done on a grander scale using an entire library floor plan and data collected during a longer sample period (e.g., a full week). These maps can serve several purposes for Figure 1. Process diagram for creating the sample maps Figure 2. Example maps depicting use of tables and computers in a fictional library computer lab, by hour GeOGRAPHic iNFORMAtiON sYsteMs: tOOls FOR DisPlAYiNG iN-liBRARY use DAtA | MANDel 51 library managers, specifically regarding the marketing of library services and the justification of library funding. Mapping data obtained from library “sweeps” can help identify the popularity of different areas of the library at different times of the day, different days of the week, or different times of the year. Once the library has identified the most popular areas, this information can be used to market library materials and services. For example, a highly populated area would be an ideal loca- tion over which to install ceiling-mounted signs that the library could use for marketing services and programs. Or the library could purchase a book display table similar to those used in bookstores and install it in the middle of a frequently populated area. The library could stock the table with seasonally relevant books and other materials (e.g., tax guidebooks in March and April) and track the circulation of these materials to determine the degree to which placement on the display table resulted in increased borrowing of those materials. In addition to helping the library market its materials and services, mapping in-library use can provide visual evidence of the library’s value. Public libraries often rely on reference and circulation transaction data, gate counts, and programming attendance statistics to justify their exis- tence. These measures, although valuable and important, do not include many other ways that patrons use libraries, such as sitting and reading, studying, group work, and socializing. During “seating sweeps,” the observers can record any and all uses they observe, including any that may not have been anticipated. All of these uses could then be mapped, providing a richer picture of how a pub- lic library is used and stronger justification of the library’s value. These maps may be easier for funding and supervis- ing authorities to understand than textual explanations or graphs and charts of statistical analyses. n Conclusion From a review of the literature, it is clear that GISs are increasingly being used in LIS research as data-analysis and display tools. GISs are being used to analyze patron and materials data as well as studies combining com- bined multiple uses of GISs. Patron analysis has included service-area-population analysis and branch-consolida- tion planning. Analysis of library materials has been used for space management, visualizing the spatial distribu- tion of in-library book use, and visual representation of facility-use measurements. This paper has proposed collecting in-library use data according to Given and Leckie’s “seating sweeps” method and visually displaying that data via a GIS. Examples of such visual displays were provided to facilitate the reader’s understanding of the possibilities afforded by using a GIS in LIS research, as well as the scalable nature of the method. Librarians and library staff can produce maps similar to the examples in this paper with minimal GIS training and background. The literature review and example figures offered in this paper show the capa- bilities of GISs for analyzing and graphically presenting library-use data. GISs are tools that can facilitate library facility evaluations, in-library use research, and library valuation and justification. References 1. Denice Adkins and Denyse K. Sturges, “Library Service Planning with GIS and Census Data,” Public Libraries 43, no. 3 (2004): 165–70; Karen Hertel and Nancy Sprague, “GIS and Cen- sus Data: Tools for Library Planning,” Library Hi Tech 25, no. 2 (2007): 246–59; Wolfgang F. E. Preiser and Xinhao Wang, “Assess- ing Library Performance with GIS and Building Evaluation Meth- ods,” New Library World 107, no. 1224–25 (2006): 193–217. Figure 3. Example map depicting total use of tables and computers in a fictional library computer lab for a sample day 52 iNFORMAtiON tecHNOlOGY AND liBRARies | MARcH 2010 2. Hertel and Sprague, “GIS and Census Data”; Preiser and Wang, “Assessing Library Performance.” 3. Jingfeng Xia, “Library Space Management: A GIS Pro- posal,” Library Hi Tech 22, no. 4 (2004): 375–82; Xia, “Using GIS to Measure In-Library Book-Use Behavior,” Information Technology & Libraries 23, no. 4 (2004): 184–91; Xia, “Visualizing Occupancy of Library Study Space with GIS Maps,” New Library World 106, no. 1212–13 (2005): 219–33. 4. Preiser and Wang, “Assessing Library Performance.” 5. Hertel and Sprague, “GIS and Census Data”; Preiser and Wang, “Assessing Library Performance.” 6. Preiser and Wang, “Assessing Library Performance”; Xia, “Library Space Management”; Xia, “Using GIS to Measure”; Xia, “Visualizing Occupancy.” 7. Lisa M. Given and Gloria J. Leckie, “‘Sweeping’ the Library: Mapping the Social Activity Space of the Public Library,” Library & Information Science Research 25, no. 4 (2003): 365–85. 8. “Jackson Rejects Levy to Reopen Libraries,” American Libraries 38, no. 7 (2007): 24–25; “May Levy Set for Jackson County Libraries Closing in April,” American Libraries 38, no. 3 (2007): 14; “Tax Reform Has Florida Bracing for Major Budget Cuts,” American Libraries 38, no. 8 (2007): 21. 9. Anne Morris and Elizabeth Barron, “User Consultation in Public Library Services,” Library Management 19, no. 7 (1998): 404–15; Susan L. Silver and Lisa T. Nickel, Surveying User Activ- ity as a Tool for Space Planning in an Academic Library (Tampa: Univ. of South Florida Library, 2002); James Simon and Kurt Schlichting, “The College Connection: Using Academic Support to Conduct Public Library Services,” Public Libraries 42, no. 6 (2003): 375–78. 10. Given and Leckie, “‘Sweeping’ the Library”; Christie M. Koontz, Dean K. Jue, and Keith Curry Lance, “Collecting Detailed In-Library Usage Data in the U.S. Public Libraries: The Methodology, the Results and the Impact,” in Proceedings of the Third Northumbria International Conference on Performance Measurement in Libraries and Information Services (Newcastle, UK: University of Northumbria, 2001): 175–79; Koontz, Jue, and Lance, “Neighborhood-Based In-Library Use Performance Measures for Public Libraries: A Nationwide Study of Majority- Minority and Majority White/Low Income Markets Using Personal Digital Data Collectors,” Library & Information Science Research 27, no. 1 (2005): 28–50. 11. Cheryl Bryan, Managing Facilities for Results: Optimizing Space for Services (Chicago: ALA, 2007); Anders C. Dahlgren, Public Library Space Needs: A Planning Outline (Madison, Wis.: Department of Public Instruction, 1988); William W. Sannwald and Robert S. Smith, eds., Checklist of Library Building Design Considerations (Chicago: ALA, 1988). 12. Brenda Dervin, “Useful Theory for Librarianship: Com- munication, Not Information,” Drexel Library Quarterly 13, no. 3 (1977): 16–32; Morris and Barron, “User Consultation”; Pre- iser and Wang, “Assessing Library Performance”; Simon and Schlichting, “The College Connection”; Norman Walzer, Karen Stott, and Lori Sutton, “Changes in Public Library Services,” Illinois Libraries 83, no. 1 (2001): 47–52. 13. Bradley Wade Bishop, “Use of Geographic Information Systems in Marketing and Facility Site Location: A Case Study of Douglas County (Colo.) Libraries,” Public Libraries 47, no. 5: 65–69; David Jones, “People Places: Public Library Build- ings for the New Millennium,” Australasian Public Libraries & Information Services 14, no. 3 (2001): 81–89; Nolan Lushington, Libraries Designed for Users: A 21st Century Guide (New York: Neal-Schuman, 2001); Shannon Mattern, “Form for Function: the Architecture for New Libraries,” in The New Downtown Library: Designing with Communities (Minneapolis: Univ. of Minnesota Pr., 2007), 55–83. 14. United Nations, Department of Economic and Social Affairs, Statistics Division, Handbook on Geographical Information Systems and Mapping (New York: United Nations, 2000). 15. Adkins and Sturges, “Library Service Planning”; Bishop, “Use of Geographic Information Systems”; Hertel and Sprague, “GIS and Census Data”; Christie Koontz, “Using Geographic Information Systems for Estimating and Profiling Geographic Library Market Areas,” in Geographic Information Systems and Libraries: Patrons, Maps, and Spatial Information, ed. Linda C. Smith and Mike Gluck (Urbana–Champaign: Univ. of Illinois Pr., 1996): 181–93; Preiser and Wang, “Assessing Library Perfor- mance.” 16. Christie M. Koontz, Dean K. Jue, and Bradley Wade Bishop, “Public Library Facility Closure: An Investigation of Reasons for Closure and Effects on Geographic Market Areas,” Library & Information Science Research 31, no. 2 (2009): 84–91. 17. Xia, “Library Space Management”; Xia, “Using GIS to Measure”; Xia, “Visualizing Occupancy.” 18. Adkins and Sturges, “Library Service Planning.” 19. Bishop, “Use of Geographic Information Systems”; Jones, “People Places”; Koontz, Jue, and Lance, “Collecting Detailed In- Library Usage Data”; Koontz, Jue, and Lance, “Neighborhood- Based In-Library Use”; Morris and Barron, “User Consultation”; Simon and Schlichting, “The College Connection”; Walzer, Stott, and Sutton, “Changes in Public Library Services.” 20. Preiser and Wang, “Assessing Library Performance.” 21. Given and Leckie, “‘Sweeping’ the Library;” Christie M. Koontz, “Retail Interior Layout for Libraries,” Marketing Library Services 19, no. 1 (2005): 3–5. 22. Hertel and Sprague, “GIS and Census Data.” 23. Ibid., 247. 24. Xia, “Library Space Management.” 25. Xia, “Using GIS to Measure.” 26. Ibid., 186. 27. Xia, “Visualizing Occupancy.” 28. Adkins and Sturges, “Library Service Planning”; Her- tel and Sprague, “GIS and Census Data”; Preiser and Wang, “Assessing Library Performance.” 29. Hertel and Sprague, “GIS and Census Data”; Preiser and Wang, “Assessing Library Performance.” 30. Koontz, Jue, and Bishop, “Public Library Facility Clo- sure”; Preiser and Wang, “Assessing Library Performance.” 31. Xia, “Library Space Management”; Xia, “Using GIS to Measure”; Xia, “Visualizing Occupancy.” 32. Hertel and Sprague, “GIS and Census Data”; Preiser and Wang, “Assessing Library Performance.” 33. Given and Leckie, “‘Sweeping’ the Library”; Koontz, Jue, and Lance, “Collecting Detailed In-Library Usage Data”; Koontz, Jue, and Lance, “Neighborhood-Based In-Library Use”; Silver and Nickel, Surveying User Activity; Jones, “People Places”; Lushington, Libraries Designed for Users. 34. Given and Leckie, “‘Sweeping’ the Library.” 35. Xia, “Visualizing Occupancy.” 36. For more information or to download MapWindow GIS, see http://www.mapwindow.org/ 3166 ---- EDITORIAL | TRuITT 3 Marc TruittEditorial W elcome to 2009! It has been unseasonably cold in Edmonton, with daytime “highs”—I use the term loosely— averaging around -25°C (that’s -13°F, for those of you ITAL readers living in the States) for much of the last three weeks. Factor in wind chill (a given on the Canadian Prairies), and you can easily subtract another 10°C. As a result, we’ve had more than a few days and nights where the adjusted temperature has been much closer to -40°, which is the same in either Celsius or Fahrenheit. While my boss and chief librarian is fond of saying that “real Canadians don’t even button their shirts until it gets to minus forty,” I’ve yet to observe such a feat of derring-do by anyone at much less than twenty below <grin>. Even your editor’s two Labrador retrievers—who love cooler weather—are reluctant to go out in such cold, with the result that both humans and pets have all been coping with bouts of cabin fever since before Christmas. n So, when is it “too cold” for a server room? Why, you may reasonably ask, am I belaboring ITAL readers with the details of our weather? Over the week- end we experienced near-simultaneous failures of both cooling systems in our primary server room (SR1), which meant that nearly all of our library IT services, including our OPAC (which we host for a consortium of twenty area libraries), a separate OPAC for Edmonton Public Library, our website, and access to licensed e-resources, e-mail, files, and print servers had to be shut down. Temperature readings in the room soared from an average of 20–22°C (68–71.5°F) to as much as 37°C (98.6°F) before settling out at around 30°C (86°F). We spent much of the weekend and beginning of this week relocating servers to all man- ner of places while the cooling system gets fixed. I imag- ine that next we may move one into each staff person’s under-heated office, where they’ll be able to perform double duty as high-tech foot warmers! All of this happened, of course, while the temperature outside the building hovered between -20° and -25°C. This is not the first time we’ve experienced a failure of our cooling systems during extremely cold weather. Last winter we suffered a series of problems with both the systems in SR1 and in our secondary room a few feet away. The issues we had then were not the same as those we’re living through now, but they occurred, as now, at the coldest time of the year. This seeming dichotomy of an overheated server environment in the depths of winter is not a matter of accident or coincidence; indeed, while it may seem counterintuitive, the fact is that many, if not all, of our cooling woes can be traced to the cold outside. The simple explanation is that extreme cold weather stresses and breaks things, including HVAC systems. As we’ve tried to analyze this incident, it appears likely that our troubles began when the older of our two systems in SR1 developed a coolant leak at some point after its last preventive maintenance servicing in August. Fall was mild here, and we didn’t see the onset of really severe cold weather until early to mid-December. Since the older system is mainly intended for failover of the newer one, and since both systems last received routine service recently, it is possible that the leak could have developed at any time since, although my supposition is that it may be itself a result of the cold. In any case, all seemed well because the newer cool- ing system in SR1 was adequate to mask the failure of the older unit, until it suffered a controller board failure that took it offline last weekend. But, with the failure of the new system on Saturday, all IT services provided from this room had to be brought down. After a night spent try- ing to cool the room with fans and a portable cooling unit, we succeeded in bringing the two OPACs and other core services back online by Sunday, but the coolant leak in the old system was not repaired until midday Monday. Today is Friday, and we’ve limped along all week on about 60 percent of the cooling normally required in SR1. We hope to have the parts to repair the newer cooling system early next week (fingers crossed!). Some interesting lessons have emerged from this incident, and while probably not many of you regularly deal with -30°C winters, I think them worth sharing in the hope that they are more generally applicable than our winter extremes are: 1. Document your servers and the services that reside on them. We spent entirely too much time in the early hours of this event trying to relate servers and ser- vices. We in information technology (IT) may think of shutting down or powering up servers “Fred,” “Wilma,” “Betty,” and “Barney,” but, in a crisis, what we generally should be thinking of is whether or not we can shut down e-mail, file-and-print ser- vices, or the integrated library system (ILS) (and, if the latter, whether we shut down just the underlying database server or also the related staff and public services). Perhaps your servers have more obvious names than ours, in which case, count yourself for- tunate. But ours are not so intuitively named—there is a perfectly good reason for this, by the way—and with distributed applications where the database Marc Truitt (marc.truitt@ualberta.ca) is Associate Director, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 4 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 may reside here, the application there, and the Web front end yet somewhere else, I’d be surprised if your situation isn’t as complex as ours. And bear in mind that documentation of dependencies goes two ways: Not only do you want to know that “Barney” is hosting the ILS’s Oracle database, but you also want to know all of the servers that should be brought up for you to offer ILS–related services. 2. Prioritize your services. If your cooling system (or other critical server-room utility) were sud- denly only operating at 50 percent of your normal required capacity, how would you quickly decide which services to shut down and which to leave up? I wrote in this space recently that we’ve been thinking about prioritized services in the context of disaster recovery and business continuity, but this week’s incident tells me that we’re not really there yet. Optimally, I think that any senior member of my on-call staff should be empowered in a given critical situation to bring down services on the basis of a predefined set of service priorities. 3. Virtualize, virtualize, virtualize. If we are at all typi- cal of large libraries in the Association of Research Libraries (and I think we are), then it will come as no surprise that we seem to add new services with alarming frequency. I suspect that, as with most places, we tend to try and keep things simple at the server end by hosting new services on sepa- rate, dedicated servers. The resulting proliferation of new servers has led to ever-greater strains on power, cooling, and network infrastructures in a facility that was significantly renovated less than two years ago. And I don’t see any near-term likeli- hood that this will change. We are, consequently, in the very early days of investigating virtualization technology as a means of reducing the number of physical boxes and making much better use of the resources—especially processor and RAM— available to current-generation hardware. I’m hop- ing that someone among our readership is farther along this path than we and will consider submit- ting to ITAL a “how we done it” on virtualization in the library server room very soon! 4. Sometimes low-tech solutions work . . . No one here has failed to observe the irony of an overheated server room when the temperature just steps away is 30° below. Our first thought was how simple and elegant a solution it would be to install duct- ing, an intake fan, and a damper to the outside of the building. Then, the next time our cooling failed in the depths of winter, voila!, we could solve the problem with a mere turn of the damper control. 5. . . . and sometimes they don’t. Not quite, it seems. When asked, our university facilities experts told us that an even greater irony than the one we currently have would be the requirement for Can$100,000 in equipment to heat that -30°C outside air to around freezing so that we wouldn’t freeze pipes and other indoor essentials if we were to adopt the “low-tech” approach and rely on Mother Nature. Oh, well . . . n In memoriam Most of the snail mail I receive as editor consists of advertisements and press releases from various firms providing IT and other services to libraries. But a few months ago a thin, hand-addressed envelope, post- marked Pittsburgh with no return address, landed on my desk. Inside were two slips of paper clipped from a recent issue of ITAL and taped together. On one was my name and address; the other was a mailing label for Jean A. Guasco of Pittsburgh, an ALA Life Member and ITAL subscriber. Beside her name, in red felt-tip pen, someone had written simply “deceased.” I wondered about this for some time. Who was Ms. Guasco? Where had she worked, and when? Had she published or otherwise been active professionally? If she was a Life Member of ALA, surely it would be easy to find out more. It turns out that such is not the case, the wonders of the Internet notwithstanding. My obvious first stop, Google, yielded little other than a brief notice of her death in a Pittsburgh-area newspaper and an entry from a digi- tized September 1967 issue of Special Libraries that identi- fied her committee assignment in the Special Libraries Assocation and the fact that she was at the time the chief librarian at McGraw-Hill, then located in New York. As a result of checking WorldCat, where I found a listing for her master’s thesis, I learned that she graduated from the now-closed School of Library Service at Columbia University in 1953. If she published further, there was no mention of it on Google. My subsequent searches under her name in the standard online LIS indexes drew blanks. From there, the trail got even colder. McGraw-Hill long ago forsook New York for the wilds of Ohio, and it seems that we as a profession have not been very good at retaining for posterity our directories of those in the field. A friend managed to find listings in both the 1982–83 and 1984–85 volumes of Who’s Who in Special Libraries, but all these did was confirm what I already knew: Ms. Guasco was an ALA Life Member, who by then lived in Pittsburgh. I’m guessing that she was then retired, since her death notice gave her age as eighty-six years. Of her professional career before that, I’m sad that I must say I was able to learn no more. 3167 ---- 16 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 Mathew J. Miles and Scott J. Bergstrom Classification of Library Resources by Subject on the Library Website: Is There an Optimal Number of Subject Labels? The number of labels used to organize resources by subject varies greatly among library websites. Some librarians choose very short lists of labels while others choose much longer lists. We conducted a study with 120 students and staff to try to answer the following question: What is the effect of the number of labels in a list on response time to research questions? What we found is that response time increases gradually as the number of the items in the list grow until the list size reaches approximately fifty items. At that point, response time increases significantly. No asso- ciation between response time and relevance was found. I t is clear that academic librarians face a daunting task drawing users to their library’s Web presence. “Nearly three-quarters (73%) of college students say they use the Internet more than the library, while only 9% said they use the library more than the Internet for informa- tion searching.”1 Improving the usability of the library websites therefore should be a primary concern for librar- ians. One feature common to most library websites is a list of resources organized by subject. Libraries seem to use similar subject labels in their categorization of resources. However, the number of subject labels varies greatly. Some use as few as five subject labels while others use more than one hundred. In this study we address the following ques- tion: What is the effect of the number of subject labels in a list on response times to research questions? n Literature review McGillis and Toms conducted a performance test in which users were asked to find a database by navigating through a library website. They found that participants “had difficulties in choosing from the categories on the home page and, subsequently, in figuring out which data- base to select.”2 A review of relevant research literature yielded a number of theses and dissertations in which the authors compared the usability of different library websites. Jeng in particular analyzed a great deal of the usability testing published concerning the digital library. The following are some of the points she summarized that were highly relevant to our study: n User “lostness”: Users did not understand the structure of the digital library. n Ambiguity of terminology: Problems with wording accounted for 36 percent of usability problems. n Finding periodical articles and subject-specific databases was a challenge for users.3 A significant body of research not specific to libraries provides a useful context for the present research. Miller’s landmark study regarding the capacity of human short- term memory showed as a rule that the span of immedi- ate memory is about 7 ± 2 items.4 Sometimes this finding is misapplied to suggest that menus with more than nine subject labels should never be used on a webpage. Subsequent research has shown that “chunking,” which is the process of organizing items into “a collection of ele- ments having strong associations with one another, but weak associations with elements within other chunks,”5 allows human short-term memory to handle a far larger set of items at a time. Larson and Czerwinski provide important insights into menuing structures. For example, increasing the depth (the number of levels) of a menu harms search performance on the Web. They also state that “as you increase breadth and/or depth, reaction time, error rates, and perceived complexity will all increase.”6 However, they concluded that a “medium condition of breadth and depth outperformed the broadest, shallow web structure overall.”7 This finding is somewhat contrary to a previous study by Snowberry, Parkinson, and Sisson, who found that when testing structures of 26, 43, 82, 641 (26 means two menu items per level, six levels deep), the 641 structure grouped into categories proved to be advantageous in both speed and accuracy.8 Larson and Czerwinksi rec- ommended that “as a general principle, the depth of a tree structure should be minimized by providing broad menus of up to eight or nine items each.”9 Zaphiris also corroborated that previous research con- cerning depth and breadth of the tree structure was true for the Web. The deeper the tree structure, the slower the user performance.10 He also found that response times for expandable menus are on average 50 percent longer than sequential menus.11 Both the research and current practices are clear concerning the efficacy of hierarchical menu structures. Thus it was not a focus of our research. The focus instead was on a single-level menu and how the number and characteristics of subject labels would affect search response times. n Background In preparation for this study, library subject lists were col- lected from a set of thirty library websites in the United Mathew J. Miles (milesm@byui.edu) is Systems Librarian and Scott J. Bergstrom (bergstroms@byui.edu) is Director of Institutional Research at Brigham Young University–Idaho in Rexburg. CLASSIFICATION OF LIBRARY RESOuRCES BY SuBJECT ON THE LIBRARY wEBSITE | MILES AND BERGSTROM 17 States, Canada, and the United Kingdom. We selected twelve lists from these websites that were representative of the entire group and that varied in size from small to large. To render some of these lists more usable, we made slight modifications. There were many similarities between label names. n Research design Participants were randomly assigned to one of twelve experimental groups. Each experimental group would be shown one of the twelve lists that were selected for use in this study. Roughly 90 percent of the participants were students. The remaining 10 percent of the participants were full-time employees who worked in these same departments. The twelve lists ranged in number of labels from five to seventy-two: n Group A: 5 subject labels n Group B: 9 subject labels n Group C: 9 subject labels n Group D: 23 subject labels n Group E : 6 subject labels n Group F: 7 subject labels n Group G: 12 subject labels n Group H: 9 subject labels n Group I: 35 subject labels n Group J: 28 subject labels n Group K: 49 subject labels n Group L: 72 subject labels Each participant was asked to select a subject label from a list in response to eleven different research ques- tions. The questions are listed below: 1. Which category would most likely have informa- tion about modern graphical design? 2. Which category would most likely have informa- tion about the Aztec Empire of ancient Mexico? 3. Which category would most likely have informa- tion about the effects of standardized testing on high school classroom teaching? 4. Which category would most likely have informa- tion on skateboarding? 5. Which category would most likely have informa- tion on repetitive stress injuries? 6. Which category would most likely have informa- tion about the French Revolution? 7. Which category would most likely have informa- tion concerning Walmart’s marketing strategy? 8. Which category would most likely have information on the reintroduction of wolves into Yellowstone Park? 9. Which category would most likely have informa- tion about the effects of increased use of nuclear power on the price of natural gas? 10. Which category would most likely have informa- tion on the Electoral College? 11. Which category would most likely have informa- tion on the philosopher Emmanuel Kant? The questions were designed to represent a variety of subject areas that library patrons might pursue. Each sub- ject list was printed on a white sheet of paper in alphabetical order in a single column, or double columns when needed. We did not attempt to test the subject lists in the context of any Web design. We were more interested in observing the effect of the number of labels in a list on response time inde- pendent of any Web design. Each participant was asked the same eleven questions in the same order. The order of ques- tions was fixed because we were not interested in testing for the effect of order and wanted a uniform treatment, thereby not introducing extraneous variance into the results. For each question, the participant was asked to select a label from the subject list under which they would expect to find a resource that would best provide information to answer the question. Participants were also instructed to select only a single label, even if they could think of more than one label as a possible answer. Participants were encour- aged to ask for clarification if they did not fully understand the question being asked. Recording of response times did not begin until clarification of the question had been given. Response times were recorded unbeknownst to the partici- pant. If the participant was simply unable to make a selec- tion, that was also recorded. Two people administered the exercise. One recorded response times; the other asked the questions and recorded label selections. Relevance rankings were calculated for each possible combination of labels within a subject list for each ques- tion. For example, if a subject list consisted of five labels, for each question there were five possible answers. Two library professionals—one with humanities expertise, the other with sciences expertise—assigned a relevance rank- ing to every possible combination of question and labels within a subject list. The rankings were then averaged for each question–label combination. n Results The analysis of the data was undertaken to determine whether the average response times of participants, adjusted by the different levels of relevance in the subject list labels that prevailed for a given question, were signifi- cantly different across the different lists. In other words, would the response times of participants using a particu- lar list, for whom the labels in the list were highly relevant 18 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 to the question, be different from students using the other lists for whom the labels in the list were also highly relevant to the question? A separate univariate general linear model analysis was conducted for each of the eleven questions. The analyses were conducted separately because each ques- tion represented a unique search domain. The univariate general linear model pro- vided a technique for testing whether the average response times associated with the different lists were significantly dif- ferent from each other. This technique also allowed for the inclusion of a cova- riate—relevance of the subject list labels to the question—to determine whether response times at an equivalent level of relevance was different across lists. In the analysis model, the depen- dent variable was response time, defined as the time needed to select a subject list label. The covariate was relevance, defined as the perceived match between a label and the question. For example, a label of “Economics” would be assessed as highly relevant to the question, what is the current unemployment rate? The same label would be assessed as not relevant for the question, what are the names of four moons of Saturn? The main factor in the model was the actual list being presented to the participant. There were twelve lists used in this study. The statistical model can be summarized as follows: response time = list + relevance + (list × relevance) + error The general linear model required that the following conditions be met: First, data must come from a ran- dom sample from a normal population. Second, all vari- ances with each of the groupings are the same (i.e., they have homoscedasticity). An examination of whether these assumptions were met revealed problems both with nor- mality and with homoscedasticity. A common technique— logarithmic transformation—was employed to resolve these problems. Accordingly, response-time data were all converted to common logarithms. An examination of assumptions with the transformed data showed that all questions but three met the required conditions. The three 0.70 0.80 0.90 1.00 1.10 1.20 0.50 0.60 Avg Log Performance Trend Figure 1. The overall average of average search times for the eight questions for all experimental groups (i.e., lists) questions (5, 6, and 7) were excluded from subsequent analysis. n Conclusions The series of graphs in the appendix show the average response times, adjusted for relevance, for eight of the eleven questions for all twelve lists (i.e., experimental groups). Three of the eleven questions were excluded from the analysis because of heteroscedascity. An inspec- tion of these graphs shows no consistent pattern in response time as the number of the items in the lists increase. Essentially, this means that, for any given level of relevance, the number of items of the list does not affect response time significantly. It seems that for a single ques- tion, characteristics of the categories themselves are more important than the quantity of categories in the list. The response times using a subject list with twenty-eight labels is similar to the response times using a list of six labels. A statistical comparison of the mean response time for each CLASSIFICATION OF LIBRARY RESOuRCES BY SuBJECT ON THE LIBRARY wEBSITE | MILES AND BERGSTROM 19 group with that of each of the other groups for each of the questions largely confirms this. There were very few statistically significant different comparisons. The spikes and valleys of the graphs in the appendix are generally not significantly different. However, when the average response time associated with all lists is combined into an overall average from all eight questions, a somewhat clearer picture emerges (see figure 1). Response times increase gradually as the number of the items in the list increase until the list size reaches approximately fifty items. At that point, response time increases significantly. No association was found between response time and relevance. A fast response time did not necessarily yield a relevant response, nor did a slow response time yield an irrelevant response. n Observations We observed that there were two basic patterns exhibited when participants made selections. The first pattern was the quick selection—participants easily made a selection after performing an initial scan of the available labels. Nevertheless, a quick selection did not always mean a relevant selection. The second pattern was the delayed selection. If participants were unable to make a selection after the initial scan of items, they would hesitate as they struggled to determine how the question might be reclas- sified to make one of the labels fit. We did not have access to a high-tech lab, so we were unable to track eye move- ment, but it appeared that the participants began scan- ning up and down the list of available items in an attempt to make a selection. The delayed selection seemed to be a combination of two problems: First, none of the avail- able labels seemed to fit. Second, the delay in scanning increased as the list grew larger. It’s possible that once the list becomes large enough, scanning begins to slow the selection process. A delayed selection did not necessarily yield an irrelevant selection. The label names themselves did not seem to be a significant factor affecting user performance. We did test three lists, each with nine items and each having differ- ent labels, and response times were similar for the three lists. A future study might compare a more extensive number of lists with the same number of items with different labels to see if label names have an effect on response time. This is a particular challenge to librarians in classifying the digital library, since they must come up with a few labels to classify all possible subjects. Creating eleven questions to span a broad range of subjects is also a possible weakness of the study. We had to throw out three questions that violated the assump- tions of the statistical model. We tried our best to select questions that would represent the broad subject areas of science, arts, and general interest. We also attempted to vary the difficulty of the questions. A different set of questions may yield different results. References 1. Steve Jones, The Internet Goes to College, ed. Mary Madden (Washington, D.C.: Pew Internet and American Life Project, 2002): 3, www.pewinternet.org/pdfs/PIP_College_Report.pdf (accessed Mar. 20, 2007). 2. Louise McGillis and Elaine G. Toms, “Usability of the Academic Library Web Site: Implications for Design,” College & Research Libraries 62, no. 4 (2001): 361. 3. Judy H. Jeng, “Usability of the Digital Library: An Evalu- ation Model” (PhD diss., Rutgers University, New Brunswick, New Jersey): 38–42. 4. George A. Miller, “The Magical Number Seven Plus or Minus Two: Some Limits on Our Capacity for Processing Infor- mation,” Psychological Review 63, no. 2 (1956): 81–97. 5. Fernand Gobet et al., “Chunking Mechanisms in Human Learning,” Trends in Cognitive Sciences 5, no. 6 (2001): 236–43. 6. Kevin Larson and Mary Czerwinski, “Web Page Design: Implications of Memory, Structure and Scent for Informa- tion Retrieval” (Los Angeles: ACM/Addison-Wesley, 1998): 25, http://doi.acm.org/10.1145/274644.274649 (accessed Nov. 1, 2007). 7. Ibid. 8. Kathleen Snowberry, Mary Parkinson, and Norwood Sis- son, “Computer Display Menus,” Ergonomics 26, no 7 (1983): 705. 9. Larson and Czerwinski, “Web Page Design,” 26. 10. Panayiotis G. Zaphiris, “Depth vs. Breath in the Arrange- ment of Web Links,” www.soi.city.ac.uk/~zaphiri/Papers/hfes .pdf (accessed Nov. 1, 2007). 11. Panayiotis G. Zaphiris, Ben Shneiderman, and Kent L. Norman, “Expandable Indexes Versus Sequential Menus for Searching Hierarchies on the World Wide Web,” http:// citeseer.ist.psu.edu/rd/0%2C443461%2C1%2C0.25%2CDow nload/http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/ cache/papers/cs/22119/http:zSzzSzagrino.orgzSzpzaphiriz SzPaperszSzexpandableindexes.pdf/zaphiris99expandable.pdf (accessed Nov. 1, 2007). 20 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 APPENDIx. Response times by question by group 0.00 0.20 0.40 0.60 0.80 1.00 1.20 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 GR P A (5 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P D (2 3 ite m s) GR P E (6 it em s) GR P F (7 it em s) GR P G (1 2 ite m s) GR P H (9 it em s) GR P I (3 5 ite m s) GR P J (2 8 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 GR P A (5 it em s) GR P E (6 it em s) GR P F (7 it em s) GR P B (9 it em s) GR P C (9 it em s) GR P H (9 it em s) GR P G (1 2 ite m s) GR P D (2 3 ite m s) GR P J (2 8 ite m s) GR P I (3 5 ite m s) GR P K (4 9 ite m s) GR P L (7 2 ite m s) Question 1 Question 8 Question 2 Question 9 Question 3 Question 10 Question 4 Question 11 3165 ---- 2 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 Andrew K. Pace President’s Message: LITA Now Andrew K. Pace (pacea@oclc.org) is LITA President 2008/2009 and Executive Director, Networked Library Services at OCLC Inc. in Dublin, Ohio. A t the time of this writing, my term as LITA presi- dent is half over; by the time of publication, I will be in the home stretch—a phrase that, to me, always connotes relief and satisfaction that is never truly realized. I hope that this time between ALA conferences is a time of reflection for the LITA board, committees, inter- est groups, and the membership at large. Various strate- gic planning sessions are, I hope, leading us down a path of renewal and regeneration of the division. Of course, the world around us will have its effect—in particular, a political and economic effect. First, the politics. I was asked recently to give my opinion about where the new administration should focus its attention regarding library technology. I had very little time to think of a pithy answer to this ques- tion, so I answered with my gut that the United States needs to continue its investment in IT infrastructure so that we are on par with other industrialized nations while also lending its aid to countries that are lagging behind. Furthermore, I thought it an apt time to redress issues of data privacy and retention. The latter is often far from our minds in a world more connected, increasingly through wireless technology, and with a user base that, as one privacy expert put it, would happily trade a DNA sample for an Extra Value Meal. I will resist the urge to write at greater length a treatise on the Bill of Rights and its status in 2008. I will hope, however, that LITA’s Technology and Access and Legislation and Regulation committees will feel reinvigorated post–election and post–inauguration to look carefully at the issues of IT policy. Our penchant for new tools should always be guided and tempered by the implementation and support of policies that rational- ize their use. As for the economy, it is our new backdrop. One anecdotal view of this is the number of e-mails I’ve received from committee appointees apologizing that they will not be able to attend ALA conferences as planned because of the economic downturn and local cuts to library budgets. Libraries themselves are in a paradoxical situation—increasing demand for the free services that libraries offer while simultaneously facing massive budget cuts that support the very collections and programs people are demanding. What can we do? Well, I would suggest that we look at library technology through a lens of efficiency and cost savings, not just from a perspective of what is cool or trendy. When it comes to running systems, we need to keep our focus on end-user satisfaction while consider- ing total cost of ownership. And if I may be selfish for a moment, I hope that we will not abandon our profes- sional networks and volunteer activities. While we all make sacrifices of time, money, and talent to support our profession, it is often tempting when economic times are hard to isolate ourselves from the professional networks that sustain us in times of plenty. Politics and economics? Though I often enjoy being cynical, I also try to make lemonade from lemons when- ever I can. I think there are opportunities for libraries to get their own economic bailout in supporting public works and emphasizing our role in contributing to the public good. We should turn our “woe-are-we” tenden- cies that decry budget cuts and low salaries into champi- oned stories of “what libraries have done for you lately.” And we should go back to the roots of IT, no matter how mythical or anachronistic, and think about what we can do technically to improve systemwide efficiencies. I encourage the membership to stay involved and reengage, whether through direct participation in LITA activities or through a closer following of the activities in the ALA Office of Information Technology Policy (OITP, www.ala.org/ala/aboutala/offices/oitp) and the ALA Washington Office itself. There is much to follow in the world that affects our profession, and so many are doing the heavy lifting for us. All we need to do sometimes is pay attention. Make fun of me if you want for stealing a campaign phrase from Richard Nixon, but I kept coming back to it in my head. In short, Library Information Technology— now more than ever. 3170 ---- LANECONNEx | KETCHELL ET AL. 31 LaneConnex: An Integrated Biomedical Digital Library Interface Debra S. Ketchell, Ryan Max Steinberg, Charles Yates, and Heidi A. Heilemann This paper describes one approach to creating a search application that unlocks heterogeneous content stores and incorporates integrative functionality of Web search engines. LaneConnex is a search interface that identifies journals, books, databases, calculators, bioinformatics tools, help information, and search hits from more than three hundred full-text heterogeneous clinical and biore- search sources. The user interface is a simple query box. Results are ranked by relevance with options for filtering by content type or expanding to the next most likely set. The system is built using component-oriented program- ming design. The underlying architecture is built on Apache Cocoon, Java Servlets, XML/XSLT, SQL, and JavaScript. The system has proven reliable in production, reduced user time spent finding information on the site, and maximized the institutional investment in licensed resources. M ost biomedical libraries separate searching for resources held locally from external database searching, requiring clinicians and researchers to know which interface to use to find a specific type of information. Google, Amazon, and other Web search engines have shaped user behavior and expectations.1 Users expect a simple query box with results returned from a broad array of content ranked or categorized appropriately with direct links to content, whether it is an HTML page, a PDF document, a streaming video, or an image. Biomedical libraries have transitioned to digital journals and reference sources, adopted OpenURL link resolvers, and created institutional repositories. However, students, clinicians, and researchers are hindered from maximizing this content because of proprietary and het- erogeneous systems. A strategic challenge for biomedical libraries is to create a unified search for a broad spectrum of licensed, open-access, and institutional content. n Background Studies show that students and researchers will use the search path of least cognitive resistance.2 Ease and speed are the most important factors for using a particular search engine. A University of California report found that academic users want one search tool to cover a wide information universe, multiple formats, full-text avail- ability to move seamlessly to the item itself, intelligent assistance and spelling correction, results sorted in order of relevance, help navigating large retrievals by logical subsetting and customization, and seamless access any- time, anywhere.3 Studies of clinicians in the patient-care environment have documented that effort is the most important factor in whether a patient-care question is pursued.4 For researchers, finding and using the best bio- informatics tool is an elusive problem.5 In 2005, the Lane Medical Library and Knowledge Management Center (Lane) at the Stanford University Medical Center provided access to an expansive array of licensed, institutional, and open-access digital content in support of research, patient care, and education. Like most of its peers, Lane users were required to use scores of different interfaces to search external databases and find digital resources. We created a local metasearch application for clinical reference content, but it did not integrate result sets from disparate resources. A review of federated-search software in the marketplace found that products were either slow or they limited retrieval when faced with a broad spectrum of biomedical content. We decided to build on our existing application architecture to create a fast and unified interface. A detailed analysis of Lane website-usage logs was conducted before embarking on the creation of the new search application. Key points of user failure in the exist- ing search options were spelling errors that could easily be corrected to avoid zero results; lack of sufficient intui- tive options to move forward from a zero-results search or change topics without backtracking; lack of use of existing genre or role searches; confusion about when to use the resource, OpenURL resolver, or PubMed search to find a known item; and results that were cognitively difficult to navigate. Studies of the Web search engine and the PubMed search log concurred with our usage- log analysis: A single term search is the most common, with three words maximum entered by typical users.6 A PubMed study found that 22 percent of user queries were for known items rather than for a general subject, con- firming our own log analysis findings that the majority of searches were for a particular source item.7 Search-term analysis revealed that many of our users were entering partial article citations (e.g., author, date) in any query Debra S. Ketchell (debra.ketchell@gmail.com) is the for- mer Associate Dean for Knowledge Management and Library Director; Ryan Max Steinberg (ryan.max.steinberg@stanford .edu) is the Knowledge Integration Programmer/Architect; Charles Yates (charles.yates@stanford.edu) is the Systems Software Developer; and Heidi A. Heilemann (heidi.heilemann@stanford .edu) is the former Director for Research & Instruction and cur- rent Associate Dean for Knowledge Management and Library Director at the Lane Medical Library & Knowledge Management Center, Information Resources & Technology, Stanford University School of Medicine, Stanford, California. 32 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 box expecting that article databases would be searched concurrently with the resource database. Our displayed results were sorted alphabetically, and each version of an item was displayed separately. For the user, this meant a cluttered list with redundant title information that increased their cognitive effort to find meaningful items. Overall, users were confronted with too many choices upfront and too few options after retrieving results. Focus groups of faculty and students were conducted in 2005. Attendees wanted local information integrated into the proposed single search. Local information included content such as how-to information, expertise, seminars, grand rounds, core lab resources, drug formulary, patient handouts, and clinical calculators. Most of this content is restricted to the Stanford user population. Users consis- tently described their need for a simple search interface that was fast and customized to the Stanford environ- ment. In late 2005, we embarked on a project to design a search application that would address both existing points of failure in the current system and meet the expressed need for a comprehensive discovery-and- finding tool as described in focus groups. The result is an application called LaneConnex. n Design objectives The overall goal of LaneConnex is to create a simple, fast search across multiple licensed, open-access, and special-object local knowledge sources that depackages and reaggregates information on the basis of Stanford institutional roles. The content of Lane’s digital collec- tion includes forty-five hundred journal titles and forty- two thousand other digital resources, including video lectures, executable software, patient handouts, bioin- formatics tools, and a significant store of digitized his- torical materials as a result of the Google Books program. Media types include HTML pages, PDF documents, JPEG images, MP3 audio files, MPEG4 videos, and executable applications. More than three hundred reference titles have been licensed specifically for clinicians at the point of care (e.g., UpToDate, eMedicine, STAT-Ref, and Micromedex Clinical Evidence). Clinicians wanted their results to reflect subcomponents of a package (e.g., results from the Micromedex patient handouts). Other clinical content is institutionally managed (e.g., institutional formulary, lab test database, or patient handouts). More than 175 bio- medical research tools have been licensed or selected from open-access content. The needs of biomedical researchers include molecular biology tools and software, biomedi- cal literature databases, citation analysis, chemical and engineering databases, expertise-finding tools, laboratory tools and supplies, institutional-research resources, and upcoming seminars. The specific objectives of the search application are the following: n The user interface should be fast, simple, and intui- tive, with embedded suggestions for improving search results (e.g., Did you mean? Didn’t find it? Have you tried?). n Search results from disparate local and external systems should be integrated into a single display based on popular search-engine models familiar to the target population. n The query-retrieval and results display should be separated and reusable to allow customization by role or domain and future expansion into other institutional tools. n Resource results should be ranked by relevance and filtered by genre. n Metasearch results should be hit counts and fil- tered by category for speed and breadth. Results should be reusable for specific views by role. n Finding a known article or journal should be streamlined and directly link to the item or “get item” option. n The most popular search options (PubMed, Google, and Lane journals) should be ubiquitous. n Alternative pathways should be dynamic and interactive at the point of need to avoid backtrack- ing and dead ends. n User behavior should be tracked by search term, resource used, and user location to help the library make informed decisions about licensing, meta- data, and missing content. n Off-the-shelf software should be used when avail- able or appropriate with development focused on search integration. n The application should be built upon existing metadata-creation systems and trusted Web- development technologies. Based on these objectives, we designed an application that is an extension of existing systems and technolo- gies. Resources are acquired and metadata are provided using the Voyager integrated library system (ILS). The SFX OpenURL link resolver provides full-text article access and expands the title search beyond biomedicine to all online journals at Stanford. EZproxy provides seamless off-campus access. WebTrends provides usage tracking. Movable Type is used to create FAQ and help information. A locally developed metasearch application provides a cross search with hit results from more than three hundred external and internal full-text sources. The technologies used to build LaneConnex and integrate all of these systems include Extensible Stylesheet Language LANECONNEx | KETCHELL ET AL. 33 Transformations (XSLT), Java, JavaScript, the Apache Cocoon project, and Oracle. n Systems Description Architecture LaneConnex is built on a principle of separation of concerns. The Lane content owner can directly change the inclusion of search results, how they are displayed, and additional path-finding information. Application programmers use Java, JavaScript, XSLT, and Structured Query Language (SQL) to create components that generate and modify the search results. The merger of content design and search results occurs “just in time” in the user’s browser. We use component-oriented programming design whereby services provided within the application are defined by simple contracts. In LaneConnex, these com- ponents (called “transformers”) consume XML informa- tion and, after transforming it in some way, pass it on to some other component. A particular contract can be fulfilled in different ways for different purposes. This component architecture allows for easy extension of the underlying Apache Cocoon application. If LaneConnex needs to transform some XML data that is not possible with built-in Cocoon transformers, it is a simple matter to create a software component that does what is needed and fulfills the transformer contract. Apache Cocoon is the underlying architecture for LaneConnex, as illustrated in figure 1. This Java Servlet is an XML–publishing engine that is built upon a compo- nent framework and uses a pipeline-processing model. A declarative language uses pattern matching to associate sets of processing components with particular request URLs. Content can come from a variety of sources. We use content from the local file system, network file sys- tem, HTTP, and a relational database. The XSLT language is used extensively in the pipelines and gives fine control of individual parts of the documents being processed. The end of processing is usually an XHTML document but can be any common MIME type. We use Cocoon to separate areas of concern so things like content, look and feel, and processing can all be managed as separate entities by different groups of people with little effect on another area. This separation of concerns is manifested by template documents that contain most of the HTML content common to all pages and are then combined with content documents within a processing pipeline. The declarative nature of the sitemap language and XSLT facilitate rapid development with no need to redeploy the entire application to make changes in its behavior. The LaneConnex search is composed of several com- ponents integrated into a query-and-results interface: Oracle resource metadata, full-text metasearch application, Movable Type blogging software, “Did you mean?” spell checker, EZproxy remote access, and WebTrends tracking. n Full-text Metasearch Integration of results from Lane’s metasearch applica- tion illustrates Cocoon’s many strengths. When a user searches LaneConnex, Cocoon sends his or her query to the metasearch application, which then dispatches the request to multiple external, full-text search engines and content stores. Some examples of these external resources are UpToDate, Access Medicine, Micromedex, PubMed, and MD Consult. The metasearch application interacts with these external resources through Jakarta Commons HTTP clients. Responses from external resources are turned into W3C Document Object Model (DOM) objects, and XPath expressions are used to resolve hit counts from the DOM objects. As result counts are returned, they are added to an XML–based result list and returned to Cocoon. The power of Cocoon becomes evident as the XML– based metasearch result list is combined with a separate display template. This template-based approach affords content curators the ability to directly add, group, and describe metasearch resources using the language and look that is most meaningful to their specific user communities. For example, there are currently eight metasearch templates curated by an informationist in partnership with a target community. Curating these tem- plates requires little to no assistance from programmers. In Lane’s 2005 interface, a user’s request was sent to the metasearch application, and the application waited five seconds before responding to give external resources a chance to return a result. Hit counts in the user interface included a link to refresh and retrieve more results from external resources that had not yet responded. Usability studies showed this to be a significant user barrier, since the refresh link was rarely clicked. The initial five second delay also gave users the impression that the site was slow. The LaneConnex application makes heavy use of JavaScript to solve this problem. After a user makes her initial request, JavaScript is used to poll the metasearch application (through Cocoon) on the user’s behalf, pop- ping in result counts as external resources respond. This adds a level of interactivity previously unavailable and makes the metasearch piece of LaneConnex much more successful than its previous version. Resource metadata LaneConnex replaces the catalog as the primary discov- ery interface. Metadata describing locally owned and 34 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 licensed resources (journals, databases, books, videos, images, calculators, and software applications) are stored in the library’s current system of record, an instance of the Voyager ILS. LaneConnex makes no attempt to replace Voyager ’s strengths as an application for the selection, acquisition, description, and management of access to library resources. It does, however, replace Voyager ’s discovery interface. To this end, metadata for about eight thousand digital resources is extracted from Voyager ’s Oracle database, converted into MARCXML, processed with XSLT, and stored in a simple relational database (six tables and twenty-nine attributes) to sup- port fast retrieval speed and tight control over search syntax. This extraction process occurs nightly, with incremental updates every five minutes. The Oracle Text search engine provides functionality anticipated by our Internet-minded users. Key features are speed and relevance-ranked results. A highly refined results rank- ing insures that the logical title appears in the first few results. A user ’s query is parsed for wildcard, Boolean, proximity, and phrase operators, and then translated into an SQL query. Results are then transformed into a display version. Related services LaneConnex compares a user’s query terms against a dictionary. Each query is sent to a Cocoon spell-checking component that returns suggestions where appropri- ate. This component currently uses the Simple Object Figure 1. LaneConnex Architecture. LANECONNEx | KETCHELL ET AL. 35 Access Protocol (SOAP)–based spell- ing service from Google. Google was chosen over the National Center for Biotechnology Information (NCBI) spelling service because of the breadth of terms entered by users; however, Cocoon’s component-oriented archi- tecture would make it trivial to change spell checkers in the future. Each query is also compared against Stanford’s OpenURL link resolver (FindIt@Stanford). Client-side JavaScript makes a Cocoon-mediated query of FindIt@Stanford. Using XSLT, FindIt@Stanford responses are turned into JavaScript Object Notation (JSON) objects and popped into the interface as appropriate. Although the vast majority of LaneConnex searches result in zero FindIt@Stanford results, the convenience of searching all of Lane’s systems in a single, unified interface far outweighs the effort of implementation. A commercial analytics tool called WebTrends is used to collect Web statis- tics for making data-centric decisions about interface changes. WebTrends uses client-side JavaScript to track specific user click events. Libraries need to track both on-site clicks (e.g., the user clicked on “Clinical Portal” from the home page) and off-site clicks (e.g., the user clicked on “Yamada’s Gastroenterology” after doing a search for “IBS”). To facilitate off-site click capture, WebTrends requires every external link to include a snippet of JavaScript. Requiring content creators to input this code by hand would be error prone and tedious. LaneConnex automatically supplies this code for every class of link (search or static). This specialized WebTrends method provides Lane with data to inform both interface design and licensing decisions. n Results LaneConnex version 1.0 was released to the Stanford biomedical community in July 2006. The current applica- tion can be experienced at http://lane.stanford.edu. The Figure 2. LaneConnex Resource Search Results. Resource results are ranked by rel- evance. Single word titles are given a higher weight in the ranking algorithm to insure they are displayed in the first five results. Uniform titles are used to co-locate versions (e.g., the three instances of Science from different producers). Journals titles are linked to their respective impact factor page in the ISI Web of Knowledge. Digital formats that require spe- cial players or restrictions are indicated. The metadata searched for eJournals, Databases, eBooks, Biotools, Video, and medCalcs are Lane’s digital resources extracted from the inte- grated library system into a searchable Oracle database. The first “All” tab is the combined results of these genres and the Lane Site help and information. Figure 3. LaneConnex Related Services Search Enhancements. LaneConnex includes a spell checker to avoid a common failure in user searches. AJAx services allow the inclusion of search results from other sources for common zero results failures. For example, the Stanford link resolver database is simultaneously searched to insure online journals outside the scope of biomedicine are presented as a linked result for the user. production version has proven reliable over two years. Incremental user focus groups have been employed to improve the interface as issues arose. A series of vignettes will be used to illustrate how the current version of 36 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 the “SUNetID login” is required. n User query: “new yokrer.” A faculty member is looking for an arti- cle in the New Yorker for a class reading assignment. He makes a typing error, which invokes the “Did you mean?” function (see figure 3). He clicks on the correct spelling. No results are found in the resource search, but a simul- taneous search of the link-resolver database finds an instance of this title licensed for the campus and displays a clickable link for the user. n User query: “pathway analy- sis.” A post–doc is looking for infor- mation on how to share an Ingenuity pathway. Figure 4 illustrates the inte- gration of the locally created Lane FAQs. FAQs comprise a broad spec- trum of help and how-to information as described by our focus groups. Help text is created in the Movable Type blog software, and made searchable through the LaneConnex application. The Movable Type interface lowers the barrier to HTML content creation by any staff member. More complex answers include embedded images and videos to enable the user to see exactly how to do a particular proce- dure. Cocoon allows for the syndica- tion of subsets of this FAQ content back into static HTML pages where it can be displayed as both category-specific lists or as the text for scroll-over help for a link. Having a single store of help information insures the content is updated once for all instances. n User query: “uterine cancer kapp.” A resident is looking for a known article. LaneConnex simultaneously searches PubMed to increase the likelihood of user success (see figure 5). Clicking on the PubMed tab retrieves the results in the native interface; however, the user sees the PubMed@Stanford ver- sion, which includes embedded links to the article based on our OpenURL link resolver. The ability to retrieve results from bibliographic databases that includes article resolution insures that our biomedical community is always using the correct URL to insure maximum full-text article access. User testing in 2007 found that adding the three most frequently used sources (PubMed, Google, and Lane Catalog) into our one-box LaneConnex search was a significant time saver. It addresses LaneConnex meets the design objectives from the user’s perspective. n User query: “science.” A graduate student is look- ing for the journal Science. The LaneConnex results are listed in relevance order (see figure 2). Single- word titles are given a higher weight in the rank- ing algorithm to insure they are displayed in the first five results. Results from local metadata are displayed by uniform title. For example, Lane has three instances of the journal Science, and each version is linked to the appropriate external store. Brief notes provide critical information for particu- lar resources. For example, restricted local patient education documents and video seminars note that Figure 4. Example of Integration of Local Content Stores. help information is managed in Moveable Type and integrated into LaneConnex search results. LANECONNEx | KETCHELL ET AL. 37 the expectation on the part of our users that they could search for an article or a journal title in a single search box without first selecting a database. n User query: “serotonin pul- monary hypertension.” A medical student is looking for the correlation of two topics. Clicking on the “Clinical” tab, the student sees the results of the clinical metasearch in fig- ure 6. Metasearch results are deep searches of sources within licensed packages (e.g., text- books in MD Consult or a spe- cific database in Micromedex), local content (e.g., Stanford’s lab-test database), and open- access content (e.g., NCBI databases). PubMed results are tailored strategies tiered by evidence. For example, the evidence-summaries strategy retrieves results from twelve clinical-evidence resources (e.g., BUJ, Clinical Evidence, and Cochrane Systematic Reviews) that link to the full-text licensed by Stanford. An example of the bioresearch metasearch is shown in figure 7. Content selected for this audience includes literature databases, funding sources, patents, structures, clinical trials, protocols, and Stanford expertise integrated with gene, protein, and phe- notype tools. User testing revealed that many users did not click on the “Clinical” tab. The clinical metasearch was originally developed for the Clinical portal page and focused on clinicians in practice; however, the results needed to be exposed more directly as part of the LaneConnex search. Figure 8 illustrates the “Have you tried?” feature that displays a few relevant clinical-content sources without requiring the user to select the “Clinical” tab. This fea- ture is managed by the SmartSearch component of the LaneConnex system. SmartSearch sends the user’s query terms to PubMed, extracts a subset of articles associated with those terms, extracts the MeSH headings for those articles, and computes the frequency of headings in the articles to determine the most likely MeSH terms associ- ated with the user’s query terms. These MeSH terms are mapped to MeSH terms associated with each metasearch resource. Preliminary evaluation indicates that the clini- cal content is now being discovered by more users. Figure 5. Example of Integration of Popular Search Engines into LaneConnex Results. Three of the most popular searches based on usage analysis are included at the top level. PubMed and google are mapped to Lane’s link resolver to retrieve the full article. Creating or editing metasearch templates is a curator- driven task. Programming is only required to add new sources to the metasearch engine. A curator may choose from more than three hundred sources to create a dis- cipline-based layout using general templates. Names, categories, and other description information are all at the curator ’s discretion. While developing new sub- specialty templates, we discovered that clinicians were confused by the difference in layout of their specialty portal and their metasearch results (e.g., the Cardiology portal used the generic clinical metasearch). To address this issue, we devised an approach that merges a portal and metasearch into a single entity as illustrated in figure 9. A combination of the component-oriented architecture of LaneConnex and JavaScript makes the integration of metasearch results into a new template patterned after a portal easy to implement. This strategy will enable the creation of templates contextually appropriate to knowl- edge requests originating from electronic medical-record systems in the future. Direct user feedback and usage statistics confirm that search is now the dominant mode of navigation. The amount of time each user spends on the website has dropped since the release of version 1.0. We speculate that the integrated search helps our users find relevant 38 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 information more efficiently. Focus groups with students are uniformly positive. Graduate students like the ability to find digital articles using a single search box. Medical students like the clinical metasearch as an easy way to look up new topics in texts and customized PubMed searches. Bioengineering students like the ability to easily look up patient care–related topics. Pediatrics residents and attend- ings have championed the develop- ment of their portal and metasearch focused on their patient population. Medical educators have commented on their ability to focus on the best information sources. n Discussion A review of websites in 2007 found that most biomedical libraries had sep- arate search interfaces for their digital resources, library catalog, and exter- nal databases. Biomedical libraries are implementing metasearch software to cross search proprietary data- bases. The University of California, Davis is using the MetaLib software to federate searching multiple bib- liographic databases.8 The University of South California and Florida State University are using WebFeat soft- ware to search clinical textbooks.9 The Health Sciences Library System at the University of Pittsburgh is using Vivisimo to search clinical textbooks and bioresearch tools.10 Academic libraries are introducing new “resource shopping” applications, such as the Endeca project at North Carolina State University, the Summa project at the University of Aarhus, and the VuFind project at Villanova University.11 These systems offer a single query box, faceted results, spell checking, recom- mendations based on user input, and Asynchronous JavaScript and XML (AJAX) for live status information. We believe our approach is a practi- cal integration for our biomedical com- munity that bridges finding a resource and finding a specific item through Figure 6. Integration of metasearch results into LaneConnex. Results from two general, role-based metasearches (Bioresearch and Clinical) are included in the LaneConnex interface. The first image shows a clinician searching LaneConnex for serotonin pulmonary hypertension. Selecting the Clinical tab presents the clinical content metasearch display (second image), and is placed deep inside the source by selecting a title (third image). LANECONNEx | KETCHELL ET AL. 39 a metasearch of multiple databases. The LaneConnex application searches across digital resources and external data stores simultaneously and pres- ents results in a unified display. The limitation to our approach is that the metasearch returns only hit counts rather than previews of the specific content. Standardization of results from external systems, particularly receipt of XML results, remains a chal- lenge. Federated search engines do integrate at this level, but are usually slow or limit the number of results. True integration awaits Health Level Seven (HL7) Clinical Decision Support standards and National Information Standards Organization (NISO) MetaSearch initiative for query and retrieval of specific content.12 One of the primary objectives of LaneConnex is speed and ease of use. Ranking and categorization of results has been very successful in the eyes of the user community. The integration of metasearch results has been par- ticularly successful with our pediatric specialty portal and search. However, general user understanding of how the clinical and biomedical tabs related to the genre tabs in LaneConnex has been problematic. We reviewed Web engines and found a similar challenge in presenting disparate format results (e.g., video or image search results) or lists of hits from different systems (e.g., NCBI’s Entrez search results).13 We are continuing to develop our new specialty portal-and-search model and our SmartSearch term-mapping com- ponent to further integrate results. n Conclusion LaneConnex is an effective and open- ended search infrastructure for inte- grating local resource metadata and full-text content used by clinicians and biomedical researchers. Its effective- ness comes from the recognition that users prefer a single query box with relevance or categorically organized results that lead them to the most likely Figure 7. Example of a Bioresearch Metasearch. Figure 8. The SmartSearch component embeds a set of the metasearch results into the LaneConnex interface as “have you tried?” clickable links. These links are the equivalent of selecting the title from a clinical metasearch result. The example search for atypical malig- nant rhabdoid tumor (a rare childhood cancer) invokes oncology and pediatric textbook results. These texts and PubMed provide quick access for a medical student or resident on the pediatric ward. Figure 9. Example of a Clinical Specialty Portal with Integrated Metasearch. Clinical portal pages are organized so metasearch hit counts can display next to content links if a user executes a search. This approach removes the dissonance clinicians felt existed between separate portal page and metasearch results in version 1.0. 40 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 answer to a question or prospects in their exploration. The application is based on separation of concerns and is easily extensible. New resources are constantly emerg- ing, and it is important that libraries take full advantage of existing and forthcoming content that is tailored to their user population regardless of the source. The next major step in the ongoing development of LaneConnex is becoming an invisible backend application to bring content directly into the user’s workflow. n Acknowledgements The authors would like to acknowledge the contribu- tions of the entire LaneConnex technical team, in par- ticular Pam Murnane, Olya Gary, Dick Miller, Rick Zwies, and Rikke Ogawa for their design contributions, Philip Constantinou for his architecture contribution, and Alain Boussard for his systems development contributions. References 1. Denise T. Covey, “The Need to Improve Remote Access to Online Library Resources: Filling the Gap between Com- mercial Vendor and Academic User Practice,” Portal Libraries and the Academy 3 no.4 (2003): 577–99; Nobert Lossau, “Search Engine Technology and Digital Libraries,” D-Lib Magazine 10 no. 6 (2004), www.dlib.org/dlib/june04/lossau/06lossau.html (accessed Mar. 1, 2008); OCLC, “College Students’ Perception of Libraries and Information Resource,” www.oclc.org/reports/ perceptionscollege.htm (accessed Mar 1, 2008); and Jim Hender- son, “Google Scholar: A Source for Clinicians,” Canadian Medical Association Journal 12 no. 172 (2005). 2. Covey, “The Need to Improve Remote Access to Online Library Resources”; Lossau, “Search Engine Technology and Digital Libraries”; OCLC, “College Students’ Perception of Libraries and Information Resource.” 3. Jane Lee, “UC Health Sciences Metasearch Exploration. Part 1: Graduate Student Gocus Group Findings,” UC Health Sciences Metasearch Team, www.cdlib.org/inside/assess/ evaluation_activities/docs/2006/draft_gradReport_march2006. pdf (accessed Mar. 1, 2008). 4. Karen K. Grandage, David C. Slawson, and Allen F. Shaughnessy, “When Less is More: a Practical Approach to Searching for Evidence-Based Answers,” Journal of the Medical Library Association 90 no. 3 (2002): 298–304. 5. Nicola Cannata, Emanuela Merelli, and Russ B. Altman, “Time to Organize the Bioinformatics Resourceome,” PLos Com- putational Biology 1 no. 7 (2005): e76. 6. Craig Silverstein et al., “Analysis of a Very Large Web Search Engine Query Log,” www.cs.ucsb.edu/~almeroth/ classes/tech-soc/2005-Winter/papers/analysis.pdf (accessed Mar. 1, 2008); Anne Aula, “Query Formulation in Web Informa- tion Search,” www.cs.uta.fi/~aula/questionnaire.pdf (accessed Mar. 1, 2008); Jorge R. Herskovic, Len Y. Tanaka, William Hersh, and Elmer V. Bernstam, “A Day in the Life of PubMed: Analysis of a Typical Day’s Query Log,” Journal of the American Medical Informatics Association 14 no. 2 (2007): 212–20. 7. Herskovic, “A Day in the Life of PubMed.” 8. Davis Libraries University of California, “QuickSearch,” http://mysearchspace.lib.ucdavis.edu/ (accessed Mar. 1, 2008). 9. Eileen Eandi, “Health Sciences Multi-eBook Search,” Norris Medical Library Newsletter (Spring 2006), Norris Medical Library, University of Southern California, www.usc.edu/hsc/ nml/lib-information/newsletters.html (accessed Mar. 1, 2008); Maguire Medical Library, Florida State University, “WebFeat Clinical Book Search,” http://med.fsu.edu/library/tutorials/ webfeat2_viewlet_swf.html (accessed Mar. 1, 2008). 10. Jill E. Foust, Philip Bergen, Gretchen L. Maxeiner, and Peter N. Pawlowski, “Improving E-Book Access via a Library- Developed Full-Text Search Tool,” Journal of the Medical Library Association 95 no. 1 (2007): 40–45. 11. North Carolina State University Libraries, “Endeca at the NCSU Libraries,” www.lib.ncsu.edu/endeca (accessed Mar. 1, 2008); Hans Lund, Hans Lauridsen, and Jens Hofman Han- sen, “Summa—Integrated Search,” www.statsbiblioteket.dk/ publ/summaenglish.pdf (accessed Mar. 1, 2008); Falvey Memo- rial Library, Villanova University, “VuFind,” www.vufind.org (accessed Mar. 1, 2008). 12. See the Health Level Seven (HL7) Clinical Decision Sup- port working committee activities, in particular the Infobutton Standard Proposal at www.hl7.org/Special/committees/dss/ index.cfm and the NISO Metasearch Initiative documentation at www.niso.org/workrooms/mi (accessed Mar 1, 2008). 13. National Center for Biotechnology Information (NCBI) Entrez cross-database search, www.ncbi.nlm.nih.gov/Entrez (accessed Mar. 1, 2008). ACRL 5 ALCTS 15 LITA cover 2, cover 3 Jaunter cover 4 Index to Advertisers 3168 ---- 6 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 Paul T. Jaeger and Zheng Yan One Law with Two Outcomes: Comparing the Implementation of CIPA in Public Libraries and Schools Though the Children’s Internet Protection Act (CIPA) established requirements for both public libraries and public schools to adopt filters on all of their computers when they receive certain federal funding, it has not attracted a great amount of research into the effects on libraries and schools and the users of these social insti- tutions. This paper explores the implications of CIPA in terms of its effects on public libraries and public schools, individually and in tandem. Drawing from both library and education research, the paper examines the legal background and basis of CIPA, the current state of Internet access and levels of filtering in public librar- ies and public schools, the perceived value of CIPA, the perceived consequences of CIPA, the differences in levels of implementation of CIPA in public libraries and public schools, and the reasons for those dramatic differences. After an analysis of these issues within the greater policy context, the paper suggests research questions to help provide more data about the challenges and questions revealed in this analysis. T he Children’s Internet Protection Act (CIPA) estab- lished requirements for both public libraries and public schools to—as a condition for receiving cer- tain federal funds—adopt filters on all of their computers to protect children from online content that was deemed potentially harmful.1 Passed in 2000, CIPA was initially implemented by public schools after its passage, but it was not widely implemented in public libraries until the 2003 Supreme Court decision (United States v. American Library Association) upholding the law’s constitutional- ity.2 Now that CIPA has been extensively implemented for five years in libraries and eight years in schools, it has had time to have significant effects on access to online information and services. While the goal of filter- ing requirements is to protect children from potentially inappropriate content, filtering also creates major edu- cational and social implications because filters also limit access to other kinds of information and create different perceptions about schools and libraries as social institu- tions. Curiously, CIPA and its requirements have not attracted a great amount of research into the effects on schools, libraries, and the users of these social institu- tions. Much of the literature about CIPA has focused on practical issues—either recommendations on implement- ing filters or stories of practical experiences with filtering. While those types of writing are valuable to practitioners who must deal with the consequences of filtering, there are major educational and societal issues raised by filter- ing that merit much greater exploration. While relatively small bodies of research have been generated about CIPA’s effects in public libraries and public schools,3 thus far these two strands of research have remained separate. But it is the contention of this paper that these two strands of research, when viewed together, have much more value for creating a broader understanding of the educational and societal implications. It would be impossible to see the real consequences of CIPA without the development of an integrative picture of its effects on both public schools and public libraries. In this paper, the implications of CIPA will be explored in terms of effects on public libraries and public schools, individually and in tandem. Public libraries and public schools are generally considered separate but related public sphere entities because both serve core educa- tional and information-provision functions in society. Furthermore, the fact that public schools also contain school library media centers highlights some very inter- esting points of intersection between public libraries and school libraries in terms of the consequences of CIPA: While CIPA requires filtering of computers throughout public libraries and public schools, the presence of school library media centers makes the connection between libraries and schools stronger, as do the teaching roles of public libraries (e.g., training classes, workshops, and evening classes). n The legal road to CIPA History Under CIPA, public libraries and public schools receiving certain kinds of federal funds are required to use filtering programs to protect children under the age of seventeen from harmful visual depictions on the Internet and to provide public notices and hearings to increase public awareness of Internet safety. Senator John McCain (R-AZ) sponsored CIPA, and it was signed into law by President Bill Clinton on December 21, 2000. CIPA requires that filters at public libraries and public schools block three specific types of content: (1) obscene material (that Paul T. Jaeger (pjaeger@umd.edu) is Assistant Professor at the College of Information Studies and Director of the Center for Information Policy and Electronic government of the University of Maryland in College Park. Zheng Yan (zyan@uamail.albany .edu) is Associate Professor at the Department of Educational and Counseling Psychology in the School of Education of the State University of New York at Albany. ONE LAw wITH TwO OuTCOMES | JAEGER AND YAN 7 which appeals to prurient interests only and is “offen- sive to community standards”); (2) child pornography (depictions of sexual conduct and or lewd exhibitionism involving minors); and (3) material that is harmful to minors (depictions of nudity and sexual activity that lack artistic, literary, or scientific value). CIPA focused on “the recipients of Internet transmission,” rather than the send- ers, in an attempt to avoid the constitutional issues that undermined the previous attempts to regulate Internet content.4 Using congressional authority under the spending clause of Article I, section 8 of the U.S. Constitution, CIPA ties the direct or indirect receipt of certain types of federal funds to the installation of filters on library and school computers. Therefore each public library and school that receives the applicable types of federal funding must implement filters on all computers in the library and school buildings, including computers that are exclusively for staff use. Libraries and schools had to address these issues very quickly because the Federal Communications Commission (FCC) mandated certifi- cation of compliance with CIPA by funding year 2004, which began in Summer 2004.5 CIPA requires that filters on computers block three specific types of content, and each of the three cat- egories of materials has a specific legal meaning. The first type—obscene materials—is statutorily defined as depicting sexual conduct that appeals only to prurient interests, is offensive to community standards, and lacks serious literary, artistic, political, or scientific value.6 Historically, obscene speech has been viewed as being bereft of any meaningful ideas or educational, social, or professional value to society.7 Statutes regulating speech as obscene have to do so very carefully and specifically, and speech can only be labeled obscene if the entire work is without merit.8 If speech has any educational, social, or professional importance, even for embody- ing controversial or unorthodox ideas, it is supposed to receive First Amendment protection.9 The second type of content—child pornography—is statutorily defined as depicting any form of sexual conduct or lewd exhi- bitionism involving minors.10 Both of these types of speech have a long history of being regulated and being considered as having no constitutional protections in the United States. The third type of content that must be filtered— material that is harmful to minors—encompasses a range of otherwise protected forms of speech. CIPA defines “harmful to minors” as including any depiction of nudity, sexual activity, or simulated sexual activity that has no serious literary, artistic, political, or scientific value to minors.11 The material that falls into this third category is constitutionally protected speech that encompasses any depiction of nudity, sexual activity, or simulated sexual activity that has serious literary, artistic, political, or scientific value to adults. Along with possibly includ- ing a range of materials related to literature, art, science, and policy, this third category may involve materials on issues vital to personal well-being such as safe sexual practices, sexual identity issues, and even general health care issues such as breast cancer. In addition to the filtering requirements, section 1731 also prescribes an Internet awareness strategy that public libraries and schools must adopt to address five major Internet safety issues related to minors. It requires librar- ies and schools to provide reasonable public notice and to hold at least one public hearing or meeting to address these Internet safety issues. Requirements for schools and libraries CIPA includes sections specifying two major strategies for protecting children online (mainly in sections 1711, 1712, 1721, and 1732) as well as sections describing vari- ous definitions and procedural issues for implementing the strategies (mainly in sections 1701, 1703, 1731, 1732, 1733, and 1741). Section 1711 specifies the primary Internet protec- tion strategy—filtering—in public schools. Specifically, it amends the Elementary and Secondary Education Act of 1965 by limiting funding availability for schools under section 254 of the Communication Act of 1934. Through a compliance certification process within a school under supervision by the local educational agency, it requires schools to include the operation of a technology protec- tion measure that protects students against access to visual depictions that are obscene, are child pornography, or are harmful to minors under the age of seventeen. Likewise, section 1712 specifies the same filtering strategy in public libraries. Specifically, it amends section 224 of the Museum and Library Service Act of 1996/2003 by limiting funding availability for libraries under sec- tion 254 of the Communication Act of 1934. Through a compliance certification process within a library under supervision by the Institute of Museum and Library Services (IMLS), it requires libraries to include the opera- tion of a technology protection measure that protects stu- dents against access to visual depictions that are obscene, child pornography, or harmful to minors under the age of seventeen. Section 1721 is a requirement for both libraries and schools to enforce the Internet safety policy with the Internet safety policy strategy and the filtering technol- ogy strategy as a condition of universal service discounts. Specifically, it amends section 254 of the Communication Act of 1934 and requests both schools and libraries to monitor the online activities of minors, operate a tech- nical protection measure, provide reasonable public notice, and hold at least one public hearing or meeting to address the Internet safety policy. This is through the 8 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 certification process regulated by the FCC. Section 1732, titled the Neighborhood Children’s Internet Protection Act (NCIPA), amends section 254 of the Communication Act of 1934 and requires schools and libraries to adopt and implement an Internet safety policy. It specifies five types of Internet safety issues: (1) access by minors to inappropriate matter on the Internet; (2) safety and security of minors when using e-mail, chat rooms, and other online communications; (3) unauthor- ized access; (4) unauthorized disclosure, use, and dis- semination of personal information; and (5) measures to restrict access to harmful online materials. From the above summary, it is clear that (1) the two protection strategies of CIPA (the Internet filtering strat- egy and safety policy strategy) were equally enforced in both public schools and public libraries because they are two of the most important social institutions for children’s Internet safety; (2) the nature of the implementation mechanism is exactly the same, using the same federal funding mechanisms as the sole financial incentive (lim- iting funding availability for schools and libraries under section 254 of the Communication Act of 1934) through a compliance certification process to enforce the imple- mentation of CIPA; and (3) the actual implementation procedure differs in libraries and schools, with schools to be certified under the supervision of local educational agencies (such as school districts and state departments of education) and with libraries to be certified within a library under the supervision of the IMLS. Economics of CIPA The Universal Service program (commonly known as E–Rate) was established by the Telecommunications Act of 1996 to provide discounts, ranging from 20 to 90 percent, to libraries and schools for telecommunications services, Internet services, internal systems, and equip- ment.12 The program has been very successful, provid- ing approximately $2.25 billion dollars a year to public schools, public libraries, and public hospitals. The vast majority of E-Rate funding—about 90 percent—goes to public schools each year, with roughly 4 percent being awarded to public libraries and the remainder going to hospitals.13 The emphasis on funding schools results from the large number of public schools and the size- able computing needs of all of these schools. But even 4 percent of the E-Rate funding is quite substantial, with public libraries receiving more than $250 million between 2000 and 2003.14 Schools received about $12 billion in the same time period.15 Along with E-Rate funds, the Library Services and Technology Act (LSTA) program adminis- tered by the IMLS provides money to each state library agency to use on library programs and services in that state, though the amount of these funds is considerably lower than E-Rate funds. The American Library Association (ALA) has noted that the E-Rate program has been particularly significant in its role of expanding online access to students and to library patrons in both rural and underserved com- munities.16 In addition to the effect on libraries, E-Rate and LSTA funds have significantly affected the lives of individuals and communities. These programs have contributed to the increase in the availability of free public Internet access in schools and libraries. By 2001, more than 99 percent of public school libraries provided students with Internet access.17 By 2007, 99.7 percent of public library branches were connected to the Internet, and 99.1 percent of public library branches offered pub- lic Internet access.18 However, only a small portion of libraries and schools used filters prior to CIPA.19 Since the advent of computers in libraries, librarians typically had used informal monitoring practices for computer users to ensure that nothing age inappropriate or morally offensive was publicly visible.20 Some individual school and library systems, such as in Kansas and Indiana, even developed formal or informal statewide Internet safety strategies and approaches.21 why were only libraries and schools chosen to protect children’s online safety? While there are many social institutions that could have been the focus of CIPA, the law places the requirements specifically on public libraries and public schools. If Congress was so interested in protecting children from access to harmful Internet content, it seems that the law would be more expansive and focused on the content itself rather than filtering access to the content. However, earlier laws that attempted to regulate access to Internet content failed legal challenges specifically because they tried to regulate content. Prior to the enactment of CIPA, there were a num- ber of other proposed laws aimed at preventing minors from accessing inappropriate Internet content. The Communications Decency Act (CDA) of 1996 prohib- ited the sending or posting of obscene material through the Internet to individuals under the age of eighteen.22 However, the Supreme Court found the CDA to be unconstitutional, stating that the law violated free speech under the First Amendment. In 1998, Congress passed the Child Online Protection Act (COPA), which prohibited commercial websites from displaying material deemed harmful to minors and imposed criminal penalties on Internet violators.23 A three-panel judge for the District Court for the Eastern District of Pennsylvania ruled that COPA’s focus on “contemporary community standards” violated the First Amendment, and the panel subsequently imposed an ONE LAw wITH TwO OuTCOMES | JAEGER AND YAN 9 injunction on COPA’s enforcement. CIPA’s force comes from Congress’s power under the spending clause; that is, Congress can legally attach requirements to funds that it gives out. Since CIPA is based on economic persuasion—the potential loss of funds for technology—the law can only have an effect on recipients of those funds. While regulating Internet access in other venues like coffee shops, Internet cafés, bookstores, and even individual homes would provide a more comprehensive shield to limit children’s access to certain online content, these institutions could not be reached under the spending clause. As a result, the burdens of CIPA fall squarely on public libraries and public schools. n The current state of filtering when did CIPA actually come into effect in libraries and schools? After overcoming a series of legal challenges that were ultimately decided by the Supreme Court, CIPA came into effect in full force in 2003, though 96 percent of public schools were already in compliance with CIPA in 2001. When the Court upheld the constitutionality of CIPA, the legal challenge by public libraries centered on the way the statute was written.24 The Court’s deci- sion states that the wording of the law does not place unconstitutional limitations on free speech in public libraries. To continue receiving federal dollars directly or indirectly through certain federal programs, public libraries and schools were required to install filtering technologies on all computers. While the case decided by the Supreme Court focused on public libraries, the decision virtually precludes public schools from making the same or related challenges.25 Before that case was decided, however, most schools had already adopted filters to comply with CIPA. As a result of CIPA, a public library or public school must install technology protection measures, better known as filters, on all of its computers if it receives n E-Rate discounts for Internet access costs, n E–Rate discounts for internal connections costs, n LSTA funding for direct Internet costs,26 or n LSTA funding for purchasing technology to access the Internet. The requirements of CIPA extend to public libraries, public schools, and any library institution that receives LSTA and E–Rate funds as part of a system, including state library agencies and library consortia. As a result of the financial incentives to comply, almost 100 percent of public schools in the United States have implemented the requirements of CIPA,27 and approximately half of public libraries have done so.28 How many public schools have implemented CIPA? According to the latest report by the Department of Education (see table 1), by 2005, 100 percent of public schools had implemented both the Internet filtering strategy and safety policy strategy. In fact, in 2001 (the first year CIPA was in effect), 96 percent of schools had implemented CIPA, with 99 percent filtering by 2002. When compared to the percentage of all public schools with Internet access from 1994 to 2005, Internet access became nearly universal in schools between 1999 and 2000 (95 to 98 percent), and one can see that the Internet access percentage in 2001 was almost the same as the CIPA implementation percentage. According to the Department of Education, the above estimations are based on a survey of 1,205 elementary and secondary schools selected from 63,000 elementary schools and 21,000 secondary and combined schools.29 After reviewing the design and administration of the sur- vey, it can be concluded that these estimations should be considered valid and reliable and that CIPA was immedi- ately and consistently implemented in the majority of the public schools since 2001.30 How many public libraries have implemented CIPA? In 2002, 43.4 percent of public libraries were receiving E-Rate discounts, and 18.9 percent said they would not apply for E-Rate discounts if CIPA was upheld.31 Since the Supreme Court decision upholding CIPA, the num- ber of libraries complying with CIPA has increased, as Table 1. Implementation of CIPA in public schools Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2005 Access (%) 35 50 65 78 89 95 98 99 99 100 100 Filtering (%) 96 99 97 100 10 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 have the number of libraries not applying for E-Rate funds to avoid complying with CIPA. However, unlike schools, there is no exact count of how many libraries have filtered Internet access. In many cases, the libraries themselves do not filter, but a state library, library con- sortium, or local or state government system of which they are a part filters access from beyond the walls of the library. In some of these cases, the library staff may not even be aware that such filtering is occurring. A number of state and local governments have also passed their own laws to encourage or require all libraries in the state to filter Internet access regardless of E-Rate or LSTA funds.32 In 2008, 38.2 percent of public libraries were filtering access within the library as a result of directly receiving E-Rate funding.33 Furthermore, 13.1 percent of libraries were receiving E-Rate funding as a part of another orga- nization, meaning that these libraries also would need to comply with CIPA’s requirements.34 As such, the number of public libraries filtering access is now at least 51.3 percent, but the number will likely be higher as a result of state and local laws requiring libraries to filter as well as other reasons libraries have implemented filters. In contrast, among libraries not receiving E-Rate funds, the number of libraries now not applying for E-Rate inten- tionally to avoid the CIPA requirements is 31.6 percent.35 While it is not possible to identify an exact number of public libraries that filter access, it is clear that libraries overall have far lower levels of filtering than the 100 per- cent of public schools that filter access. E-Rate and other program issues The administration of the E-Rate program has not occurred without controversy. Throughout the course of the program, many applicants for and recipients of the funding have found the program structure to be obtuse, the application process to be complicated and time con- suming, and the administration of the decision-making process to be slow.36 As a result, many schools and librar- ies find it difficult to plan ahead for budgeting purposes, not knowing how much funding they will receive or when they will receive it.37 There also have been larger difficulties for the program. Following revelations about the uses of some E-Rate awards, the FCC suspended the program from August to December 2004 to impose new accounting and spending rules for the funds, delaying the distribution of over $1 billion in funding to libraries and schools.38 News inves- tigations had discovered that certain school systems were using E-Rate funds to purchase more technology than they needed or could afford to maintain, and some school systems failed to ever use technology they had acquired.39 While the administration of the E-Rate program has been comparatively smooth since, the temporary suspension of the program caused serious short-term problems for, and left a sense of distrust of, the program among many recipients.40 Filtering issues During the 1990s, many types of software filtering prod- ucts became available to consumers, including server- side filtering products (using a list of server-selected blocked URLs that may or may not be disclosed to the user), client-side filtering (controlling the blocking of specific content with a user password), text-based content-analysis filtering (removing illicit content of a website using real-time analysis), monitoring and time- limiting technologies (tracking a child’s online activi- ties and limiting the amount of time he or she spends online), and age-verification systems (allowing access to webpages by passwords issued by a third party to an adult).41 But because filtering software companies make the decisions about how the products work, content and collection decisions for electronic resources in schools and public libraries have been taken out of the hands of librarians, teachers, and local communities and placed in the trust of proprietary software products.42 Some filtering programs also have specific political agendas, which many organizations that purchase them are not aware of.43 In a study of over one million pages, for every webpage blocked by a filter as advertised by the software vendor, one or more pages were blocked inappropriately, while many of the criteria used by the filtering products go beyond the criteria enumerated in CIPA.44 Filters have significant rates of inappropriately block- ing materials, meaning that filters misidentify harmless materials as suspect and prevent access to harmless items (e.g., one filter blocked access to the Declaration of Independence and the Constitution).45 Furthermore, when libraries install filters to comply with CIPA, in many instances the filters will frequently be blocking text as well as images, and (depending on the type of filter- ing product employed) filters may be blocking access to entire websites or even all the sites from certain Internet service providers. As such, the current state of filtering technology will create the practical effect of CIPA restrict- ing access to far more than just certain types of images in many schools and libraries.46 n Differences in the perceived value of CIPA and filtering Based on the available data, there clearly is a sizeable contrast in the levels of implementation of CIPA between ONE LAw wITH TwO OuTCOMES | JAEGER AND YAN 11 schools and libraries. This difference raises a number of questions: For what reasons has CIPA been much more widely implemented in schools? Is this issue mainly value driven, dollar driven, both, or neither in these two public institutions? Why are these two institutions so dif- ferent regarding CIPA implementation while they share many social and educational similarities? Reasons for nationwide full implementation in schools There are various reasons—from financial, population, social, and management issues to computer and Internet availability—that have driven the rapid and compre- hensive implementation of filters in public schools. First, public schools have to implement CIPA because of societal pressures and the lobbying of parents to ensure students’ Internet safety. Almost all users of computers in schools are minors, the most vulnerable groups for Internet crimes and child pornography. Public schools in America have been the focus of public attention and scru- tiny for years, and the political and social responsibility of public schools for children’s Internet safety is huge. As a result, society has decided these students should be most strongly protected, and CIPA was implemented immediately and most widely at schools. Second, in contrast to public libraries (which average slightly less than eleven computers per library outlet), the typical number of computers in public schools ranges from one hundred to five hundred, which are needed to meet the needs of students and teachers for daily learning and teaching. Since the number of computers is quite large, the financial incentives of E-Rate funding are substantial and critical to the operation of the schools. This situation provides administrators in schools and school districts with the incentive to make decisions to implement CIPA as quickly and extensively as possible. Furthermore, the amount of money that E-Rate provides for schools in terms of technology is astounding. As was noted earlier, schools received over $12 billion from 2000 to 2003 alone. Schools likely would not be able to provide the necessary computers for students and teachers with- out the E-Rate funds. Third, the actual implementation procedure differs in schools and libraries: Schools are certified under the supervision of the local educational agencies such as school districts and state departments of education; libraries are certified within a library organization under the supervision of the IMLS. In other words, the cer- tification process at schools is directly and effectively controlled by school districts and state departments of education, following the same fundamental values of protecting children. The resistance to CIPA in schools has been very small in comparison to libraries. The primary concern raised has been the issue of educational equality. Concerns have been raised that filters in schools may create two classes of students—ones with only filtered access at school and ones who also can get unfiltered access at home.47 Reasons for more limited implementation in libraries In public libraries, the reasons for implementing CIPA are similar to those of public schools in many ways. Public libraries provide an average of 10.7 computers in each of the approximately seven thousand public libraries in the United States, which is a lot of technology that needs to be supported. The E-Rate and LSTA funds are vital to many libraries in the provision of computers and the Internet. Furthermore, with limited alternative sources of funding, the E-Rate and LSTA funds are hard to replace if they are not available. Given that the public libraries have become the guarantor of public access to comput- ing and the Internet, libraries have to find ways to ensure that patrons can access the Internet.48 Libraries also have to be concerned about protect- ing and providing a safe environment for younger patrons. While libraries serve patrons of all ages, one of the key social expectations of libraries is the provision of educational materials for children and young adults. Children’s sections of libraries almost always have com- puters in them. Much of the content blocked by filters is of little or no education value. As such, “defending unfil- tered Internet access was quite different from defending Catcher in the Rye.”49 Nevertheless, many libraries have fought against the filtering requirements of CIPA because they believe that it violates the principles of librarianship or for a number of other reasons. In 2008, 31.6 percent of public libraries refused to apply for E-Rate or LSTA funds specifically to avoid CIPA requirements, a substantial increase from the 15.3 percent of libraries that did not apply for E-Rate because of CIPA in 2006.50 As a result of defending patron’s rights to free access, the libraries that are not applying for E-Rate funds because of the requirements of CIPA are being forced to turn down the chance for fund- ing to help pay for Internet access in order to preserve community access to the Internet. Because many librar- ies feel that they cannot apply for E-Rate funds, local and regional discrepancies are occurring in the levels of Internet access that are available to patrons of public libraries in different parts of the country.51 For adult patrons who wish to access material on computers with filters, CIPA states that the library has the option of disabling the filters for “bona fide research or other lawful purposes” when adult patrons request such disabling. The law does not require libraries to 12 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 disable the filters for adult patrons, and the criteria for disabling of filters do not have a set definition in the law. The potential problems in the process of having the filters disabled are many and significant, including librarians not allowing the filters to be turned off, librarians not knowing how to turn the filters off, the filtering software being too complicated to turn off without injuring the performance of the workstation in other applications, or the filtering software being unable to be turned off in a reasonable amount of time.52 It has been estimated that approximately 11 million low-income individuals rely on public libraries to access online information because they lack Internet access at home or work.53 The E-Rate and LSTA programs have helped to make public libraries a trusted community source of Internet access, with the public library being the only source of free public Internet access available to all community residents in nearly 75 percent of communities in the United States.54 Therefore usage of computers and the Internet in public libraries has continued to grow at a very fast pace over the past ten years.55 Thus public librar- ies are torn between the values of providing safe access for younger patrons and broad access for adult patrons who may have no other means of accessing the Internet. n CIPA, public policy, and further research While the diverse implementations, effects, and levels of acceptance of CIPA across schools and libraries dem- onstrate the wide range of potential ramifications of the law, surprisingly little consideration is given to major assumptions in the law, including the appropriateness of the requirements to different age groups and the nature of information on the Internet. CIPA treats all users as if they are the same level of maturity and need the same level of protection as a small child, as evidenced by the require- ment that all computers in a library or school have filters regardless of whether children use a particular computer. In reality, children and adults interact in different social, physical, and cognitive ways with computers because of different developmental processes.56 CIPA fails to recognize that children as individual users are active processors of information and that children of different ages are going to be affected in divergent ways by filtering programs.57 Younger children benefit from more restrictive filters while older children benefit from less restrictive filters. Moreover, filtering can be compli- mented by encouragement of frequent positive Internet usage and informal instruction to encourage positive use. Finally, children of all ages need a better understanding of the structure of the Internet to encourage appropriate caution in terms of online safety. The Internet represents a new social and cultural environment in which users simultaneously are affected by the social environment and also construct that environment with other users.58 CIPA also is based on fundamental misconceptions about information on the Internet. The Supreme Court’s decision upholding CIPA represents several of these mis- conceptions, adopting an attitude that ‘we know what is best for you’ in terms of the information that citizens should be allowed to access.59 It assumes that schools and libraries select printed materials out of a desire to protect and censor rather than recognizing the basic reality that only a small number of print materials can be afforded by any school or library. The Internet frees schools and libraries from many of these costs. Furthermore, the Court assumes that libraries should censor the Internet as well, ultimately upholding the same level of access to information for adult patrons and librarians in public libraries as students in public schools. These two major unexamined assumptions in the law certainly have played a part in the difficulty of implementing CIPA and in the resistance to the law. And this does not even address the problems of assuming that public libraries and public schools can be treated interchangeably in crafting legislation. These problem- atic assumptions point to a significantly larger issue: In trying to deal with the new situations created by the Internet and related technology, the federal government has significantly increased the attention paid to informa- tion policy.60 Over the past few years, government laws and standards related to information have begun to more clearly relate to social aspects of information technolo- gies such as the filtering requirements of CIPA.61 But the social, economic, and political ramifications for decisions about information policy are often woefully underexam- ined in the development of legislation.62 This paper has documented that many of the reasons for and statistics about CIPA implementation are avail- able by bringing together information from different social institutions. The biggest questions about CIPA are about the societal effects of the policy decisions: n Has CIPA changed the education and information- provision roles of libraries and schools? n Has CIPA changed the social expectations for libraries and schools? n Have adult patron information behaviors changed in libraries? n Have minor patron information behaviors changed in libraries? n Have student information behaviors changed in school? n How has CIPA changed the management of librar- ies and schools? n Will Congress view CIPA as successful enough to merit using libraries and schools as the means of enforcing other legislation? ONE LAw wITH TwO OuTCOMES | JAEGER AND YAN 13 But these social and administrative concerns are not the only major research questions raised by the imple- mentation of CIPA. Future research about CIPA not only needs to focus on the individual, institutional, and social effects of the law. It must explore the lessons that CIPA can provide to the process of creating and implementing information policies with significant societal implications. The most significant research issues related to CIPA may be the ones that help illuminate how to improve the legislative process to better account for the potential consequences of regulating information while the legislation is still being developed. Such cross-disciplinary analyses would be of great value as information becomes the center of an increasing amount of legislation, and the effects of this legislation have continually wider consequences for the flow of information through society. It could also be of great benefit to public schools and libraries, which, if CIPA is any indication, may play a large role in future legislation about public Internet access. References 1. Children’s Internet Protection Act (CIPA), Public Law 106- 554. 2. United States v. American Library Association, 539 U.S. 154 (2003). 3. American Library Association, Libraries Connect Communi- ties: Public Library Funding & Technology Access Study 2007–2008 (Chicago: ALA, 2008); Paul T. Jaeger, John Carlo Bertot, and Charles R. McClure, “The Effects of the Children’s Internet Protection Act (CIPA) in Public Libraries and its Implications for Research: A Statistical, Policy, and Legal Analysis,” Journal of the American Society for Information Science and Technology 55, no. 13 (2004): 1131–39; Paul T. Jaeger et al., “Public Libraries and Internet Access Across the United States: A Comparison by State from 2004 to 2006,” Information Technology and Librar- ies 26, no. 2 (2007): 4–14; Paul T. Jaeger et al., “CIPA: Decisions, Implementation, and Impacts,” Public Libraries 44, no. 2 (2005): 105–9; Zheng Yan, “Limited Knowledge and Limited Resources: Children’s and Adolescents’ Understanding of the Internet,” Journal of Applied Developmental Psychology (forthcoming); Zheng Yan, “Differences in Basic Knowledge and Perceived Education of Internet Safety between High School and Undergraduate Students: Do High School Students Really Benefit from the Children’s Internet Protection Act?” Journal of Applied Develop- mental Psychology (forthcoming); Zheng Yan, “What Influences Children’s and Adolescents’ Understanding of the Complexity of the Internet?,” Developmental Psychology 42 (2006): 418–28. 4. Martha M. McCarthy, “Filtering the Internet: The Chil- dren’s Internet Protection Act,” Educational Horizons 82, no, 2 (Winter 2004): 108. 5. Federal Communications Commission, In the Matter of Federal–State Joint Board on Universal Service: Children’s Internet Protection Act, FCC order 03-188 (Washington, D.C.: 2003). 6. CIPA. 7. Roth v. United States, 354 U.S. 476 (1957). 8. Miller v. California, 413 U.S. 15 (1973). 9. Roth v. United States. 10. CIPA. 11. CIPA. 12. Telecommunications Act of 1996, Public Law 104-104 (Feb. 8, 1996). 13. Paul T. Jaeger, Charles R. McClure, and John Carlo Ber- tot, “The E-Rate Program and Libraries and Library Consortia, 2000–2004: Trends and Issues,” Information Technology & Libraries 24, no. 2 (2005): 57–67. 14. Ibid. 15. Ibid. 16. American Library Association, “U.S. Supreme Court Arguments on CIPA Expected in Late Winter or Early Spring,” press release, Nov. 13, 2002, www.ala.org/ala/aboutala/hqops/ pio/pressreleasesbucket/ussupremecourt.cfm (accessed May 19, 2008). 17. Kelly Rodden, “The Children’s Internet Protection Act in Public Schools: The Government Stepping on Parents’ Toes?” Fordham Law Review 71 (2003): 2141–75. 18. John Carlo Bertot, Paul T. Jaeger, and Charles R. McClure, “Public Libraries and the Internet 2007: Issues, Implications, and Expectations,” Library & Information Science Research 30 (2008): 175–184; Charles R. McClure, Paul T. Jaeger, and John Carlo Bertot, “The Looming Infrastructure Plateau?: Space, Funding, Connection Speed, and the Ability of Public Libraries to Meet the Demand for Free Internet Access,” First Monday 12, no. 12 (2007), www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/ article/view/2017/1907 (accessed May 19, 2008). 19. McCarthy, “Filtering the Internet.” 20. Leigh S. Estabrook and Edward Lakner, “Managing Inter- net Access: Results of a National Survey,” American Libraries 31, no. 8 (2000): 60–62. 21. Alberta Davis Comer, “Studying Indiana Public Librar- ies’ Usage of Internet Filters,” Computers in Libraries (June 2005): 10–15; Thomas M. Reddick, “Building and Running a Collabora- tive Internet Filter is Akin to a Kansas Barn Raising,” Computers in Libraries 20, no. 4 (2004): 10–14. 22. Communications Decency Act of 1996, Public Law 104-104 (Feb. 8, 1996). 23. Child Online Protection Act (COPA), Public Law 105-277 (Oct. 21, 1998). 24. United States v. American Library Association. 25. R. Trevor Hall and Ed Carter, “Examining the Constitu- tionality of Internet Filtering in Public Schools: A U.S. Perspec- tive,” Education & the Law 18, no. 4 (2006): 227–45; McCarthy “Filtering the Internet.” 26. Library Services and Technology Act, Public Law 104-208 (Sept. 30, 1996). 27. John Wells and Laurie Lewis, Internet Access in U.S. Public Schools and Classrooms: 1994–2005, special report prepared at the request of the National Center for Education Statistics, Nov. 2006. 28. American Library Association, Libraries Connect Commu- nities; John Carlo Bertot, Charles R. McClure, and Paul T. Jaeger, “The Impacts of Free Public Internet Access on Public Library Patrons and Communities,” Library Quarterly 78, no. 3 (2008): 285–301; Jaeger et al., “CIPA.” 29. Wells and Lewis, Internet Access in U.S. Public Schools and Classrooms. 14 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 30. Ibid. 31. Jaeger, McClure, and Bertot, “The E-Rate Program and Libraries and Library Consortia.” 32. Jaeger et al., “CIPA.” 33. American Library Association, Libraries Connect Commu- nities. 34. Ibid. 35. Ibid. 36. Jaeger, McClure, and Bertot, “The E-Rate Program and Libraries and Library Consortia.” 37. Ibid. 38. Norman Oder, “$40 Million in E-Rate Funds Suspended: Delays Caused as FCC Requires New Accounting Standards,” Library Journal 129, no. 18 (2004): 16; Debra Lau Whelan, “E-Rate Funding Still Up in the Air: Schools, Libraries Left in the Dark about Discounted Funds for Internet Services,” School Library Journal 50, no. 11 (2004): 16. 39. Ken Foskett and Paul Donsky, “Hard Eye on City Schools’ Hardware,” Atlanta Journal-Constitution, May 25, 2004; Ken Fos- kett and Jeff Nesmith, “Wired for Waste: Abuses Tarnish E-rate Program,” Atlanta Journal-Constitution, May 24, 2004. 40. Jaeger, McClure, and Bertot, “The E-Rate Program and Libraries and Library Consortia.” 41. Department of Commerce, National Telecommunication and Information Administration, Children’s Internet Protection Act: Study of Technology Protection Measures in Section 1703, report to Congress (Washington, D.C.: 2003). 42. McCarthy, “Filtering the Internet.” 43. Paul T. Jaeger and Charles R. McClure, “Potential Legal Challenges to the Application of the Children’s Internet Protec- tion Act (CIPA) in Public Libraries: Strategies and Issues,” First Monday 9, no. 2 (2004), www.firstmonday.org/issues/issue9_2/ jaeger/index.html (accessed May 19, 2008). 44. Electronic Frontier Foundation, Internet Blocking in Public Schools (Washington, D.C.: 2004), http://w2.eff.org/Censor ship/Censorware/net_block_report (accessed May 19, 2008). 45. Adam Horowitz, “The Constitutionality of the Children’s Internet Protection Act,” St. Thomas Law Review 13, no. 1 (2000): 425–44. 46. Tanessa Cabe, “Regulation of Speech on the Internet: Fourth Time’s the Charm?” Media Law and Policy 11 (2002): 50–61; Adam Goldstein, “Like a Sieve: The Child Internet Pro- tection Act and Ineffective Filters in Libraries,” Fordham Intel- lectual Property, Media, and Entertainment Law Journal 12 (2002): 1187–1202; Horowitz, “The Constitutionality of the Children’s Internet Protection Act”; Marilyn J. Maloney and Julia Morgan, “Rock and A Hard Place: The Public Library’s Dilemma in Pro- viding Access to Legal Materials on the Internet While Restrict- ing Access to Illegal Materials,” Hamline Law Review 24, no. 2 (2001): 199–222; Mary Minow, “Filters and the Public Library: A Legal and Policy Analysis,” First Monday 2, no. 12 (1997), www .firstmonday.org/issues/issue2_12/minnow (accessed May 19, 2008); Richard J. Peltz, “Use ‘the Filter You Were Born with’: The Unconstitutionality of Mandatory Internet Filtering for Adult Patrons of Public Libraries,” Washington Law Review 77, no. 2 (2002): 397–479. 47. McCarthy, “Filtering the Internet.” 48. John Carlo Bertot et al., “Public Access Computing and Internet Access in Public Libraries: The Role of Public Libraries in E-Government and Emergency Situations,” First Monday 11, no. 9 (2006), www.firstmonday.org/issues/issue11_9/bertot (accessed May 19, 2008); John Carlo Bertot et al., “Drafted: I want You to Deliver E-Government,” Library Journal 131, no. 13 (2006): 34–39; Paul T. Jaeger and Kenneth R. Fleischmann, “Public Libraries, Values, Trust, and E-Government,” Informa- tion Technology and Libraries 26, no. 4 (2007): 35–43. 49. Doug Johnson, “Maintaining Intellectual Freedom in a Filtered World,” Learning & Leading with Technology 32, no. 8 (May 2005): 39. 50. Bertot, McClure, and Jaeger, “The Impacts of Free Public Internet Access on Public Library Patrons and Communities.” 51. Jaeger et al., “Public Libraries and Internet Access Across the United States.” 52. Paul T. Jaeger et al., “The Policy Implications of Internet Connectivity in Public Libraries,” Government Information Quar- terly 23, no. 1 (2006): 123–41. 53. Goldstein, “Like a Sieve.” 54. Bertot, McClure, and Jaeger, “The Impacts of Free Public Internet Access on Public Library Patrons and Communities”; Jaeger and Fleischmann, “Public Libraries, Values, Trust, and E-Government.“ 55. Bertot, Jaeger, and McClure, “Public Libraries and the Internet 2007”; Charles R. McClure et al., “Funding and Expen- ditures Related to Internet Access in Public Libraries,” Informa- tion Technology & Libraries (forthcoming). 56. Zheng Yan and Kurt W. Fischer, “How Children and Adults Learn to Use Computers: A Developmental Approach,” New Directions for Child and Adolescent Development 105 (2004): 41–61. 57. Zheng Yan, “Age Differences in Children’s Understand- ing of the Complexity of the Internet,” Journal of Applied Devel- opmental Psychology 26 (2005): 385–96; Yan, “Limited Knowledge and Limited Resources”; Yan, “Differences in Basic Knowledge and Perceived Education of Internet Safety”; Yan, “What Influ- ences Children’s and Adolescents’ Understanding of the Com- plexity of the Internet?” 58. Patricia Greenfield and Zheng Yan, “Children, Adoles- cents, and the Internet: A New Field of Inquiry in Developmen- tal Psychology,” Developmental Psychology 42 (2006): 391–93. 59. John N. Gathegi, “The Public Library as a Public Forum: The (De)Evolution of a Legal Doctrine,” Library Quarterly 75 (2005): 12. 60. Sandra Braman, “Where Has Media Policy Gone? Defin- ing the Field in the 21st Century,” Communication Law and Policy 9, no. 2 (2004): 153–82; Sandra Braman, Change of State: Informa- tion, Policy, & Power (Cambridge, Mass.: MIT Pr., 2007); Charles R. McClure and Paul T. Jaeger, “Government Information Policy Research: Importance, Approaches, and Realities,” Library & Information Science Research 30 (2008): 257–64; Milton Mueller, Christiane Page, and Brendan Kuerbis, “Civil Society and the Shaping of Communication-Information Policy: Four Decades of Advocacy,” Information Society 20, no. 3 (2004): 169–85. 61. Paul T. Jaeger, “Information Policy, Information Access, and Democratic Participation: The National and International Implications of the Bush Administration’s Information Politics,” Government Information Quarterly 24 (2007): 840–59. 62. McClure and Jaeger, “Government Information Policy Research.” 3169 ---- A SEMANTIC MODEL OF SELECTIvE DISSEMINATION OF INFORMATION | MORALES-DEL-CASTILLO ET AL. 21 A Semantic Model of Selective Dissemination of Information for Digital Libraries J. M. Morales-del-Castillo, R. Pedraza-Jiménez, A. A. Ruíz, E. Peis, and E. Herrera-Viedma In this paper we present the theoretical and methodo- logical foundations for the development of a multi-agent Selective Dissemination of Information (SDI) service model that applies Semantic Web technologies for spe- cialized digital libraries. These technologies make pos- sible achieving more efficient information management, improving agent–user communication processes, and facilitating accurate access to relevant resources. Other tools used are fuzzy linguistic modelling techniques (which make possible easing the interaction between users and system) and natural language processing (NLP) techniques for semiautomatic thesaurus genera- tion. Also, RSS feeds are used as “current awareness bul- letins” to generate personalized bibliographic alerts. N owadays, one of the main challenges faced by information systems at libraries or on the Web is to efficiently manage the large number of docu- ments they hold. Information systems make it easier to give users access to relevant resources that satisfy their information needs, but a problem emerges when the user has a high degree of specialization and requires very specific resources, as in the case of researchers.1 In “tra- ditional” physical libraries, several procedures have been proposed to try to mitigate this issue, including the selec- tive dissemination of information (SDI) service model that make it possible to offer users potentially interesting documents by accessing users’ personal profiles kept by the library. Nevertheless, the progressive incorporation of new information and communication technologies (ICTs) to information services, the widespread use of the Internet, and the diversification of resources that can be accessed through the Web has led libraries through a process of reinvention and transformation to become “digital” libraries.2 This reengineering process requires a deep revision of work techniques and methods so librarians can adapt to the new work environment and improve the services provided. In this paper we present a recommendation and SDI model, implemented as a service of a specialized digital library (in this case, specialized in library and informa- tion science), that can increase the accuracy of accessing information and the satisfaction of users’ information needs on the Web. This model is built on a multi-agent framework, similar to the one proposed by Herrera-Viedma, Peis, and Morales-del-Castillo,3 that applies Semantic Web technologies within the specific domain of special- ized digital libraries in order to achieve more efficient information management (by semantically enriching dif- ferent elements of the system) and improved agent–agent and user–agent communication processes. Furthermore, the model uses fuzzy linguistic model- ling techniques to facilitate the user–system interaction and to allow a higher grade of automation in certain procedures. To increase improved automation, some natural language processing (NLP) techniques are used to create a system thesaurus and other auxiliary tools for the definition of formal representations of information resources. In the next section, “Instrumental basis,” we briefly analyze SDI services and several techniques involved in the Semantic Web project, and we describe the prelimi- nary methodological and instrumental bases that we used for developing the model, such as fuzzy linguistic model- ling techniques and tools for NLP. In “Semantic SDI serv- ice model for digital libraries,” the bulk of this work, the application model that we propose is presented. Finally, to sum up, some conclusive data are highlighted. n Instrumental basis Filtering techniques for SDI services Filtering and recommendation services are based on the application of different process-management techniques that are oriented toward providing the user exactly the information that meets his or her needs or can be of his or her interest. In textual domains, these services are usu- ally developed using multi-agent systems, whose main aims are n to evaluate and filter resources normally repre- sented in XML or HTML format; and n to assist people in the process of searching for and retrieving resources.4 J. M. Morales-del-Castillo (josemdc@ugr.es) is Assistant Professor of Information Science, Library and Information Science Department, University of granada, Spain. R. Pedraza- Jiménez (rafael.pedraza@upf.edu) is Assistant Professor of Information Science, Journalism and Audiovisual Communication Department, Pompeu Fabra University, Barcelona, Spain. A. A. Ruíz (aangel@ugr.es) is Full Professor of Information Science, Library and Information Science Department, University of granada. E. Peis (epeis@ugr.es) is Full Professor of Information Science, Library and Information Science Department, University of granada. E. Herrera-viedma (viedma@decsai.ugr.es) is Senior Lecturer in Computer Science, Computer Science and Artificial Intelligence Department, University of granada. 22 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 Traditionally, these systems are classified as either content-based recommendation systems or collaborative recommendation systems.5 Content-based recommen- dation systems filter information and generate recom- mendations by comparing a set of keywords defined by the user with the terms used to represent the content of documents, ignoring any information given by other users. By contrast, collaborative filtering systems use the information provided by several users to recommend documents to a given user, ignoring the representation of a document’s content. It is common to group users into different categories or stereotypes that are characterized by a series of rules and preferences, defined by default, that represent the information needs and common behav- ioural habits of a group of related users. The current trend is to develop hybrids that make the most of content-based and collaborative recommendation systems. In the field of libraries, these services usually adopt the form of SDI services that, depending on the profile of subscribed users, periodically (or when required by the user) generate a series of information alerts that describe the resources in the library that fit a user’s interests.6 SDI services have been studied in different research areas, such as the multi-agent systems development domain,7 and, of course, the digital libraries domain.8 Presently, many SDI services are implemented on Web platforms based on a multi-agent architecture where there is a set of intermediate agents that compare users’ profiles with the documents, and there are input-output agents that deal with subscriptions to the service and display generated alerts to users.9 Usually, the information is struc- tured according to a certain data model, and users’ profiles are defined using a series of keywords that are compared to descriptors or the full text of the documents. Despite their usefulness, these services have some deficiencies: n The communication processes between agents, and between agents and users, are hindered by the dif- ferent ways in which information is represented. n This heterogeneity in the representation of infor- mation makes it impossible to reuse such informa- tion in other processes or applications. A possible solution to these deficiencies consists of enriching the information representation using a common vocabulary and data model that are understandable by humans as well as by software agents. The Semantic Web project takes this idea and provides the means to develop a universal platform for the exchange of information.10 Semantic web technologies The Semantic Web project tries to extend the model of the present Web by using a series of standard languages that enable enriching the description of Web resources and make them semantically accessible.11 To do that, the project basis itself on two fundamental ideas: (1) resources should be tagged semantically so that informa- tion can be understood both by humans and comput- ers, and (2) intelligent agents should be developed that are capable of operating at a semantic level with those resources and that infer new knowledge from them (shift- ing from the search of keywords in a text to the retrieval of concepts).12 The semantic backbone of the project is the Resource Description Framework (RDF) vocabulary, which pro- vides a data model to represent, exchange, link, add, and reuse structured metadata of distributed information sources, thereby making them directly understandable by software agents.13 RDF structures the information into individual assertions (e.g., “resource,” “property,” and “property value triples”) and uniquely character- izes resources by means of Uniform Resource Identifiers (URIs), allowing agents to make inferences about them using Web ontologies or other, simpler semantic struc- tures, such as conceptual schemes or thesauri.14 Even though the adoption of the Semantic Web and its application to systems like digital libraries is not free from trouble (because of the nature of the technologies involved in the project and because of the project’s ambi- tious objectives,15 among other reasons), the way these technologies represent the information is a significant improvement over the quality of the resources retrieved by search engines, and it also allows the preservation of platform independence, thus favouring the exchange and reuse of contents.16 As we can see, the Semantic Web works with infor- mation written in natural language that is structured in a way that can be interpreted by machines. For this reason, it is usually difficult to deal with problems that require operating with linguistic information that has a certain degree of uncertainty (e.g., when quantifying the user’s satisfaction in relation to a product or service). A possible solution could be the use of fuzzy linguistic modelling techniques as a tool for improving system–user commu- nication. Fuzzy linguistic modelling Fuzzy linguistic modelling supplies a set of approxi- mate techniques appropriate for dealing with qualitative aspects of problems.17 The ordinal linguistic approach is defined according to a finite set of tags (S) completely ordered and with odd cardinality (seven or nine tags): { }{ }T,=Hi,s=S i …∈ 0, The central term has a value of approximately 0.5, and the rest of the terms are arranged symmetrically around A SEMANTIC MODEL OF SELECTIvE DISSEMINATION OF INFORMATION | MORALES-DEL-CASTILLO ET AL. 23 it. The semantics of each linguistic term is given by the ordered structure of the set of terms, considering that each linguistic term of the pair (si, sT-i) is equally informative. Each label si is assigned a fuzzy value defined in the inter- val [0,1] that is described by a linear trapezoidal property function represented by the 4-tupla (ai, bi, αi, βi). (The two first parameters show the interval where the property value is 1.0; the third and fourth parameters show the left and right limits of the distribution.) Additionally, we need to define the following properties: 1.–The set is ordered: si ≥ sj if i ≥ j. 2.–There is the negation operator: Neg(si ) = sj, with j = T - i. 3.–Maximization operator: MAX(si, sj) = si if si ≥ sj. 4.–Minimization operator: MIN(si, sj) = si if si ≤ sj. It also is necessary to define aggregation operators, such as Linguistic Weighted Averaging (LWA),18 capable of and operating with and combining linguistic information. Focusing on facilitating the interaction between users and system, the other starting objective is to achieve the development and implementation of the model proposed in the most automated way possible. To do this, we use a basic auxiliary tool—a thesaurus—that, among other tasks, assists users in the creation of their profile and ena- bles automating the alerts generation. That is why it is critical to define the way in which we create this tool, and in this work we propose a specific method for the semiautomatic development of thesauri using NLP techniques. NLP techniques and other automating tools NLP consists of a series of linguistic techniques, statistic approaches, and machine learning algorithms (mainly clustering techniques) that can be used, for example, to summarize texts in an automatic way, to develop automatic translators, and to create voice recognition software. Another possible application of NLP would be the semiautomatic construction of thesauri using different techniques. One of them consists of determining the lexical relations between the terms of a text (mainly syn- onymy, hyponymy, and hyperonymy),19 and extracting terms that are more representative for the text’s specific domain.20 It is possible to elicit these relations by using linguistic tools, like Princeton’s WordNet (http://wordnet .princeton.edu) and clustering techniques. WordNet is a powerful multilanguage lexical data- base where each one of its entries is defined, among other elements, by their synonyms (synsets), hyponyms, and hyperonyms.21 As a consequence, once given the most important terms of a domain, WordNet can be used to create from them a thesaurus (after leaving out all terms that have not been identified as belonging or related to the domain of interest).22 This tool can also be used with clustering tech- niques—for example, to group documents of a collection in a set of nodes or clusters, depending on their similarity. Each of these clusters is described by the most representa- tive terms of their documents. These terms make up the most specific level of a thesaurus and are used to search in WordNet for their synonyms and most general terms, contributing (with the repetition of this procedure) to the bottom-up-development process of the thesaurus.23 Although there are many others, these are some of the most well-known techniques of semiautomatic thesau- rus generation (semiautomatic because, needless to say, the supervision of experts is necessary to determine the validity of the final result). For specialized digital libraries, we propose develop- ing, on a multi-agent platform and using all these tools, SDI services capable of generating alerts and recommendations for users according to their personal profiles. In particular, the model presented here is the result of several previous models merging, and its service is based on the definition of “current-awareness bulletins,” where users can find a basic description of the resources recently acquired by the library or those that might be of interest to them.24 n The Semantic SDI service model for digital libraries The SDI service includes two agents (an interface agent and a task agent) distributed in a four-level hierarchi- cal architecture: user level, interface level, task level and resource level. Its main components are a repository of full-text doc- uments (which make up the stock of the digital library) and a series of elements described using different RDF- based vocabularies: one or several RSS feeds that play a role similar to that of current-awareness bulletins in traditional libraries; a repository of recommendation log files that store the recommendations made by users about the resources, and a thesaurus that lists and hierarchi- cally relates the most relevant terms of the specialization domain of the library.25 Also, the semantics of each ele- ment (that is, its characteristics and the relations the ele- ment establishes with other elements in the system) are defined in a Web ontology developed in Web Ontology Language (OWL).26 Next, we describe these main elements as well as the different functional modules that the system uses to carry out its activity. Elements of the model There are four basic elements that make up the system: 24 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 the thesaurus, user profiles, RSS feeds, and recommenda- tion log files. Thesaurus An essential element of this SDI service is the thesau- rus, an extensible tool used in traditional libraries that enables organizing the most relevant concepts in a specific domain, defining the semantic relations estab- lished between them, such as equivalence, hierarchical, and associative relations. The functions defined for the thesaurus in our system include helping in the indexing of RSS feeds items and in the generation of information alerts and recommendations. To create the thesaurus, we followed the method suggested by Pedraza-Jiménez, Valverde-Albacete, and Navia-Vázquez.27 The learning technique used for the creation of a the- saurus includes four phases: preprocessing of documents, parameterizing the selected terms, conceptualizing their lexical stems, and generating a lattice or graph that shows the relation between the identified concepts. Essentially, the aim of the preprocessing phase is to prepare the documents’ parameterization by removing elements regarded as superfluous. We have developed this phase in three stages: eliminating tags (stripping), standardizing, and stemming. In the first stage, all the tags (HTML, XML, etc.) that can appear in the collection of documents are eliminated. The second stage is the standardization of the words in the documents in order to facilitate and improve the parameterization process. At this stage, the acronyms and N-grams (bigrams and trigrams) that appear in the documents are identified using lists that were created for that purpose. Once we have detected the acronyms and N-grams, the rest of the text is standardized. Dates and numeri- cal quantities are standardized, being substituted with a script that identifies them. All the terms (except acro- nyms) are changed to small letters, and punctuation marks are removed. Finally, a list of function words is used to eliminate from the texts articles, determiners, auxiliary verbs, conjunctions, prepositions, pronouns, interjections, contractions, and grade adverbs. All the terms are stemmed to facilitate the search of the final terms and to improve their calculation during parameterization. To carry out this task, we have used Morphy, the stemming algorithm used by WordNet. This algorithm implements a group of functions that check whether a term is an exception that does not need to be stemmed and then convert words that are not exceptions to their basic lexical form. Those terms that appear in the documents but are not identified by Morphy are elimi- nated from our experiment. The parameterization phase has a minimum complex- ity. Once identified, the final terms (roots or bases) are quantified by being assigned a weight. Such weight is obtained by the application of the scheme term frequency- inverse document frequency (tf-idf), a statistic measure that makes possible the quantification of the importance of a term or N-gram in a document depending on its fre- quency of appearance and in the collection the document belongs to. Finally, once the documents have been parameter- ized, the associated meanings of each term (lemma) are extracted by searching for them in WordNet (specifically, we use WordNet 2.1 for UNIX-like systems). Thus we get the group of synsets associated with each word. The group of hyperonyms and hyponyms also are extracted from the vocabulary of the analyzed collection of documents. The generation of our thesaurus—that is, the identifi- cation of descriptors that better represent the content of documents, and the identification of the underlying rela- tions between them—is achieved using formal concept analysis techniques. This categorization technique uses the theory of lat- tices and ordered sets to find abstraction relations from the groups it generates. Furthermore, this technique ena- bles clustering the documents depending on the terms (and synonyms) it contains. Also, a lattice graph is gener- ated according to the underlying relations between the terms of the collection, taking into account the hypero- nyms and hyponyms extracted. In that graph, each node represents a descriptor (namely, a group of synonym terms) and clusters the set of documents that contain it, linking them to those with which it has any relation (of hyponymy or hyperonymy). Once the thesaurus is obtained by identifying its terms and the underlying relations between them, it is automatically represented using the Simple Knowledge Organization System (SKOS) vocabulary (see figure 1).28 user profiles User profiles can be defined as structured representations that contain personal data, interests, and preferences of users with which agents can operate to customize the SDI service. In the model proposed here, these profiles are basically defined with Friend of a Friend (FOAF), a specific RDF/XML for describing people (which favours the profile interoperability, since this is a widespread vocabulary supported by an OWL ontology) and another nonstandard vocabulary of our own to define fields not included in FOAF (see figure 2).29 Profiles are generated the moment the user is regis- tered in the system, and they are structured in two parts: a public profile that includes data related to the user’s identity and affiliation, and a private profile that includes the user’s interests and preferences about the topic of the alerts he or she wishes to receive. To define their preferences, users must specify key- words and concepts that best define their information A SEMANTIC MODEL OF SELECTIvE DISSEMINATION OF INFORMATION | MORALES-DEL-CASTILLO ET AL. 25 needs. Later, the system compares those concepts with the terms in the thesaurus using as a similarity measure the edit tree algorithm.30 This function matches character strings, then returns the term introduced (if there’s an exact match) or the lexically most similar term (if not). Consequently, if the suggested term satisfies user expectations, it will be added to the user’s profile together with its synonyms (if any). In those cases where the suggested term is not satisfactory, the system must have any tool or application that enables users to browse the thesaurus and select terms that bet- ter describe their needs. An exam- ple of this type of applications is ThManager (http://thmanager .sourceforge.net), a project of the Universidad de Zaragoza, Spain, that enables editing, visualiz- ing, and going through structures defined in SKOS. Each of the terms selected by the user to define his or her areas of interest has an associated lin- guistic frequency value (tagged as <freq>) that we call “satisfaction frequency.” It represents the regular- ity with which a particular prefer- ence value has been used in alerts positively evaluated by the user. This frequency measures the relative importance of the preferences stated by the user and allows the interface agent to generate a ranking list of results. The range of possible values for these frequencies is defined by a group of seven labels that we get from the fuzzy linguistic variable “Frequency,” whose expression domain is defined by the linguis- tic term set S = {always, almost_ always, often, occasionally, rarely, almost_never, never}, being the default value and “occasionally” being the central value. RSS feeds Thanks to the popularization of blogs, there has been wide- spread use of several vocabular- ies specifically designed for the syndication of contents (that is, for making accessible to other Internet users the content of a website by means of hyperlink lists called “feeds”). To create our current-awareness bulletin we use RSS 1.0, a vocabulary that enables managing hyperlinks lists in an easy and flexible way. It utilizes the RDF/XML syntax and data model and is easily extensible because of the use of <skos:Concept rdf:about=”7”> <skos:inScheme rdf:resource=”http://www.ugr.es/…/thes/”/> <skos:prefLabel xml:lang=”es”>Proceedings</skos:prefLabel> <skos:broader rdf:resource=”http://www.ugr.es/…/thes/668”/> <skos:narrower rdf:resource=”http://www.ugr.es/…/thes/286”/> <skos:narrower rdf:resource=”http://www.ugr.es/…/thes/830”/> </skos:Concept> Figure 1. Sample entry of a SKOS Core thesaurus <foaf:PersonalProfileDocument rdf:about=””> <foaf:maker rdf:resource=”#person”/> <foaf:primaryTopic rdf:resource=”#person”/> </foaf:PersonalProfileDocument> <foaf:Person rdf:ID=”user_09234”> <foaf:name>Diego Allione</foaf:name> <foaf:title>Sr.</foaf:title> <foaf:mbox_sha1sum>af9fa7601df46e95566</foaf:mbox_sha1sum> <foaf:homepage rdf:resource=”http://allione.org”/> <foaf:depiction rdf:resource=”allione.jpg”/> <foaf:phone rdf:resource=”tel:555-432-432”/> <dfss:topic> <dfss:pref rdf:nodeID=”pref_09234-1”> <rdfs:label>Library management</rdfs:label> <dfss:relev>0.83</dfss:relev> </dfss:pref> </dfss:topic> </foaf:Person> Figure 2. User profile sample 26 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 modules that enable extending the vocabulary without modi- fying its core each time new describing elements are added. In this model several modules are used: the Dublin Core (DC) module to define the basic bib- liographic information of the items utilizing the elements established by the Dublin Core Metadata Initiative (http:// dublincore.org); the syndica- tion module to facilitate soft- ware agents synchronizing and updating RSS feeds; and the taxonomy module to assign topics to feeds items. The structure of the feeds comprises two areas: one where the channel itself is described by a series of basic metadata like a title, a brief description of the content, and the updating frequency; and another where the descriptions of the items that make up the feed (see figure 3) are defined (including elements such as title, author, sum- mary, hyperlink to the primary resource, date of creation, and subjects). Recommendation log file Each document in the repository has an associated recommendation log file in RDF that includes the listing of evaluations assigned to that resource by different users since the resource was added to the system. Each of the entries of the recom- mendation log files consists of a recommendation value, a URI that identifies the user that has done the recommendation, and the date of the record (see figure 4). The expression domain of the rec- ommendations is defined by the following set of five fuzzy linguistic labels that are extracted from the linguistic variable “Quality of the resource”: Q = {Very_low, Low, Medium, High, Very_high}. These elements represent the raw materials for the SDI service that enable it to develop its activity through four processes or functional modules: the pro- files updating process, RSS feeds generation process, alert generation process, and collaborative recommen- dation process. System processes Profiles updating process Since the SDI service’s functions are based on generating passive searches to RSS feeds from the preferences stored <recomm-log rdf:ID=”log-00528”> <doc rdf:resource=”http://doc.es/doc-0A15”/> <items_e> <item rdf:nodeID=”item-000A901”> <user rdf:resource=”http://user.es/001”/> <date>14/03/2007</date> <recomm>High</recomm> </item> </ítems_e> </recomm-log> Figure 4. Recommendation log file sample <item rdf:about=”http://www.ugr.es/…/doc-00000528”> <dc:creator>Escudero Sánchez, Manuel</dc:creator> <dc:creator>Fernández Cáceres, José Luis</dc:creator> <title>Broadcasting and the Internet http://eprints.rclis.org/…/AudioVideo_good.pdf This paper is about… 2002 REDOC, 8 (4), 2008 Virual communities Figure 3. RSS feed item sample in a user’s profile, updating the profiles becomes a critical task. User profiles are meant to store long-term prefer- ences, but the system must be able to detect any subtle change in these preferences over time to offer accurate recommendations. In our model, user profiles are updated using a simple mechanism that enables finding users’ implicit preferences by applying fuzzy linguistic techniques and taking into account the feedback users provide. Users are asked about their satisfaction degree (ej) in relation to the informa- tion alert generated by the system (i.e., whether the items A SEMANTIC MODEL OF SELECTIvE DISSEMINATION OF INFORMATION | MORALES-DEL-CASTILLO ET AL. 27 retrieved are interesting or not). This satisfaction degree is obtained from the linguistic variable “Satisfaction,” whose expression domain is the set of five linguistic labels: S’ = {Total, Very_high, High, Medium, Low, Very_low, Null}. This mechanism updates the satisfaction frequency associated with each user preference according to the satisfaction degree ej. It requires the use of a matching function similar to those used to model threshold weights in weighted search queries.31 The function proposed here rewards the frequencies associated with the preference val- ues present when resources assessed are satisfactory, and it penalizes them when this assessment is negative. Let ej { }T,=Hba,|Ss,s ba 0,...∈∈ S’ be the degree of satisfaction, and f j i l { }T,=Hba,|Ss,s ba 0,...∈∈ S the frequency of property i (in this case i = “Preference”) with value l, then we define the updating function g as S’x S→S: { } { } ( ) {=f,eg s WCP File Analysis: 201 records analyzed. Record: 71 OCLC Number: 243683394 Timestamp: 20080824000000.0 245: 10 |a Difference algebra /|c Levin Alexander. 245 h 245 n 245 p numerals keywords APPENDIx A. CatQC Document Instance Excerpt CATCQ AND SHELF-READY MATERIAL | JAY, SIMPSON, AND SMITH 47 490: 0 |a Algebras and applications ;|v v. 8 . . . APPENDIx B. CatQC Document Type Definition 48 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2009 3172 ---- 50 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Andrew K. Pace President’s Message: LITA Forever andrew K. Pace (pacea@oclc.org) is lITA President 2008/2009 and executive director, networked library Services at oclc Inc. in dublin, ohio. I was warned when I started my term as LITA president that my time at the helm would seem fleeting in retro- spect, and I didn’t believe it. I should have. I suppose most advice of that sort falls on deaf ears—advice to children about growing up, advice to newlyweds, advice to new parents. Some things you just have to experience. Now I am left with that feeling of having worked very hard while not accomplishing nearly enough. It’s time to buy myself some more time. My predecessor, Mark Beatty, likes to jokingly intro- duce himself in ALA circles as “LITA has-been” in refer- ence to his role as LITA past-president. I say jokingly because he and I both know it is not true. Not only does the past-president continue in an active role on the LITA board and executive committee, the past-president has the daunting task of acting as the division’s financial officer. Just as Mark knows well the nature of this elected (but still volunteer) commitment, so Michelle Frisque, my successor this July, knows that the hard work started as vice-president/ president-elect has two challenging years ahead. Being elected LITA president is for all intents and purposes a three-year term with shifting responsibili- ties. Add to this the possibility of serving on the board beforehand, and it’s likely that one could serve less time for knocking over a liquor store. I’m joking, of course— there’s nothing punitive about being a LITA officer; it’s as rewarding as it is challenging. Neither is this intended to be a self-congratulatory screed as my last hurrah in print as LITA president. I’ve referred repeatedly to the grassroots success of LITA’s board, interest groups, dedicated committees, and engaged volunteers. The flatness of our division is often emulated by others. I thoroughly enjoy engagement with the LITA membership, face-to-face and virtual recruitment of new members and volunteers, and group meetings to discuss moving LITA forward. I love that LITA is fun. Fun and enjoyment, coupled with my dedication to the profession that I love, is why I plan to make the most of my time, even as a has-been. All those meetings, all that bureaucracy? Well, believe it or not, I like the bureaucracy—process works when you learn to work the process—and all those meetings have actually created some excellent feedback for the LITA board. Changes in ALA, changes in the membership, and changes suggested by committees and interest groups all suggest . . . guess what? Change. “Change” has been a popular theme these days. I’m in that weird minority of people who does not believe that people don’t like to change. I think if the ideas are good, if the destination is worthwhile, then change is possible and even desirable. I’m always geared up for change, for learning from our mistakes, for asking forgiveness on occasion and for permission even less. This is a long-winded way of saying that I think LITA is ready for some change. Change to the board, change to the committees and interest groups, and changes to our interactions with LITA and ALA staff. I think ALA and the other divisions are anxious for change as well, and I feel confident that LITA and its membership can help, even while we change ourselves. Don’t ask me today what the details of these changes are. All I can say is that I will be there for them, help see them through, and will be there on the other side to asses which changes worked and which didn’t. One thing I hope does not change is the passion and dedication of the leaders, volunteers, and members of this great organization. I only hope that our ranks grow, even in times of financial uncertainty. LITA provides a valuable network of colleagues and friends—this net- work is always valuable, but it is indispensible in times of difficulty. For many, LITA represents a second or third divisional membership, but for networking and collegial support, I think we are second to none. I titled my previous column “LITA Now.” I think it’s safe for me to say now, “LITA Forever.” 3173 ---- EDiTorial | TruiTT 51 Marc TruittEditorial: ALA and Our Carbon Footprint Obligatory disclaimer: Before proceeding, I want to state very clearly that—as with anything I write in this space that is not explicitly attributed to someone other than myself—the reflections that follow are my own thoughts and views. They in no way are intended to represent the views either official or personal of LITA or ALA officials or employees. W hile I am writing these lines just a week or so after the end of the American Library Association (ALA) Midwinter Meeting, by the time you see them the ALA Annual Conference in Chicago will be just days away. I’ve been reflecting (stewing?) for some time now about the question of ALA conferences: Why do I attend, and what do I get from these gatherings? Is the vendor/exhibitor “tail” wagging the ALA/attendee “dog”? Is attendance responsible in a time of straitened budgets? And, most recently, what is the environmental cost of attendance? For the moment, I’d like to consider only one of these. We all know that flying is, from an environmen- tal perspective, enormously wasteful and destructive. Yet, for attendance at ALA and most other professional conferences, air travel is the only practical means, unless either one is fortunate enough to live in the area or ALA holds the event in a place such as New York, Chicago, Philadelphia, or Washington, each of which can boast credible commuter rail service. Sadly, in most other places trains are really not an option; how many of us can imag- ine being able to take a long-distance Amtrak train to an ALA conference? So I wondered what it costs the environment for all of us to go to an ALA conference. The following admittedly broad-side-of-barn figures for the recently completed Midwinter Meeting in Denver are real eye-openers (you may not like my assumptions, but we have to assume some things, and after all, I’m only trying to get an order- of-magnitude number): A. Number of paid attendees at Midwinter Meeting 2009: 9,8501 B. “Fudge” figure for those who didn’t fly (local attendees or those close enough to use other means of transport): 1,000 C. Total number of attendees who flew (A-B): 8,850 D. Average distance to Denver (round trip, in metric tons of CO2 produced): .36352 E. Total metric tons of CO2—the “carbon footprint”— for all attendees who flew to Denver (C x D): 3,217 I’m guessing this is a conservative number; still, the total “carbon footprint” of all who flew to the Midwinter Meeting was more than 3,000 metric tons of CO2.3 That seems to me to be a giant’s footprint indeed for what we are told is primarily a “business meeting.” And this, of course, represents only that portion of the footprint that one identifies with air travel . . . enumerating the actual footprint would require taking into account many other sources of waste, with the resulting total being far larger. Is it just me, or does this seem to be an extravagance these days? Given that the vast majority of our “business meetings” can be transacted through video conference, teleconference, e-mail, or similar technological means, how do we continue to justify the indulgence of attend- ing such conferences as the planet warms to temperature levels not observed in thousands of years? At a minimum, I would suggest that it’s high time we—individually or as a profession—began to think hard about compensating for our excess by purchasing carbon credits. I personally think of them as “bleeding heart envi- ronmentalism,” that is, little more than a means for we “haves” to assuage our guilt about our profligate ways. But even offset payments would be better than nothing. The obvious way to handle this would be for ALA to add a modest ($5–10) surcharge to the meeting registration fee, with the resulting proceeds dedicated to an approved beneficiary. Let’s see . . . my “carbon footprint” for flying to Midwinter Meeting 2009 is .38 metric tons. I can purchase an “offset” for about $5 and apply it to any of several wor- thy causes shown on the carbonfootprint.com website. Ah, I feel better already . . . . . . or not. n More Midwinter Meeting Fallout One of the more interesting sessions I attended at the Midwinter Meeting was a sleeper bearing the title “Redefining Technical Services Workflows with OCLC.” Led by Karen Calhoun, OCLC’s vice president of WorldCat and Metadata Services, a panel that included Robin Fradenburgh of the University of Texas and my University of Alberta colleagues Kathy Carter and Sharon Marshall described several innovative OCLC services aimed at “improv[ing] efficiency and enhanc[ing] access to library materials.”4 Calhoun’s overview, “Reinventing Technical Services,” nicely summarized many of the issues facing technical services (TS) operations today, Marc Truitt (marc.truitt@ualberta.ca) is Associate director, Bibliographic and Information Technology Services, University of Alberta libraries, edmonton, Alberta, canada, and editor of ITAL. 52 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 including declining staff counts and the desire by library administrators to reclaim for patron use the space cur- rently occupied by TS operations. She then reviewed recent studies about our patrons’ changing preferences for research tools—i.e., the question that has often been cast as “Google versus the catalog.” Precisely how work- flow and organizational efficiencies (whether or not they come from OCLC) in TS can alter our users’ research hab- its is a bit beyond me, but I’ll leave it to you to decide. The presentations are available to view at http://www.oclc. org/us/en/multimedia/2009/ALA_MW_Redefining_ Technical_Services.htm; do listen to the presentations and decide for yourself. In any case, Calhoun’s talk, and an earlier comment made by a colleague and long-time friend of mine, got me to thinking again about “the catalog.” My friend, when asked at another program held just before the Midwinter Meeting, had said that the TS efficiency she would like most to institute would be “to stop cataloguing new (trade) books.” Instead, we should put our limited cata- loging resources where they might best be used, that is, in making rare and unique local resources discoverable. Whoa!, I thought at the time. How might we do this? As Calhoun talked about our users’ preference for discovery outside of the catalog, my mind wandered back to my friend’s comment. WorldCat Local? Probably not, since it would still involve “cataloging” books, and doesn’t seem likely to be any more appealing to the Google and Amazon–focused user than are our OPACs already. But what about Amazon? I can envision a “cata- log” search that begins at Amazon’s already metadata- rich site, enhanced with links to local holdings of all the things listed there—AmazonCat Local, if you will. Blue-skying a bit more, I can imagine Amazon’s business model for offering this kind of service. Not only would there be even more eyeballs on its site than there are now, but a library considering such a service might offer in return that some or all of its acquisitions be sourced to Amazon. Conceivably, Amazon could even offer a shelf-ready service, in which it provided the materials already barcoded, marked, and ready to park on our shelves. Hmmm . . . open the box, shelve the already-in-the-“catalog” books, and pay the invoice. Sounds pretty simple, no? Things are rarely that simple, and I know that. There would be complexities aplenty, but who knows? Am I serious? I make this proposal because I come from a back- ground that respects and values the work of catalogers and other TS staff. Part of me wants the idea to be tried and found wanting, that some of those who argue that library cataloging is “dead” might then come to a dif- ferent view. But, either way, what we’d need would be a sizable institution willing to try it and see. Who wants to be the pilot site? AmazonCat Local, anyone? References and notes 1. Library Journal.com, “With Economy Sputtering, ALA Midwinter Attendance Dips Sharply,”www.libraryjournal.com/ index.asp?layout=talkbackCommentsFull&talk_back_header_ id=6582196&articleid=CA6632569#129349 (accessed Feb. 5, 2009). According to LibraryJournal.com, the count on Saturday, January 24, was 9,850, including 7,689 registrants (of whom 498 were on-site registrants) and 2,161 exhibitors. 2. I used the Carbon Footprint Calculator at www .carbonfootprint.com/calculator.aspx (accessed Feb. 5, 2009) to compute the CO2 footprint in metric tons for one round-trip flight between Denver and each of the following cities: Atlanta (.40), Boston (.58), Chicago (.30), Dallas (.22), Houston (.29), Los Angeles (.27), Miami (.57), Minneapolis (.23), New York–JFK (.54), Philadelphia (.52), Phoenix (.19), Pittsburgh (.43), Salt Lake City (.23), San Diego (.27), San Francisco (.31), Seattle (.34), and Washington, D.C. (.49). I then averaged these for an “average trip” production of .3635 metric tons. 3. According to Wikipedia, one metric ton equals 2,204.6226 lbs. or 1.102 U.S. tons. Thus, 3,217 metric tons equals approxi- mately 3,545 U.S. tons. Wikipedia, “Tonne,” http://en.wikipedia .org/wiki/Tonne (accessed Feb. 5, 2009). 4. OCLC, Redefining Technical Services Workflows with OCLC, www.oclc.org/us/en/multimedia/2009/ALA_MW_ Redefining_Technical_Services.htm (accessed Feb. 25, 2009). 3176 ---- PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 81 John Carlo Bertot Public Access Technologies in Public Libraries: Effects and Implications Public libraries were early adopters of Internet-based technologies and have provided public access to the Internet and computers since the early 1990s. The landscape of public-access Internet and computing was substantially different in the 1990s as the World Wide Web was only in its initial development. At that time, public libraries essentially experimented with public- access Internet and computer services, largely absorbing this service into existing service and resource provision without substantial consideration of the management, facilities, staffing, and other implications of public-access technology (PAT) services and resources. This article explores the implications for public libraries of the provi- sion of PAT and seeks to look further to review issues and practices associated with PAT provision resources. While much research focuses on the amount of public access that public libraries provide, little offers a view of the effect of public access on libraries. This article provides insights into some of the costs, issues, and challenges associated with public access and concludes with recommendations that require continued exploration. P ublic libraries were early adopters of Internet-based technologies and have provided public access to the Internet and computers since the early 1990s.1 In 1994, 20.9 percent of public libraries were connected to the Internet, and 12.7 percent offered public-access com- puters. By 1998, Internet connectivity in public libraries grew to 83.6 percent, and 73.3 percent of public librar- ies provided public Internet access.2 The landscape of public-access Internet and computing was substantially different in the 1990s, as the World Wide Web was only in its initial development. At that time, public libraries essentially experimented with public-access Internet and computer services, largely absorbing this service into existing service and resource provision without substan- tial consideration of the management, facilities, staffing, and other implications of public-access technology (PAT) services and resources.3 Using case studies conducted at thirty-five public libraries in five geographically dispersed and demograph- ically diverse states, this article explores the implications for public libraries of the provision of PAT. The researcher also conducted interviews with state library agency staff prior to visiting libraries in each state. The goals of this article are to n explore the level of support PAT requires within public libraries; n explore the implications of PAT on public libraries, including management, building planning, staff- ing, and other support issues; n explore current PAT support practices; n identify issues and challenges public libraries face in maintaining and supporting their PAT infra- structure; and n identify factors that contribute to successful PAT practices. This article seeks to look beyond the provision of PAT by public libraries and review issues and practices associated with PAT–provision resources. While much research focuses on the amount of public access that public libraries provide, little offers a view of the effect of public access on libraries. This article provides insights into some of the costs, issues, and challenges associated with public access, and it concludes with recommenda- tions that require continued exploration. n Literature review Quickly over time, public libraries increased their pub- lic-access provision substantially (see figures 1 and 2). Connectivity grew from 20.9 percent in 1994 to nearly 100 percent in 2006.4 Moreover, nearly all libraries that connected to the Internet offered public-access Internet services. Simultaneously, the average number of public- access computers grew from 1.9 per public library in 1996 to 12 per public library in 2007.5 Accompanying and in support of the continual growth of basic connec- tivity and computing infrastructure was a demand for broadband connectivity. Indeed, since 1994, connectiv- ity has progressed from dial-up phone lines to leased lines and other forms of high-speed connectivity. The extent of the growth in public-access services within public libraries is profound and substantive, leading to the development of new Internet-based service roles for public libraries.6 And public access to the Internet through public libraries provides a number of com- munity benefits to different populations within served communities.7 Overlaid onto the public-access infrastructure is an increasingly complex service mix that now includes access to digital content (e.g., databases and digital John Carlo Bertot (jbertot@umd.edu) is Professor and director of the center for library Innovation in the college of Information Studies at the University of Maryland, college Park. 82 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 libraries), integrated library systems (ILSs), Voice over Internet Protocol (VoIP), digital reference, and a host of other services and resources—some for public access, others for back-office library operations. And patrons do use these services in increasing amounts—both in the library and in everyday life.8 In fact, 82.5 percent of public libraries report that they do not have an adequate number of public-access computers some or all of the time and have resorted to time limits and wireless access to extend public-access services.9 By 2007, as connectivity and public-access computer infrastructure grew, so ensued the need to provide a range of publicly available services and resources: n 87.7 percent of public libraries provide access to licensed databases n 83.4 percent of public libraries offer technology training n 74.1 percent of public libraries provide e-govern- ment services (e.g., locating government infor- mation and helping patrons complete online applications) n 62.5 percent of public libraries provide digital refer- ence services n 51.8 percent of public libraries offer access to e-books10 The list is not exhaustive, but illustrative, since librar- ies do offer other services such access to homework resources, video content, audio content, and digitized collections. As public libraries expanded these services, man- agement realized that they needed to plan and evalu- ate technology-based services. Over the years, a range of technology management, planning, and evaluation resources emerged to help public libraries cope with their technology-based resources—those both publicly avail- able and for administrative operations.11 But increasingly, public libraries report the strain that PAT services promulgate. This centers on four key areas: n Maintenance and Management. The necessary main- tenance and management requirements of PAT places an additional burden on existing staff, many of whom do not possess technology expertise to troubleshoot, fix, and support Internet-based ser- vices and resources that patrons access. n Staff. Libraries consistently cite staff expertise and availability as a barrier to the addition, support, and management of PAT. Indeed, as described in previous sections, some libraries have experienced a decline in library staff. n Finances. There is evidence of stagnant funding for libraries at the local level as well as a shift in expen- ditures from staff and collections to operational costs such as utilities and maintenance. n Buildings. The buildings are inadequate in terms of space and infrastructure (e.g., wiring and cabling) to support additional public access.12 This article explores these four areas through a site- visit method in an effort to go beyond a quantitative assessment of PAT within the public library community. Though related in terms of topic area and author, this study was conducted separately from the Public Library Internet surveys conducted since 1994 and offers insights into the provision of PAT services and resources that a national survey cannot explore in such depth. Figure 1. Public-access Internet connectivity from 1994 through 2008 Figure 2. Public-access Internet workstations from 1996 through 2008 PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 83 n Method The researcher visited thirty-five public libraries in five geographically and demographically diverse states between October 2007 and May 2008. The states were in the West, Southwest, Southeast, and Mid-Atlantic regions. The libraries visited included urban, suburban, rural, and Native American public libraries that served populations ranging from a few hundred to more than half a million. The communities that the libraries served varied in terms of poverty, race, income, age, employment, and education demographics. Prior to visiting the public library sites, the researcher conducted interviews with state library agency staff to better understand the public library con- text within each state and to explore overall PAT issues, strategies, and other factors within the state. The following research questions guided the site visits: n What are the community and library contexts in which the library provides PAT? n What are the PAT services and resources that the library makes available to its community? n What PAT services and resources does the library desire to provide to its community? n What is the relationship between provided and desired PAT and the effect on the library (e.g., staff, finances, the building, and management)? n What are the perceived benefits to the library and its community gains through PAT in the library? n What are the issues and barriers that the library encounters in providing PAT services and resources? n How does the library manage and maintain its PAT? The researcher visited each library for four to six hours. During that time, he interviewed the library direc- tor and/or branch manager and technology support staff (either a specific library position, designated library employee, or city or county IT staff person), toured the library facility, and conducted a brief technology inven- tory. At some libraries, the researcher was able to meet with community partners that in some way collaborated with the library to provide PAT services and resources (e.g., educational institutions that collaborated with libraries to provide access to broadband or volunteers who conducted technology training sessions). Interviews were recorded and transcribed, and the technology inventories were entered into a Microsoft Excel spreadsheet for analysis. The transcripts were coded using thematic content analytic schemes to allow for the identification of key issues regarding PAT areas.13 This approach enabled the researcher to use an iterative site-visit strategy that used findings from previous site visits to inform subsequent visits. To ensure valid and reliable data, the researcher used a three-stage strategy: 1. Site-visit reports were completed and sent to th libraries for review. Corrections from libraries were incorporated into a final site-visit report. 2. A final state-based site-visit report was compiled for distribution to state library agency staff and also incorporated their corrections. This provided a state-level reliability and validity check. 3. A summary of key findings was distributed to six experts in the public library technology environ- ment, three of which were public library technol- ogy managers and three of which were technology consultants who worked with public libraries. In combination, this approach provided three levels of data quality checks, thus providing both internal (library and state) and external (technology expert) support for the findings. The findings in this article are limited to the libraries visited and interviews conducted with public librarians and state library agency staff. However, themes emerged early during the site-visit process and were reinforced through subsequent interviews and visits across the states and libraries visited. In addition, the use of external reviewers of the findings lends additional, but limited, support to the findings. n Findings This section presents the results of the site visits and interviews with state library agency staff and public librarians. The article presents the findings by key areas surrounding PAT in public libraries. The public-access context Public libraries have a range of PAT installed in their libraries for patron use. These technologies include pub- lic-access computers, wireless (WiFi) access, ILSs, online databases, digital reference, downloadable audio and video, and others. Many of these services and resources are also available to patrons from outside library build- ings, thus extending the reach (and support issues) of the library beyond the library’s walls. In addition, when libraries do not provide direct access to resources and ser- vices, they serve as access points to those services, such as online gaming and social networking. While libraries can and do deploy a number of technologies for public use, it is possible to group these 84 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 technologies broadly into two overlapping categories: n Hardware. Library PAT hardware can include pub- lic-access computers, public-access computing reg- istration (i.e., reservation) systems, self-checkout stations, printers, faxes, laptops, and a range of other devices and systems. Some of these technolo- gies may have additional devices, such as those required for persons with disabilities. Within the hardware grouping are networking technologies that include a range of hardware and software to enable a range of library networks to run (e.g., routers, hubs, switches, telecommunications lines, and networking software). n Software. Software can include device operating system software (e.g., Microsoft Windows, Mac OS, and Linux), device application software (e.g., Microsoft Office, OpenOffice, graphics software, audio software, e-book readers, assistive software, and others), and functional software (e.g., Web browsers, online databases, and digital reference). In short, public libraries make use of a range of tech- nologies that the public uses in some way. Each type of technology requires skills, management, implementation, and maintenance, all of which are discussed later. In the building, all of these products and services come together at the library’s public-access computers, or patron mobile device if WiFi is available. Moreover, patrons increasingly want to use their portable devices (e.g., USB drives, iPods, and others) with library tech- nology. This places pressure on libraries to not just offer public-access computers, but also to support a range of technologies and services. Thus the environment in which libraries offer PAT is complex and requires substantial technical expertise, support, and maintenance in key areas of applications, computers, and networking. Moreover, as discussed below, patrons are increasingly demanding market-based approaches to PAT. These demands—which are largely about single-point access to a range of information ser- vices and resources—are often at odds with library tech- nology that is based on stove-piped approaches (e.g, ILS, e-books, and licensed resources) and that do not necessar- ily lend themselves to seamless integration. n External pressures on PATs The advent and increased use by the public of Google, Amazon, iTunes, YouTube, MySpace, Second Life, and other networked services affects public libraries in a number of ways. This article discusses these services and resources from the perspective of an information marketplace of which the public library is one entrant. Interviewed librarians overwhelmingly indicated that users now expect library services to resemble those in the marketplace. Users expect the look and feel, integration, service capabilities, interactivity, and personalization and customization that they experience while engaging in social networking, online searching, online purchas- ing, or other online activities. And within the library building, patrons expect the services to integrate at the public-access computer entry point—not distributed throughout the library in a range of locations, worksta- tions, or devices. Said differently, they expect to have a “MyLibrary.com” experience that allows for seamless integration across the library’s services but also facilitates the use of personal technologies (e.g., iPods, MP3 players, and USB devices). Thus users expect the library’s services to resemble those services offered by a range of informa- tion service providers. Importantly, however, librarians indicated that library systems on which their services and resources reside by and large do not integrate seamlessly—nor were they designed to do so. Public-access computers are gateways to the Internet; the ILS exists for patrons to search for and locate library holdings; and online databases, e-books, audiobooks, etc., are extensions of the library’s holdings but are not physical items under a library’s control and thus subject to a vendor’s information and business mod- els. While library vendors and the library community are working to develop more integrated products that lead users to the information they seek, the technology is under development. There are three significant issues that libraries face because of market pressures: (1) The pressures all come together at a single point—the public-access computer; (2) users want a customized experience while using tech- nology designed for the general public, not the individual user; and (3) users have choices in the information mar- ketplace. One participant indicated, “If the library cannot match what users have access to on the outside, users will and do move on.” Managing and maintaining public access Managing the public-access computer environment for public libraries is an growing challenge. There are a num- ber of management areas with which public librarians contend: n Public-access computers—the computers and laptops (if applicable) themselves, which can include any- thing from keyboards and mice to troubleshooting a host of computer problems (it is important to note that these may be computers that often vary in age and composition, come from a range of ven- dors, run different operating systems, and often PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 85 have different application software versions). n Peripheral management—the printers, faxes, scan- ners, and other equipment that are part of the library’s overall public access infrastructure. n Public-access management software or systems—these may include online or in-building computer-based reservations (which encompasses specialized reser- vations such as teen machines, gaming computers, computers for seniors, and so on), time manage- ment (set to the library’s decided-upon time allot- ment), filtering, security, logins, virtual machines, etc. n Wireless access—this may include logins and config- urations for patrons to gain access to the library’s wireless network. n Bandwidth management—this may include the need to allocate bandwidth differently as needs increase and decrease in a typical day. n Training and patron assistance—for a vast array of services such as databases, online searching, e-government (e.g., completing government forms and seeking government information), and others. Training can take place formally through classes, but also through point-of-use tutorials requested by patrons. To some extent, librarians commented that, while they do have issues with the public-access computers themselves from time to time, the real challenges that they face regard the actual management of the public- access environment—sign-ups, time limits, cost recovery for print jobs, helping patrons, and so on. One librarian commented that “the computers themselves are pretty stable. We don’t really have too many issues with them per se. It’s everything that goes into, out from, or around the computer that creates issues for us.” As a result of the management challenges, several librar- ies have adopted turn-key solutions, such as public-access management systems (e.g., Comprise Technology’s Smart Access Manager [http://www.comprisetechnologies .com/product_29.html]) and all-encompassing public computing management systems that include networking and desktops (e.g., Userful’s DiscoverStations [http:// userful.com/libraries/]). These systems allow for an all- in-one sign-up, print cost recovery, filtering (if desired), and security approach. Also, the DiscoverStations are a Linux-based, all encompassing public-access management environment. A clear advantage to the DiscoverStation approach is that the DiscoverStation is connected to the Internet and is accessible by Userful staff remotely to update software and perform other maintenance func- tions. They also use open-source operating and applica- tion software. While these solutions do provide efficiencies, they also can create limitations. For example, the DiscoverStations are a thin-client system and are dependent on the server for graphics and memory, thus limiting their ability to access gaming and social-networking sites. The Smart Access Manager, and similar programs, can rely on smart cards or other technology that users must purchase to print. Another limitation is that the time limits are fixed, and, while users get warnings as time runs out, the ses- sion can end abruptly. These approaches are by and large adopted by librar- ies to ease the management associated with public-access computers and let staff concentrate on other duties and responsibilities. One librarian indicated that “until we had our management system, we would spend most of the day signing people up for the computers, or asking them to finish their work for the next person in line.” n Planning for PAT services and resources Public libraries face a number of challenges when plan- ning for PAT services and resources. This is primarily because PAT planning involves more than computers. Any planning needs to encompass n building needs, requirements, limitations, and design; n technology assessment that considers the library’s existing technology, technology potential, current practices, and future trends; n planning for and supporting multiple technology platforms; n telecommunications and networking; n services and resources available in the market- place—those specifically for libraries and those more broadly available to consumers and used by patrons; n specific needs and requirements of technology (e.g., memory, disk space, training, other); n requirements of other IT groups with which the library may need to integrate, for example, city or county technology mandates; n support needs, including the need to enter into maintenance agreements for computer, network, and other equipment and software; n staff capabilities, such as current staff skill sets and their ability to handle the technologies under review or purchased; and n policy, such as requirements to filter because of local, state or federal mandates. The above list may not be exhaustive, but rather based on the main items that librarians identified during the site visits, and they serve to provide indicators of the chal- lenges those planning library IT initiatives face. 86 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 n The endless upgrade and planning One librarian likened the PAT environment to “being a gerbil on a treadmill. You go round and round and never really arrive,” a reference to the fact that public libraries are in a perpetual cycle of planning and implementing vari- ous PAT services and resources. Either hardware needs to be updated or replaced, or there is a software update that needs to be installed, or libraries are looking to the next technology coming down the road. In short, the technol- ogy planning to implementation cycle is perpetual. The upgrade and replacement cycle is further exac- erbated by the funding situation in which most public libraries find themselves. Increasingly, public library local and state funding, which combined can account for more than 90 percent of library funding, is flat or declining.14 The most recent series of Public Library Internet studies indicates an increase in reliance by public libraries on fees and fines, fundraising, private foundation, and grant funding to finance collections and technology within libraries.15 This places key aspects of library operations in the realm of unreliable and one-time funding sources, thus making it difficult for libraries to develop multiyear plans for PAT. n Multiple support models To cope with PAT management and maintenance issues, public libraries are developing various support strategies. The site visits found a number of technology-support approaches in effect, ranging from no IT support to highly centralized statewide approaches. The following list describes the technology-support models encoun- tered during the site visits: 1. No technology support. Libraries in this group have neither technology-support staff nor any type of organized technology-support mechanism with existing library staff. Nor do they have access to external support providers such as county or city IT staff. Libraries in this group might rely on vol- unteers or engage in ad hoc maintenance, but by and large have no formal approach to supporting or maintaining their technology. 2. Internal library support without technology staff. In this model, the library provides its own technology support but does not necessarily have dedicated technology staff. Rather, the library has desig- nated one or more staff members to serve as the IT person. Usually this person has an interest in technology but has other primary responsibilities within the library. There may be some structure to the support—such as updating software (e.g., Windows patches) once a week at a certain time— but it may be more ad hoc in approach. Also, the library may try to provide its designated IT person(s) with training to develop his or her skills further over time. 3. Internal library support with technology staff. In this model, the library has at least one dedicated IT staff person (part- or full-time) who is responsible for maintaining and planning the library’s PAT environment. The person may also have respon- sibilities for network maintenance and a range of technology-based services and resources. At the higher end of this approach are libraries with mul- tiple IT staff with differing responsibilities, such as networking, telecommunications, public-access computers, the ILS, etc. Libraries at this end of the spectrum tend to have a high degree of technology sophistication but may face other challenges (i.e., staffing shortages in key areas). 4. Library consortia. Over the years, public libraries have developed consortia for a range of services— shared ILSs, resource sharing, resource licensing, and more. As public-library needs evolve, so too do the roles of library consortia. Consortia increas- ingly provide training and technology-support services, and may be funded through membership fees, state aid, or other sources. 5. Technology partners. While some libraries may rely on consortia for their technology support, others are seeking libraries that have more technology expertise, infrastructure, and abilities with whom to partner. This can be a fee-for-service arrange- ment that may involve sharing an ILS, a mainte- nance agreement for network and public-access computer support, and a range of services. These arrangements allow the partner libraries to have some input into the technology planning and implementation processes without incurring the full expense of testing the technologies, having to implement them first, or hiring necessary staff (e.g., to manage the ILS). The disadvantage to this model is that the smaller partner libraries are dependent on the technology decisions that the primary partner makes, including upgrade cycles, technology choices, migration time frames, etc. 6. City, county, or other agency IT support. As city or county government agencies, some libraries receive technology support from the city or county IT department (or in some cases the education department). This support ranges from a full slate of services and support available to the library to support only for the staff network and computers. PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 87 Even at the higher end of the support spectrum, librarians gave mixed reviews for the support received from IT agencies. This was primarily because of competing philosophies regarding the PAT environment, with public librarians wanting an open-access policy to allow users access to a range of information service and resources and IT agency staff wanting to essentially lock down the public-access environment and thus severely limit the functionality of the public-access computers and network services (i.e., wireless). Other limita- tions might include prescribed PAT, specified ven- dors, and bidding requirements. 7. State library support. One state library visited pro- vides a high degree of service through its statewide approach to supporting public-access computing in the state’s public libraries. The state library has IT staff in five locations throughout the state to provide support on a regional level but also has additional staff in the capital. These staff offer training, in- house technical support, phone support, and can remote access the public-access computers in public libraries to troubleshoot, update, and perform other functions. Moreover, this state built a statewide network through a statewide application to the fed- eral E-Rate program, thus providing broadband to all libraries. This model extends the availability of qualified technical support staff to all public librar- ies in the state—by phone as well as in person if need be. As a result, this enables public libraries to concentrate on service delivery to patrons. It is important to note that there are combinations of the above models in public libraries. For example, some libraries support their public-access networks and tech- nology while the county or city IT department supports the staff network and technology. It is clear, however, that there are a number of models for technology support in public libraries, and likely more than are presented in this article. The key issue is that public libraries are engaging in a broad spectrum of strategies to support, maintain, and manage their PAT infrastructure. Also of significance is that there are public libraries that have no technology-support services that provide PAT services and resources. These libraries tend to serve populations of less than ten thousand, are rural, have fewer than five full-time equivalents (FTEs), and are unlikely to be staffed by professional librarians. staff needs and pressures The study found a number of issues related to the effect of PAT on library staff. This section of the findings discusses the primary factors affecting library staff as they work in the public-access context. n Multiple skills needed Not only is the pace of technological change increasing, but the change requires an ever-increasing array of skills because of the complexity of applications, technolo- gies, and services. An example of such complexity is the library OPAC or ILS. Visited libraries indicated that such systems are becoming so complex and technologically sophisticated that there is a need for a full-time staff per- son to run and maintain the library ILS. Given the range of hardware, software, and network- ing infrastructure, as well as planning and PAT manage- ment requirements, public librarians need a number of skills to successfully implement and maintain their PAT environments. Moreover, the skill needs depend on the librarian’s position—for example, an actual IT staff person versus a reference librarian who does double duty by serv- ing as the library’s IT person. The skills required fall into technology, information literacy, service and facilities plan- ning, management, and leadership and advocacy areas: n Technology o General computer troubleshooting o Basic maintenance, such as mouse and key- board cleaning o Basic computer repair, such as memory replacement, floppy drive replacement, disk defragmentation, etc. o Basic networking, such as troubleshooting an “Internet” issue versus a computer problem o Telecommunications so as to understand the design and maintenance of broadband net- works o Integrated library systems o Web design n Information literacy o Searching and using Internet-based resources o Searching and using library licensed resources o Training patrons on the use of the public- access computers, general Internet resources, and library resources o Designing curriculum for various patron training courses n Services and facilities planning o Technology plan development and imple- mentation (including budgeting) o Telecommunications planning (including 88 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 E-Rate plan and application development) o Building design so as to accommodate the requirements of public access technologies n Management o License and contract negotiation for licensed resources, various public-access software and licenses, and maintenance agreements (service and repair agreements) o Integration of PAT into library operations o Troubleshooting guidelines and process o Policy development, such as acceptable use, filtering, filtering removal requests by patrons, etc. n Leadership and advocacy o Grant writing and partnership development so as to fund PAT services and resources and extend out into the community that the library serves o Advocacy so as to be able to demonstrate the value of PAT in the library as a community good o Leadership so as to build a community approach to public access with the library as one of the foundational institutions These items provide a broad cross section of the skills that public library staff may need to offer a robust PAT environment. In the case of smaller, rural libraries, these requirements in general fall to the library director—along with all other duties of running the public library. In librar- ies that have separate technology, collections development, and other specialized staff, the skills and expertise may be dispersed throughout various areas in the library. n Training Public librarians receive a range of technology training— including none at all. In some cases, this might be a basic workshop on some aspect of technology at a state library association annual meeting or a regional workshop hosted by the library’s consortium. It could be an online course through WebJunction (http://www.webjunction .org/). It could be a one-on-one session with a vendor representative or colleague. Or it could be a formal, mul- tiday class regarding the latest release of an ILS. If avail- able, public librarians have access to technology training that can take many forms, has a wide array of content (basic to expert), and can enhance staff knowledge about IT with varying degrees of success. An issue raised by librarians was that having access to training and being able to take advantage of training are two separate things. Regardless of the training delivery medium, librarians indicated that they were not always able to get release time to attend a training session. This was particularly the case for small, rural libraries that had less than five FTEs spread out over several part-time individuals. For these staff to take advantage of train- ing would require a substitute to cover public-service hours—or shut down the library. Funding information technology As one might expect, there was a range of technology budgets in the public libraries visited or interviewed— from no technology budget to a substantial technology budget, and many points in between. Some libraries had a dedicated IT budget line item, others had only an oper- ating budget out of which they might carve some funds for technology. Libraries with dedicated IT budgets by and large had at least one IT staff person; libraries with no IT budget largely relied on a staff person responsible for other library functions to manage their technology. In the smallest libraries, the library director served as the tech- nology specialist in addition to being the general library operation manager. Some libraries have established foundations through which they can raise funds for technology, among other library needs. Many seek grants and thus devote substan- tial effort to seeking grant initiatives and writing grant proposals. Some libraries held fundraisers and worked with their Library Friends groups to generate funds. Other libraries engage in all of the above efforts to provide for their PAT infrastructure, services, and resources. In short, there are several budgetary approaches public libraries use to support their PAT environment. Critical to note is that a number of libraries are increasingly relying on nonrecur- ring funds to support PATs, a fact corroborated by the 2007 and 2008 Public Library Internet surveys.16 The buildings When one visits public libraries, one is immediately struck by the diversity in design, functionality, and archi- tecture of the buildings. Public libraries often reflect the communities that they serve not only in the collection and service, but also in the facilities. This diversity serves the public library community well because it allows for a custom approach to libraries and their community. The building design, however, can also be a source of substantial challenge for public libraries. The increased integration of technology into library service places a range of stresses on buildings—physical space for work- stations and other equipment and specialized furniture, power, server rooms, and cabling, for example. Along with the library-based technology requirements come those of patrons—particularly the need for power so that PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 89 patrons may plug in their laptops or other devices. Also important to note is that the building limitations also extend to staff and their access to computing and net- worked technologies. A number of librarians commented that they are “sim- ply at capacity.” One librarian summed it up by stating that “there’s no more room at the inn. Unless we start removing parts of our collection, we don’t have any more room for workstations.” Another said that, “while we do have the space to add more computers, we don’t have enough power or outlets to support them. And, with our building, it’s not a simple thing to add.” In short, many libraries are reaching, or have reached, a saturation point as to just how much PAT they can support. n Discussion and implications Over time, PAT services have become essential services that public libraries provide their communities. With nearly all public libraries connected to the Internet and offering public-access computers, the high percentage of libraries that offer Internet-based services and resources, the overall usage of these resources by the public,17 and 73 percent of public libraries reporting that they are the only free provider of PAT in their communities, it is clear that the provision of PAT services is a key and critical ser- vice role that public libraries offer.18 It is also clear, how- ever, that the extent to which public libraries can continue to absorb, update, and expand their PAT depends on the resolution of a number of staffing, financial, maintenance and management, and building barriers. In a time of constrained budgets, it is unlikely that libraries will receive increased operational funding. Indeed, reports of library funding cuts are increasing in the current economic downturn, which affects the ability of libraries to increase, or significantly update, staff—particularly in the areas of technology, licensing additional resources, procuring additional and new com- puters, and purchasing and offering expanded services such as digital photography, gaming, or social network- ing.19 Moreover, the same financial constraints can affect the ability of libraries to raise capital funds for building improvements and new construction. Funding also has an effect on the training that public libraries can offer or develop for their staff. And training is becoming increasingly important to the success of PAT services and resources in public libraries—but not just training regarding the latest technologies. Rather, there is a need for training that provides instruction on the rela- tionship between the level of PAT services and resources a library can or desires to provide and advocacy; broad- band, computing, and other needs; technology planning and management; collaboration and partnering; and leadership. The public library PAT environment is com- plex, encompasses a number of technologies, and has ties to many community services and resources. Training programs need to reflect this complexity. The continued provision of PAT services in public libraries is increasingly burdensome on the public library community, and the pressures to expand their PAT ser- vices and resources continues to grow—particularly as libraries report their “sole provider” of free PAT status in their communities. The successful libraries in terms of PAT services and resources visited had staff that could n understand PAT (both in terms of functionality and potential); n think creatively across the technology and library service spectrum; n integrate online content, PAT, and library services; n articulate the value of PAT as an essential commu- nity need and public library service; n articulate the role of the perception of the library by its community as a critical bridge to online con- tent; n demonstrate leadership within the community and library; n form partnerships and extend PAT services and resources into the community; and n raise funds and develop other support mecha- nisms to enhance PAT services and resources in the library and throughout the community. In short, successful PAT in libraries was being rede- fined in the context of communitywide PAT service and resource provision. This approach not only can lead to a more robust community PAT infrastructure, but it also lessens the library’s burden of PAT service and resource provision. But equally important to note is that the extent to which all public libraries can engage in these activities on their own is unclear. Indeed, several libraries visited were struggling to maintain basic PAT service levels and indi- cated that increasing PAT services came at the expense of other library services. “We’re trying to meet demand,” one librarian said, “but we have too few computers, too slow a connection, and staff don’t always know what to do when things go wrong or someone comes in talking about the latest technology or website.” For some librar- ies, therefore, quality PAT services that meet community needs are simply out of reach. Thus another implication and finding of the study is the need for libraries to explore other models of sup- port for their PAT environments—for example, using the services of a regional cooperative, if available; if none is available, libraries could form their own cooperative for resource sharing, technology support, and other aspects of PAT service provision. The same approach could be 90 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 taken within a city or county to enhance technology sup- port throughout a region. Another approach would be to outsource a library’s PAT support and maintenance to a nearby library with support staff in a fee-for-service approach. There are a number of approaches that librar- ies could take to support their PAT infrastructure. A key point is that libraries need to consider PAT service provi- sion in a broader community, regional, or state context, and the study found some libraries doing so. The need to avail staff of the skills required to truly support PAT was a recurring theme throughout the site visits. Approaches and access to training var- ied. For example, some state libraries provided—either directly or through the hiring of consultants and instruc- tors—a number of technology-related courses taught in regional locations. An example of this approach is California’s InfoPeople project (http://www.infopeople .org/). Some state libraries subscribed to WebJunction (http://www.webjunction.org/), which provides access to online instructional content. Online manuals provided by CompuMentor through a grant funded by the Bill and Melinda Gates Foundation aimed at helping rural libraries support their PAT (www.maintainitproject.org) are another resource. Beyond technology skills training, however, is the need for technology planning, effective communica- tion, leadership, value demonstration, and advocacy. The extent to which leadership, advocacy, and library market- ing, for example, are able to be taught remains a question. All of these issues take place with the backdrop of an economic downturn and budgetary constraints. Increased operating costs created through inflation and higher energy costs place substantial pressures on public libraries simply to maintain current levels of service— much less engage in the additional levels of service that the PAT environment brings. Indeed, as the 2008 Public Library Funding and Technology Access Study demonstrated, public libraries are increasingly funding their technology-based services through non-recurring funds such as fines and fundraising activities.20 Thus, the ability of public libraries to provide robust PAT services and resources is increasingly limited unless such service provision comes at the expense of other library services. Alone, the financial pressures place a high burden on public libraries. Combined with the building, staffing, skills, and other constraints reported by public libraries, however, the emerging picture for library PAT services and resources is one of significant challenge. n Three key areas for additional exploration The findings from the study point to the need for addi- tional research and exploration of three key services areas and issues related to PAT support and services: 1. Develop a better understanding of success in the PAT environment. This study and the 2006 study by Bertot et al. point to what is required for libraries to be successful in a networked environment.21 In fact, the 2007 Public Libraries and the Internet report contained a section entitled “The Successfully Networked Public Library,” which offered a range of checklists for public libraries (and others) to consider as they planned and implemented their networked services.22 This study identified addi- tional success factors and considerations focused specifically on the public access technology envi- ronment. Together, these efforts point to the need to better understand and articulate the critical suc- cess factors necessary for public libraries to plan, implement, and update their PAT given current service contexts. This is particularly necessary in the context of meeting user expectations and needs regarding networked technologies and services. 2. Further identify technology-support models. This study uncovered a number of different technology- support models implemented by public libraries. Undoubtedly there are additional models that require identification. But, more importantly, there is a need to further explore how each technology- support model assists libraries, under what cir- cumstances, and in what ways. Some models may be more or less appropriate on the basis of the ser- vice context of the library—and that is not clearly understood at this time. 3. Levels of service capabilities. An underlying theme throughout this research, and one that is increasingly supported by the Public Library and the Internet studies, is that the PAT service context is essentially a continuum from low service and capability to high service and capability. There are a number of factors contributing to where libraries may lie on the success continuum—funding, management, leadership, attitude, skills, community support, and innovation, to name a few. This continuum requires additional research, and the research implications could be profound. Emerging data indicate that there are public libraries that will be unable to con- tinue to evolve and meet the increased demands of the networked environment, both in terms of staff and infrastructure. Public libraries will have to make choices regarding the provision of PAT ser- vices and resources in light of their ability to provide high-quality services (as defined by their service communities). For better or worse, the technology environment continually evolves and requires new technologies, management, and support. That is, PuBliC aCCEss TECHNoloGiEs iN PuBliC liBrariEs | BErToT 91 and will continue to be, the nature of public access to the Internet. Though there are likely other issues worthy of explo- ration, these three are critical to further our understand- ing of the PAT environment and public library roles and issues associated with the provision of public access. n Conclusion The PAT environment in which public libraries operate is increasingly complex and continues to grow in funding, maintenance and management, staffing, and building demands. Public libraries have navigated this environ- ment successfully for more than fifteen years; however, stresses are now evident. Libraries rose quickly to the challenge of providing public-access services to the com- munities that they serve. The challenges libraries face are not necessarily insurmountable, and there are a range of tools designed to help public libraries plan and man- age their public-access services. These tools, however, place the burden of public access, or assume that the burden of public access in placed, on the public library. Given increased operating costs because of inflation, the continual need to innovate and upgrade technologies, staff technology skills requirements, and other factors discussed in this article, libraries may not be in a position to shoulder the burden of public access alone. Thus there is a need to reconsider the extent to which PAT provision is the sole responsibility of the library; perhaps there is a need to integrate and expand public access throughout a community. The potential of such an approach can benefit a community through an integrated and broader access strategy, but also can relieve the pressure on the public library as the sole provider of public access. n Acknowledgement This reserach was made possible in part through the sup- port of the MaintianIT Project (http://www.maintainit project.org/), an effort of the nonprofit TechSoup Web resource (http://www.techsoup.org/). References 1. Charles R. McClure, John Carlo Bertot, and Douglas L. Zweizig, Public Libraries and the Internet: Study Results, Policy Issues, and Recommendations (Washington, D.C.: National Commission on Libraries and Information Science, 1994). 2. John Carlo Bertot and Charles R. McClure, Moving Toward More Effective Public Internet Access: The 1998 National Survey of Public Library Outlet Internet Connectivity (Washington, D.C.: National Commission on Libraries and Information Science, 1998), http://www.liicenter.org/Reports/1998_plinternet_ study.pdf (accessed Apr. 22, 2009). 3. Charles R. McClure, John Carlo Bertot, and John C. Beachboard, Internet Costs and Cost Models for Public Libraries (Washington, D.C.: National Commission on Libraries and Information Science, 1995). 4. Charles R. McClure, John Carlo Bertot, and Douglas L. Zweizig, Public Libraries and the Internet: Study Results, Policy Issues, and Recommendations (Washington, D.C.: National Com- mission on Libraries and Information Science, 1994); John Carlo Bertot, Charles R. McClure, Paul T. Jaeger, and Joe Ryan, Public Libraries and the Internet 2006: Study Results and Findings (Tallahassee, Fla.: Information Institute, 2006), http://www .ii.fsu.edu/projectFiles/plinternet/2006/2006_plinternet.pdf (accessed Mar. 5, 2009). 5. John Carlo Bertot, Charles R. McClure, Carla B. Wright, Elise Jensen, and Susan Thomas, Public Libraries and the Internet 2007: Study Results and Findings (Tallahassee, Fla.: Information Institute, 2008). http://www.ii.fsu.edu/projectFiles/plinternet/ 2007/2007_plinternet.pdf (accessed Sept. 10, 2008). 6. Charles R. McClure and Paul T. Jaeger, Public Libraries and Internet Service Roles: Measuring and Maximizing Internet Services (Chicago: ALA, 2008). 7. George D’Elia, June Abbas, Kay Bishop, Donald Jacobs, and Eleanor Jo Rodger, “The Impact of Youth’s Use of the Inter- net on the Use of the Public Library,” Journal of the American Soci- ety for Information Science & Technology 58, no. 14 (2007): 2180–96; George D’Elia, Corinne Jorgensen, Joseph Woelfel, and Eleanor Jo Rodger, “The Impact of the Internet on Public Library Use: An Analysis of the Current Consumer Market for Library and Internet Services,” Journal of the American Society for Information Science & Technology 53, no. 10 (2002): 802–20. 8. National Center for Education Statistics (NCES), Public Libraries in the United States: Fiscal Year 2005 [NCES 2008301] (Washington, D.C.: National Center for Education Statistics, 2007); Pew American and Internet Life, “Internet Activities,” http:// www.pewinternet.org/trends/Internet_Activities_2.15.08.htm (accessed Mar. 5, 2009). 9. Bertot et al., Public Libraries and the Internet 2007. 10. Ibid. 11. Cheryl Bryan, Managing Facilities for Results: Optimiz- ing Space for Services (Chicago: Public Library Association, 2007); Joseph Matthews, Strategic Planning and Management for Library Managers (Westport, Conn.: Libraries Unlimited, 2005); Joseph Matthews, Technology Planning: Preparing and Updating a Library Technology Plan (Westport, Conn.: Libraries Unlimited, 2004); Diane Mayo and Jeanne Goodrich, Staffing For Results: A Guide to Working Smarter (Chicago: Public Library Association, 2002). 12. ALA, Libraries Connect Communities: Public Library Fund- ing & Technology Access Study (Chicago: ALA, 2008), http:// www.ala.org/ala/aboutala/offices/ors/plftas/0708report.cfm (accessed Mar. 5, 2008). 13. Charles P. Smith, ed., Motivation and Personality: Hand- book of Thematic Content Analysis (New York: Cambridge Univ. 92 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Pr., 1992); Klaus Krippendorf, Content Analysis: An Introduction to its Methodol- ogy (Beverly Hills, Calif.: Sage, 1980). 14. ALA, Libraries Connect Communities. 15. Bertot et al., Public Libraries and the Internet 2006; Bertot et al., Public Libraries and the Internet 2007. 16. Ibid. 17. NCES, Public Libraries in the United States. 18. Bertot et al., Public Libraries and the Internet 2007. 19. American Libraries, “Branch clos- ings and budget cuts threaten libraries nationwide,” Nov. 7, 2008, http://www .ala.org/ala/alonline/currentnews/ newsarchive/2008/November2008/ branchesthreatened.cfm (accessed Nov. 17, 2008). 20. ALA, Libraries Connect Communities. 21. Bertot et al., Public Libraries and the Internet 2006. 22. Bertot et al., Public Libraries and the Internet 2007. 3177 ---- MissiNG iTEMs: auToMaTiNG THE rEPlaCEMENT worKFlow ProCEss | sMiTH ET al. 93 Tutorial Cheri Smith, Anastasia Guimaraes, Mandy Havert, and Tatiana H. Prokrym Missing Items: Automating the Replacement Workflow Process Academic libraries handle miss- ing items in a variety of ways. The Hesburgh Libraries of the University of Notre Dame recently revamped their system for replacing or with- drawing missing items. This article describes the new process that uses a customized database to facilitate efficient and effective communica- tion, tracking, and selector decision making for large numbers of missing items. T hough missing books are a ubiquitous problem affecting multiple aspects of library ser- vices and workflows, policies and procedures for handling them have not generated a great deal of buzz in library literature. For the purpose of this article, missing books (and other collection items), refers to items that were not returned from circula- tion or have otherwise gone missing from the collection and cannot be located. Significant staff time may be invested in the missing-book pro- cess by departments such as col- lection development, circulation, acquisitions, database management, systems, and public services. More importantly, user experiences can be negatively affected when missing books are not handled efficiently and effectively. While most libraries have procedures for replacing or suppress- ing catalog records for items that are missing from the stacks or have been checked out and never returned, few have made these procedures public. This article describes the procedure developed by the Hesburgh Libraries of the University of Notre Dame to replace missing items or to withdraw them from the catalog. Hesburgh Libraries’ procedure offers stream- lined, paperless routing of records for missing materials, accounts for “non- decisions” by subject librarians, and results in a shortened turnaround time for acquisitions and catalog- maintenance workflows. Hesburgh Libraries’ Experience In 2005, Hesburgh Libraries recog- nized its need to develop a stream- lined method of processing missing items. Because of personnel changes and competing demands on staff time, the routine handling of missing mate- rials had been suspended for roughly five years. During this period, cir- culation staff continued to perform searches. When staff declared an item officially missing, the item’s catalog record was updated to the item pro- cess status “missing” (MI) and paper records were routed to the Collection Development Department office, but no further action was taken. The mounting backlog of missing items in the catalog became a recur- ring source of frustration to patrons and public-services employees alike. Searches for books that were popular among undergraduates often led to items with a “missing” status. To compound the problem, budgetary constraints resulted in the suspension of spending from the fund earmarked for the replacement of missing items. Subject librarians were forced to use their own discipline-specific funds to replace items in their areas, but because there was no systematic means of notifying subject librarians of missing items, they replaced items very rarely and on a case-by-case basis—primarily when faculty or graduate students asked a selector to purchase a replacement for an item critical to their teaching or research. Also in 2005, a library-wide fund to replace materials was made avail- able. Unfortunately, by that time, the tremendous backlog of catalog records for missing items rendered the exist- ing paper-based system unworkable. As a result, a small task force was formed to manage the backlog and to develop a new method for handling future missing items. Hesburgh Libraries’ Solution The missing items task force was initially composed of eight members representing all departments affected by changes in the procedures for han- dling missing books. The task force was chaired by the subject librarian for psychology and education. Other members represented the Circulation, Collection Development, Cataloging, Catalog and Database Maintenance (CADM), Monograph Acquisitions, and Systems departments. During the initial meeting, each member described their portion of the work- flow and communicated their require- ments for effectively completing their parts of the process. Because most items with the sta- tus “missing” were ones that a patron or patrons had either recently used or requested and could therefore be con- sidered relatively high-use material, the task force quickly determined that the search time for missing books should be shortened from one year to six months. Task force members from Monograph Acquisitions were particularly interested in making this change because newer books are more easily replaced if requests were made Cheri smith (cheryl.s.smith.454@nd.edu) is coordinator for Instructional Services, anastasia Guimaraes (aguimara @nd.edu) is Supervisor of catalog and database Maintenance, Mandy Havert (mhavert@nd.edu) is head of the Monograph Acquisitions department, and Tatiana H. Prokrym (tprokrym@ nd.edu) is Senior Technical consultant at hesburgh libraries of the University of notre dame, notre dame, Indiana. 94 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 sooner—many books, especially in the sciences, go out of print quickly and become difficult to replace. The Systems task force member supplied a spreadsheet containing the roughly three thousand miss- ing items. This initial spreadsheet included all fields that might be useful for staff in Monograph Acquisitions, Cataloging, CADM, and Collection Development. Various strategies for disseminating the spreadsheet to subject librarians were discussed, but all ideas for how the subject librarians might interact with the spreadsheet seemed laborious and inevitably required that someone sort through each item on the list to deter- mine whether the records needed to be sent to Monograph Acquisitions or CADM for further processing. The process seemed feasible for a one- time effort, but the task force did not see it as a suitable permanent solution. The task force then consid- ered the feasibility of developing a customized database to manage all of the information necessary for library employees—primarily subject librar- ians and Monograph Acquisitions and CADM staff—to participate in the processing of missing books. The Database Once the task force determined that a database would serve Hesburgh Libraries’ needs more efficiently than a spreadsheet- or paper-based sys- tem, the task force enlisted the help of an applications developer. Hesburgh Libraries had previously created a database for handling journal cancel- lations, and the task force decided to base the replacement application upon this model. The application is therefore written in PHP and uses a MySQL database. The first step in designing the database was to determine which bibliographic metadata (such as call number, ISBN, ISSN, imprint, etc.) would be required by subject librarians to specify replacement or withdrawal decisions, including whether the item was to be replaced with the same edition, any edition, or the newest available edition. Because replacement funds may not always be available, the task force wanted to enable the selector to identify other funds to use for the replacement purchase. Finally, the task force felt that, no matter how easy the system was to use, there would always be a few sub- ject librarians who choose not to use it. It was therefore important that the database could also account for “non- decisions” from subject librarians. Other general database requirements included that it be available through any Web browser and accessible to only those people who are part of the replacement-book process. With those requirements in mind, the task force created a list of meta- data elements to be included in the database (see table 1). On a quarterly basis, the applica- tion pulls the database fields—Title, Author, Call Number, Sub Library, Imprint, ISBN or ISSN, Barcode, Previous Fund, Local Cost, Description, Item Status, Update Date, Bib System Number, and System Number—from Hesburgh Libraries’ ILS (Aleph v18) and imports into the replacements database. For each item, biblio- graphic, circulation, and acquisitions information is retrieved from ALEPH and combined to generate the export data file. Procedurally, a list of all items with an item process status of “missing” is first retrieved into a temporary table from the item record (Z30) table. This temporary table con- sists of the system number, status field, sublibrary, collection, barcode, description, and the last date the item was modified (z30-update-date in ALEPH). A second temporary table is then created that includes the pur- chase price and fund code originally used to purchase the item. The two temporary tables are joined and their information merged, creating a sin- gle list of missing items and related acquisitions information. This list is then linked to the bibliographic tables to obtain key bibliographic informa- tion such as title, author, imprint, ISBN or ISSN, the ILS bibliographic number, and the barcode. These com- bined results are converted into an ASCII text file for import into the MySQL replacements database. Upon the import of the ASCII file, an e-mail is sent to the collection development e-mail list, informing subject librar- ians that data has been loaded and is ready for their review and input. Table 2 lists the purpose of each of the nine tables within the replace- ments database. Figure 1 illustrates the relationships and linking fields Table 1. Fields for the replacements database Database Field Data Type Title varchar(200) Author varchar(150) Call Number varchar(30) Sub Library varchar(12) Imprint varchar(150) ISBN or ISSN varchar(150) Barcode varchar(30) Previous Fund varchar(20) Local Cost decimal(10,2) Description varchar(50) Item Status char(2) Update Date Date Bib System Number int(9) unsigned zerofill System Number varchar(50) New Database Fields: Action to Take tinyint(1) New Fund Code int(10) Modified Date Date Modified By varchar(50) Notes Longtext System-Used Fields: Transfer Date Date Record ID int(10) (Auto) MissiNG iTEMs: auToMaTiNG THE rEPlaCEMENT worKFlow ProCEss | sMiTH ET al. 95 between the tables. The database provides two “pick lists” for subject librarians. The first pick list is the Action to Take field. Primary choices are “Any edition,” “Newest edition only,” “Micro format only,” and “Do not replace.” The second pick list is the New Fund field. The default choice for this field is Hesburgh Libraries’ replacement fund code, although any acquisitions funds may be selected. Both pick lists provide data integrity and assurance that all input from the subject librarians is standardized. Two internal fields, Record ID and Transfer Date facilitate programming and identification. These fields are very important for auditing and track- ing replacement records through the replacement process. Rollbacks are easily handled through the manipu- lation of these two fields. Programmatic Process For the initial implementation of this application, the task force decided that batch loads would be preformed on an as-needed basis. After the ini- tial phase of the project, the task force implemented a quarter-based schedule. For each data load, the exported records are written to a text file, which is then imported into the replacements database through an import script. The import script archives the previous group of pro- cessed records, appending them to a set of historical tables stored within the database. The import script fur- ther processes the ALEPH data by eliminating duplicate records and ensuring there is only one record per barcode and system number. The historical tables are checked to see if a missing item has already been loaded into the database and pro- cessed. If a record has already been processed, it is automatically deleted from the newly imported item list. After the successful completion of the data load, an e-mail is auto- matically generated notifying sub- ject librarians that the replacements database is ready for their review and input. The verified missing item records are then transferred to the main database table, “tblreplace- ments,” and are ready for updating. Included in the e-mail to subject librarians is a link that directs them to a search window allowing them to take action on the missing items (see figure 2). Once the subject librarians update the records, the application provides a mechanism to distribute missing book records to the appropriate departments for further processing. A Collection Development staff member runs a series of reports, each one cre- ating a Microsoft Excel spreadsheet. The first report lists missing-book records marked for replacement and is sent to Monograph Acquisitions for processing. Missing books that have been marked “do not replace” or have had no action taken on them after a certain time period are exported to a separate Excel spreadsheet that is sent to CADM for suppression or removal of cataloging records. For each report that is run, the application gener- ates an e-mail message, notifying all necessary departments that there is information to be processed. A list of processed records is available for viewing and distribution to CADM and Acquisitions as illustrated in fig- ure 3. The application also provides customized manipulation of the data records that are exported to each of the departments. This customization pulls together only the specific fields of interest to each department such that each export template is unique to each department’s needs. At the end of each replacement cycle, the application automatically creates backups and archives the missing book records. Table 2. Tables and their purposes within the database Table Description alephdump Stores imported ALEPH data before processing. tbltempreplacemetns Stores ALEPH data from the alephdump table. This data is processed and sent through verification and truncation programs. tblreplacments Post-processed ALEPH records. Primary table for all activities, actions, and fund codes selected by the subject librarians. tblactions A reference list of valid actions that can be taken by the subject librarians. tblfunds A reference list of valid fund codes; originally imported from ALEPH. tblacqrecords Temporary table that stores processed records that should be sent to Monographic Acquistions. tblcadmrecords Temporary table that stores processed records that should be sent to CADM. tblcadmnullrecord Temporary table that stores records where no action has been taken by a subject librarian. historytblreplacements An archiving table. 96 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Subject librarian workflow When subject librarians receive a message indicating a new replace- ment list is ready for review, their job is surprisingly simple. After entering their network ID and password to gain access to the database, they can select how they wish to view the list of missing books—by selected call number ranges, by the budget code with which the books were originally purchased, or by system number (the last two options are rarely used). Subject librarians can also view items that have already been processed, and they are able to sort this list by subject librarian, action taken, new budget code, or call number. Figure 1. relationship diagram for the nine database tables that were created for this application. The AlePh system number is used as the primary linking field for most of the tables. MissiNG iTEMs: auToMaTiNG THE rEPlaCEMENT worKFlow ProCEss | sMiTH ET al. 97 Initially, subject librarians encoun- ter a list of brief records for each item in the database. The brief records include system numbers, titles, authors, volume numbers (if appli- cable), call numbers, sublibraries, and ISBNs or ISSNs. If a record has already been reviewed by a subject librarian, the list will include actions taken and the names of the sub- ject librarians who took the action. To take action on an item, subject librarians select the system number, displaying the full record (see figure 4), and may then choose to replace the book with the same edition, any edition, the newest edition available, or a microform version. By using a drop-down menu, the selector can elect to pay for the replacement with replacement funds or with their own subject funds. Subject librarians who choose to replace books with their own funds are rewarded at the end of the quarter when their replacement requests appear at the top of the queue for processing by Monograph Acquisitions. Additional functionality includes the ability to directly link to and browse OPAC records for items in the database. Replacement funds cannot be used for second copies of books, so quick access to OPAC records is often useful. It also facilitates determining if the library owns other editions of the item before taking action. A notes field allows subject librarians to com- municate special instructions for Monograph Acquisitions or CADM, and records can be e-mailed to other librarians for additional input with just a few clicks. Subject librarians are able to return to the database at any time during a given quarter to continue making decisions on their missing books and make any adjust- ments to prior decisions as necessary. If a subject librarian takes no action on an item by the end of the quar- ter, it is assumed that it is not to be replaced, and these untouched items are sent to CADM for removal or suppression. Figure 2. replacements application search window Figure 3. Processed book records ready to be sent to Monograph Acquisitions and cAdM. notification and data transmission to these units are achieved through the Send buttons on this webpage. 98 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Monograph Acquisitions workflow Once the quarterly database pro- cessing completes, a comma-sepa- rated file is delivered to the shared Monograph Acquisitions e-mail address. Monograph Acquisitions staff format, sort, and begin search- ing the spreadsheet, giving priority to the orders designated for replace- ment by subject librarian funds over those funded from the library replacement fund. Staff members routinely search the library catalog for duplicate titles or review orders in process for the same title prior to searching with our library materials vendors. Staff members ensure that replacement funds are not used to purchase second copies. Material that is not available for purchase is referred by Monograph Acquisitions to the subject librarian for direction. Sometimes the mate- rials may be kept on order with a vendor to continue searching for out- of-print or aftermarket availability. Other times it is necessary for staff to cancel the order and remove the record from the system completely. Likewise, the missing edition may have been subsumed by a newer, revised edition. Subject librarians are contacted by search and order staff in the Monograph Acquisitions department regarding availability of different editions when they did not specify that any edition would be acceptable. When the Monograph Acquisitions department places a replacement-copy order, the search-and-order unit adds an ILS library note field code designat- ing the item is a replacement (RPLC), the bibliographic system number of the item being replaced, and any typi- cal order notes such as the initials of the staff member placing the order. The RPLC code alerts the receipt unit to route new items to the Cataloging supervisor, who then reviews and directs the items to either Cataloging or CADM for processing. Catalog and Database Maintenance (CADM) workflow CADM is usually the last unit to edit records in the missing books work- flow. The unit receives two reports from the database: a “do not replace” list and a “no action taken” list. Both reports get the same treatment: All catalog records for titles listed are removed from the catalog. Removal of catalog records is accomplished either by suppression/ deletion of the bibliographic records or complete deletion of all records (item, holdings, bibliographic, and administrative) from the server. For titles that have order or subscrip- tion records attached to bibliographic records, a suppression/deletion pro- cedure allows the record to be sup- pressed from patrons’ view while preserving the title’s order and pay- ment history for internal staff use. Records are completely deleted when no such information exists (e.g., a gift copy or an older record that has no such data attached). Because it takes a long time to review each newly loaded batch from the catalog into the database, some records that come to CADM for deletion no longer need to be deleted if missing books are found and returned to the shelves. It is very important for staff working on the cleanup of records to check the item process status and not delete any items that have been cleared of the “missing” status. Fortunately, Aleph allows staff to look up an item’s his- tory and view prior changes made to the record. This item history feature eliminates unnecessary shelf checks for items appearing on CADM reports that are no longer listed as “missing” in the catalog. Occasionally, CADM receives requests to delete records directly from Monograph Acquisitions and Cataloging staff because of a revised selector decision. This often occurs when a replacement item is only available in a different edition from Figure 4. Full record for a missing book in the replacement database MissiNG iTEMs: auToMaTiNG THE rEPlaCEMENT worKFlow ProCEss | sMiTH ET al. 99 the one originally sought, or when an item is ultimately unable to be replaced because it has gone out of print or a vendor backs out of a pur- chase agreement. When a different edition is received to replace a missing item, the replacement copy is sent by the receipt unit in Monograph Acquisitions to Cataloging for copy or original cataloging, and CADM is alerted by either Monograph Acquisitions or Cataloging staff if the record for the missing item needs to be deleted. Because Monograph Acquisitions often orders the replace- ment on its own record with appro- priate bibliographic information (we keep the original record just in case the missing piece is found while we wait for replacement), the record for the missing book does not come to CADM on either of the two reports. Perhaps in a library with a differ- ent makeup of technical services the process would be more streamlined, but because Hesburgh Libraries has separate cataloging and database maintenance units, we have created such partnerships to make sure noth- ing falls through the cracks. So far it has worked well, and every party in the process knows and carries out their responsibilities. Issues While the initial implementation suc- cessfully brought a large backlog of missing records into the database, subsequent loads included dupli- cate records of some items processed in earlier batches. This duplication occurred, for example, if an item was identified for replacement in a prior database review cycle, but a replacement request had not yet been processed by Monograph Acquisitions staff. Because such an item is still identified as “missing” in the catalog, it was again included in data loaded from the catalog into the missing-books database, creating confusion for selectors, CADM, and Monograph Acquisitions. To resolve this problem, the import process was revised to include a search for previ- ously loaded items, deleting them before records are viewed by collec- tion managers. A second issue involved the tim- ing of the data load from the catalog into the replacements database. For various reasons, the data load file was not fully generated for several of the scheduled processing dates. To remedy this problem, the appli- cation automatically generates an e-mail confirming a successful data load to the Collection Development Department staff. There is continued debate as to whether the missing- items file should be created on a daily basis, providing the capabil- ity for Collection Development to import new data at one time rather than periodically. Results Since implementing our new system, Hesburgh Libraries has processed records for 5,141 missing items. Since its creation, twenty-five librarians have consulted the database and twenty-three of thirty subject librari- ans have used the database to request replacements. Of the 5,141 records loaded into the database, 2,537 items (49 percent) have been selected for replacement, and 2,604 items (51 percent) have either been sup- pressed or deleted from our catalog. Replacement funds are renewed on an annual basis and have not yet run out. As a reflection of the collection strengths at Hesburgh Libraries, most of the missing books (21 percent) fell in the Theology/Religion call number range. Language and Literatures was the second most popular collection for missing items (17 percent). Other collections with significant numbers of missing books are History (15 percent), Social Sciences (17 percent), Science (12 percent), and Philosophy (10 percent). Conclusion Although the process could certainly be further developed and refined, the Hesburgh Libraries missing books application is an amazing improve- ment over the extremely outdated paper-based method of dealing with missing library materials. The pro- cess works; it is both efficient and effective, and employees who engage in the process have reported satisfac- tion with it. It has not only allowed Hesburgh Libraries to catch up on its backlog but, more importantly, to stay current and organized, keep- ing the catalog more accurate and patrons more satisfied. Furthermore, should the libraries opt to do a full inventory in the future, the cur- rent system will prove invaluable. The authors are pleased to have the opportunity to share our experiences with interested libraries. Feel free to contact any of the authors for further information. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3175 ---- CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 55 Martha M. Yee Can Bibliographic Data be Put Directly onto the Semantic Web? This paper is a think piece about the possible future of bib- liographic control; it provides a brief introduction to the Semantic Web and defines related terms, and it discusses granularity and structure issues and the lack of standards for the efficient display and indexing of bibliographic data. It is also a report on a work in progress—an experi- ment in building a Resource Description Framework (RDF) model of more FRBRized cataloging rules than those about to be introduced to the library community (Resource Description and Access) and in creating an RDF data model for the rules. I am now in the process of trying to model my cataloging rules in the form of an RDF model, which can also be inspected at http://myee. bol.ucla.edu/. In the process of doing this, I have discov- ered a number of areas in which I am not sure that RDF is sophisticated enough yet to deal with our data. This article is an attempt to identify some of those areas and explore whether or not the problems I have encountered are soluble—in other words, whether or not our data might be able to live on the Semantic Web. In this paper, I am focusing on raising the questions about the suitability of RDF to our data that have come up in the course of my work. T his paper is a think piece about the possible future of bibliographic control; as such, it raises more complex questions than it answers. It is also a report on a work in progress—an experiment in build- ing a Resource Description Framework (RDF) model of FRBRized descriptive and subject-cataloging rules. Here my focus will be on the data model rather than on the FRBRized cataloging rules for gathering data to put in the model, although I hope to have more to say about the latter in the future. The intent is not to present you with conclusions but to present some questions about data modeling that have arisen in the course of the experiment. My premise is that decisions about the data model we follow in the future should be made openly and as a com- munity rather than in a small, closed group of insiders. If we are to move toward the creation of metadata that is more interoperable with metadata being created outside our community, as is called for by many in our profes- sion, we will need to address these complex questions as a community following a period of deep thinking, clever experimentation, and astute political strategizing. n The vision The Semantic Web is still a bewitching midsummer night’s dream. It is the idea that we might be able to replace the existing HTML–based Web consisting of marked-up documents—or pages—with a new RDF– based Web consisting of data encoded as classes, class properties, and class relationships (semantic linkages), allowing the Web to become a huge shared database. Some call this Web 3.0, with hyperdata replacing hyper- text. Embracing the Semantic Web might allow us to do a better job of integrating our content and services with the wider Internet, thereby satisfying the desire for greater data interoperability that seems to be widespread in our field. It also might free our data from the propri- etary prisons in which it is currently held and allow us to cooperate in developing open-source software to index and display the data in much better ways than we have managed to achieve so far in vendor-developed ILS OPACs or in giant, bureaucratic bibliographic empires such as OCLC WorldCat. The Semantic Web also holds the promise of allow- ing us to make our work more efficient. In this bewitch- ing vision, we would share in the creation of Uniform Resource Identifiers (URIs) for works, expressions, mani- festations, persons, corporate bodies, places, subjects, and so on. At the URI would be found all of the data about that entity, including the preferred name and the vari- ant names, but also including much more data about the entity than we currently put into our work (name-title and title), such as personal name, corporate name, geographic, and subject authority records. If any of that data needed to be changed, it would be changed only once, and the change would be immediately accessible to all users, libraries, and library staff by means of links down to local data such as circulation, acquisitions, and binding data. Each work would need to be described only once at one URI, each expression would need to be described only once at one URI, and so forth. Very much up in the air is the question of what institu- tional structures would support the sharing of the creation of URIs for entities on the Semantic Web. For the data to be reliable, we would need to have a way to ensure that the system would be under the control of people who had been educated about the value of clean and accurate entity definition, the value of choosing “most commonly known” preferred forms (for display in lists of mul- tiple different entities), and the value of providing access Martha M. Yee (myee@ucla.edu) is cataloging Supervisor at the University of california, los Angeles Film and Television Archive. 56 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 under all variant forms likely to be sought. At the same time, we would need a mechanism to ensure that any interested members of the public could contribute to the effort of gathering variants or correcting entity definitions when we have had inadequate information. For example, it would be very valuable to have the input of a textual or descriptive bibliographer applied to difficult questions concerning particular editions, issues, and states of a sig- nificant literary work. It would also be very valuable to be able to solicit input from a subject expert in determining the bounds of a concept entity (subject heading) or class entity (classification). n The experiment (my project) To explore these bewitching ideas, I have been conduct- ing an experiment. As part of my experiment, I designed a set of cataloging rules that are more FRBRized than is RDA in the sense that they more clearly differentiate between data applying to expression and data apply- ing to manifestation. Note that there is an underlying assumption in both FRBR (which defines expression quite differently from manifestation) and on my part, namely that catalogers always know whether a given piece of data applies at either the expression or the man- ifestation level. That assumption is open to questioning in the process of the experiment as well. My rules also call for creating a more hierarchical and degressive relationship between the FRBR entities work, expression, manifestation, and item, such that data pertaining to the work does not need to be repeated for every expres- sion, data pertaining to the expression does not need to be repeated for every manifestation, and so forth. Degressive is an old term used by bibliographers for bib- liographies that provide great detail about first editions and less detail for editions after the first. I have adapted this term to characterize my rules, according to which the cataloger begins by describing the work; any details that pertain to all expressions and manifestations of the work are not repeated in the expression and manifesta- tion descriptions. This paper would be entirely too long if I spent any more time describing the rules I am devel- oping, which can be inspected at http://myee.bol.ucla .edu. Here, I would like to focus on the data-modeling process and the questions about the suitability of RDF and the Semantic Web for encoding our data. (By the way, I don’t seriously expect anyone to adopt my rules! They are radically different than the rules currently being applied and would represent a revolution in cata- loging practice that we may not be up to undertaking in the current economic climate. Their value lies in their thought-experiment aspect and their ability to clarify what entities we can model and what entities we may not be able to model.) I am now in the process of trying to model my cataloging rules in the form of an RDF model (“RDF” as used in this paper should be considered from now on to encompass RDF Schema [RDFS], Web Ontology Language [OWL], and Simple Knowledge Organization System [SKOS] unless otherwise stated); this model can also be inspected at http://myee.bol .ucla.edu. In the process of doing this, I have discovered a number of areas in which I am not sure that RDF is yet sophisticated enough to deal with our data. This article is an attempt to outline some of those areas and explore whether the problems I have encountered are soluble, in other words, whether or not our data might be able to live on the Semantic Web eventually. I have already heard from RDF experts Bruce D’Arcus (Miami University) and Rob Styles (developer of Talis, as Semantic Web technol- ogy company), whom I cite later, but through this article I hope to reach a larger community. My research questions can be found later, but first some definitions. n Definition of terms The Semantic Web is a way to represent knowledge; it is a knowledge-representation language that provides ways of expressing meaning that are amenable to com- putation; it is also a means of constructing knowledge- domain maps consisting of class and property axioms with a formal semantics RDF is a family of specifications for methods of modeling information that underpins the Semantic Web through a variety of syntax formats; an RDF metadata model is based on making statements about resources in the form of triples that consist of 1. the subject of the triple (e.g., “New York”); 2. the predicate of the triple that links the subject and the object (e.g., “has the postal abbreviation”); and 3. the object of the triple (e.g., “NY”). XML is commonly used to express RDF, but it is not a necessity; it can also be expressed in Notation 3 or N3, for example.1 RDFS is an extensible knowledge-representation lan- guage that provides basic elements for the description of ontologies, also known as RDF vocabularies. Using RDFS, statements are made about resources in the form of 1. a class (or entity) as subject of the RDF triple (e.g., “New York”); 2. a relationship (or semantic linkage) as predicate of the RDF triple that links the subject and the object (e.g., CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 57 “has the postal abbreviation”); and 3. a property (or attribute) as object of the RDF triple (e.g., “NY”). OWL is a family of knowledge representation lan- guages for authoring ontologies compatible with RDF. SKOS is a family of formal languages built upon RDF and designed for representation of thesauri, classification schemes, taxonomies, or subject-heading systems. n Research questions Actually, the full-blown Semantic Web may not be exactly what we need. Remember that the fundamental definition of the Semantic Web is “a way to represent knowledge.” The Semantic Web is a direct descendant of the attempt to create artificial intelligence, that is, of the attempt to encode enough knowledge of the real world to allow a computer to reason about reality in a way indistinguish- able from the way a human being reasons. One of the research questions should probably be whether or not the technology developed to support the Semantic Web can be used to represent information rather than knowledge. Fortunately, we do not need to represent all of human knowledge—we simply need to describe and index resources to facilitate their retrieval. We need to encode facts about the resources and what the resources discuss (what they are “about”), not facts about “reality.” Based on our past experience, doing even this is not as simple as people think it is. The question is whether we could do what we need to do within the context of the Semantic Web. Sometimes things that sound simple do not turn out to be so simple in the doing. My research questions are as follows: 1. Is it possible for catalogers to tell in all cases whether a piece of data pertains to the FRBR expression or the FRBR manifestation? 2. Is it possible to fit our data into RDF? Given that RDF was designed to encode knowledge rather than information, perhaps it is the wrong technol- ogy to use for our purposes? 3. If it is possible to fit our data into RDF, is it possible to use that data to design indexes and displays that meet the objectives of the catalog (i.e., providing an efficient instrument to allow a user to find a particular work of which the author and title are known, a particular expression of a work, all of the works of an author, all of the works in a given genre or form, or all of the works on a particular subject)? As stated previously, I am not yet ready to answer these questions. I hope to find answers in the course of developing the rules and the model. In this paper, I am focusing on raising the questions about the suitability of RDF to our data that have come up in the course of my work. n Other relevant projects Other relevant projects include the following: 1. FRBR, Functional Requirements for Authority Data (FRAD), Funtional Requirements for Subject Authority Records (FRSAR), and FRBR-object- oriented (FRBRoo). All are attempts to create con- ceptual models of bibliographic entities using an entity-relationship model that is very similar to the class-property model used by RDF.2 2. Various initiatives at the Library of Congress (LC), such as LC Subject Headings (LCSH) in SKOS,3 the LC Name Authority File in SKOS,4 the LCCN Permalink project to create persistent URIs for bibliographic records,5 and initiatives to provide SKOS representations for vocabularies and data elements used in MARC, PREMIS, and METS. These all represent attempts to convert our exist- ing bibliographic data into URIs that stand for the bibliographic entities represented by bibliographic records and authority records; the URIs would then be available for experiments in putting our data directly onto the Semantic Web. 3. The DC-RDA Task Group project to put RDA data elements into RDF.6 As noted previously and dis- cussed further later, RDA is less FRBRized than my cataloging rules, but otherwise this project is very similar to mine. 4. Dublin Core’s (DC’s) work on an RDF schema.7 Dublin Core is very focused on manifestation and does not deal with expressions and works, so it is less similar to my project than is the DC-RDA Task Groups’s project (see further discussion later). n Why my project? One might legitimately ask why there is a need for a dif- ferent model than the ones already provided by FRBR, FRAD, FRSAR, FRBRoo, RDA, and DC. The FRBR and RDA models are still tied to the model that is implicit in our current bibliographic data in which expression and manifestation are undifferentiated. This is because publishers publish and libraries acquire and shelve mani- festations. In our current bibliographic practice, a new 58 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 bibliographic record is made for either a new manifesta- tion or a new expression. Thus, in effect, there is no way for a computer to tell one from the other in our current data. Despite the fact that FRBR has good definitions of expression (change in content) and manifestation (mere change in carrier), it perpetuates the existing implicit model in its mapping of attributes to entities. For exam- ple, FRBR maps the following to manifestation: edition statements (“2nd rev. ed.”); statements of responsibility that identify translators, editors, and illustrators; physi- cal description statements that identify illustrated edi- tions; and extent statements that differentiate expressions (the 102-minute version vs. the 89-minute version); etc. Thus the FRBR definition of expression recognizes that a 2nd revised edition is a new expression, but FRBR maps the edition statement to manifestation. In my model, I have tried to differentiate more cleanly data applying to expressions from data applying to manifestations.8 FRBR and RDA tend to assume that our current bib- liographic data elements map to one and only one group 1 entity or class. There are exceptions, such as title, which FRBR and RDA define at work, expression, and manifes- tation levels. However, there is a lack of recognition that, to create an accurate model of the bibliographic universe, more data elements need to be applied at the work and expression level in addition to (or even instead of) the manifestation level. In the appendix I have tried to con- trast the FRBR, FRAD, and RDA models with mine. In my model, many more data elements (properties and attri- butes) are linked to the work and expression level. After all, if the expression entity is defined as any change in work content, the work entity needs to be associated with all content elements that might change, such as the original extent of the work, the original statement of responsibil- ity, whether illustrations were originally present, whether color was originally present in a visual work, whether sound was originally present in an audiovisual work, the original aspect ratio of a moving image work, and so on. FRBR also tends to assume that our current data ele- ments map to one and only one entity. In working on my model, I have come to the conclusion that this is not necessarily true. In some cases, a data element pertaining to a manifestation also pertains to the expression and the work. In other cases, the same data element is specific to that manifestation, and, in other cases, the same data ele- ment is specific to its expression. This is true of most of the elements of the bibliographic description. FRAD, in attempting to deal with the fact that our current cataloging rules allow a single person to have several bibliographic identities (or pseudonyms), treats person, name, and controlled access point as three separate entities or classes. I have tried to keep my model simpler and more elegant by treating only person as an entity, with preferred name and variant name as attributes or properties of that entity. FRBRoo is focused on the creation process for works, with special attention to the creation of unique works of art and other one-off items found in museums. Thus FRBRoo tends to neglect the collocation of the various expressions that develop in the history of a work that is reproduced and published, such as translations, abridged editions, editions with commentary, etc. DC has concentrated exclusively on the description of manifestations and has neglected expression and work altogether. One of the tenets of Semantic Web development is that, once an entity is defined by a community, other communities can reuse that entity without defining it themselves. The very different definitions of the work and expression entities in the different communities described above raise some serious questions about the viability of this tenet. n Assumptions It should be noted that this entire experiment is based on two assumptions about the future of human intervention for information organization. These two assumptions are based on the even bigger assumption that, even though the Internet seems to be an economy based on free intel- lectual labor, and, even though human intervention for information organization is expensive (and therefore at more risk than ever), human intervention for information organization is worth the expense. n Assumption 1: What we need is not artificial intel- ligence, but a better human–machine partnership such that humans can do all of the intellectual labor and machines can do all of the repetitive clerical labor. Currently, catalogers spend too much time on the latter because of the poor design of current systems for inputting data. The univer- sal employment provided by paying humans to do the intellectual labor of building the Semantic Web might be just the stimulus our economy needs. n Assumption 2: Those who need structured and granular data—and the precise retrieval that results from it—to carry out research and scholarship may constitute an elite minority rather than most of the people of the world (sadly), but that talented and intelligent minority is an important one for the cul- tural and technological advancement of humanity. It is even possible that, if we did a better job of providing access to such data, we might enable the enlargement of that minority. CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 59 n Granularity and structure issues As soon as one starts to create a data model, one encoun- ters granularity or cataloger-data parsing issues. These issues have actually been with us all along as we devel- oped the data model implicit in AACR2R and MARC 21. Those familiar with RDA, FRBR, and FRAD development will recognize that much of that development is directed at increasing structure and granularity in cataloger- produced data to prepare for moving it onto the Semantic Web. However, there are clear trade-offs in an increase in structure and granularity. More structure and more granularity make possible more powerful indexing and more sophisticated display, but more structure and more granularity are more complex and expensive to apply and less likely to be implemented in a standard fashion across all communities; that is, it is less likely that interoperable data would be produced. Any switching or mapping that was employed to create interoperable data would produce the lowest common denominator (the simplest and least granular data), and once rendered interoper- able, it would not be possible for that data to swim back upstream to regain its lost granularity. Data with less structure and less granularity could be easier and cheaper to apply and might have the potential to be adopted in a more standard fashion across all communities, but that data would limit the degree to which powerful indexing and sophisticated display would be possible. Take the example of a personal name: Currently, we demarcate surname from forename by putting the sur- name first, followed by a comma and then the forename. Even that amount of granularity can sometimes pose a problem for a cataloger who does not necessarily know which part of the name is surname and which part is forename in a culture unfamiliar to the cataloger. In other words, the more granularity you desire in your data, the more often the people collecting the data are going to encounter ambiguous situations. Another example: Currently, we do not collect information about gender self-identification; if we were to increase the granularity of our data to gather that information, we would surely encounter situations in which the cataloger would not necessarily know if a given creator was self-defined as a female or a male or of some other gender identity. Presently, if we are adding a birth and death date, whatever dates we use are all together in a $d subfield without any separate coding to indicate which date is the birth date and which is the death date (although an occa- sional “b.” or “d.” will tell us this kind of information). We could certainly provide more granularity for dates, but that would make the MARC 21 format much more complex and difficult to learn. People who dislike the MARC 21 format already argue that it is too granular and therefore requires too much of a learning curve before people can use it. For example, Tennant claims that “there are only two kinds of people who believe themselves able to read a MARC record without referring to a stack of manuals: a handful of our top catalogers and those on serious drugs.”9 How much of the granularity already in MARC 21 is used either in existing records or, even if present, is used in indexing and display software? Granularity costs money, and libraries and archives are already starving for resources. Granularity can only be provided by people, and people are expensive. Granularity and structure also exist in tension with each other. More granularity can lead to less structure (or more complexity to retain structure along with granular- ity). In the pursuit of more granularity of data than we have now, RDA, attempting to support RDF–compliant XML encoding, has been atomizing data to make it useful to computers, but this will not necessarily make the data more useful to humans. To be useful to humans, it must be possible to group and arrange (sort) the data meaning- fully, both for indexing and for display. The developers of SKOS refer to the “vast amounts of unstructured (i.e., human readable) information in the web,”10 yet labeling bits of data as to type and recording semantic relation- ships in a machine-actionable way do not necessarily provide the kind of structure necessary to make data readable by humans and therefore useful to the people the Web is ultimately supposed to serve. Consider the case of music instrumentation. If you have a piece of music for five guitars and one flute, and you simply code number and instrumentation without any way to link “five” with “guitars” and “one” with “flute,” you will not be able to guarantee that a person looking for music for five flutes and one guitar will not be given this piece of music in their results (see figure 1).11 The more granular the data, the less the cataloger can build order, sequencing, and linking into the data; the coding must be carefully designed to allow the desired order, sequenc- ing, and linking for indexing and display to be possible, which might call for even more complex coding. It would be easy to lose information about order, sequencing, and linking inadvertently. Actually, there are several different meanings for the term structure: 1. Structure is an object of a record (structure of docu- ment?); for example, Elings and Waibel refer to “data fields . . . also referred to as elements . . . which are organized into a record by a data struc- ture.”12 2. Structure is the communications layer, as opposed to the display layer or content designation.13 3. Structure is the record, field, and subfield. 4. Structure is the linking of bits of data together in the 60 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 form of various types of relationships. 5. Structure is the display of data in a structured, ordered, and sequenced manner to facilitate human understand- ing. 6. Data structure is a way of storing data in a computer so that it can be used efficiently (this is how computer programmers use the term). I hasten to add that I am definitely in favor of add- ing more structure and granularity to our data when it is necessary to carry out the fundamental objectives of our profession and of our catalogs. I argued earlier that FRBR and RDA are not granular enough when it comes to the distinction between data elements that apply to expression and those that apply to manifestation. If we could just agree on how to differentiate data applying to the manifestation from data applying to the expression instead of our current practice of identifying works with headings and lumping all manifestation and expression data together, we could increase the level of service we are able to provide to users a thousandfold. However, if we are not going to commit to differentiating between Figure 1b. example of encoding of musical instrumentation at the expression level based on the above model 5 guitars 1 flute instrumentation of musical expression original instrumentation of musical expression—number of a particular instrument original instrumentation of musical expression—type of instrument Figure 1a. extract from Yee rdF model that illustrates one technique for modeling musical instrumentation at the expression level (using a blank node to group repeated number and instrument type) CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 61 expression and manifestation, it would be more intellec- tually honest for FRBR and RDA to take the less granular path of mapping all existing bibliographic data to mani- festation and expression undifferentiated, that is, to use our current data model unchanged and state this openly. I am not in favor of adding granularity for granularity’s sake or for the sake of vague conceptions of possible future use. Granularity is expensive and should be used only in support of clear and fundamental objectives. n The goal: efficient displays and indexes My main concern is that we model and then structure the data in a way that allows us to build the complex displays that are necessary to make catalogs appear simple to use. I am aware that the current orthodoxy is that recording data should be kept completely separate from indexing and display (“the applications layer”). Because I have spent my career in a field in which catalog records are indexed and displayed badly by systems people who don’t seem to understand the data contained in them, I am a skeptic. It is definitely possible to model and struc- ture data in such a way that desired displays and indexes are impossible to construct. I have seen it happen! The LC Working Group report states that “it will be recognized that human users and their needs for display and discovery do not represent the only use of bibliographic metadata; instead, to an increasing degree, machine applications are their primary users.”14 My fear is that the underlying assumption here is that users need to (and can) retrieve the single perfect record. This will never be true for bibliographic metadata. Users will always need to assemble all relevant records (of all kinds) as precisely as possible and then browse through them before making a decision about which resources to obtain. This is as true in the Semantic Web—where “records” can be conceived of as entity or class URIs—as it is in the world of MARC–encoded metadata. Some of the problems that have arisen in the past in trying to index bibliographic metadata for humans are connected to the fact that existing systems do not group all of the data related to a particular entity effectively, such that a user can use any variant name or any combi- nation of variant names for an entity and do a successful search. Currently, you can only look for a match among two or more keywords within the bounds of a single manifestation-based bibliographic record or within the bounds of a single heading, minus any variant terms for that entity. Thus, when you do a keyword search for two keywords, for example, “clemens” and “adventures,” you will retrieve only those manifestations of Mark Twain’s Adventures of Tom Sawyer that have his real name (Clemens) and the title word “Adventures” co-occurring within the bounded space created by a single manifes- tation-based bibliographic record. Instead, the preferred forms and the variant forms for a given entity need to be bounded for indexing such that the keywords the user employs to search for that entity can be matched using co-occurrence rules that look for matches within a single bounded space representing the entity desired. We will return to this problem in the discussion of issue 3 in the later section “RDF Problems Encountered.” The most complex indexing problem has always proven to be the grouping or bounding of data related to a work, since it requires pulling in all variants for the creator(s) of that work as well. Otherwise, a user who searches for a work using a variant of the author’s name and a variant of the title will continue to fail (as they do in all current OPACs), even when the desired work exists in the catalog. If we could create a URI for the Adventures of Tom Sawyer that included all variant names for the author and all variant titles for the work (including the variant title Tom Sawyer), the same keyword search described above (“clemens” and “adventures”) could be made to retrieve all manifestations and expressions of the Adventures of Tom Sawyer, instead of the few isolated manifestations that it would retrieve in current catalogs. We need to make sure that we design and structure the data such that the following displays are possible: n Display all works by this author in alphabeti- cal order by title with the sorting element (title) appearing at the top of each work displayed. n Display all works on this subject in alphabetical order by principal author and title (with principal author and title appearing at top of each work dis- played), or title if there is no principal author (with title appearing at top of each work displayed). We must ensure that we design and structure the data in such a way that our structure allows us to create subgroups of related data, such as instrumentation for a piece of music (consisting of a number associated with each particular instrument), place and related publisher for a certain span of dates on a serial title change record, and the like. n Which standards will carry out which functions? Currently, we have a number of different standards to carry out a number of different functions; we can speculate about how those functions might be allocated in a new Semantic Web–based dispensation, as shown in table 1. In table 1, data structure is taken to mean what a record represents or stands for; traditionally, a record has represented an expression (in the days of hand- 62 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 press books) or a manifestation (ever since reproduction mechanisms have become more sophisticated, allowing an explosion of reproductions of the same content in different formats and coming from different distribu- tors). RDA is record-neutral; RDF would allow URIs to be established for any and all of the FRBR levels; that is, there would be a URI for a particular work, a URI for a particular expression, a URI for a particular manifesta- tion, and a URI for a particular item. Note that I am not using data structure in the sense that a computer pro- grammer does (as a way of storing data in a computer so that it can be used efficiently). Currently, the encoding of facts about entity relation- ships (see table 1) is carried out by matching data-value character strings (headings or linking fields using ISSNs and the like) that are defined by the LC/NACO author- ity file (following AACR2R rules), LCSH (following rules in the Subject Cataloging Manual), etc. In the future, this function might be carried out by using RDF to link the URI for a resource to the URI for a data value. Display rules (see table 1) are currently defined by ISBD and AACR2R but widely ignored by systems, which frequently truncate bibliographic records arbitrarily in displays, supply labels, and the like; RDA abdicates responsibility, pushing display out of the cataloging rules. The general principle on the Web is to divorce data from display and allow anyone to display the data any way they want. Display is the heart of the objects (or goals) of cataloging: The point is to display to the user the works of an author, the editions of a work, or the works on a subject. All of these goals only can be met if complex, high-quality displays can be built from the data created according to the data model. Indexing rules (see table 1) were once under the control of catalogers (in book and card catalogs) in that users had to navigate through headings and cross-references to find Table 1. Possible reallocation of current functions in a new Semantic Web–based dispensation Function Current Future? Data content, or content guidelines (rules for providing data in a particular element) Defined by AACR2R and MARC 21 Defined by RDA and RDF/RDFS/ OWL/SKOS Data elements Defined by ISBD–based AACR2R and MARC 21 Defined by RDA and RDF/RDFS/ OWL/SKOS Data values Defined by LC/NACO authority file, LCSH, MARC 21 coded data values, etc. Defined as ontologies using RDF/ RDFS/OWL/SKOS Encoding or labeling of data elements for machine manipulation; same as data format? Defined by ISO 2709–based MARC 21 Defined by RDF/RDFS/XML Data structure (i.e., what a record stands for) Defined by AACR2R and MARC 21; also FRBR? Defined by RDF/RDFS/OWL/ SKOS Schematization (constraint on structure and content) MARC 21, MODS, DCMI abstract model Defined by RDF/RDFS/OWL/ SKOS Encoding of facts about entity relationships Carried out by matching data value strings (headings found in LC/NACO authority file and LCSH, ISSN’s, and the like) Carried out by RDF/RDFS/OWL/ SKOS in the form of URI links Display rules ILS software, formerly ISBD– based AACR2R (“Application layer”) or Yee rules Indexing rules ILS software SPARQL, “application layer,” or Yee rules CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 63 what they wanted; currently indexing is in the hands of system designers who prefer to provide keyword index- ing of bibliographic (i.e., manifestation-based) records rather than provide users with access to the entities they are really interested in (works, authors and subjects), all represented currently by authority records for head- ings and cross-references. RDA abdicates responsibility, pushing indexing concerns completely out of the catalog- ing rules. The general principle on the Web is to allow resources to be indexed by any Web search engines that wish to index them. Current Web data is not structured at all for either indexing or display. I would argue that our interest in the Semantic Web should be focused on whether or not it will support more data structure—as well as more logic in that data structure—to support better indexes and better displays than we have now in manifestation-based ILS OPACs. Crucial to better indexing than we have ever had before are the co-occurrence rules for keyword indexing, that is, the rules for when a co-occurrence of two or more keywords should produce a match. We need to be able to do a keyword search across all possible variant names for the entity of interest, and the entity of interest for the average catalog user is much more likely to be a particular work than to be a particular manifestation. Unfortunately, catalog-use studies only have studied so-called known-item searches without investigating whether a known-item searcher was looking for a par- ticular edition or manifestation of a work or was simply looking for a particular work in order to make a choice as to edition or manifestation once the work was found. However, common sense tells us that it is a rare user who approaches the catalog with prior knowledge about all published editions of a given work. The more com- mon situation is surely one in which a user desires to read a particular Shakespeare play or view a particular David Lean film and discovers that the desired work exists in more than one expression or manifestation only after searching the catalog. We need to have the keyword(s) in our search for a particular work co-occur within a bounded space that encompasses all possible keywords that might refer to that particular work entity, including both creator and title keywords. Notice in table 1 the unifying effect that RDF could potentially have; it could free us from the use of multiple standards that can easily contradict each other, or at least not live peacefully together. Examples are not hard to find in the current environment. One that has cropped up in the course of RDA development concerns family names. Presently the rules for naming families are dif- ferent depending on whether the family is the subject of a work (and established according to LCSH) or whether the family is responsible for a collection of papers (and established according to RDA). n Types of data RDA has blurred the distinctions among certain types of data, apparently because there is a perception that on the Semantic Web the same piece of data needs to be coded only once, and all indexing and display needs can be supported from that one piece of data. I question that assumption on the basis of my experience with biblio- graphic cataloging. All of the following ways of encod- ing the same piece of data can still have value in certain circumstances: n Transcribed; in RDF terms, a literal (i.e., any data that is not a URI, a constant value). Transcribed data is data copied from an item being cataloged. It is valuable for providing access to the form of the name used on a title page and is particularly useful for people who use pseudonyms, corporate bodies that change name, and so on. Transcribed data is an important part of the historical record and not just for off-line materials; it can be a historical record of changing data on notoriously fluid webpages. n Composed; in RDF terms, also a literal. Composed data is information composed by a cataloger on the basis of observation of the item in hand; it can be valuable for historical purposes to know which data was composed. n Supplied; in RDF terms, also a literal. Supplied data is information supplied by a cataloger from outside sources; it can be valuable for historical purposes to know which data was supplied and from which outside sources it came. n Coded; in RDF, represented by a URI. Coded data would likely transform on the Semantic Web into links to ontologies that could provide normalized, human-readable identification strings on demand, thus causing coded and normalized data to merge into one type of data. Is it not possible, though, that the coded form of normalized data might continue to provide for more efficient searching for computers as opposed to humans? Coded data also has great cross-cultural value, since it is not as language-dependent as literals or normalized headings. n Normalized Headings (controlled headings); in RDF, represented by a URI. Normalized or controlled headings are still necessary to provide users with coherent, ordered displays of thousands of entities that all match the user’s search for a particular entity (work, author, subject, etc.). The reason Google displays are so hideous is that, so far, the data searched lacks any normalized display data. If variant language forms of the name for an entity 64 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 are linked to an entity URI, it should be possible to supply headings in the language and script desired by a particular user. n The RDF model Those who have become familiar with FRBR over the years will probably not find it too difficult to transition from the FRBR conceptual model to the RDF model. What FRBR calls an “entity,” RDF calls a “subject” and RDFS calls a “class.” What FRBR calls an “attribute,” RDF calls an “object” and RDFS calls a “property.” What FRBR calls a “relationship,” RDF calls a “predicate” and RDFS calls a “relationship” or a “semantic linkage” (see table 2). The difficulty in any data-modeling exercise lies in deciding what to treat as an entity or class and what to treat as an attribute or property. The authors of FRBR decided to create a class called expression to deal with any change in the content of a work. When FRBR is applied to serials, which change content with every issue, the model does not work well. In my model, I found it useful to create a new entity at the manifestation level, the serial title, to deal with the type of change that is more relevant to serials, the change in title. I also created another new entity at the manifestation level, title-manifestation, to deal with a change of title in a nonserial work that is not asso- ciated with a change in content. One hundred years ago, this entity would have been called title-edition. I am also in the process of developing an entity at the expression level—surrogate—to deal with reproductions of original artworks that need to inherit the qualities of the original artwork they reproduce without being treated as an edi- tion of that original artwork, which ipso facto is unique. These are just examples of cases in which it is not that easy to decide on the classes or entities that are necessary to accurately model bibliographic information. See the appendix for a complete comparison of the classes and entities defined in four different models: FRBR, FRAD, RDA, and the Yee Cataloging Rules (YCR). The appendix also shows variation among these models concerning whether a given data element is treated as a class/entity or as an attribute/property. The most notable examples are name and preferred access point, which are treated as classes/entities in FRAD, as attributes in FRBR and YCR, and as both in RDA. n RDF problems encountered My goal for this paper is to institute discussion with data modelers about which problems I observed are insoluble and which are soluble: 1. Is there an assumption on the part of Semantic Web developers that a given data element, such as a publisher name, should be expressed as either a literal or using a URI (i.e., con- trolled), but never both? Cataloging is rooted in humanistic practices that require careful recording of evidence. There will always be value in distinguishing and labeling the following types of data: n Copied as is from an artifact (transcribed) n Supplied by a cataloger n Categorized by a cataloger (controlled) Tim Berners-Lee (the father of the Internet and the Semantic Web) emphasizes the importance of record- ing not just data but also its provenance for the sake of authenticity.15 For many data elements, therefore, it will be important to be able to record both a literal (tran- scribed or composed form or both) and a URI (controlled form). Is this a problem in RDF? As a corollary, if any data that can be given a URI cannot also be represented by a literal (transcribed and composed data, or one or the other), it may not be possible to design coherent, readable displays of the data describing a particular entity. Among other things, cataloging is a discursive writing skill. Does RDF require that all data be represented only once, either by a literal or by a URI? Or is it perhaps possible that data that has a URI could also have a transcribed or composed form as a property? Perhaps it will even be possible to store multiple snapshots of online works that change over time to document variant forms of a name for works, persons, and so on. 2. Will the Internet ever be fast enough to assemble the equivalent of our current records from a collection of hundreds or even thousands of URIs? In RDF, links are one-to-one rather than one-to-many. This leads to a great prolifera- tion of reciprocal links. The more granularity there is in the data, the more linking is necessary to ensure that atomized data elements are linked together. Potentially, every piece of data describing a particular entity could be represented by a URI leading out to a SKOS list of data values. The number of links necessary to pull together Table 2. The FRBR conceptual model translated into RDF and RDFS FRBR RDF RDFS Entity Subject Class Attribute Object Property Relationship Predicate Relationship/ semantic linkage CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 65 all of the data just to describe one manifestation could become astronomical, as could the number of one-to-one links necessary to create the appearance of a one-to-many link, such as the link between an author and all the works of an author. Is the Internet really fast enough to assemble a record from hundreds of URIs in a reasonable amount of time? Given the often slow network throughput typical of many of our current Internet connections, is it really practical to expect all of these pieces to be pulled together efficiently to create a single display for a single user? We yet may feel nostalgia for the single manifestation-based record that already has all of the relevant data in it (no assembly required). Bruce D’Arcus points out, however, that I think if you’re dealing with RDF, you wouldn’t neces- sarily be gathering these data in real-time. The URIs that are the targets for those links are really just global identifiers. How you get the triples is a separate matter. So, for example, in my own personal case, I’m going to put together an RDF store that is populated with data from a variety of sources, but that data popula- tion will happen by script, and I’ll still be querying a single endpoint, where the RDF is stored in a relational database.16 In other words, D’Arcus essentially will put them all in one place, or in one database that “looks” from a URI perspective to be “one place” where they’re already gathered. 3. Is RDF capable of dealing with works that are identified using their creators? We need to treat author as both an entity in its own right and as a property of a work, and in many cases the latter is the more important function for user service. Lexical labels, or human-readable identi- fiers for works that are identified using both the principal author and the title, are particularly problematic in RDF given that the principal author is an entity in its own right. Is RDF capable of supporting the indexing neces- sary to allow a user to search using any variant of the author’s name and any variant of the title of a work in combination and still retrieve all expressions and mani- festations of that work, given that author will have a URI of its own, linked by means of a relationship link to the work URI? Is RDF capable of supporting the display of a list of one thousand works, each identified by principal author, in order first by principal author, then by title, then by publication date, given that the preferred heading for each principal author would have to be assembled from the URI for that principal author and the preferred title for each work would have to be assembled from the URI for that work? For fear that this will not, in fact, be pos- sible, I have put a human-readable work-identifier data element into my model that consists of principal author and title when appropriate, even though that means the preferred name of the principal author may not be able to be controlled by the entity record for the principal author. Any guidance from experienced data modelers in this regard would be appreciated. According to Bruce D’Arcus, this is purely an inter- face or application question that does not require a solu- tion at the data layer.17 Since we have never had interfaces or applications that would do this correctly, even though the data is readily available in authority records, I am skeptical about this answer! Perhaps Bruce’s suggestion under item 9 of designat- ing a sortName property for each entity is the solution here as well. My human-readable work identifier con- sisting of the name of the principal creator and uniform title of work could be designated the sortName poperty for the work. It would have to be changed whenever the preferred form of the name for the principal creator changed, however. 4. Do all possible inverse relationships need to be expressed explicitly, or can they be inferred? My model is already quite large, and I have not yet defined the inverse of every property as I really should to have a correct RDF model. In other words, for every property there needs to be an inverse property; for example, the property isCreatorOf needs to have the inverse property isCreatedBy; thus “Twain” has the property isCreatorOf, while “Adventures of Tom Sawyer” has the property isCreatedBy. Perhaps users and inputters will not actually have to see the huge, complex RDF data model that would result from creating all the inverse relationships, but those who maintain the model will have to deal with a great deal of complexity. However, since I’m not a programmer, I don’t know how the complexity of RDF compares to the complexity of existing ILS software. 5. Can RDF solve the problems we are having now because of the lack of transitivity or inheritance in the data models that underlie current ILSes, or will RDF merely perpetuate these problems? We have problems now with the data models that underlie our current ILSes because of the inability of these models to deal with hierarchical inheritance, such that whatever is true of an entity in the hierarchy is also true of every entity below that entity in the hierarchy. One example is that of cross-references to a parent corporate body that should be held to apply to all subdivisions of that corporate body but never are in existing ILS systems. There is a cross-reference from “FBI” to “United States. Federal Bureau of Investigation,” but not from “FBI Counterterrorism Division” to “United States. Federal Bureau of Investigation. Counterterrorism Division.” For that reason, a search in any OPAC name index for “FBI Counterterrorism Division” will fail. We need systems that recognize that data about a parent corporate body is relevant to all subdivisions of that parent body. We need systems that recognize that data about a work is relevant to all expressions and manifestations of that work. RDF allows you to link a work to an expression 66 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 and an expression to a manifestation, but I don’t believe it allows you to encode the information that everything that is true of the work is true of all of its expressions and manifestations. Rob Styles seems to confirm this: “RDF doesn’t have hierarchy. In computer science terms, it’s a graph, not a tree, which means you can connect anything to anything else in any direction.”18 Of course, not all links should be this kind of tran- sitive or inheritance link. One expression of work A is linked to another expression of work A by links to work A, but whatever is true of one of those expressions is not necessarily true of the other; one may be illustrated, for example, while the other is not. Whatever is true of one work is not necessarily true of another work related to it by related work link. It should be recognized that bibliographic data is rife with hierarchy. It is one of our major tools for expressing meaning to our users. Corporate bodies have corporate subdivisions, and many things that are true for the par- ent body also are true for its subdivisions. Subjects are expressed using main headings and subject subdivisions, and many things that are true for the main heading (such as variant names) also are true for the heading combined with one of its subdivisions. Geographic areas are con- tained within larger geographic areas, and many things that are true of the larger geographic area also are true for smaller regions, counties, cities, etc., contained within that larger geographic area. For all these reasons, I believe that, to do effective displays and indexes for our biblio- graphic data, it is critical that we be able to distinguish between a hierarchical relationship and a nonhierarchical relationship. 6. To recognize the fact that the subject of a book or a film could be a work, a person, a concept, an object, an event, or a place (all classes in the model), is there any reason we cannot define subject itself as a property (a relationship) rather than a class in its own right? In my model, all subject properties are defined as having a domain of resource, meaning there is no constraint as to the class to which these subject properties apply. I’m not sure if there will be any fall-out from that modeling decision. 7. How do we distinguish between the corporate behavior of a jurisdiction and the subject behavior of a geographical loca- tion? Sometimes a place is a jurisdiction and behaves like a corporate body (e.g., United States is the name of the government of the United States). Sometimes place is a physical location in which something is located (e.g., the birds discussed in a book about the birds of the United States). To distinguish between the corporate behavior of a jurisdiction and the subject behavior of a geographical location, I have defined two different classes for place: Place as Jurisdictional Corporate Body and Place as Geographic Area. Will this cause problems in the model? Will there be times when it prevents us from making elegant general- izations in the model about place per se? There is a similar problem with events. Some events are corporate bodies (e.g., conferences that publish papers) and some are a kind of subject (e.g., an earthquake). I have defined two different classes for event: Conference or Other Event as Corporate Body Creator and Event as Subject. 8. What is the best way to model a bound-with or an issued- with relationship, or a part–whole relationship in which the whole must be located to obtain the part? The bound-with relationship is actually between two items containing two different works, while the issued-with relationship is between two manifestations containing two different works (see figure 2). Is this a work-to-work relation- ship? Will designating it a work-to-work relationship cause problems for indicating which specific items or manifestation-items of each work are physically located in the same place? This question may also apply to those part–whole relationships in which the part is physically contained within the whole and both are located in the same place (sometimes known as analytics). One thing to bear in mind is that in all of these cases the relationship between two works does not hold between all instances of each work; it only holds for those particular instances that are contained in the particular manifestation or item that is bound with, issued with, or part of the whole. However, if the relationship is modeled as a work-1- manifestation to work-2-manifestation relationship, or a work-1-item to work-2-item relationship,, care must be taken in the design of displays to pull in enough infor- mation about the two or more works so as not to confuse the user. 9. How do we express the arrangement of elements that have a definite order? I am having trouble imagining how to encode the ordering of data elements that make up a larger element, such as the pieces of a personal name. This is really a desire to control the display of those atom- ized elements so that they make sense to human beings rather than just to machines. Could one define a property such as natural language order of forename, surname, middle name, patronymic, matronymic and/or clan name of a person given that the ideal order of these elements might vary from one person to another? Could one define proper- ties such as sorting element 1, sorting element 2, sorting element 3, etc., and assign them to the various pieces that will be assembled to make a particular heading for an entity, such as an LCSH heading for a historical period? (Depending on the answer to the question in item 11, it may or may not be possible to assign a property to a property in this fashion.) Are there standard sorting rules we need to be aware of (in Unicode, for example)? Are there other RDF techniques available to deal with sorting and arrangement? Bruce D’Arcus suggests that, instead of coding the name parts, it would be more useful to designate sort- Name properties;19 might it not be necessary to designate a sortName property for each variant name, as well, CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 67 for cases in which variants need to appear in sorted displays? And wouldn’t these sortName properties com- plicate maintenance over time as preferred and variant names changed? 10. How do we link related data elements in such a way that effective indexing and displays are possible? Some examples: number and kind of instrument (e.g., music written for two oboes and three guitars); multiple publishers, fre- quencies, subtitles, editors, etc., with date spans for a serial title change (or will it be necessary to create a new manifestation for every single change in subtitle, pub- lisher name, place of publication, etc?). The assumption seems to be that there will be no repeatable data ele- ments. Based on my somewhat limited experience with RDF, it appears that there are record equivalents (every data element—property or relationship—pertaining to a particular entity with a URI), but there are no field or subfield equivalents that allow the sublinking of related pieces of data about an entity. Indeed, Rob Styles goes so far as to argue that ultimately there is no notion of a “record” in RDF.20 It is possible that blank nodes might be able to fill in for fields and subfields in some cases for grouping data, but there are dangers involved in their use.21 To a cataloger, it looks as though the plan is for RDF data to float around loose without any requirement that there be a method for pulling it together into coherent displays designed for human beings. 11. Can a property have a property in RDF? As an exam- ple of where it might be useful to define a property of a property, Robert Maxwell suggests that date of publication is really an attribute (property) of the published by rela- tionship (another property).22 Another example: In my model, a variant title for a serial is a property. Can that property itself have the property type of variant title to encompass things like spine title, key title, etc.? Another example appeared in item 9, in which it is suggested that it might be desirable to assign sort-element properties to the various elements of a name property. 12. How do we document record display decisions? There is no way to record display decisions in RDF itself; it is completely display-neutral. We could not safely commit to a particular RDF–based data model until a significant amount of sample bibliographic data had been created and open-source indexing and display software had been designed and user-tested on that data. It may be that we will need to supplement RDF with some other encoding mechanism that allows us to record display decisions along with the data. Current cataloging rules are about display as much as they are about content designation. ISBD concerns the order in which the elements should be displayed to humans. The cataloging objectives con- cern display to users of such entity groups as the works of an author, the editions of a work, and the works on a subject. 13. Can all bibliographic data be reduced to either a class or a property with a finite list of values? Another way to put this is to ask if all that catalogers do could be reduced to a set of pull-down menus. Cataloging is the art of writing discursive prose as much as it is the ability to select the correct value for a particular data element. We must deal with ambiguous data (presented by Joe Blow could mean that Joe created the entire work, produced it, distributed it, sponsored it, or merely funded it). We must sometimes record information without knowing its exact meaning. We must deal with situations that have not been antici- pated in advance. It is not possible to list every possible kind of data and every possible value for each type of Figure 2. examples of part–whole relationships. how might these be best expressed in rdF? issued-with relationship A copy of Charlie Chaplin’s 1917 film The Immigrant can be found on a videodisc compilation called Charlie Chaplin, The Early Years along with two other Chaplin films. This compilation was published and collected by many different libraries and media centers. If a user wants to view this copy of The Immigrant, he or she will first have to locate Charlie Chaplin, The Early Years, then look for the desired film at the beginning of the first videodisc in the set. The issued-with rela- tionship between The Immigrant and the other two films on Charlie Chaplin, The Early Years is currently expressed in the bibliographic record by means of a “with” note: First on Charlie Chaplin, the early years, v. 1 (62 min.) with: The count – Easy Street. Bound-with relationship The University of California, Los Angeles Film & Television Archive has acquired a reel of 16 mm. film from a collector who strung five Warner Bros. car- toons together on a single reel of film. We can assume that no other archive, library, or media collection will have this particular compilation of cartoons, so the relationship between the five cartoons is purely local in nature. However, any user at the Film & Television Archive who wishes to view one of these cartoons will have to request a viewing appointment for the entire reel and then find the desired cartoon among the other four on the reel. The bound-with relation- ship among these cartoons is currently expressed in a holdings record by means of a “with” note: Fourth on reel with: Daffy doodles – Tweety Pie – I love to singa – Along Flirtation Walk. 68 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 data up front before any data is gathered. It will always be necessary to provide a plain-text escape hatch. The bibliographic world is a complex, constantly changing world filled with ambiguity. n What are the next steps? In a sense, this paper is a first crude attempt at locating unmapped territory that has not yet been explored. If we were to decide as a community that it would be valu- able to move our shared cataloging activities onto the Semantic Web, we would have a lot of work ahead of us. If some of the RDF problems described above are insolu- ble, we may need to work with Semantic Web developers to create a more sophisticated version of RDF that can handle the transitivity and complex linking required by our data. We will also need to encourage a very complex existing community to evolve institutional structures that would enable a more efficient use of the Internet for the sharing of cataloging and other metadata creation. This is not just a technological problem, but also a political one. In the meantime, the experiment continues. Let the think- ing and learning begin! References and notes 1. “Notation3, or N3 as it is more commonly known, is a shorthand non–XML serialization of Resource Description Framework models, designed with human-readability in mind: N3 is much more compact and readable than XML RDF nota- tion. The format is being developed by Tim Berners-Lee and oth- ers from the Semantic Web community.” Wikipedia, “Notation 3,” http://en.wikipedia.org/wiki/Notation_3 (accessed Feb. 19, 2009). 2. FRBR Review Group, www.ifla.org/VII/s13/wgfrbr/; FRBR Review Group, FRANAR (Working Group on Functional Requirements and Numbering of Authority Records), www .ifla.org/VII/d4/wg-franar.htm; FRBR Review Group, FRSAR (Working Group, Functional Requirements for Subject Authority Records), www.ifla.org/VII/s29/wgfrsar.htm; FRBRoo, FRBR Review Group, Working Group on FRBR/CRM Dialogue, www .ifla.org/VII/s13/wgfrbr/FRBR-CRMdialogue_wg.htm. 3. Library of Congress, Response to On the Record: Report of the Library of Congress Working Group on the Future of Bib- liographic Control (Washington, D.C.: Library of Congress, 2008): 24, 39, 40, www.loc.gov/bibliographic-future/news/LCWGRpt Response_DM_053008.pdf (accessed Mar. 25, 2009). 4. Ibid., 39. 5. Ibid., 41. 6. Dublin Core Metadata Initiative, DCMI/RDA Task Group Wiki, http://www.dublincore.org/dcmirdataskgroup/ (accessed Mar. 25, 2009). 7. Mikael Nilsson, Andy Powell, Pete Johnston, and Ambjorn Naeve, Expressing Dublin Core Metadata Using the Resource Description Framework (RDF), http://dublincore.org/ documents/2008/01/14/dc-rdf/ (accessed Mar. 25, 2009). 8. See for example table 6.3 in FRBR, which maps to mani- festation every kind of data that pertains to expression change with the exception of language change. IFLA Study Group on the Functional Requirements for Bibliographic Records, Func- tional Requirements for Bibliographic Records (Munich: K. G. Saur, 1998): 95, http://www.ifla.org/VII/s13/frbr/frbr.pdf (accessed Mar. 4, 2009). 9. Roy Tennant, “MARC Must Die,” Library Journal 127, no. 17 (Oct. 15, 2002): 26. 10. W3C, SKOS Simple Knowledge Organization System Refer- ence, W3C Working Draft 29 August 2008, http://www.w3.org/ TR/skos-reference/ (accessed Mar. 25, 2009). 11. The extract in figure 1 is taken from my complete RDF model, which can be found at http://myee.bol.ucla.edu/ ycrschemardf.txt. 12. Mary W. Elings and Gunter Waibel, “Metadata for All: Descriptive Standards and Metadata Sharing Across Libraries, Archives and Museums,” First Monday 12, no. 3 (Mar. 5, 2007), http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/ article/view/1628/1543 (accessed Mar. 25, 2009). 13. OCLC, A Holdings Primer: Principles and Standards for Local Holdings Records, 2nd ed. (Dublin, Ohio: OCLC, 2008), 4, http:// www.oclc.org/us/en/support/documentation/localholdings/ primer/Holdings%20Primer%202008.pdf (accessed Mar. 25, 2009). 14. The Library of Congress Working Group, On the Record: Report of the Library of Congress Working Group on the Future of Bibliographic Control (Washington, D.C.: Library of Congress, 2008): 30, http:// www.loc.gov/bibliographic-future/news/lcwg-ontherecord -jan08-final.pdf (accessed Mar. 25, 2009). 15. Talis, Sir Tim Berners-Lee Talks with Talis about the Seman- tic Web: Transcript of an Interview Recorded on 7 February 2008, http://talis-podcasts.s3.amazonaws.com/twt20080207_TimBL .html (accessed Mar. 25, 2009). 16. Bruce D’Arcus, e-mail to author, Mar. 18, 2008. 17. Ibid. 18. Rob Styles, e-mail to author, Mar. 25, 2008. 19. Bruce D’Arcus, e-mail to author, Mar. 18, 2008. 20. Rob Styles, e-mail to author, Mar. 25, 2008. 21. W3C, “Section 2.3, Structured Property Values and Blank Nodes,” in RDF Primer: W3C Recommendation 10 February 2004, http://www.w3.org/TR/rdf-primer/#structuredproperties (accessed Mar. 25, 2009). 22. Robert Maxwell, FRBR: A Guide for the Perplexed (Chicago: ALA, 2008). CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 69 Entities/classes in rDa, FrBr, FraD compared to Yee Cataloging rules (YCr) RDA, FRBR, and FRAD YCR Group 1: Work Work Group 1: Expression Expression Surrogate Group 1: Manifestation Manifestation Title-manifestation Serial title Group 1: Item Item Group 2: Person Person Fictitious character Performing animal Group 2: Corporate body Corporate body Corporate subdivision Place as jurisdictional corporate body Conference or other event as corporate body creator Jurisdictional corporate subdivision Family (RDA and FRAD only) Group 3: Concept Concept Group 3: Object Object Group 3: Event Event or historical period as subject Group 3: Place Place as geographic area Discipline Genre/form Name Identifier Controlled access point Rules (FRAD only) Agency (FRAD only) APPENDIx. Entity/class and attribute/property comparisons 70 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 attributes/properties in FrBr compared to FraD Model Entity FRBR FRAD Work title of the work form of work date of the work other distinguishing characteristics intended termination intended audience context for the work medium of performance (musical work) numeric designation (musical work) key (musical work) coordinates (cartographic work) equinox (cartographic work) form of work date of the work medium of performance subject of the work numeric designation key place of origin of the work original language of the work history other distinguishing characteristic Expression title of the expression form of expression date of expression language of expression other distinguishing characteristics extensibility of expression revisability of expression extent of the expression summarization of content context for the expression critical response to the expression use restrictions on the expression sequencing pattern (serial) expected regularity of issue (serial) expected frequency of issue (serial) type of score (musical notation) medium of performance (musical notation or recorded sound) scale (cartographic image/object) projection (cartographic image/object) presentation technique (cartographic image/object) representation of relief (cartographic image/object) geodetic, grid, and vertical measurement (cartographic image/ object) recording technique (remote sensing image) special characteristic (remote sensing image) technique (graphic or projected image) form of expression date of expression language of expression technique other distinguishing characteristic Surrogate CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 71 Model Entity FRBR FRAD Manifestation title of the manifestation statement of responsibility edition/issue designation place of publication/distribution publisher/distributor date of publication/distribution fabricator/manufacturer series statement form of carrier extent of the carrier physical medium capture mode dimensions of the carrier manifestation identifier source for acquisition/access authorization terms of availability access restrictions on the manifestation typeface (printed book) type size (printed book) foliation (hand-printed book) collation (hand-printed book) publication status (serial) numbering (serial) playing speed (sound recording) groove width (sound recording) kind of cutting (sound recording) tape configuration (sound recording) kind of sound (sound recording) special reproduction characteristic (sound recording) colour (image) reduction ratio (microform) polarity (microform or visual projection) generation (microform or visual projection) presentation format (visual projection) system requirements (electronic resource) file characteristics (electronic resource) mode of access (remote access electronic resource) access address (remote access electronic resource) edition/issue designation place of publication/distribution publisher/distributor date of publication/distribution form of carrier numbering Title-manifestation Serial title Item item identifier fingerprint provenance of the item marks/inscriptions exhibition history condition of the item treatment history scheduled treatment access restrictions on the item location of item attributes/properties in FrBr compared to FraD (cont.) 72 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Model Entity FRBR FRAD Person name of person dates of person title of person other designation associated with the person dates associated with the person title of person other designation associated with the person gender place of birth place of death country place of residence affiliation address language of person field of activity profession/occupation biography/history Fictitious character Performing animal Corporate body name of the corporate body number associated with the corporate body place associated with the corporate body date associated with the corporate body other designation associated with the corporate body place associated with the corporate body date associated with the corporate body other designation associated with the corporate body type of corporate body language of the corporate body address field of activity history Corporate subdivision Place as jurisdictional corporate body Conference or other event as corporate body creator Jurisdictional corporate subdivision Family type of family dates of family places associated with family history of family Concept term for the concept type of concept Object term for the object type of object date of production place of production producer/fabricator physical medium Event term for the event date associated with the event place associated with the event attributes/properties in FrBr compared to FraD (cont.) CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 73 Model Entity FRBR FRAD Place term for the place coordinates other geographical information Discipline Genre/form Name type of name scope of usage dates of usage language of name script of name transliteration scheme of name Identifier type of identifier identifier string suffix Controlled access point type of controlled access point status of controlled access point designated usage of controlled access point undifferentiated access point language of base access point script of base access point script of cataloguing transliteration scheme of base access point transliteration scheme of cataloguing source of controlled access point base access point addition Rules citation for rules rules identifier Agency name of agency agency identifier location of agency attributes/properties in FrBr compared to FraD (cont.) 74 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 attributes/properties in rDa compared to YCr Model Entity RDA YCR Work title of the work form of work date of work place of origin of work medium of performance numeric designation key signatory to a treaty, etc. other distinguishing characteristic of the work original language of the work history of the work identifier for the work nature of the content coverage of the content coordinates of cartographic content equinox epoch intended audience system of organization dissertation or theses information key identifier for work language-based identifier (preferred lexical label) variant language-based identifier (alternate lexical label) language-based identifier (preferred lexical label) for work language-based identifier for work (preferred lexical label) identified by PrincipalCreator in combination with uniform title language-based identifier (preferred lexical label) for work identified by title alone (uniform title) supplied title for work variant title for work original language of work responsibility for work original publication statement of work dates associated with work original publication/release/broadcast date of work copyright date of work creation date of work date of first recording of a work date of first performance of a work finding date of naturally occurring object original publisher/distributor/broadcaster of work places associated with work original place of publication/distribution/broadcasting for work country of origin of work place of creation of work place of first recording of work place of first performance of work finding place of naturally occurring object original method of publication/distribution/broadcast of work serial or integrating work original numeric and/or alphabetic designations—beginning serial or integrating work original chronological designations— beginning serial or integrating work original numeric and/or alphabetic designations—ending serial or integrating work original chronological designations— ending encoding of content of work genre/form of content of work original instrumentation of musical work instrumentation of musical work—number of a particular instrument instrumentation of musical work—type of instrument original voice(s) of musical work voice(s) of musical work—number of a particular type of voice voice(s) of musical work—type of voice original key of musical work numeric designation of musical work coordinates of cartographic work equinox of cartographic work original physical characteristics of work original extent of work original dimensions of work mode of issuance of work CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 75 Model Entity RDA YCR Work (cont.) original aspect ratio of moving image work original image format of moving image work original base of work original materials applied to base of work work summary work contents list custodial history of work creation of archival collection censorship history of work note about relationship(s) to other works Expression content type date of expression language of expression other distinguishing characteristic of the expression identifier for the expression summarization of the content place and date of capture language of the content form of notation accessibility content illustrative content supplementary content colour content sound content aspect ratio format of notated music medium of performance of musical content duration performer, narrator, and/or presenter artistic and/or technical credits scale projection of cartographic content other details of cartographic content awards key identifier for expression language-based identifier (preferred lexical label) for expression variant title for expression nature of modification of expression expression title expression statement of responsibility edition statement scale of cartographic expression projection of cartographic expression publication statement of expression place of publication/distribution/release/broadcasting for expression place of recording for expression publisher/distributor/releaser/broadcaster for expression publication/distribution/release/broadcast date for expression copyright date for expression date of recording for expression numeric and/or alphabetic designations for serial expressions chronological designations for serial expressions performance date for expression place of performance for expression extent of expression content of expression language of expression text language of expression captions language of expression sound track language of sung or spoken text of expression language of expression subtitles language of expression intertitles language of summary or abstract of expression instrumentation of musical expression instrumentation of musical expression—number of a particular instrument instrumentation of musical expression—type of instrument voice(s) of musical expression voice(s) of musical expression—number of a particular type of voice voice(s) of musical expression—type of voice key of musical expression appendages to the expression expression series statement mode of issuance for expression notes about expression Surrogate [under development] attributes/properties in rDa compared to YCr (cont.) 76 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Model Entity RDA YCR Manifestation title statement of responsibility edition statement numbering of serials production statement publication statement distribution statement manufacture statement copyright date series statement mode of issuance frequency identifier for the manifestation note media type carrier type base material applied material mount production method generation layout book format font size polarity reduction ratio sound characteristics projection characteristics of motion picture film video characteristics digital file characteristics equipment and system requirements terms of availability key identifier for manifestation publication statement of manifestation place of publication/distribution/release/broadcast of manifestation manifestation publisher/distributor/releaser/broadcaster manifestation date of publication/distribution/release/broadcast carrier edition statement carrier piece count carrier name carrier broadcast standard carrier recording type carrier playing speed carrier configuration of playback channels process used to produce carrier carrier dimensions carrier base materials carrier generation carrier polarity materials applied to carrier carrier encoding format intermediation tool requirements system requirements serial manifestation illustration statement manifestation standard number manifestation ISBN manifestation ISSN manifestation publisher number manifestation universal product code notes about manifestation Title- manifestation key identifier for title-manifestation variant title for title-manifestation title-manifestation title title-manifestation statement of responsibilities title-manifestation edition statement publication statement of title-manifestation place of publication/distribution/release/broadcasting of title- manifestation publisher/distributor/releaser, broadcaster of title-manifestation date of publication/distribution/release/broadcast of title- manifestation title-manifestation series title-manifestation mode of issuance notes about title-manifestation title-manifestation standard number attributes/properties in rDa compared to YCr (cont.) CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 77 Model Entity RDA YCR Serial title key identifier for serial title variant title for serial title title of serial title serial title statement of responsibility serial title edition statement publication statement of serial title place of publication/distribution/release/broadcast of serial title publisher/distributor/releaser/broadcaster of serial title date of publication/distribution/release/broadcast of serial title serial title beginning numeric and/or alphabetic designations serial title beginning chronological designations serial title ending numeric and/or alphabetic designations serial title ending chronological designations serial title frequency serial title mode of issuance serial title illustration statement notes about serial title serial title ISSN-L Item preferred citation custodial history immediate source of acquisition identifier for the item item-specific carrier characteristics key identifier for item item barcode item location item call number or accession number item copy number item provenance item condition item marks and inscriptions item exhibition history item treatment history item scheduled treatment item access restrictions attributes/properties in rDa compared to YCr (cont.) 78 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Model Entity RDA YCR Person name of the person preferred name for the person variant name for the person date associated with the person title of the person fuller form of name other designation associated with the person gender place of birth place of death country associated with the person place of residence address of the person affiliation language of the person field of activity of the person profession or occupation biographical information identifier for the person key identifier for person language-based identifier (preferred lexical label) for person clan name of person forename/given name/first name of person matronymic of person middle name of person nickname of person patronymic of person surname/family name of person natural language order of forename, surname, middle name, patronymic, matronymic and/or clan name of person affiliation of person biography/history of person date of birth of person date of death of person ethnicity of person field of activity of person gender of person language of person place of birth of person place of death of person place of residence of person political affiliation of person profession/occupation of person religion of person variant name for person Fictitious character [under development] Performing animal [under development] Corporate body name of the corporate body preferred name for the corporate body variant name for the corporate body place associated with the corporate body date associated with the corporate body associated institution other designation associated with the corporate body language of the corporate body address of the corporate body field of activity of the corporate body corporate history identifier for the corporate body key identifier for corporate body language-based identifier (preferred lexical label) for corporate body dates associated with corporate body field of activity of corporate body history of corporate body language of corporate body place associated with corporate body type of corporate body variant name for corporate body Corporate subdivision [under development] Place as jurisdictional corporate body [under development] attributes/properties in rDa compared to YCr (cont.) CaN BiBlioGraPHiC DaTa BE PuT DirECTlY oNTo THE sEMaNTiC wEB? | YEE 79 Model Entity RDA YCR Conference or other event as corporate body creator [under development] Jurisdictional corporate subdivision [under development] Family name of the family preferred name for the family variant name for the family type of family date associated with the family place associated with the family prominent member of the family hereditary title family history identifier for the family Concept term for the concept preferred term for the concept variant term for the concept type of concept identifier for the concept key identifier for concept language-based identifier (preferred lexical label) for concept qualifier for concept language-based identifier variant name for concept Object name of the object preferred name for the object variant name for the object type of object date of production place of production producer/fabricator physical medium identifier for the object key identifier for object language-based identifier (preferred lexical label) for object qualifier for object language-based identifier variant name for object Event name of the event preferred name for the event variant name for the event date associated with the event place associated with the event identifier for the event key identifier for event or historical period as subject language-based identifier (preferred lexical label) for event or historical period as subject beginning date for event or historical period as subject ending date for event or historical period as subject variant name for event or historical period as subject Place name of the place preferred name for the place variant name for the place coordinates other geographical information identifier for the place key identifier for place as geographic area language-based identifier (preferred lexical label) for place as geographic area qualifier for place as geographic area variant name for place as geographic area Discipline key identifier for discipline language-based identifier (preferred lexical label) (name or classification number or symbol) for discipline translation of meaning of classification number or symbol for discipline attributes/properties in rDa compared to YCr (cont.) 80 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Model Entity RDA YCR Genre/form key identifier for genre/form language-based identifier (preferred lexical label) for genre/form variant name for genre/form Name scope of usage date of usage Identifier Controlled access point Rules Agency note: In rdA, the following attributes have not yet been assigned to a particular class or entity: extent, dimensions, terms of availability, contact information, restrictions on access, restrictions on use, uniform resource locator, status of identification, source consulted, cataloguer’s note, status of identification, and undifferentiated name indicator. Name is being treated as both a class and a property. Identifier and controlled access point are treated as properties rather than classes in both rdA and Ycr. attributes/properties in rDa compared to YCr (cont.) 3174 ---- EDiTorial BoarD THouGHTs | DEHMlow 53 Mark DehmlowEditorial Board Thoughts The Ten Commandments of Interacting with Nontechnical People M ore than ten years of working with technology and interacting with nontechnical users in a higher education environment has taught me many lessons about successful communication strategies. Somehow, in that time, I have been fortunate to learn some effective mechanisms for providing constructive support and leading successful technical projects with both technically and “semitechnically” minded patrons and librarians. I have come to think of myself as some- one who lives in the “in between,” existing more in the beyond than the bed or the bath, and, while not a native of either place, I like to think that I am someone who is comfortable in both the technical and traditional cliques within the library. Ironically, it turns out that the most critical pieces to successfully implementing technology solutions and bridging the digital divide in libraries has been categorically nontechnical in nature; it all comes down to collegiality, clear communication, and a commit- ment to collaboration. As I ruminated on the last ten plus years of work- ing in technology, I began to think of the behaviors and techniques that have proved most useful in developing successful relationships across all areas of the library. The result is this list of the top ten dos and don’ts for those of us self-identified techies who are working more and more often with the self-identified nontechnical set. 1. Be inclusive—I have been around long enough to see how projects that include only technical people are doomed to scrutiny and criticism. The single best strategy I have found to getting buy-in for technical projects is to include key stakeholders and those with influence in project planning and core decision-making. Not only does this create support for projects, but it encourages others to have a sense of ownership in project implementa- tion—and when people feel ownership for a proj- ect, they are more likely to help it succeed. 2. Share the knowledge—I don’t know if it is just the nature of librarianship, but librarians like to know things, and more often than not they have a healthy sense of curiosity about how things work. I find it goes a long way when I take a few moments to explain how a particular technology works. Our public services specialists, in particular, often want to know the details of how our digital tools work so that they can teach users most effectively and answer questions users have about how they func- tion. Sharing expertise is a really nice way to be inclusive. 3. Know when you have shared enough—In the same way that I don’t need to know every deep detail of collections management to appreciate it, most nontechies don’t need hour-long lectures on how each component of technology relates to the other. Knowing how much information to share when describing concepts is critical to keeping people’s interest and generally keeping you approachable. 4. Communicate in English—It is true that every spe- cialization has its own vocabulary and acronyms (oh how we love acronyms in libraries) that have no relevance to nonspecialists. I especially see this in the jargon we use in the library to describe our tools and services. The best policy is to avoid jar- gon and explain concepts in lay-person’s terms or, if using jargon is unavoidable, define specialized words in the simplest terms possible. Using analo- gies and drawing pictures can be excellent ways to describe technical concepts and how they work. It is amazing how much from kindergarten remains relevant later in life! 5. Avoid techno-snobbery—I know that I am risking vir- tual ostracism in writing this, but I think it needs to be said. Just because I understand technology does not make me better than others, and I have heard some variant of the “cup holder on the computer” joke way too often. Even if you don’t make these kinds of comments in front of people who aren’t as technically capable as you, the attitude will be apparent in your interactions, and there is truly nothing more condescending. 6. Meet people halfway—When people are trying to ask technology-related questions or converse about technical issues, don’t correct small mistakes. Instead, try to understand and coax out their mean- ing; elaborate on what they are saying, and extend the conversation to include information they might not be aware of. People don’t like to be corrected or made to feel stupid—it is embarrassing. If their understanding is close enough to the basic idea, letting small mistakes in terminology slide can create an opening for a deeper understanding. You can provide the correct terminology when talking about the topic without making a point to correct people. 7. Don’t make a clean technical/nontechnical distinction— After once offering the “technical” perspective on a topic, one librarian said to me that it wasn’t that they themselves didn’t have any technical Mark Dehmlow (mdehmlow@nd.edu) is digital Initiatives librarian, hesburgh libraries, University of notre dame, notre dame, Indiana. 54 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 perspective, it just wasn’t perhaps as extensive as mine. Each person has some level of technical expertise; it is better to encourage the development of that understanding rather than compartmental- izing people on the basis of their area of expertise. 8. Don’t expect everyone to be interested—Just because I chose a technical track and am interested in it doesn’t mean everyone should be. Sometimes peo- ple just want to focus on their area of expertise and let the technical work be handled by the techies. 9. Assume everyone is capable—at least at some level. Sometimes it is just a question of describing con- cepts in the right way, and besides, not every- one should be a programmer. Everyone brings their own skills to the table and that should be respected. 10. Expertise is just that—and no one, no one knows everything. There just isn’t enough time, and our brains aren’t that big. Embrace those with different expertise, and bring those perspectives into your project planning. A purely technical perspective, while perhaps being efficient, may not provide a practical or intuitive solution for users. Diversity in perspective creates stronger projects. In the same way that the most interesting work in academia is becoming increasingly more multidisci- plinary, so too the most successful work in libraries needs to bring diverse perspectives to the fore. While it is easy to say libraries are constantly becoming more technically oriented because of the expanse of digital collections and services, the need for the convergence of the technical and traditional domains is clear—digital preservation is a good example of an area that requires the lessons and strengths learned from physical preservation, and, if any- thing, the technical aspects still raise more questions than solutions—just see Henry Newman’s article “Rocks Don’t Need to be Backed Up” to see what I mean.1 Increasingly, as we develop and implement applications that better leverage our collections and highlight our services, their success hinges on their usability, user-driven design, and implementations based on user feedback. These “user”-based evaluation techniques fit more closely with traditional aspects of public services: interacting with patrons. Lastly, it is also important to remember that technol- ogy can be intimidating. It has already caused a good deal of anxiety for those in libraries who are worried about long-term job security as technology continues to initiate changes in the way we perform our jobs. One of the best ways to bring people along is to demystify the scary parts of technology and help them see a role for themselves in the future of the library. Going back to Maslow’s hierar- chy of needs, people want to feel a sense of security and belonging, and I believe it is incumbent upon those of us with a deep understanding of technology to help bring the technical to the traditional in a way that serves every- one in the process. Reference 1. Henry Newman, “Rocks Don’t Need to be Backed Up,” Enterprise Storage Forum.com (Mar. 27, 2009), www.enterprise storageforum.com/continuity/features/article.php/3812496 (accessed April 24, 2009). 3178 ---- 100 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Tutorial Andrew Darby and Ron Gilmour Adding Delicious Data to Your Library Website Social bookmarking services such as Delicious offer a simple way of devel- oping lists of library resources. This paper outlines various methods of incorporating data from a Delicious account into a webpage. We begin with a description of Delicious Linkrolls and Tagrolls, the simplest but least flexible method of dis- playing Delicious results. We then describe three more advanced meth- ods of manipulating Delicious data using RSS, JSON, and XML. Code samples using PHP and JavaScript are provided. O ne of the primary components of Web 2.0 is social bookmark- ing. Social bookmarking ser- vices allow users to store bookmarks on the Web where they are avail- able from any computer and to share these bookmarks with other users. Even better, these bookmarks can be annotated and tagged to provide multiple points of subject access. Social bookmarking services have become popular with librarians as a means of quickly assembling lists of resources. Since anything with a URL can become a bookmark, such lists can combine diverse resource types such as webpages, scholarly articles, and library catalog records. It is often desirable for the data stored in a social bookmarking account to be displayed in the context of a library webpage. This creates consistent branding and a more professional appearance. Delicious (http://delicious .com/), one of the most popular social bookmarking tools, allows users to extract data from their accounts and to display this data on their own websites. Delicious offers mul- tiple ways of doing this, from simply embedding HTML in the target web- page to interacting with the API.1 In this paper we will begin by looking at the simplest methods for users uncomfortable with programming, and then move on to three more advanced methods using RSS, JSON, and XML. Our examples use PHP, a cross-platform scripting language that may be run on either Linux/ Unix or Windows servers. While it is not possible for us to address the many environments (such as CMSes) in which websites are constructed, our code should be adaptable to most contexts. This will be especially sim- ple in the many popular PHP–based CMSes such as Drupal, Joomla, and WordPress. It should be noted that the pro- cess of tagging resources in Delicious requires little technical expertise, so the task of assembling lists of resources can be accomplished by any librarian. The construction of a website infrastructure (presumably by the library’s webmaster) is a more complex task that may require some programming expertise. Linkrolls and Tagrolls The simplest way of sharing links is to point users directly to the desired andrew Darby (adarby@ithaca.edu) is web Services librarian, and ron Gilmour (rgilmour@ithaca.edu) is Science librarian at Ithaca college library, Ithaca, new York. Figure 1. delicious linkroll page aDDiNG DEliCious DaTa To Your liBrarY wEBsiTE | DarBY aND GilMour 101 Delicious page. To share all the items labeled “biology” for the user account “iclibref,” one could disseminate the URL http://delicious.com/iclibref/ biology. The obvious downside is that the user is no longer on your website, and they may be confused by their new location and what they are supposed to do there. Linkrolls, a utility avail- able from the Delicious site, provides a number of options for generating code to display a set of bookmarked links, including what tags to display, the number, the type of bullet, and the sorting criterion (see figure 1).2 This utility creates simple HTML code that can be added to a website. A related tool, Tagrolls, creates the ubiquitous Delicious tag cloud.3 For many librarians, this will be enough. With the embedded Linkroll code, and perhaps a bit of CSS styling, they will be satisfied with the results. However, Delicious also offers more advanced methods of interacting with data. For more control over how Delicious data appears on a website, the user must interact with Delicious through RSS, JSON or XML. RSS Like most Web 2.0 applications, Delicious makes its content available as RSS feeds. Feeds are available at a variety of levels, from the Delicious system as a whole down to a par- ticular tag in a particular account. Within a library context, the most useful types of feeds will be those that point to lists of resources with a given tag. For example, the request http://feeds.delicious.com/rss/icli- bref/biology returns the RSS feed for the “biology” tag of the “iclibref” account, with items listed as follows: Darwin’s Dangerous Idea (Evolution 1) 2008-04- 09T18:40:00Z http://icarus.ithaca .edu/cgi-bin/Pwebrecon. cgi?BBID=237870 iclibref This epi- sode interweaves the drama in key moments of Darwin's life with documentary sequences of current research, linking past to present and introducing major concepts of evolutionary theory. 2001 biology To display Delicious RSS results on a website, the webmaster must use some RSS parsing tool in com- bination with a script to display the results. The XML_RSS package pro- vides an easy way to read RSS using PHP.4 The code for such an operation might look like this: parse(); foreach ($rss->getItems() as $item) { echo “

”; } ?> This code uses XML_RSS to parse the RSS feed and then prints out a list of linked results. RSS is designed primarily as a cur- rent awareness tool. Consequently, a Delicious RSS feed only returns the most recent thirty-one items. This makes sense from an RSS perspec- tive, but it will not often meet the needs of librarians who are using Delicious as a repository of resources. Despite this limitation, the Delicious RSS feed may be useful in cases where currency is relevant, such as lists of recently acquired materials. JSON A second method to retrieve results from Delicious is using JavaScript Object Notation or JSON.5 As with the RSS feed method, a request with credentials goes out to the Delicious server. The response returns in JSON format, which can then be processed using JavaScript. An example request might be http://feeds.delicious . c o m / v 2 / j s o n / i c l i b r e f / b i o l o g y . By navigating to this URL, the JSON response can be observed directly. A JSON response for a single record (formatted for readability) looks like this: Delicious.posts = [ {“u”:“http:\/\/icarus.ithaca .edu\/cgi-bin\/Pwebrecon .cgi?BBID=237870”, “d”:“Darwin’s Dangerous Idea (Evolution 1)”, “t”:[“biology”], “dt”:“2008-04-09T06:40:00Z”, “n”:“This episode interweaves the drama in key moments of Darwin’s life with docu- mentary sequences of current research, linking past to present and introducing major concepts of evolutionary theory. 2001”} ]; It is instructive to look at the JSON feed because it displays the information elements that can be extracted: “u” for the URL of the resource, “d” for the title, “t” for a comma-separated list of related tags, “n” for the note field, and “dt” for the timestamp. To display results in a webpage, the feed is requested using JavaScript: 102 iNForMaTioN TECHNoloGY aND liBrariEs | JuNE 2009 Then the JSON objects must be looped through and displayed as desired. Alternately, as in the script below, the JSON objects may be placed into an array for sorting. The following is a simple exam- ple of a script that displays all of the available data with each item in its own paragraph. This script also sorts the links alphabetically. While RSS returns a maximum of thirty-one entries, JSON allows a maximum of one hundred. The exact number of items returned may be modified through the count param- eter at the end of the URL. At the Ithaca College Library, we chose to use JSON because at the time, Delicious did not offer the convenient Tagrolls, and the results returned by RSS were displayed in reverse chronological order and truncated at thirty-one items. Currently, we have a single PHP page that can display any Delicious result set within our library website template. Librarians gener- ate links with parameters that desig- nate a page title, a comma-delimited list of desired tags, and whether or not item descriptions should be displayed. For example, www.itha- calibrary.com/research/delish_feed. php?label=Biology%20Films&tag=bio logy,biologyI¬es=yes will return a page that looks like figure 2. The advantage of this approach is that librarians can easily gener- ate webpages on the fly and send the URL to their faculty members or add it to a subject guide or other webpage. The PHP script only has to read the “$_GET” variables from the URL and then query Delicious for this content. xML Delicious offers an application pro- gramming interface (API) that returns XML results from que- ries passed to Delicious through HTTPS. For instance, the request https://api.del.icio.us/v1/posts/ recent?&tag=biology returns an XML document listing the fifteen most recent posts tagged as “biology” for a given account. Unlike either the RSS or the JSON methods, the XML API offers a means of retrieving all of the posts for a given tag by allowing requests such as https://api.del.icio.us/v1/ posts/all?&tag=biology. This type of request is labor intensive for the Delicious server, so it is best to cache the results of such a query for future use. This involves the user writing the results of a request to a file on the server and then checking to see if such an archived file exists before issuing another request. A PHP util- ity called DeliciousPosts, which provides caching functionality, is available for free.6 Note that the username is not part of the request and must be sup- plied separately. Unlike the public RSS or JSON feeds, using the XML API requires users to log in to their own account. From a script, this can be accomplished using the PHP curl function: $ch = curl_init(); curl_setopt($ch, CURLOPT_ URL, $queryurl); curl_setopt($ch, CURLOPT_ USERPWD, $username . “:” . $password); curl_setopt($ch, CURLOPT_ RETURNTRANSFER, 1); $posts = curl_exec($ch); curl_close($ch); This code logs into a Delicious account, passes it a query URL, and makes the results of the query avail- able as a string in the variable $posts. The content of $posts can then be processed as desired to create Web content. One way of doing this is to use an XSLT stylesheet to transform the results into HTML, which can then be printed to the browser: /* Create a new DOM document from your stylesheet */ $xsl = new DomDocument; $xsl->load(“mystylesheet.xsl”); /* Set up the XSLT processor */ $xp = new XsltProcessor; $xp->importStylesheet($xsl); /* Create another DOM docu- ment from the contents of the $posts variable */ $doc = new DomDocument; $doc->loadXML($posts); /* perform the XSLT transfor- mation and output the resulting HTML */ $html = $xp- >transformToXML($doc); echo $html; Conclusion Delicious is a great tool for quickly and easily saving bookmarks. It also offers some very simple tools such as Linkrolls and Tagrolls to add Delicious content to a website. But to exert more control over this data, the user must interact with the Delicious API or feeds. We have outlined three different ways to accomplish this: RSS is a familiar option and a good choice if the data is to be used in a feed reader, or if only the most recent items need be shown. JSON is per- haps the fastest method, but requires some basic scripting knowledge and can only display one hundred results. The XML option involves more pro- gramming but allows an unlimited number of results to be returned. All of these methods facilitate the use of Delicious data within an existing website. References 1. Delicious, Tools, http://delicious .com/help/tools (accessed Nov. 7, 2008). 2. Linkrolls may be found from your Delicious account by clicking Settings > Linkrolls, or directly by going to http:// delicious.com/help/linkrolls (accessed Nov. 7, 2008). 3. Tagrolls may be found from your Delivious account by clicking Settings > Tagrolls or directly by going to http:// delicious.com/help/tagrolls (accessed Nov. 7, 2008) 4. Martin Jansen and Clay Loveless, “Pear::Package::XML_RSS,” http://pear .php.net/package/XML_RSS (accessed November 7, 2008). 5. Introducing JSON, http://json.org (accessed Nov. 7, 2008). 6. Ron Gilmour, “DeliciousPosts,” h t t p : / / r o n g i l m o u r. i n f o / s o f t w a r e / deliciousposts (accessed Nov. 7, 2008). LITA cover 2, cover 3, cover 4 MIT Press 92 Index to Advertisers 3216 ---- 106 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Michelle FrisquePresident’s Message Michelle Frisque (mfrisque@northwestern.edu) is LiTa President 2009–10 and Head, information Systems, north- western University, Chicago. B y the time you read this column I will be LITA president, however, as I write this I still have a couple of weeks left in my vice-presidential year. I have been warned by so many that my presidential year will fly by, and I am beginning to understand how that could be. I can’t believe I am almost done with my first year. I have enjoyed it and sometimes been overwhelmed by it—especially when I began the process of appointing LITA volunteers to committees and liaison roles. I didn’t realize how many appointments there were to make. I want to thank all of the LITA members who volunteered. You really helped make the appointment process easier. As a volunteer organization, LITA relies on you, and once again many of you have stepped up. Thank you. During the appointment process I was introduced to many LITA members whom I had not yet met. I enjoyed being intro- duced to you virtually, and I look forward to meeting you in person in the coming year. I also want to thank the LITA office. They were there whenever I needed them. Without their assistance I would not have been able to successfully complete the appointment process. Over the last year I have been working closely with this year’s LITA Emerging Leaders, Lisa Thomas and Holly Tomren. I have really enjoyed the experience. Their enthusiasm and energy is contagious. I wish every LITA member could have been at this year’s LITA Camp in Columbus, Ohio, on May 8. During one of the lightning round sessions, Lisa went to the podium and gave an impassioned speech about the benefits of belonging to a professional organization like LITA. If there was a person in the audience that was not yet a LITA member, I am sure they joined immediately afterward. She really captured the essence of why I became active in LITA and why I continue to stay so involved in this organization so many years later. I can honestly say that as much as I have given to LITA, I have received so much more in return. That is the true benefit of LITA membership. Over the last year, the LITA board has had some great discussions with LITA members and leaders. Those con- versations will continue as we start the work of drafting a new strategic plan. I want to create a strategic plan that will chart a meaningful path for the association and its members for the next several years. I want it to provide direction but also be flexible enough to adapt to changes in the information technology association landscape. As Andrew Pace mentioned in his last President’s Message, changes will be coming. While we still aren’t sure exactly what those changes are, we know that it is time to seriously look at the current organizational structure of LITA to make sure it best fits our needs today while continuing to remain flexible enough to meet our needs tomorrow. When I think of the organizational changes we are exploring, I can’t help but think of the houses I see on my favorite home improvement shows. LITA has good bones. The structure and foundation are solid and well built, and as long as the house is well cared for, should last for years to come. However, like all houses, improvements need to be made over time to keep up with the market. The LITA structure and foundation will be the same. When you drive up to the house you will still recognize the LITA structure. When you walk in the door my hope is that you will still get that same homey feeling you had before, maybe with a few “oohs” and “aahs” thrown in as you notice the upgrades and enhancements. As the year progresses we will know more. I will use this column and other communication avenues to keep you informed of our plans and to gather your input. I would like to close my first column by thanking you for giving me this opportunity to serve you as the LITA president. I am honored and humbled by the trust you have placed in me, and I am ready to start my presiden- tial year. I hope it does not go by too quickly. I want to savor the experience. Now let’s get started! 3217 ---- EDiTorial | TruiTT 107 Marc TruittEditorial: Computing in the “Cloud” Silver Lining or Stormy Weather Ahead? C loud computing. Remote hosting. Software as a Service (SaaS). Outsourcing. Terms that all describe various parts of the same IT elephant these days. The sexy ones—cloud computing, for example—empha- size New Age-y, “2.0” virtues of collaboration and sharing with perhaps slightly mystic overtones: Exactly where and what is the “cloud,” after all? Others, such as the more utilitarian “remote hosting” and “outsourcing,” appeal more to the bean counters and sustainability- minded among us. But they’re really all about the same thing: the tradeoff between cost and control. That the issue increasingly resonates with IT opera- tions at all levels these days can be seen in various ways. I’ll cite just a few: n At the meeting of the LITA Heads of Library Technology (HoLT) Interest Group at the 2009 ALA Annual Conference in Chicago, two topics dominated the list of proposed HoLT programs for the 2010 Annual Conference. One of these was the question of virtualization technology, and the other was the whole white hat–black hat dichotomy of the cloud.1 Practically everyone in the room seemed to be looking at—or wanting to know more about—the cloud and how it might be used to ben- efit institutions. n My institution is considering outsourcing e-mail. All of it—to Google. Times are tough, and we’re being told that by handing e-mail over to the Googleplex, our hardware, licensing, evergreen- ing, and technical support fees will total zero. Zilch. With no advertising. Heady stuff when your campus hosts thirty-plus central and depart- mental mail servers, at least as many Blackberry servers, and total costs in people, hardware, licensing, and infrastructure are estimated to exceed Can$1,000,000 annually. n In the last couple of days, library electronic dis- cussion lists such as web4lib have been abuzz— or do we now say a-Twitter?—about Amazon’s Orwellian Kindle episode, in which the firm deleted copies of 1984 and Animal Farm from subscribers’ Kindle e-book readers without their knowledge or consent.2 Indeed, Amazon’s action was in violation of its own terms of service, in which the company “grants [the Kindle owner] the non-exclusive right to keep a permanent copy of the applicable Digital Content and to view, use, and display such Digital Content an unlim- ited number of times, solely on the Device or as authorized by Amazon as part of the Service and solely for [the Kindle owner ’s] personal, non- commercial use.”3 All of this has me thinking back to the late 1990s marketing slogan of a manufacturer of consumer-grade mass storage devices—remember removable hard drives? Iomega launched its advertising campaign for the 1 GB Jaz drive with the catch-line “Because it’s your stuff.” Ultimately, whether we park it locally or send it to the cloud, I think we need to remember that it is our stuff. What I fear is that in straitened times, it becomes easy to forget this as we struggle to balance limited staff, infra- structure, and budgets. We wonder how we’ll find the time and resources to do all the sexy and forward-looking things, burdened as we are with the demands of support- ing legacy applications, “utility” services, and a huge and constantly growing pile of all kinds of content that must be stored, served up, backed up (and, we hope, not too often, restored), migrated, and preserved. The buzz over the cloud and all its variants thus has a certain Siren-like quality about it. The notion of sign- ing over to someone else’s care—for little or no apparent cost—our basic services and even our own content (our stuff) is very appealing. The song is all the more persua- sive in a climate where we’ve moved from just the normal bad news of merely doing more with less to a situation where staff layoffs are no longer limited to corporate and public libraries, but indeed extend now to our greatest institutions.4 At the risk of sounding like a paranoid naysayer to what might seem a no-brainer proposition, I’d like to sug- gest a few test questions for evaluating whether, how, and when we send our stuff into the cloud: 1. Why are we doing this? What do we hope to gain? 2. What will it cost us? Bear in mind that nothing is free—except, in the open-source community, where free beer is, unlike kittens, free. If, for example, the Borg offer to provide institutional mail without advertisements, there is surely a cost somewhere. The Borg, sensibly enough, are not in business to provide us with pro bono services. 3. What is the gain or loss to our staff and patrons in terms of local customization options, functionality, access, etc? 4. How much control do we have over the ser- vice offered or how our content is used, stored, Marc Truitt (marc.truitt@ualberta.ca) is associate University Librarian, Bibliographic and information Technology Services, University of alberta Libraries, Edmonton, alberta, Canada, and Editor of ITAL. 108 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 repurposed, or made available to other parties? 5. What’s the exit strategy? What if we want to pick up and move elsewhere? Can we reclaim all of our stuff easily and portably, leaving no sign that we’d ever sent it to the cloud? We are responsible for the services we provide and for the content we have been entrusted. We cannot shrug off this duty by simply consigning our services and our stuff to the cloud. To do so leaves us vulnerable to an irreparable loss of credibility with our users; eventually some among them would rightly ask, “So what is it that you folks do, anyway?” We’re responsible for it—whether it’s at home or in the cloud—because it’s our stuff. It is our stuff, right? References and Notes 1. I should confess, in the interest of full disclosure, that it was Eli Neiburger of the Ann Arbor District Library who suggested “hosted services as savior or slippery slope” for next year’s HoLT program. I’ve shamelessly filched Eli’s topic, if not his catchy title, for this column. Thanks, Eli. Also, again in the interest of full disclosure, I suggested the virtualization topic, which eventually won the support of the group. Finally, some participants in the discussion observed that virtualization technology and hosting are in many ways two sides of the same topical coin, but I’ll leave that for others to debate. 2. Brad Stone, “Amazon Erases Orwell Books from Kin- dle,” New York Times, July 17, 2009, http://www.nytimes .com/2009/07/18/technology/companies/18amazon.html?_ r=1 (accessed July 21, 2009). 3. Amazon.com, “Amazon Kindle: License Agreement and Terms of Use,” http://www.amazon.com/gp/help/customer/ display.html?nodeId=200144530 (accessed July 21, 2009). 4. “Budget Cutbacks Announced in Libraries, Center for Pro- fessional Development,” Stanford University News, June 10, 2009, http://news.stanford.edu/news/2009/june17/layoffs-061709 .html (accessed July 22, 2009; “Harvard Libraries Cuts Jobs, Hours,” Harvard Crimson (Online Edition), June, 26 2009, http:// www.thecrimson.com/article.aspx?ref=528524 (accessed July 22, 2009). 3218 ---- EDiTorial BoarD THouGHTS | EDEN 109 Editorial Board Thoughts Bradford Lee Eden Musings on the Demise of Paper W e have been hearing the dire predictions about the end of paper and the book since microfiche was hailed as the savior of libraries decades ago. Now it seems that technology may be finally catching up with the hype. With the Amazon Kindle and the Sony Reader beginning to sell in the marketplace despite the cost (about $360 for the Kindle), it appears that a whole new group of electronic alternatives to the print book will soon be available for users next year. Amazon reports that e-book sales quadrupled in 2008 from the previous year. This has many technology firms salivating and hop- ing that the consumer market is ready to move to digital reading as quickly and profitably as the move to digital music. Some of these new devices and technologies are featured in the March 3, 2009, Fortune article by Michael V. Copeland titled “The End of Paper?”1 Part of the problem with current readers is their chal- lenges for advertising. Because the screen is so small, there isn’t any room to insert ads (i.e., revenue) around the margins of the text. But new readers such as Plastic Logic, Polymer Vision, and FirstPaper will have larger screens, stronger image resolution, and automatic wire- less updates, with color screens and video capabilities just over the horizon. Still, working out a business model for newspapers and magazines is the real challenge. And how much will readers pay for content? With everything “free” over the Internet, consumers have become accus- tomed to information readily available for no immediate cost. So how much to charge and how to make money selling content? The Plastic Logic reader weighs less than a pound, is one-eighth of an inch thick, and resembles an 8½ x 11 inch sheet of paper or a clipboard. It will appear in the mar- ketplace next year, using plastic transistors powered by a lithium battery. While not flexible, it is a very durable and break-resistant device. Other e-readers will use flexible display technology that allows one to fold up the screen and place the device into a pocket. Much of this technol- ogy is fueled by E-Ink, a start-up company that is behind the success of the Kindle and the Reader. They are explor- ing the use of color and video, but both have problems in terms of reading experience and battery wear. In the long run, however, these issues will be resolved. Expense is the main concern: Just how much are users willing to pay to read something in digital rather than analog? Amazon has been hugely successful with the Kindle, selling more than 500,000 for just under $400 in 2007. And with the drop in subscriptions for analog magazines and news- papers, advertisers are becoming nervous about their futures. Or will the “pay by the article” model, like that used for digital music sales, become the norm? So what should or do these developments mean for libraries? It means that we should probably be exploring the purchase of some of these products when they appear and offering them (with some content) for checkout to our patrons. Many of us did something similar when it became apparent that laptops were wanted and needed by students for their use. Many of us still offer this ser- vice today, even though many campuses now require students to purchase them anyway. Offering cutting-edge technology with content related to the transmission and packaging of information is one way for our clientele to see libraries as more than just print materials and a social space. And libraries shouldn’t pay full price (or any price) for these new toys; companies that develop these products are dying to find free research and devel- opment focus groups that will assist them in versioning and upgrading their products for the marketplace. What better avenue than college students? Related to this is the recent announcement by the University of Michigan that their university press will now be a digital operation to be run as part of the library.2 Decreased university and library budgets have meant that university presses have not been able to sell enough of their monographs to maintain viable business models. The move of a university press to a successful scholarly communication and open-source publishing entity like the University of Michigan Libraries means that the press will be able to survive, and it also indicates that the newer model of academic libraries as university publishers will have a prototypical example to point out to their univer- sity’s administration. In the long run, these types of part- nerships are essential if academic libraries are to survive their own budget cuts in the future. References 1. Michael V. Copeland, “The End of Paper?” CNNMoney .com, Mar. 3, 2009, http://money.cnn.com/2009/03/03/ technology/copeland_epaper.fortune/ (accessed June 22, 2009). 2. Andrew Albanese, “University of Michigan Press Merged with Library, With New Emphasis on Digital Mono- graphs,” LibraryJournal.com, Mar. 26, 2009, http://www .libraryjournal.com/article/CA6647076.html (accessed June 22, 2009). Bradford lee Eden (eden@library.ucsb.edu) is associate University Librarian for Technical Services and Scholarly Communication, University of California, Santa Barbara. 3219 ---- 110 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Employing Virtualization in Library Computing: Use Cases and Lessons Learned Arwen Hutt, Michael Stuart, Daniel Suchy, and Bradley D. Westbrook This paper provides a broad overview of virtualization technology and describes several examples of its use at the University of California, San Diego Libraries. Libraries can leverage virtualization to address many long-standing library computing challenges, but careful planning is needed to determine if this technology is the right solution for a specific need. This paper outlines both technical and usability considerations, and concludes with a discussion of potential enterprise impacts on the library infrastructure. O perating system virtualization, herein referred to simply as “virtualization,” is a powerful and highly adaptable solution to several library technology challenges, such as managing computer labs, automat- ing cataloging and other procedures, and demonstrating new library services. Virtualization has been used in one manner or another for decades,1 but it is only within the last few years that this technology has made significant inroads into library environments. Virtualization technol- ogy is not without its drawbacks, however. Libraries need to assess their needs, as well as the resources required for virtualization, before embarking on large-scale imple- mentations. This paper provides a broad overview of virtualization technology and explains its benefits and drawbacks by describing some of the ways virtualization has been used at the University of California, San Diego (UCSD) Libraries.2 n Virtualization overview Virtualization is used to partition the physical resources (processor, hard drive, network card, etc.) of one com- puter to run one or more instances of concurrent, but not necessarily identical, operating systems (OSs). Traditionally only one instance of an operating system, such as Microsoft Windows, can be used at any one time. When an operating system is virtualized—creating a vir- tual machine (VM)—the VM communicates through vir- tualization middleware to the hardware or host operating system. This middleware also provides a consistent set of virtual hardware drivers that are transparent to the end- user and to the physical hardware. This allows the virtual machine to be used in a variety of heterogeneous envi- ronments without the need to reconfigure or install new drivers. With the majority of hardware and compatibility requirements resolved, the computer becomes simply a physical presentation medium for a VM. n Two approaches to virtualization: host-based vs. hypervisor Virtualization can be implemented using Type 1 or Type 2 hypervisor architectures. A Type 1 hypervisor (figure 1), commonly referred to as “host-based virtualization,” requires an OS such as Microsoft Windows XP to host a “guest” operating system like Linux or even another ver- sion of Windows. In this configuration, the host OS treats the VM like any other application. Host-based virtualiza- tion products are often intended to be used by a single user on workstation-class hardware. In the Type 2 hypervisor architecture (figure 2), com- monly referred to as “hypervisor-based virtualization,” the virtualization middleware interacts with the comput- er’s physical resources without the need of a host operat- ing system. Such systems are usually intended for use by multiple users with the VMs accessed over the network. Realizing the full benefits of this approach requires a con- siderable resource commitment for both enterprise-class server hardware and information technology (IT) staff. n Use cases archivists’ Toolkit The Archivists’ Toolkit (AT) project is a collabora- tion of the UCSD Libraries, the New York University Libraries, and the Five Colleges Libraries (Amherst College, Hampshire College, Mt. Holyoke College, Smith College, and University of Massechusetts, Amherst) and is funded by the Andrew W. Mellon Foundation. The AT is an open-source archival data management system that provides broad, integrated support for the management of archives. It consists of a Java client that connects to a relational database back-end (MySQL, MSsql, or Oracle). The database can be implemented on a networked server or a single workstation. Since its initial release in December 2006, the AT has sparked a great deal of interest and rapid uptake of the appli- cation within the archival community. This growing interest has, in turn, created an increased demand for demonstrations of the product, workshops and training, and simpler methods for distributing the application. (Of the use cases described here, the two for the AT arwen Hutt (ahutt@ucsd.edu) is Metadata Specialist, Michael Stuart (mstuart@ucsd.edu) is information Technology analyst, Daniel Suchy (dsuchy@ucsd.edu) is Public Services Technology analyst, and Bradley D. Westbrook (bradw@library.ucsd.edu) is Metadata Librarian and digital archivist, University of California, San diego Libraries. EMploYiNG VirTualizaTioN iN liBrarY CoMpuTiNG | HuTT ET al. 111 distribution and laptop classroom are exploratory, whereas the rest are in production.) aT workshops The Society of American Archivists sponsors a two-day AT workshop occurring on multiple dates at sev- eral locations. In addition, the AT team provides one- and two-day workshops to different institu- tional audiences. AT workshops are designed to give participants a hands-on experience using the AT application. Accomplishing this effectively requires, at the mini- mum, supplying all participants with identical but separate data- bases so that participants can com- plete the same learning exercises simultaneously and independently without concern for working in each other’s space. In addition, an ideal configuration would reduce the workload of the instructors, freeing them from having to set up the AT instructional database onsite for each workshop. For these workshops we needed to do the following: n provide identical but sepa- rate databases and database content for all workshop attendees n create an easily reproduc- ible installation and setup for workshops by prepar- ing and populating the AT instructional database in advance Virtualization allows the AT workshop instructors to predefine the workstation configuration, including the installation and pop- ulation of the AT databases, prior to arriving at the workshop site. To accomplish this we developed a workshop VM configuration with MySQL and the AT client installed within a Linux Ubuntu OS. The workshop instructors then built the AT VM with the data they require for the workshop. The AT client and database are loaded on a DVD or flash drive and shipped to the classroom managers at the workshop sites, who then need only to install a copy of the VM and the freely available VMPlayer software (necessary to launch the AT VM) onto each workstation in the classroom. The AT VM, once built, can be used many times both for multiple workstations in a classroom as well as for multiple work- shops at different times and locations. This implementation has worked very well, saving both time and effort for the instructors and classroom support staff by reducing the time and communication Figure 1. a Type 1 hypervisor (host-based) implementation Figure 2. a Type 2 hypervisor-based implementation 112 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 necessary for deploying and reconfiguring the VM. It also reduces the chances that there will be an unexpected conflict between the application and the host worksta- tion’s configuration. But the method is not perfect. More than anything else, licensing costs motivated us to choose Linux as the operating system instead of a proprietary OS such as Windows. This reduces the cost of using the VM, but it also requires workshop participants to use an OS with which they are often unfamiliar. For some partici- pants, unfamiliarity with Linux can make the workshop more difficult than it would be if a more ubiquitous OS was used. aT demonstrations In a similar vein, members of the AT team are often called upon to demonstrate the application at various profes- sional conferences and other venues. These demonstra- tions require the setup and population of a demonstration database with content for illustrating all of the applica- tion’s functions. One of the constraints posed by the demonstration scenario is the importance of using a local database instance rather than a networked instance, since network connections can be unreliable or outright unavailable (network connectivity being an issue we’ve all faced at conferences). Another constraint is that portions of the demonstrations need some level of preparation (for example, knowing what search terms will return a non- empty result set), which must be customized for the unique content of a database. A final constraint is that, because portions of the demonstration (import and data merging) alter the state of the database, changes to the database must be easily reversible, or else new examples must be created before the database can be reused. Building on our experience of using virtualization to implement multiple copies of an AT installation, we evaluated the possibility of using the same technology for simplifying the setup necessary for demonstrating the AT. As with the workshops, the use of a VM for AT dem- onstrations allows for easy distribution of a prepopulated database, which can be used by multiple team members at disparate geographic locations and on different host OSs. This significantly reduces the cost of creating (and recreating) demonstration databases. In addition, dem- onstration scripts can be shared between team members, creating additional time savings as well as facilitating team participation in the development and refinement of the demonstration. Perhaps most important is the ability to roll back the VM to a specific state or snapshot of the database. This means the database can be quickly returned to its original state after being altered during a demonstration. Overall, despite our initial anxiety about depending on the VM for presentations to large audi- ences, this solution has proven very useful, reliable, and cost-effective. aT distribution Implementing the AT requires installing both the toolkit client and a database application such as MySQL, instan- tiating an AT database, and establishing the connection between database and client. For many potential cus- tomers of the AT, the requirements for database creation and management can be a significant barrier due to inexperience with how such processes work and a lack of readily available IT resources. Many of these customers simply desire a plug-and-play version of the application that they can install and use without requiring technical assistance. It is possible to satisfy this need for a plug-and-play AT by constructing a VM containing a fully installed and ready-to-use AT application and database instance. This significantly reduces the number and difficulty of steps involved in setting up a functional AT instance. The cus- tomer would only need to transfer the VM from a DVD or other source to their computer, download and install the VM reader, and then launch the AT VM. They would then be able to begin using the AT immediately. This removes the need for the user to perform database creation and management; arguably the most technically challenging portion of the setup process. Users would still have the option of configuring the application (default values, lookup lists, etc.) in accord with the practices of their repository. Batch processing catalog records The rapid growth of electronic resources is significantly changing the nature of library cataloging. Not only are types of library materials changing and multiplying, the amount of e-resources being acquired increases each year. Electronic book and music packages often contain tens of thousands of items, each requiring some level of catalog- ing. Because of these challenges, staff are increasingly cataloging resources with specialized programs, scripts, and macros that allow for semiautomated record creation and editing. Such tools make it possible to work on large sets of resources—work that would not be financially possible to perform manually item by item. However, the specialized configuration of the workstation required for using these automated procedures makes it very dif- ficult to use the workstation for other purposes at the same time. In fact, user interaction with the workstation while the process is running can cause a job to terminate prior to completion. In either scenario, productivity is compromised. Virtualization offers an excellent remedy to this prob- lem. A virtual machine configured for semiautomated batch processing allows for unused resources on the workstation to process the batch requests in an isolated environment while, at the same time and on the same machine, the user is able to work on other tasks. In cases EMploYiNG VirTualizaTioN iN liBrarY CoMpuTiNG | HuTT ET al. 113 where the user’s machine is not an ideal candidate for virtualization, the VM can be hosted via a hypervisor- based solution, and the user can access the VM with familiar remote access tools such as Remote Desktop in Windows XP. Secure sandbox In addition to challenges posed by increasingly large quantities of acquisitions, the UCSD Libraries is also encountering an increasing variety of library material types. Most notable is the variety and uniqueness of digital media acquired by the library, such as specialized programs to process and view research data sets, new media formats and viewers, and application installers. Cataloging some of these materials requires that media be loaded and that applications be installed and run to inspect and validate content. But running or opening these materials, which are sometimes from unknown sources, poses a security risk to both the user’s worksta- tion and to the larger pool of library resources accessible via the network. Many installers require a user to have administrative privileges, which can pose a threat to net- work security. The virtual machine allows for a user to have admin- istrative privileges within the VM, but not outside of the VM. The user can be provided with the privileges needed for installing and validating content without modifying their privileges on the host machine. In addition, the VM can be isolated by configuring its network connection so that any potential security risks are limited to the VM instance and do not extend to either the host machine or the network. laptop classroom Instructors at the UCSD Libraries need a laptop class- room that meets the usual requirements for this type of service (mobility, dependability, etc.) but also allows for the variety of computing environments and applica- tions in use throughout our several library locations. In a least-common-denominator scenario, computers are configured to meet a general standard (usually Microsoft Windows with a standard browser and office suite) and allow minimal customization. While this solution has its advantages and is easy to configure and maintain from the IT perspective, it leaves much to be desired for an instructor who needs to use a variety of tools in the classroom, often on demand. The goal in this case is not to settle for a single generic build but instead look for a solution that accommodats three needs: n The ability to switch quickly between different customized OS configurations n The ability to add and remove applications on demand in a classroom setting n The ability to restore a computer modified during class to its original state Of course, regardless of the approach taken, the lap- tops still needed to retain a high level of system security, application stability, and regular hardware maintenance. After a thorough review of the different technologies and tools already in use in the libraries, we determined that virtualization might also serve to meet the require- ments of our laptop classroom. The need to support multiple users and multiple VMs makes this scenario an ideal candidate for hypervisor-based virtualization. We decided to use VDI (Virtual Desktop Infrastructure), a commercially available hypervisor product from VMware. VMware is one of the largest providers of virtualization software, and we were already familiar with several itera- tions of its host-based VM services. The core of our project plan consists of a base VM to be created and managed by our IT department. To sup- port a wide variety of applications and instruction styles, instructors could create a customized VM specific to their library’s instruction needs with only nominal assistance from IT staff. The custom VM would then be made avail- able on demand to the laptops from a central server (as depicted in figure 2 above). In this manner, instructors could “own” and maintain a personal instructional com- puting environment, while the classroom manager could still ensure the laptop classroom as a whole maintained the necessary secure software environment required by IT. As an added benefit, once these VMs are established, they could be accessed and used in a variety of diverse locations. n Considerations for implementation Before implementing any virtualization solution, in-depth analysis and testing is needed to determine which type of solution, if any, is appropriate for a specific use case in a specific environment. This analysis should include three major areas of focus: user experience, application perfor- mance in the virtualized environment, and effect on the enterprise infrastructure. In this section of this paper, we review considerations that, in hindsight, we would have found to be extremely valuable in the UCSD Libraries’ various implementations of virtualization. user experience Traditionally, system engineers have developed systems and tuned performance according to engineering metrics (e.g., megabytes per second and network latency). While such metrics remain valuable to most assessments of a 114 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 computer application, performance assessments are being increasingly defined by usability and user experience fac- tors. In an academic computing environment, especially in areas such as library computer labs, these newer kinds of performance measures are important indicators of how effectively an application performs and, indirectly, of how well resources are being used. Virtualization can be implemented in a way that allows library users to have access to both the virtual- ized and host OSs or to multiple virtualized OSs. Since virtualization essentially creates layers within the work- station, multiple OS layers (either host or virtualized) can cause the users to become confused as to which OS they are interacting with at a given moment. In that kind of implementation, the user can lose his or her way among the host and guest OSs as well as become disoriented by differing features of the virtualized OSs. For example, the user may choose to save a file to the desktop, but may not be aware that the file will be saved to the desktop of the virtualized OS and not the host OS. External device sup- port can also be problematic for the end user, particularly with regard to common devices such as flash drives. The user needs to be aware of which operating system is in use, since it is usually the only one with which an external device is configured to work. Authentication to a system is another example of how the relationship between the host and guest OS can cause confusion. The introduction of a second OS implicitly creates a second level of authentication and authoriza- tion that must be configured separately from that of the host OS. User privileges may differ between the host and guest OS for a particular VM configuration. For instance, a user might need to remember two logins or at least enter the same login credentials twice. These unexpected differences between the host and guest OS produce nega- tive effects on a user’s experience. This can be a critical factor in a time-sensitive environment such as a computer lab, where the instructor needs to devote class time to teaching and not to preparing the computers for use and navigating students through applications. interface latency and responsiveness Latency (meaning here the responsiveness or “sluggish- ness” of the software application or the OS) in any inter- face can be a problem for usability. Developers devote a significant amount of time to improving operating systems and application interfaces to specifically address this issue. However, users will often be unable to rec- ognize when an application is running a virtualized OS and will thus expect virtualized applications to perform with the same responsiveness as applications that are not-virtualized. In our experience, some VM implementa- tions exhibit noticeable interface latency because of inher- ent limitations of the virtualization software. Perhaps the most notable and restrictive limitation is the lack of advanced 3D video rendering capability. This is due to the lack of support for hardware-accelerated graphics, thus adding an extra layer of communication between the application and the video card and slowing down performance. In most hardware-accelerated 3D applica- tions (e.g., Google Earth Pro or Second Life), this latency is such a problem that the application becomes unusable in a virtualized environment. Recent developments have begun to address and, in some cases, overcome these limitations.3 In every virtualization solution there is overhead for the virtualization software to do its job and delegate resources. In our experience, this has been found to cause an approximately 10–20 percent performance penalty. Most applications will run well with little or moder- ate changes to configuration when virtualized, but the overhead should not be overlooked or assumed to be inconsequential. It is also valuable to point out that the combination of applications in a VM, as well as VMs running together on the same host, can create further performance issues. Traditional bottlenecks The bottlenecks faced in traditional library comput- ing systems also remain in almost every virtualization implementation. General application performance is usu- ally limited by the specifications of one or more of the following components: processor, memory, storage, and network hardware. In most cases, assuming adequate hardware resources are available, performance issues can be easily addressed by reconfiguring the resources for the VM. For example, a VM whose application is memory- bound (i.e., performance is limited by the memory avail- able to the VM), can be resolved by adjusting the amount of memory allocated to the VM. A critical component of planning a successful virtual- ization deployment includes a thorough analysis of user workflow and the ways in which the VM will be utilized. Although the types of user workflows may vary widely, analysis and testing serve to predict and possibly avoid potential bottlenecks in system performance. Enterprise impact When assessing the effect virtualization will have on your library infrastructure, it is important to have an accurate understanding of the resources and capabilities that will form the foundation for the virtualized infrastructure. It is a misconception that it is necessary to purchase state- of-the-art hardware to implement virtualization. Not only are organizations realizing how to utilize existing hardware better with virtualization for specific projects, they are discovering that the technology can be extended EMploYiNG VirTualizaTioN iN liBrarY CoMpuTiNG | HuTT ET al. 115 to the rest of the organization and be successfully inte- grated into their IT management practices. Virtualization does, however, impose certain performance requirements for large-scale deployments that will be used in a 24/7 production environment. In such scenarios, organizations should first compare the level of performance offered by their current hardware resources with the performance of new hardware. The most compelling reasons to buy new servers include the economies of scale that can be obtained by running more VMs on fewer, more robust servers, as well as the enhanced performance supplied by newer, more virtualization-aware hardware. In addi- tion, virtualization allows for resources to be used more efficiently, resulting in lower power consumption and cooling costs. Also, the network is often one of the most overlooked factors when planning a virtualization project. While a local virtualized environment (i.e., a single computer) may not necessarily require a high performance network environment, any solution that calls for a hypervisor-based infrastructure requires considerable planning and scaling for bandwidth requirements. The current network hard- ware available in your infrastructure may not perform or scale adequately to meet the needs of this VM use. Again, this highlights the importance of thorough user workflow analyses and testing prior to implementation. Depending on the scope of your virtualization project, deployment in your library can potentially be expen- sive and can have many indirect costs. While the initial investment in hardware is relatively easy to calculate, other factors, such as ongoing staff training and system administration overhead, are much more difficult to determine. In addition, virtualization adds an additional layer to oftentimes already complex software licensing terms. To deal with the increased use of virtualization, software vendors are devoting increasing attention to the intricacies of licensing their products for use in such environments. While virtualization can ameliorate some licensing constraints (as noted in the AT workshop use case), it can also conceal and promote licensing violations, such as multiple uses of a single-license applications or access to license-restricted materials. License review is a prudent and highly recommended component of implementing a virtualization solution. Finally, concern- ing virtualization software itself, it also should be noted that while commercial VM companies usually provide plentiful resources for aiding implementation, several worthy open-source options also exist. As with any open- source software, the total cost of operation (e.g., the costs of development, maintenance, and support) needs to be considered. n Conclusion As our use cases illustrate, there are numerous potential applications and benefits of virtualization technology in the library environment. While we have illustrated a number of these, many more possibilities exist, and fur- ther opportunities for its application will be discovered as virtualization technology matures and is adapted by a growing number of libraries. As with any technology, there are many factors that must be taken into account to evaluate if and when virtualization is the right tool for the job. In short, successful implementation of virtualization requires thoughtful planning. When so implemented, virtualization can provide libraries with cost-effective solutions to long-standing problems. References and notes 1. Alessio Gaspar et al., “The Role of Virtualization in Com- puting Education,” in Proceedings of the 39th SIGCSE Technical Symposium on Computer Science Education (New York: ACM, 2008): 131–32; Paul Ghostine, “Desktop Virtualization: Stream- lining the Future of University IT,” Information Today 25, no. 2 (2008): 16; Robert P. Goldberg, “Formal Requirements for Virtu- alizable Third Generation Architectures,” in Communications of the ACM 17, no. 7 (New York: ACM, 1974): 412–21; and Karissa Miller and Mahmoud Pegah, “Virtualization: Virtually at the Desktop,” in Proceedings of the 35th Annual ACM SIGUCCS Con- ference on User Services (New York: ACM, 2007): 255–60. 2. For other, non–UCSD use cases of virtualization, see Joel C. Adams and W. D. Laverell, “Configuring a Multi-Course Lab for System-Level Projects,” SIGCSE Bulletin 37, no. 1 (2005): 525–29; David Collins, “Using VMWare and Live CD’s to Con- figure a Secure, Flexible, Easy to Manage Computer Lab Envi- ronment,” Journal of Computing for Small Colleges 21, no. 4 (2006): 273–77; Rance D. Necaise, “Using VMware for Dual Operating Systems,” Journal of Computing in Small Colleges 17, no. 2 (2001): 294–300; and Jason Nieh and Chris Vaill, “Experiences Teaching Operating Systems Using Virtual Platforms and Linux,” SIGCSE Bulletin 37, no 1 (2005): 520–24. 3. H. Andrés Lagar-Cavilla, “VMGL (formerly Xen-GL): OpenGL Hardware 3D Acceleration for Virtual Machines,” www .cs.toronto.edu/~andreslc/xen-gl/ (accessed Oct. 21, 2008). 3220 ---- 116 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Success Factors and Strategic Planning: Rebuilding an Academic Library Digitization Program Cory Lampert and Jason Vaughan This paper discusses a dual approach of case study and research survey to investigate the complex factors in sustaining academic library digitization programs. The case study involves the background of the University of Nevada, Las Vegas (UNLV) Libraries’ digitization pro- gram and elaborates on the authors’ efforts to gain staff support for this program. A related survey was admin- istered to all Association of Research Libraries (ARL) members, seeking to collect baseline data on their digital collections, understand their respective administrative frameworks, and to gather feedback on both negative obstacles and positive inputs affecting their success. Results from the survey, combined with the authors’ local experience, point to several potential success fac- tors including staff skill sets, funding, and strategic planning. E stablishing a successful digitization program is a dialog and process already undertaken or cur- rently underway at many academic libraries. In 2002, according to an Institute of Museum and Library Services report, “thirty-four percent of academic librar- ies reported digitization activities within the past 12 months.” Nineteen percent expect to be involved in digi- tization work in the next twelve months, and forty-four percent beyond twelve months.1 More current statistics from a subsequent study in 2004 reflected that digitiza- tion work has both continued and expanded, with half of all academic libraries performing digitization activi- ties.2 Fifty-five percent of ARL libraries responded to a survey informing part of the 2006 Association of Research Libraries (ARL) study Managing Digitization Activities; of these, 97 percent of the respondents indicated engagement in digitization.3 The 2008 Ithaka study Key Stakeholders in the Digital Transformation in Higher Education found that nearly 80 percent of large academic libraries either already have or plan to have digital repositories.4 With digitization becoming the norm in many institutions, the time is right to consider what factors contribute to the success and rapid growth of some library digitization programs while other institutions find digitization chal- lenging to sustain. The evolution of digitization at the UNLV Libraries is doubtless a journey many institutions have undertaken. Over the past couple of years, those responsible for such a program at the UNLV Libraries have had the opportu- nity to revitalize the program and help collaboratively address some key philosophical questions that had not been systematically asked before, let alone answered. Associated with this was a concerted focus to engage other less involved staff. One goal was to help educate them on academic digitization programs. Another goal was to provide an opportunity for input on key ques- tions related to the programs’ strategic direction. As a subsequent action, the authors conducted a survey of other academic libraries to better understand what fac- tors have contributed to their programs’ own success as well as challenges that have proven problematic. Many questions asked of our library staff in the planning and reorganization process were asked in the survey of other academic libraries. While the UNLV Libraries have undertaken what is felt are the proper structural steps and have begun to author policies and procedures geared toward an efficient operation, the authors wanted to bet- ter understand the experiences, key players, and underly- ing philosophies of other institutional libraries as theses pertain to their own digitization program. The following article provides a brief context relating the background of the UNLV Libraries’ digitization program and elaborates on the authors’ efforts toward educating library col- leagues and gaining staff buy-in for UNLV’s digitization program—a process that countless other institutions have no doubt experienced, led, or suffered. The administered survey to ARL members dealt with many topics similar to those that arose during the authors’ initial planning and later conversations with library staff, and as such, survey questions and responses are integrated in the following discussion. The authors administered a 26-question survey to the 123 members of the ARL. The focus of this survey was different from the previously mentioned ARL study Managing Digitization Activities, though several of the questions overlapped to some degree. In addition to demographic or concrete factual types of questions, the UNLV Libraries Digitization Survey had several ques- tions focused on perceptions—that is, staff support, administrative support, challenges, and benefits. Areas of overlap with the earlier ARL survey are mentioned in the appropriate context. Though UNLV isn’t a member of the ARL, we consider ourselves a research library, and, regardless, it was a convenient way to provide some structure to the survey. Survey responses were collected for a forty-five-day period from mid-June to late July, 2008. Through visiting each and every ARL library’s web- site, the authors identified the individuals that appeared to be the “leaders” of the ARL digitization programs, with instructions to forward the message to a colleague if Cory lampert (cory.lampert@unlv.edu) is digitization Projects Librarian and Jason Vaughan (jason.vaughan@unlv.edu) is director, Library Technologies, University of nevada Las Vegas. SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 117 they themselves had been incorrectly identified. This was very tricky, and revealed numerous program structures in place, differences between institutions in promoting their collections, and so on. The authors didn’t necessar- ily start with the presumption that all ARL libraries even have a digitization program, but most (but not all) either seemed to have a formal organized digitization program with staffing, or at least had digitized and made available something, even if only a single collection. We e-mailed a survey announcement and a link to the survey to the targeted individuals, with a follow-up reminder a month later. Responses were anonymous, and respondents were allowed to skip questions; thus the number of responses for the twenty-six questions making up the survey ranged from a low of thirty (24.4 percent) to a high of forty-four responses (35.8 percent). The average number of responses for each of the questions was 39.8, yield- ing an overall response rate of 32.4 percent. Questions were of three types: multiple choice (select one answer), multiple choice (mark all that apply), and open text. In addition, some of the multiple choice questions allowed additional open text comments. Survey responses appear in appendix A. n Context of the UNLV Libraries’ digitization program “Digital collection,” for the purpose of the UNLV Library Digitization Survey, was defined as a collection of library or archival materials converted to machine-readable format to provide electronic access or for preservation purposes; typically, digital collections are library-created digital copies of original materials presented online and organized to be easily searched. They may offer features such as: full text search, brows- ing, zooming and panning, side by side comparison of objects, and export for presentation and reuse. One question the survey asked was “what year do you feel your library published its first ‘major’ digital collec- tion?” Responses ranged from 1990 to 2007; the general average of all responses was 2001. The earlier ARL study found 2000 as the year most respondents began digiti- zation activities.5 Mirroring this chronology, the UNLV Libraries has been active in designing digital projects and digitizing materials from library collections since the late 1990s. Technical Web design expertise was developed in the Cataloging unit (later renamed Bibliographic and Metadata Services), and some of the initial efforts were to create online galleries and exhibits of visual materials from Special Collections, such as the Jeanne Russell Janish (1998) exhibit.6 Subsequently, the UNLV Libraries pur- chased the CONTENTdm digital collection management software, providing both back-end infrastructure and front-end presentation for digital collections. Later, the first digitization project with search functionality was created in partnership with Special Collections and was funded by a UNLV Planning Initiative Award received in 1999. The Early Las Vegas (2003) project focused on Las Vegas historical material and was designed to guide users to search, retrieve, and manipulate results using CONTENTdm software to query a database.7 UNLV’s development corresponds with regional developments in Utah in 2001, when “the largest academic institutions in Utah were just beginning to develop digital imaging projects.”8 Data from the 2004 IMLS study showed that, in the twelve months prior to the study release in 2004, the majority of larger academic libraries had digitized between one and five hundred images for online presen- tation.9 In terms of staffing, digitization efforts occur in a wide variety of configurations, from large departments to solo librarians managing volunteers. For institutions with rec- ognized digitization staff, great variations exist between institutions in terms of where in the organizational chart digitization staff are placed. Boock and Vondacek’s research revealed that, of departments involved in digi- tization, special collections, archives, technical services, and newly created digital library units are where digiti- zation activities most commonly take place.10 A majority of respondents to the ARL study indicated that some or all activities associated with digitization are distrib- uted across various units in the library.11 In 2003, the UNLV Libraries created a formal department within the Knowledge Access Management division—Web and Digitization Services (WDS)—initially comprising five staff focused on the development of the UNLV Libraries’ public website, the development of web-based applica- tions and databases to manage and efficiently present information resources, and the digitization and online presentation of library materials unique to the UNLV Libraries’ collections and of potential interest to a wider audience. Augmenting their efforts were individuals in other departments helping with metadata standards, con- tent selection, and associated systems technical support. The UNLV Library Digitization Survey showed that the majority (78 percent) of libraries that responded have at least one full-time staff member whose central job respon- sibility is to support digitization activities. This should not imply the existence of a fully staffed digitization program; the 2006 IMLS study found that 74.1 percent of larger academic libraries described themselves as lack- ing in sufficiently skilled technology staff to accomplish technology-related activities.12 Central to any digitization program should be some structure in terms of how projects are proposed and subsequently prioritized. To help guide the priorities 118 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 of UNLV’s infant WDS department, a Digital Projects Advisory Committee was formed to help solicit and prioritize project ideas, and subsequently track the development of approved projects. This committee’s work could be judged as having mixed success partly because it met too infrequently, struggled with conflict- ing philosophical thoughts on digitization, and was confronted with the reality that staff that were needed to help bring approved ideas to fruition simply weren’t in place because of too many other library priorities draw- ing attention away from digitization. An evaluation of the lessons learned from these early years can be found in Brad Eden’s article.13 The UNLV Library Digitization Survey had several questions related to management and prioritization for digital projects and shows that despite the challenges of a committee-based decision- making structure, when a formal process is in place at all, 42.1 percent of survey respondents used a committee versus a single decision maker (23.7 percent) for deter- mining to whom projects are proposed for production. A follow-up question asked “how are approved projects ultimately prioritized?” The most popular response (54.1 percent) indicated “by a committee for review by multiple people,” followed by “no formal process” (27 percent). “By a single decision maker” was selected by 18.9 percent of the respondents. The earlier ARL study asked a somewhat related question: “Who makes deci- sions about the allocation of staff support for digitiza- tion efforts? Check all that apply.” Out of seven possible responses, the three most popular were “head of cen- tralized unit,” “digitization team/committee/working group,” and “other person”; the other person was most often in an administrative capacity, such as a dean, director, or department head.14 Administrative support for a program was another variable the UNLV Library Digitization Survey investi- gated. The survey asked respondents to rate, on a scale of one to five, “how would you characterize current support for digitization by your library’s administration?” More than 40 percent of responses indicated “consistent sup- port,” followed by 31 percent of respondents indicating “very strong support, top priority,” 14.3 percent ranking support as neutral, and 14.2 percent claiming “minimal support” or “very little support, or some resistance.” It was also clear from some of the other questions’ responses that the dean or director’s support (or lack thereof) can have dramatic effects on the digitization program. 2005 brought change to the UNLV Libraries in the form of a new dean. Well-suited for the digitization program, she came from California, a state very heavily engaged and at the forefront of digitization within the library and larger aca- demic environment. One of her initiatives was a retooling of the digitization program at the UNLV Libraries, and her enthusiasm reflects a growing awareness of administrators regarding the benefits of digitization. n Reorganization, library staff engagement, and decision making In 2006, two new individuals joined UNLV Libraries’ Web and Digitization Services Department, the digitization projects librarian (filling a vacancy), and the Web tech- nical support manager (a new position). A bit later, the Systems department (providing technical support for the Web and digitization servers, among other things), and the WDS department were combined into a single unit and renamed Library Technologies. Collectively, these changes brought new and engaged staff into the digitiza- tion program and combined under one division many of the individuals responsible for digital collection creation and support. Perhaps more subtlety, this arrangement also provided formal acknowledgement of the impor- tance and desire of publishing digital collections. With the addition of new staff and a reorganization, a piece still missing was a resuscitation of library stake- holders to help solicit, prioritize, and manage the cre- ation of digital collections and an overall vision guiding the program. While the technical expertise, knowledge of metadata and imaging standards, and deep-rooted knowledge of digitization programs and concepts existed within the Library Technologies staff, other knowledge didn’t—primarily in-depth knowledge of the UNLV Libraries’ Special Collections and a track record of deep engagement with college faculty and the educational curriculum. Similar to other organizations, the UNLV Libraries had not only created a new unit, but was also poised to introduce cross-departmental project groups that would collaborate on digitization activities. In their study of ARL and Greater Western Library Association (GWLA) libraries, Book and Vondracek found that this was the most commonly used organizational structure.15 Knowledge of the concepts of a digitization program and what is involved in digitizing and sustaining a collec- tion was not widespread among other library colleagues. Acknowledged, but not guaranteed up front for the UNLV Libraries, was the likely eventual reformation of a group of interested and engaged library stakeholders charged to solicit, prioritize, and provide oversight of the UNLV Libraries’ digitization program. For various reasons, the authors wanted to garner staff buy-in to the highest degree possible. Apart from wanting less informed col- leagues to understand the benefits of a digitization pro- gram, it was also likely that such colleagues would help solicit projects through their liaison work with programs of study across campus. One UNLV Library Digitization Survey question asked, “how would you characterize support for digitization in your library by the majority of those providing content for digitization projects?” “Consistent support” was indicated by 65.9 percent of respondents; 15.9 percent indicated “very strong support, top priority,” 13.6 percent indicated neutrality, and 4.6 SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 119 percent indicated either minimal support or even some resistance. To help garner staff buy-in and set the stage for revitalizing the UNLV Libraries’ digitization efforts, we began laying the groundwork to educate and engage library staff in the benefits of a digitization program. This work included language successfully woven into the UNLV Libraries’ strategic plan and an authored white paper posing engaging questions to the larger library audience related to the strategic direction of the program. Finally, we planned and executed two digitization work- shops for library staff. n The strategic plan One UNLV Library Digitization Survey question asked, “is the digitization program or digitization activities referenced in your library’s strategic plan?” A total of 63.4 percent indicated yes, with an additional 22 percent indicating no specific references, but rather implied ref- erences. Only 7.3 percent indicated that the digitization program was not referenced in any manner in the stra- tegic plan, while, surprisingly, 3 responses (7.3 percent) indicated that their library doesn’t have a strategic plan. The UNLV Libraries’ strategic plan is an important docu- ment authored with wide feedback from library staff, and it exemplifies the participatory decision-making process in place in the library. The current iteration of the strategic plan covers 2007–9 and includes various goals with supporting strategies and action items.16 In addition, all action items have associated assessment metrics and library staff responsible for championing the action items. Departmental annual reports explicitly reference progress toward strategic plan goals. As such, if goals related to the digitization program appear in the strategic plan, that’s a clear indication, to some degree, of staff buy-in in acknowledging the significance of the digitization pro- gram. Fortunately, digitization efforts figure prominently in several goals, strategies, and action items, including the following: n Increasingly provide access to digital collections and services to support instruction, research, and outreach while improving access to the UNLV Libraries’ print and media collections. n Provide greater access to digital collections while continuing to build and improve access to collec- tions in all formats to meet the research and teach- ing needs of the university. Identify collections to digitize that are unique to UNLV and that have a regional, national, and international research inter- est. Create digital projects utilizing and linking col- lections. Develop and adapt metadata and scanning standards that conform to national standards for all formats. Provide content and metadata for regional and national digital projects. Continue to develop expertise in the creation and management of digi- tal collections and information. Collaborate with faculty, students, and others outside the library in developing and presenting digital collections. n Be a comprehensive resource for the documenta- tion, investigation, and interpretation of the com- plex realities of the Las Vegas metropolitan area and provide an international focal point for the study of Las Vegas as a unique urban and cultural phenomenon. Facilitate real and digital access to materials and information that document the his- torical, cultural, social, and environmental setting of Las Vegas and its region by identifying, collect- ing, preserving, and managing information and materials in all formats. Identify unique collections that strengthen current collections of national and international significance in urban development and design, gaming, entertainment, and architec- ture. Develop new access tools and enhance the use of current bibliographic and metadata utilities to provide access to physical and digital collections. Develop Web-based digital projects and exhibits based upon the collections. An associated capital campaign case statement associ- ated with the strategic plan lists several gift opportunities that would benefit various aspects of the UNLV Libraries; several of these include gift ideas related to the digitiza- tion of materials. n The white paper Another important step in laying the groundwork for the digitization program was a comprehensive white paper authored by the recently hired digitization projects librar- ian. The finished paper was originally given to the dean of libraries and thereafter to the administrative cabinet, and eventually distributed to all library staff. The out- line of this white paper is provided as appendix B. The purpose of the white paper was multifaceted. After a brief historical context, the white paper addressed per- haps the single most important aspect of a digitization program—program planning—developing the strategic goals of the program, selecting and prioritizing projects though a formal decision-making process, and managing initiatives from idea to reality through efficient project teams. This first topic addressing the core values of the program had a strong educational purpose for the entire library staff—the ultimate audience of the paper. As part of its educational goal, the white paper enumerated the various strengths of digitization and why an institution 120 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 would want to sustain a digitization program (providing greater worldwide access to unique materials, promoting and supporting education and learning when integrated with the curriculum, etc.). It defined distinctions between an ephemeral digital exhibit and a long-term published and maintained collection. It discussed the various com- ponents of a digital collection—images, multimedia, metadata, indexing, thematic presentation (and the pref- erence to be unbiased), integration with other digital col- lections and the library website, etc. It posited important questions on sustenance and assessment, and defined concepts such as refreshing of data and migration of data to help set the stage for future philosophical discussions. Given the myriad reasons one might want to publish a digital collection, checked by the reality that all the rea- sons and advantages may not be realized or given equal importance, the white paper listed several scenarios and asked if each scenario was a strong underlying goal for our program—in short, true or false: n “The libraries are interested in digitizing select unique items held in our collection and providing access to these items in new formats.” n “The Libraries are interested in digitizing whole runs of an information resource for access in new formats.” n “The Libraries should actively pursue funding to support major digitization initiatives.” n “The Libraries should take advantage of the unique publicity, promotion, and marketing opportunities afforded by a digital project/program.” Continuing with a purpose of defining boundaries of the new program, the paper asked questions related to audience, required skill sets, and resources. The second primary topic introduced the selection and prioritization of the items and ideas suggested for digiti- zation. It posed questions related to content criteria (Why does this idea warrant consideration? Would complex or unique metadata be required from a subject specialist?) and listed various potential evaluative measures of proj- ect ideas (Should we do this if another library is already doing a very similar project?). Technical criteria consider- ations were enumerated, touching on interoperability of collections in different formats, technical infrastructure considerations, and so on. Multiple simultaneous ideas beg for prioritization, and the white paper proposed a formal review process and the library staff and skill sets that would help make such a process successful. The third primary topic focused on the details of carrying an approved idea to reality, and strengthened the educational purpose of the white paper. It described the general planning steps for an approved project and included a list of typical steps involved with most digital projects—scanning; creating metadata, indexes, and controlled vocabulary; coding and designing the Web interface; loading records into UNLV Libraries’ CONTENTdm system; publicizing the launch of the proj- ect; and assessing the project after completion. One UNLV Library Digitization Survey question was related to thir- teen such skills the UNLV Libraries identified as critical for a successful digitization program. The question asked respondents to rate skill levels possessed by personnel at their library, based on a five-point scale (from one to five: “no expertise,” “very limited expertise,” “working knowledge/enough to get by,” “advanced knowledge,” and “tremendous expertise”). Neither “no expertise” nor “very limited expertise” garnered the highest number of responses for any of the skills. The overall rating average of all thirteen skills was 3.79 out of 5. The skills with the highest rating averages were “metadata creation/catalog- ing” 4.4 and “digital imaging/document scanning/post image processing/photography” with 4.27. The skills with the lowest rating averages were “marketing and promotion” with 2.95 followed by “multimedia formats” with 3.33. The UNLV Libraries’ white paper contained several appendixes that likely provided some of the richest content of the white paper. With the educational thrust completed, the appendixes drew a roadmap of “where do we want to go from here?” This roadmap suggested the revitalization of an overarching Digital Projects Advisory Committee, potential members of the committee, and functions of the committee. The committee would be responsible for soliciting and prioritizing ideas and track- ing the progress of approved ideas to publication. The appendixes also proposed project teams (which would exist for each project), likely members of the project teams, and the functions of the project team to complete day-to-day digitization activities. The liaison between the Digital Projects Advisory Committee and the project team would be the digitization projects librarian, who would always serve on both. The last page of the white paper provided an illustration highlighting the various steps proposed in the lifecycle of a digital project—from concept to reality. n Digitization workshops Several months after the white paper had been shared, the next step in restructuring the program and building momentum was sponsoring two forums on digitization. The first one occurred in November 2006 and included two speakers brought in for the event, Roy Tennant (formerly user services architect with the California Digital Library and now with OCLC) and Ann Lally (head of the Digital Initiatives program at the University of Washington Libraries). This session consisted of a SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 121 two-hour presentation and Q&A to which all library staff were invited, followed by two breakout sessions. All three sessions were moderated by the digitization projects librarian. Questions from these sessions are pro- vided in appendix C. The breakout sessions were each targeted to specific departments in the UNLV Libraries. The first focused on providing access to digital collec- tions (definitions of digital libraries, standards, designing useful metadata, accessibility and interoperability, etc.). The second focused on components of a well-built digital library (goals of a digitization program, content selection criteria, collaboration, evaluation and assessment, etc.). Colleagues from other libraries in Nevada were invited, and the forum was well attended and highly praised. The sessions were recorded and later made available on DVD for library staff unable to attend. This initial forum accomplished two important goals. First, it was an all- staff meeting offering a chance to meet, explore ideas, and learn from two well-known experts in the field. Second, it offered a more intimate chance to talk about the technical and philosophical aspects of a digitization program for those individuals in the UNLV Libraries associated with such tasks. As a momentum-building opportunity for the digitization program, the forum was successful. The second workshop occurred in April 2007. To gain initial feedback on several digitization questions and to help focus this second workshop, we sent out a survey to several dozen library staff—those that would likely play some role at some point in the digitization program. The survey contained questions focused on several the- matic areas: defining digital libraries, boundaries to the digitization program, users and audience, digital project design, and potential projects and ideas. It contained thirteen questions consisting of open-ended response questions, questions where the respondent ranked items on a five-point scale, and “select all that apply”–type questions. We distributed the survey to invitees to the second workshop, approximately three dozen individu- als; of those, eighteen (about 50 percent) responded to most of the questions. The survey was closely tied to the white paper and meant to gauge early opinions on some of the questions posed by that paper. Whereas the first workshop included some open Q&A, the second ses- sion was structured as a hands-on workshop to answer some of the digitization questions and to illustrate the complexity of prioritizing projects. The second workshop began with a status update on the retooling of the UNLV Libraries’ digitization program. This was followed by an educational component that focused on a diagram that detailed the workflow of a typical digitization project and who was involved and that emphasized the fact that there is a lot of planning and effort needed to bring an idea to reality. In addition, we discussed project types and how digital projects can vary widely in scope, con- tent, and purpose. Finally, we shared general results from the aforementioned survey to help set the stage for the structured hands-on exercises. The outline for this second workshop is provided in appendix D. One question of the UNLV Library Digitization Survey asked, “on a scale of 1 to 5, how important are each of the factors in weighing whether to proceed with a proposal for a new digital collection project, or enhancement of an existing project?” Eight factors were listed, and the five- point scale was used (from one to five: “not important,” “less important,” “neutral,” “important,” and “vitally important”). The average rating for all eight factors was 3.66. The two most important factors were “collection includes unique items” (4.49 average rating) and “col- lection includes items for which there is a preservation concern or to make fragile items more accessible to the public” (3.95 average rating). The factors with the lowest average ratings were “collection includes integration of various media into a themed presentation” (2.54 average rating) followed by “collection involves a whole run of an information resource (i.e., such as an entire manuscript, newspaper run, etc.” (3.39 average rating). The earlier ARL survey asked a somewhat related question, “What is/has been the purpose of these digitization efforts? Check all that apply.” Of the six possible responses (which differed somewhat from those in the UNLV Library Digitization Survey), the most frequent responses were “improved access to library collections,” “support for research,” and “preservation.”17 The earlier survey also asked the question, “What are the criteria for select- ing material to be digitized? Check all that apply.” The most frequent responses were “subject matter,” “mate- rial is part of a collection being digitized,” and “rarity or uniqueness of the item(s).”18 The first exercise of the second digitization workshop focused on digital collection brainstorming. The authors provided a list of ten project examples and asked each of the six tables (with four colleagues each) to prioritize the ideas. Afterward, a speaker from each table presented the prioritizations and defended their rankings. This exercise successfully illustrated to peers in attendance that different groups of people have different ideas about what’s important and what constitutes prime materials for digitization. The rankings from the varying tables were quite divergent. A related question asked of the ARL libraries in the UNLV Library Digitization Survey was “from where have ideas originated for existing, published digital collection at your library?” and offered six choices. Respondents could mark multiple items. The most chosen answer (92.7 percent) was “special collections, archives, or library with a specialized collection or focus.” The least chosen answer (51.2 percent) was “an external donor, friend of the library, community user, etc.” For the second part of the workshop exercise, each table came up with their own digital collection ideas, defined the audience and content of the proposal, and defended and 122 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 explained why they thought these were good proposals. Fourteen unique and varied ideas were proposed, most of which were tightly focused on Las Vegas and Nevada, such as “History of Las Vegas,” “UNLV Yearbooks,” “Las Vegas Gambling and Gamblers,” and “African American Entertainers in Las Vegas.” Other proposals were less tied to the area, such as a “Botany Collection,” “Movie Posters,” “Children’s Literature,” “Architecture,” and “Federal Land Management.” This exercise successfully showed that ideas for digital collections stretch across a broad spectrum, as broad as the individual brainchilden themselves. Finally, in the last digitization workshop exercise, each table came up with specialties, roles, and skills of candidates who could potentially serve on the proposed committee, and defended their rationale—in other words, committee success factors. This exercise generated nine- teen skills seen as beneficial by one or more of the group tables. At the end of the workshop, we asked if others had alternate ideas to the proposed committee. None sur- faced, and the audience thought such a committee should be reestablished. This second workshop concluded with a brief discussion on next steps—drafting a charge for the committee, choosing members, and a plug for the expec- tation of subject liaisons working with their respective areas to help better identify opportunities for collabora- tion on digital projects across campus. n Toward the future Digital projects currently maintained by the UNLV Libraries include both static Web exhibits in the tra- dition of UNLV’s first digitization efforts, as well as several searchable CONTENTdm–powered collections. The UNLV Libraries have also sought to continue col- laborative efforts, participating as project partners for the Western Waters Digital Library (phase 1) and continu- ing in a regional collaboration as a hosting partner in the Mountain West Digital Library. Partnerships were shown in the UNLV Library Digitization Survey to garner increased buy-in for projects, with one respon- dent commenting that faculty partnerships had been “the biggest factor for success of a digital library proj- ect.” Institutional priorities at UNLV Libraries reflect another respondent’s comment regarding “interesting archival collections” as a success factor. One recently launched UNLV collection is the Showgirls collection (2006), focused on a themed collection of historical mate- rial about Las Vegas entertainment history.19 Another recently launched collection, the Nevada Test Site Oral History Project (2008), recounts the memories of those affiliated with and affected by the Nevada Test Site dur- ing the era of Cold War nuclear testing and includes searchable transcripts, selected audio and video clips, and scanned photographs and images.20 With general library approval, the restructured Digitization Projects Advisory Committee was estab- lished in July 2007 with six members drawn from Library Technologies, Special Collections, the subject special- ists, and at large. The advisory committee has drafted and gained approval for several key documents to help govern the committee’s future work. This includes a col- lection development policy for digitization projects and a project proposal form to be completed by the individual or group proposing an idea for a digital collection. At the time of writing, the committee is just now at the point of advertising the project proposal form and process, and time will tell how successful these documents prove. In the UNLV Library Digitization Survey, 65.4 percent responded that a digitization mission statement or collec- tion development policy was in place at their institution. One goal at UNLV is to “ramp up” the number of simul- taneous digitization projects underway at any one time at UNLV. Many items in the Special Collections are ripe for digitization. Many of these are uncataloged, and digitiz- ing such collections would help promote these hidden treasures. Related to ramping up production, one UNLV Library Digitization Survey question asked, “on average over the past three years, approximately how many new digital collections are published each year?” Responses ranged from zero new collections to sixty. The average number of new collections added each year was 6.4 for the 32 respondents who gave exact numerical answers. While this is perhaps double the UNLV Libraries’ current rate of production, it illustrates that increasing produc- tion is an achievable goal. Staffing and funding for the UNLV Libraries’ digitiza- tion program have both seen increases over the past several years. A new application developer was hired, and a new graphics/multimedia specialist filled an existing vacancy. Together, these staff have helped with projects such as modifying CONTENTdm templates, graphic design, and multimedia creation related to digital projects, in addition to working on other Web-based projects not necessarily related to the digitization program. Another position has a job focus shifted toward usability for all things Web- based, including digitization projects. In terms of funding, the two most recent projects at the UNLV Libraries are both the result of successful grants. The recently launched Nevada Test Site Oral History Project was the result of two grants from the U.S. Departments of Education and Energy. Subsequently, a $95,000 LSTA grant proposal seek- ing to digitize key items related to the history of southern Nevada from 1900 to 1925 was funded for 2008–9, with the resulting digital collection publicly launched in May 2009. This collection, Southern Nevada: The Boomtown Years, contains more than 1,500 items from several institutions, focused on the heyday of mining town life in Southern SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 123 Nevada during the early twentieth century.21 This grant funded four temporary positions: a metadata specialist, an archivist, a digital projects intern, and an education consultant to help tie the digitized collection into the K–12 curriculum. Grants will likely play a large role in the UNLV Libraries’ future digitization activities. The UNLV Library Digitization Survey asked, “Has your institu- tion been the recipient of a grant or gift whose primary focus was to help efforts geared toward digitization of a particular collection or to support the overall efforts of the digitization program?” The question sought to determine if grants had played a role, and if so, whether it was primarily large grants (defined as > $100,000), small grants (< $100,000), or both. The major- ity of responses (46.2 percent), indicated a combination of both small and large grants had been received in sup- port of a project or the program. An additional 25.6 per- cent indicated that large grants had played a role, and 23.1 percent indicated that one or more small grants had played a role. Two respondents (5.1 percent) indicated that no grants had been received or that they had not applied for any grants. The earlier ARL survey asked the question, “What was/is the source of the funds for digitization activities? Check all that apply.” Of seven possible responses, “grant” was the second most fre- quent response, trailing only “library.”22 With an eye toward the future, the survey adminis- tered to ARL libraries asked two blunt questions sum- marizing the overall thrust of the survey. One of the final open-ended survey questions asked, “What are some of the factors that you feel have contributed to the success of your institution’s digitization program?” Forty respondents offered answers that ranged from list- ing one item to multiple items. Several responses along the same general theme seemed to surface, which could be organized into rough clusters. In general, support from library administration was mentioned by a dozen respondents, with such statements as “consistent inter- est on the part of higher level administration,” “having support for the digitization program at an administra- tive level from the very beginning,” “good support from the library administration,” “support of the dean,” and, mentioned multiple times in the same precise language, “support from library administration.” Faculty collabo- ration and interest across campus was mentioned by ten respondents, evidenced by statements such as “strong collaboration with faculty partners,” “support of faculty and other partners,” “interest from faculty,” “heavily involving faculty in particular . . . ensures that we can have continued funding since the faculty can lobby the Provost’s office,” and “grant writing partnerships with faculty.” Passionate individuals involved with the pro- gram and/or support from other staff in the libraries were mentioned by ten respondents, with comments such as “program management is motivated to achieve success,” “a strong department head,” “individual staff member ’s dedication to a project,” “commitment of the people involved,” “team work, different departments and staff willing to work together,” and “supportive individuals within the library.” Having “good” content to digitize was mentioned by seven respondents, with statements such as “good content,” “collection strength,” “good collections,” and “availability of unique source materials.” Strategic plan or goals integration was men- tioned in several responses, such as “strong financial commitment from the strategic plan” and “mainstream- ing the work of digital collection building into the stra- tegic goals of many library departments.” Successful grants and donor cultivation were mentioned by four respondents. Other responses were more unique, such as one respondent’s one-word response—“luck”—and other responses such as “nimbleness, willingness, and creativity,” and “a vision for large-scale production, and an ability to achieve it.” The final UNLV Library Digitization Survey question asked, “What are the biggest challenges for your institu- tion’s digitization program?” Thirty-nine respondents provided feedback, and again, several variations on a theme emerged. The most common response, unsurpris- ingly, “not enough staffing,” was mentioned by eighteen respondents, with responses such as “lack of support for staffing at all necessary levels,” “the real problem is people, we don’t have enough staff,” “limited by staff,” and “we need more full-time people.” Following this was (a likely related response) “funding,” mentioned by another nine respondents, with statements such as “funding for external digitization,” “identifying enough funding to support conversion,” “we could always use more money,” and, succinctly, “money.” Related to staff- ing, specifically, six responses focused on technical staff or support from technical staff, such as “need more IT (information technology) staff,” “need support from existing IT staff,” “not enough application development staff,” and “limited technical expertise.” Prioritization and demand issues surfaced in six responses, with responses such as “prioritizing efforts now that many more requests for digital projects have been submit- ted,” “prioritization,” “can’t keep up with demand,” and “everyone wants to digitize everything.” Workflow was mentioned in four responses, such as “workflow bottlenecks,” “we need to simplify the process of getting materials into the repository,” and “it takes far longer to describe an object than to digitize it, thus creating bottlenecks.” “Not enough space” was mentioned by three respondents, and “maintaining general library- wide staff support for the program” was mentioned by two respondents. The UNLV Libraries will keep in mind the experiences of our colleagues, as few, if any, libraries are likely immune to similar issues. 124 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 n Conclusions The UNLV Library Digitization Survey revealed, not sur- prisingly, that not all libraries, even those of high stature, are created equally. Many have struggled to some extent in growing and sustaining their digitization programs. Many have numerous published projects, others have few or perhaps even none. Administrative and fellow colleague support varies, as does funding. Additional questions remain to be tackled at the UNLV Libraries. How precisely will we define success for the digitization program? By the number of published collections? By the number of successful grants executed? By the num- ber of image views or metadata record accesses? By the frequency of press in publications and word-of-mouth praise from fellow colleagues? Ideas abound, but no definitive answers exist as of yet. At the larger level, other questions are looming. As libraries continue to promote themselves as relevant in the digital age, and promote themselves as a (or the) central partner in student learn- ing, to what degree will libraries’ digital collections be tied into the educational curriculum, whether at their own affiliated institutions or with K–12 in their own states as well as beyond? Clearly the profession is chang- ing, with library schools creating courses and certificate programs in digitization. Discussions about the integra- tion of various information silos, metadata crosswalk- ing, and item exposure in other online systems used by students will continue. Library digitized collections are primary resources involved in such discussions. While these questions persist, it’s hoped that at a minimum, the UNLV Libraries have established the foundational struc- ture to foster what we hope will be a successful digitiza- tion program. References 1. Institute for Museum and Library Services, “Status of Technology and Digitization in the Nation’s Museums and Librar- ies 2002 Report,” May 23, 2002, www.imls.gov/publications/ TechDig02/2002Report.pdf (accessed Mar. 1, 2009). 2. Institute for Museum and Library Services, “Status of Technology and Digitization in the Nation’s Museums and Libraries 2006 Report,” Jan. 2006, www.imls.gov/resources/ TechDig05/Technology%2BDigitization.pdf (accessed Mar. 1, 2009). 3. Rebecca Mugridge, Managing Digitization Activities, SPEC Kit 294 (Washington, D.C.: Association of Research Libraries, 2006): 11. 4. Ross Housewright and Roger Schonfeld, “Ithaka’s 2006 Studies of Key Stakeholders in the Digital Transformation in Higher Education,” Aug. 18, 2008, www.ithaka.org/research/ Ithakas%202006%20Studies%20of%20Key%20Stakeholders%20 in%20the%20Digital%20Transformation%20in%20Higher%20 Education.pdf (accessed Mar 1, 2009). 5. Ibid. 6. University of Nevada, Las Vegas University Libraries, “Jeanne Russell Janish, Botanical Illustrator: Landscapes of China and the Southwest,” Oct. 17, 2006, http://library.unlv .edu/speccol/janish/index.html (accessed Mar. 1, 2009). 7. University of Nevada, Las Vegas University Librar- ies, “Early Las Vegas,” http://digital.library.unlv.edu/early_ las_vegas/earlylasvegas/earlylasvegas.html (accessed Mar. 1, 2009). 8. Arlitsch, Kenning, and Jeff Jonsson, “Aggregating Distrib- uted Digital Collections in the Mountain West Digital Library with the CONTENTdm Multi-site Server,” Library Hi Tech 23, no. 2 (2005): 221. 9. Institute for Museum and Library Services, “Status of Technology and Digitization in the Nation’s Museums and Libraries 2006 Report.” 10. Michael Boock and Ruth Vondracek, “Organizing for Digitization: A Survey,” portal: Libraries and the Academy 6, no. 2 (2006), http://muse.jhu.edu/journals/portal_libraries_and_ the_academy/v006/6.2boock.pdf (accessed Mar. 1, 2009). 11. Mugridge, Managing Digitization Activities, 12. 12. Institute for Museum and Library Services, “Status of Technology and Digitization in the Nation’s Museums and Libraries 2006 Report.” 13. Brad Eden, “Managing and Directing a Digital Proj- ect,” Online Information Review 25, no. 6 (2001), www.emerald insight.com/Insight/viewPDF.jsp?contentType=Article& Filename=html/Output/Published/EmeraldFullTextArticle/ Pdf/2640250607.pdf (accessed Mar. 1, 2009). 14. Mugridge, Managing Digitization Activities, 32–33. 15. Boock and Vondracek, “Organizing for Digitization: A Survey.” 16. University of Nevada, Las Vegas University Librar- ies, “University Libraries Strategic Goals and Objectives,” June 1, 2005, www.library.unlv.edu/about/strategic_goals.pdf (accessed Mar. 1, 2009). 17. Mugridge, Managing Digitization Activities, 20. 18. Ibid, 48. 19. University of Nevada, Las Vegas University Librar- ies, “Showgirls,” http://digital.library.unlv.edu/showgirls/ (accessed Mar. 1, 2009). 20. University of Nevada, Las Vegas University Libraries, “Nevada Test Site Oral History Project,” http://digital.library .unlv.edu/ntsohp/ (accessed Mar. 1, 2009). 21. University of Nevada, Las Vegas University Librar- ies, “Southern Nevada: The Boomtown Years,” http://digital .library.unlv.edu/boomtown/ (accessed May 15, 2009). 22. Mugridge, Managing Digitization Activities, 40. SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 125 APPENDIx A. UNLV library digitization survey responses 1. Is the digitization program or digitization activities referenced in your library’s strategic plan? Answer Options (41 responses total) Response Percent Response Count Yes 63.4 26 No 7.3 3 Not specifically, but implied 22.0 9 Our library doesn’t have a strategic plan 7.3 3 2. How would you characterize current support for digitization by your library’s administration? Answer Options (42 responses total) Response Percent Response Count Very strong support, top priority 31.0 13 Consistently supportive 40.5 17 Neutral 14.3 6 Minimal support, 7.1 3 Very little support, or some resistance 7.1 3 3. How would you characterize support for digitization in your library by the majority of those providing content for digitization projects (i.e., regardless of whether those providing content have as a primary or a minor responsibility provisioning content for digitization projects)? Answer Options (44 responses total) Response Percent Response Count Very strong support, top priority 15.9 7 Consistently supportive 65.9 29 Neutral 13.6 6 Minimal support 2.3 1 Very little support, or some resistance 2.3 1 126 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 4. What year do you feel your library published its first “major” digital collection? Major is defined as this was the first project deemed as having permanence and which would be sustained; it has associated metadata, etc. If you do not know, you may estimate or type “Unknown.” Responses ranged from 1990 to 2007. 5. To date, approximately how many digital collections has your library published? (Please do not include ephemeral exhibits that may have existed in the past but no longer are present or sustained.) Responses ranged from 1 to 1,000s. The great majority of responses were under 100; four responses were between 100 and 200, and one response was “1,000s.” SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 127 6. On average over the past 3 years, approximately how many new digital collections are published each year? All but two responses ranged from 0 to 10. One response was 13, one was 60. 7. What hosting platform(s) do you use for your digital collections (e.g., CONTENTdm, etc.)? 8. Does your institution have an institutional repository (e.g., DSpace)? Answer Options (41 responses total) Response Percent Response Count Yes 73.2 30 No 26.8 11 9. If the answer was “yes” in question 5, is your institutional repository using the same software as your digital collections? Answer Options (30 responses total) Response Percent Response Count Yes 26.7 8 No 73.3 22 128 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 10. Is there an individual at your library whose central job responsibility is the development, oversight, and manage- ment of the library’s digitization program? (For purposes of this survey, central job responsibility means that 50 percent or more of the employee’s time is dedicated to digitization activities.) Answer Options (38 responses total) Response Percent Response Count Yes 78.9 30 No 21.1 8 11. Are there regular, full-time staff at your library who have as their primary or one of their primary job responsi- bilities support of the digitization program? For this question, a primary job responsibility means that at least 20 percent of their normal time is spent on activities directly related to supporting the digitization program or devel- opment of a digital collection. (Mark all that apply) Answer Options (39 responses total) Response Percent Response Count Digital imaging/document scanning, post-image processing, photography 82.1 32 Metadata creation/cataloging 79.5 31 Archival research of documents included in a collection(s) 28.2 11 Administration of the hosting server 53.8 21 Grant writing/donor cultivation/program or collection marketing 23.1 9 Project management 61.5 24 Multimedia formats 25.6 10 Database design and data manipulation 53.8 21 Maintenance, customization, and/or configuration of digital asset management software or features within that software (e.g., CONTENTdm) 64.1 25 Programming languages 30.8 12 Web design and development 71.8 28 Usability 25.6 10 Marketing and promotion 28.2 11 None of the above 2.6 1 12. Approximately how many individuals not on the full-time library staff payroll (i.e., student workers, interns, field- workers, volunteers) are currently working on digitization projects? Answers ranged from 0 to “approximately 46.” The majority of responses (24) fell between 0 and 10 workers; twelve responses indicated more than 10; several responses indicated “unknown.” SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 129 13. Has your library funded staff development, training, or conference opportunities that directly relate to your digi- tization program and activities for one or more library staff members? Answer Options (41 responses total) Response Percent Response Count Yes, frequently, one or more staff have been funded by library administration for such activities 48.8 20 Yes, occasionally, one or more staff have been funded by library administration for such activities 51.2 21 No, to the best of my knowledge, no library staff member has been funded for such activities 0.0 0 14. Where does the majority of digitization work take place? Answer Options (41 responses total) Response Percent Response Count Centralized in the library (majority of content digitized using library staff and equipment in one department) 48.8 20 Decentralized (majority of content digitized in multiple library departments or outside the library by other university entities) 12.2 5 Through vendors or outsourcing 7.3 3 Hybrid of approaches depending on project 31.7 13 15. On a scale of 1 to 5 (1 being least important and 5 being vitally important), how important are each of the factors in weighing whether to proceed with a proposal for a new digital collection project or enhancement of an existing project? Answer Options (41 responses total) Not Important Less Important Neutral Important Vitally Important Rating Average Response Count Collection includes item(s) for which there is a preservation concern or to make fragile item(s) more accessible to the public 0 1 9 22 9 3.95 41 Collection includes unique items 0 0 1 19 21 4.49 41 Collection involves a whole run of an information resource (e.g., an entire manuscript, newspaper run, etc.) 2 5 11 21 2 3.39 41 130 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Answer Options (41 responses total) Not Important Less Important Neutral Important Vitally Important Rating Average Response Count Collection includes the integration of various media (i.e., images, documents, audio) into a themed presentation 7 11 17 6 0 2.54 41 Collection has a direct tie to educational programs and initiatives (e.g., university courses, statewide education programs, or K–12 education) 3 3 6 17 12 3.78 41 Collection supports scholarly communication and/or management of institutional content 1 4 7 21 8 3.76 41 Collection involves a collaboration with university colleagues 1 3 9 18 10 3.83 41 Collection involves a collaboration with entities external to the university (e.g., public libraries, historical societies, museums) 2 4 11 19 5 3.51 41 16. From where have ideas originated for existing, published digital collections at your library? In other words, have one or more digital collections been the brainchild of one of the following? (Mark all that apply) Answer Options (41 responses total) Response Percent Response Count Library subject liaison or staff working with teaching faculty on a regular basis 75.6 31 Library administration 65.9 27 Special Collections, Archives, or library with a specialized collection or focus 92.7 38 Digitization program manager 63.4 26 University staff or faculty member outside the library 68.3 28 An external donor, friend of the library, community user, etc. 51.2 21 (continued from previous page) SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 131 17. To whom are new projects first proposed to be evaluated for digitization consideration? Answer Options (38 responses total) Response Percent Response Count To an individual decision-maker 23.7 9 To a committee for review by multiple people 42.1 16 No formal process 34.2 13 18. How are approved projects ultimately prioritized? Answer Options (37 responses total) Response Percent Response Count By a single decision-maker 18.9 7 By a committee for review by multiple people 54.1 20 By departments or groups outside of the library 0.0 0 No formal process 27.0 10 19. Are digitization program mission statements, selection criteria, or specific prioritization procedures in use? Answer Options (40 responses total) Response Percent Response Count Yes, one or more of these forms of documentation exist detailing process 67.5 27 Yes, some criteria are used but no formal documentation exists 25.0 10 No documented process in use 7.5 3 20. What general evaluation criteria do you employ to measure how successful a typical digital project is? (Mark all that apply) Answer Options (39 responses total) Response Percent Response Count Log analysis showing utilization/record views of digital collection items 69.2 27 Analysis of feedback or survey responses associated with the digital collection 38.5 15 Publicity generated by, or citations referencing, digital collection 46.2 18 E-commerce sales or reproduction requests for digital images 12.8 5 We have no specific evaluation measures in use 33.3 13 132 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 21. Has your institution been the recipient of a grant or gift whose primary focus was to help efforts geared toward digitization of a particular collection or to support the overall efforts of the digitization program? Answer Options (39 responses total) Response Percent Response Count We have received one or more smaller grants or donations (each of which was $100,000 or less) to support a digital collection/program 23.1 9 We have received one or more larger grants or donations (each of which was greater than $100,000) to support a digital collection/program 25.6 10 We have received a mix of small and large grants or donations to support a digital collection/program 46.2 18 We have been unsuccessful in receiving grants or have not applied for any grants—grants and/or donations have not played any role whatsoever in supporting a digital collection or our digitization program 5.1 2 22. How would you rate the overall level of buy-in for collaborative digitization projects between the library and external partners (an external partner is someone not on the full-time library staff payroll, such as other university colleagues, colleagues from other universities, etc.)? Answer Options (41 responses total) Response Percent Response Count Excellent 41.5 17 Good 39.0 16 Neutral 4.9 2 Minimal 7.3 3 Low or None 0.0 0 Not applicable—our library has not yet published or attempted to publish a collaborative digital project involving individuals outside the library 7.3 3 23. When considering the content available for digitization, which of the following statements apply? (Mark all that apply) Answer Options (40 responses total) Response Percent Response Count At my institution, there is a lack of suitable library collections for digitization 0.0 0 Content providers regularly contact the digitization program with project ideas 52.5 21 The main source of content for new digitization projects comes from Special Collections, archives, other libraries with specialized collections (maps, music, etc.), or local cultural organizations (historical societies, museums) 87.5 35 SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 133 Answer Options (40 responses total) Response Percent Response Count The main source of content for new digitization projects comes from born digital materials (such as dissertations, learning objects, or faculty research materials) 32.5 13 Content digitization is mainly limited by available resources (lack of staffing, space, equipment, expertise) 47.5 19 Obtaining good content for digitization can be challenging 7.5 3 24. Various types of expertise are important in collaborative digitization projects. Please rate the level of your local library staff’s expertise in the following areas (1–5 scale, with 1 having no expertise and 5 having tremendous expertise). Answer Options (41 responses total) No Expertise Very Limited Expertise Working Knowledge/ Enough to “Get By” Advanced Knowledge Tremendous Expertise N/A Rating Average Response Count Digital imaging/ document scanning, post image processing, photography 0 1 3 21 16 0 4.27 41 Metadata creation/ cataloging 0 0 2 20 18 0 4.40 40 Archival research of documents included in a collection 0 2 6 15 16 2 4.15 41 Administration of the hosting server 1 2 7 16 15 0 4.02 41 Grant writing/ donor cultivation 1 4 13 13 8 2 3.59 41 Project management 0 1 9 23 8 0 3.93 41 Multimedia formats 0 5 21 10 4 1 3.33 41 Database design and data manipulation 0 4 9 14 13 1 3.90 41 (continued from previous page) 134 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Answer Options (41 responses total) No Expertise Very Limited Expertise Working Knowledge/ Enough to “Get By” Advanced Knowledge Tremendous Expertise N/A Rating Average Response Count Digital asset management software (e.g., CONTENTdm) 3 0 5 21 11 0 3.93 40 Programming languages 4 3 14 9 11 0 3.49 41 Web design and development 2 1 13 10 15 0 3.85 41 Usability 1 7 12 13 8 0 3.49 41 Marketing and Promotion 2 11 17 7 3 1 2.95 41 25. What are some of the factors that you feel have contributed to the success of your institution’s digitization program? Survey responses were quite diverse because respondents were speaking to their own perceptions and institutional expe- rience. The general trend of responses are discussed in the body of the paper. 26. What are the biggest challenges for your institution’s digitization program? Survey responses were quite diverse because respondents were speaking to their own perceptions and institutional expe- rience. The general trend of responses are discussed in the body of the paper. APPENDIx B. White paper organization I. Introduction II. Current Status of Digitization Projects at the UNLV Libraries III. Topic 1: Program Planning A. Are there boundaries to the Libraries digitization program? What should the program support? B. What resources are needed to realize program goals? C. Who is the user or audience? D. When selecting and designing future projects, how can high-quality information be presented in online for- mats incorporating new features while remaining un-biased and accurate in service provision? E. To what degree do digitization initiatives need their own identity versus heavily integrating with the Libraries’ other online components, such as the general website? F. How do the libraries plan on sustaining and evaluating digital collections over time? G. What type of authority will review projects at completion? How will the project be evaluated and promoted? IV. Topic 2: Initiative Selection and Prioritization A. Project Selection: What content criteria should projects fall within in order to be considered for digitization and what is the justification for conversion of the proposed materials? (continued from previous page) SuCCESS FaCTorS aND STraTEGiC plaNNiNG | laMpErT aND VauGHaN 135 B. Project Selection: What technical criteria should projects fall within in order to be considered for digitization? C. Project Selection: How does the project relate to, interact with, or complement other published projects and collections available globally, nationally, and locally? D. Project Selection and Prioritization: After a project meets all selection criteria, resources may need to be eval- uated before the proposal reaches final approval. What information needs to be discussed in order to finalize the selection process, select between qualified project candidates, and begin the prioritization process for approved proposals? E. Project Prioritization: Should we develop a formal review process? V. Topic 3: Project Planning A. What are the planning steps that each project requires? B. Who will be responsible for the different steps in the project plan and department workload? C. How can the Libraries provide rich metadata and useful access points? D. What type of Web design will each project require? E. What type of communication needs to exist between groups during the project? VI. Concluding Remarks VII. Related Links and Resources Cited VIII. White Paper Appendixes A. Working List of Advisory Committee Functions and Project Workgroup Functions B. CONTENTdm Software: Roles and Expertise C. Project Team Workflow D. CONTENTdm Elements APPENDIx C. First workshop questions General questions 1. How do you define a digital library? Do the terms “repository,” “digital project,” “exhibit,” or “online collection” connote different things? If so, what are the differences, similarities, and boundaries for each? 2. What factors have contributed to a successful digitization program at your institution? Did anything go drastically wrong? Were there any surprises? What should new digitization programs be cautious and aware of? 3. What is the role, specifically, of the academic library in creating digital collections? How is digitization tied to the mission of your institution? 4. Why digitize and for whom? Do digital libraries need their own mission statement or philosophy because they differ from physical collections? Should there be boundaries to what is digitized? 5. What standards are most widely in use at this time? What does the future hold? Are there new standards you are interested in? Technical questions, metadata questions 1. What are some of the recommended components of digital library infrastructure that should be in place to support a digitization program (equipment, staff, planning, technical expertise, content expertise, etc?) 2. What are the relationships between library digitization initiatives, the library website, the campus website or por- tal, and the Web? In what ways do these information sources overlap, interoperate, or require boundaries? 3. How do you decide on what technology to use? What is the decision-making process when implementing a new technology? 4. Standards are used in various ways during digitization. What is the importance of using standards, and are there areas where standards should be relaxed, or not used at all? How do digitization programs deal with evolving standards? 5. Preservation isn’t talked about as much as it used to be. What’s your solution or strategy to the problem of preserv- ing digital materials? 6. Will embedded metadata ever be the norm for digital objects, or will we continue to rely on collection management like CONTENTdm to link digital objects to their associated metadata? 136 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 APPENDIx D. Second workshop outline 1. Introduction—purpose/focus of the meeting A. To talk about next steps in the digitization program B. Quick review of the current status and where the program has been C. Serve to further educate participants on the steps involved in taking a project idea to reality D. Goals for Participants: understand types of projects and project prioritization; engage in activities on ideas and prioritization; talk about process and discuss committee; open forum 2. Staff Digitization Survey Discussion A. “Defining Digital Libraries” B. “Boundaries to the Digitization Program” C. “Users and Audience” D. “Digital Project Design” E. “Potential Projects and Ideas” 3. First Group Exercise: Digital Project Idea Ranking and Defense of Ranking 4. Second Group Exercise: Digital Project Idea Brainstorming and Defense of Ideas Brainstormed 5. Concept/Proposal for a Digitization Advisory Committee 6. Conclusion and Next Steps Collections and design questions 1. How do you decide what should be included in a digital library? Does the digital library need a collection develop- ment policy and if so, what type? How are projects prioritized at your institution? 2. How do you decide who your user is? Are digital libraries targeting mobile users or other users with unique needs? What value-added material compliments and enhances digital collections (i.e., item-level metadata records, guided searches, narrative or scholarly content, teaching material, etc.)? 3. How should digital libraries be assessed and evaluated? How do you gauge the success of a digital collection, exhibit, or library? What has been proven and disproved in the short time that libraries have been doing digital projects? 4. What role do digital libraries play in marketing the library? How do you market your digital collections? Are there any design criteria that should be considered for the Web presence of digital libraries (should the digital library look like the library website, the campus website, or have a unique look and feel)? 5. Do you have any experience partnering with teaching faculty to create digital collections? How are collabora- tions initiated? Are such collaborations a priority? What other types of collaborations are you involved in now? How do you achieve consensus with a diverse group of collaborators? To what degree is centralization important or unnecessary? 3221 ---- GENDEr, TECHNoloGY, aND liBrariES | laMoNT 137 Melissa Lamont Gender, Technology, and Libraries Information technology (IT) is vitally important to many organizations, including libraries. Yet a review of employ- ment statistics and a citation analysis show that men make up the majority of the IT workforce, in libraries and in the broader workforce. Research from sociology, psychology, and women’s studies highlights the organizational and social issues that inhibit women. Understanding why women are less evident in library IT positions will help inform measures to remedy the gender disparity. T echnology not only produces goods and services, it also influences society and culture and affects our ability to work and communicate. As the computer encroaches more deeply into both workplaces and homes, encouraging participation in the development and use of technology by all segments of society is important. Libraries, in particular, need to provide services and products that both appeal to and are accessible by a broad range of clientele. For libraries, information technol- ogy (IT) has become vitally important to the operation of the organization. Yet fewer women are active in IT than men. A complex series of social and cultural biases inhibits women from participating in technology both in the library and in the larger workforce. The inclusion of more women in technology would alter the development and design of products and services as well as change the dynamic of the workplace. Understanding why women reject IT as it is currently practiced is necessary to under- standing how to make technology more inviting for women. n Occupational data Studies and statistics from the broader IT fields high- light discrepancies between the compensation, manage- rial level, and occupational roles of men and women.1 Among the numbers are those showing that computer and information science fields include only 519,700 females and slightly more than 1,360,000 males in 2003.2 In the same occupational fields, men earned a median of $74,000 while women earn $63,000.3 Similarly, the Association of Research Libraries (ARL) statistics from 2004 to 2008 show that men were more often employed as the heads of computer systems departments within libraries. Computer systems department heads also earned higher salaries than the heads of other library departments. With the exception of 2004–5, female com- puter department heads were paid less than their male counterparts, despite the fact that they had more years of experience. In the 2007–8 report, men and women had the same number of years of experience, though women’s salaries lagged slightly behind those of the men, as shown in table 1.4 The availability of statistics for the heads of library technology departments belies the difficulty in counting the number of technology positions in libraries, or the broader workplace, and compiling statistics by gender. In a recent study of the job satisfaction of academic library IT workers, Lim comments on the complexities in iden- tifying survey participants, “as a directory of library IT workers does not exist.”5 Thus, to augment the statistical data for department heads, a citation analysis was used to identify those persons involved enough in library technology to write about it. Presumably, authors of articles appearing in technology-oriented journals would have interests and expertise in technology regardless of their position titles or locations within the organization. Technology-related articles can and do appear in a wide variety of library journals. Journals with a focus on technology were selected to avoid the dilemma of subjectively categorizing individual articles as technical or nontechnical. The journals selected provide a cross-section of asso- ciation, commercial, electronic, and print publications. Information Technology and Libraries is the journal of the Library Information Technology Association division of the American Library Association (ALA). The Journal of Information Science and Technology (JASIS&T) is an offi- cial publication of the American Society for Information Science and Technology. The not-for-profit Corporation for National Research Initiatives publishes D-Lib Magazine, an Table 1. Library computer systems department heads Year Gender Depart- ment Heads Salary Years in Field 2004–5 Women 32 76,764 18.9 Men 60 76,060 16.9 2005–6 Women 32 78,767 19.4 Men 52 79,680 18.4 2006–7 Women 26 81,435 18.2 Men 52 82,409 17.6 2007–8 Women 27 87,107 18.8 Men 51 87,136 18.8 Melissa lamont (mlamont@rohan.sdsu.edu) is digital Collections Librarian, San diego State University. 138 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 electronic publication on digital library research and devel- opment. All three are peer-reviewed. Computers in Libraries, published by Information Today, includes case studies and how-we-did-it articles and is not peer-reviewed. Emerald publishes the peer-reviewed journal Library Hi Tech. The author assembled statistics for the years 2006 and 2007. For the survey, regular columns, editors’ sections, reviews, short notices, and association communications were not counted. Each authored article was counted. No attempt was made to include or discount an article based upon the topic. The gender of the authors was determined by notes within the journal, authors’ websites, other Internet sites, or by communication with the authors. As the statistics in table 2 demonstrate, men publish in these journals at a far higher rate than women, with the excep- tion of Computers in Libraries. Women make up 35 percent of the authors while men make up 65 percent. JASIS&T, arguably the most technical and theoreti- cal journal in the analysis, and the journal with the most academic authorship, illustrates the highest disparity. Alternatively, the publication Computers in Libraries con- tains more articles authored by women. This publication solicits articles on the application of technology—practi- cal and less formal articles to share successes and ideas. It may be argued that female librarians simply pub- lish less than male librarians. Two additional publica- tions, The Journal of Academic Librarianship, published by Elsevier, and College and Research Libraries (C&RL), published by the Association of College and Research Libraries, were analyzed for com- parison. Table 3 illustrates the data for the compari- son journals alone, with women making up 62 percent of the authors. Female authors outnum- bered male authors in the comparison journals, but women account for approximately 80 percent of U.S. librarians and are therefore publishing at a lower rate than men.6 In the interest of com- parison, the author also analyzed the journal Children and Libraries, the journal of the Association for Library Service to Children, a division of ALA. In 2006 and 2007, only four male authors were represented in Children and Libraries. They appeared as authors a total of eleven times. All of the remaining fifty authors are female. Women made up 82 percent of the total authors while men made up 18 percent. These statistics are similar to a study conducted by Hakanson and published in 2005. She analyzed articles in selected journals from the years 1980 to 2000 and found that male authors slightly outnumbered female authors, and further that articles authored by men were more likely to be referenced than those by women.7 The data gathered here are similar: 41 percent of the total authors in both technology and comparison journals are women, and 59 percent are men. Male authors also are more likely to be the lead author on articles with multiple authors. Again JASIS&T shows the greatest disparity. Computers in Libraries includes more female lead authors, as shown in table 4. In the comparison journals, women are more often the lead author, as shown in table 5. Both Hakanson’s data and the small statistical sample reported here demonstrate that although women hold most library positions, they do not publish a comparable amount. Technology journals show the most disparity between the numbers of male and female authors. Together, the citation and occupational statistics illus- trate the higher visibility men have in IT. Fewer women are evident in IT as department heads, employees, aca- demics, or authors. Table 2. Gender of authors in technology journals, 2006–7 Publication Articles Female Authors Male Authors # % # % Computers in Libraries 57 51 61.4 32 38.6 D-Lib Magazine 92 83 38.6 132 61.4 Information Technology & Libraries 43 28 33 57 67 JASIS&T 354 244 30.3 560 69.7 Library Hi-Tech 91 63 41.2 90 58.8 Totals 637 469 35 871 65 Table 3. Gender of authors in comparison journals, 2006–7 Publication Articles Female Authors Male Authors # % # % College & Research Libraries 66 81 63 48 37 Journal of Academic Librarianship 128 140 61 89 39 Totals 194 221 62 137 38 GENDEr, TECHNoloGY, aND liBrariES | laMoNT 139 n Discussion In the broader workplace, not just libraries, men hold the majority of IT positions. The importance of including women in IT is not just a matter of equal opportunity. According to Rasmussen and Hapnes, women will bring different concerns and outlooks to IT. Further, the prod- ucts and services produced by a diverse and integrated workforce will appeal to a broader market. Including more women in the IT workplace will also alter the orga- nizational environment. Their ideas and interests will bring new perspectives to development discussions and likely lead to new or different systems.8 Understanding why relatively few women enter IT fields will help inform measures to alter the current, male-dominated dynamic. By reviewing the research in sociology, psychology, and women’s studies, the factors inhibiting women from participation in IT can start to be understood. The dissua- sive factors are a complex and intertwined combination of organizational culture, occupational segregation, and subtle discrimination. abilities and perceptions Technology is pervasive throughout the library, and nearly all librarians develop basic technical skills as a condition of employment. Librarians may develop more advanced computing skills to address a lack of technical support, to develop new services, or for professional or personal interest. Correspondingly, technologists have absorbed library concepts such as description and clas- sification. Yet knowledge and ability are valued and evaluated within the social context of the organization, according to Scott-Dixon. The location of an occupation within the organization will influence the perception of the ability and skill required to succeed in that posi- tion.9 Although the work of librarians and technologists may be similar or interdependent, the occupations are valued differently. Scott-Dixon’s research addresses the problem of “designating which work is technical enough to merit consideration as IT work.”10 Technologically proficient librarians or staff working outside of the IT department will not be considered part of the library’s IT staff, yet they may be performing at a technological level equal to that of the regular IT staff. Scott-Dixon states, “Assumptions about IT work incorporate assumptions about who performs this work, and that work performed in traditionally nonwhite, non-male jobs is often viewed as less technical, regardless of the technological objects that are employed in the process.”11 The number of women participating in IT may be higher than the statis- tics represent; nevertheless, women are still less directly employed in IT. Any contributions they make to IT will be devalued as a consequence of their positions within the library organization. Position and department titles also influence the perceived value of the work. To make traditional library tasks appear modern and relevant, long-established library functions have been renamed. Cataloging has become metadata, catalog control has become system administration, and librarianship has become information science. The old chestnut that informa- tion science is library science for boys has an element of truth. In 2006, the average annual starting salary for librarians who catego- rize their positions as information science was $48,413; the average for those who categorized their positions as library science was $39,580. Women who categorized their positions as information sci- ence earned an average starting salary of $46,118; men averaged $55,423.12 Salary statistics sub- stantiate the research showing that information technology posi- tions are more highly valued and therefore more highly compen- sated in the library organization. Likewise, men are more highly compensated than women. One of the causes of income inequality is occupational Table 4. Gender of lead authors in technology journals, 2006–7 Publication Articles Female First Male First Computers in Libraries 20 12 8 D-Lib Magazine 50 22 28 Information Technology & Libraries 17 7 10 JASIS&T 140 32 108 Library Hi-Tech 42 19 23 Totals 269 92 177 Table 5. Gender of lead authors in comparison journals, 2006–7 Publication Articles Female First Male First College and Research Libraries 39 29 10 Journal of Academic Librarianship 61 36 25 Totals 100 65 35 140 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 segregation.13 Occupational segregation occurs when positions with similar educational requirements, but different titles or locations within the organization, are valued differently.14 The difference in the salaries of tra- ditional library department heads and the heads of tech- nology departments is one example of income inequality within the library. According to the ARL Annual Salary Survey 2007–08, heads of computer systems departments earn more than $87,100 while heads of rare books and manuscripts departments, who have the second highest salaries, earn $80,628. The rare books and manuscripts department heads are nearly evenly divided by gender; the majority of computer systems department heads are men.15 In libraries, occupational segregation divides tra- ditional library departments and functions from IT departments and technology applications. Librarians are predominately female and, as the occupational statistics show, IT workers are predominately male. The result for libraries has been a gendered segregation of the library workforce.16 The results of occupational segregation are intensified by the tendency for women to avoid defining themselves as technology workers. The research by Adam et al. con- firms the results of several earlier studies. When asked to define their roles in the organization, men more often associate their positions with IT; women tend to identify with a larger or more encompassing group within the organization, not specifically IT.17 Though these studies did not include librarians, it could be assumed that female librarians would respond much like their counterparts in other industries. In fact, few occupational studies con- ducted outside the library profession include librarians. Thus it appears that women choose to be excluded from an occupational group that is well compensated, integral to the organization, and considered highly skilled. Not only do women define their positions as non– IT, but women also underestimate their technical skills. Hargittai and Shafer reviewed a number of studies inves- tigating the self-assessment of computer skills. In those studies, women test at the same skill level as men but consistently underrate their technical ability. Hargittai and Shafer conducted a study of Internet skills that draws the same conclusion.18 organizational culture Women may underestimate their abilities and disassoci- ate with IT in part because of the perception of IT orga- nizational culture.19 Technical positions are associated with long and irregular hours, leading to the assump- tion that family and home responsibilities will cause women to be less able to contribute. As Ramsey and McCorduck note, those assumptions are not associated with men’s work.20 They emphasize that while women “often shoulder more family responsibilities than men . . . the presumption more than the reality tends to limit women’s advancement.”21 The perception of a high commitment level is fostered by the computing industries. The stereotype of the soli- tary computer geek, typing away in physical, though not virtual, isolation with a social life revolving around the technology is not entirely accurate. Yet Guzman, Stam, and Stanton have studied IT as an occupational subcul- ture. They call the perceived demands of the subculture “extreme and unusual,” with long hours and constant need for self reeducation.22 The appearance of high cost in time and capital is one way that the already-initiated keep outsiders out. The use of specialized language and jargon, stories of long hours spent, and complaints about end users are all means of solidifying organizational boundaries. The Ramsey and McCorduck report points to a perception by some women that the long hours are often “a status symbol, a sign of machismo.”23 All occu- pational groups participate in us-versus-them behavior however; since IT is gendered, the subculture effectively excludes women and exacerbates the segregation. According to Guzman, Stam, and Stanton’s research, one of the hallmarks of the IT subculture is the sense of control over other groups within the organization. Yet the subculture also shares a sense of fulfillment in assisting others with technology.24 The esoteric knowledge held by IT workers is essential to the operation of most organiza- tions, in particular libraries. This gives the subculture an inordinate sense of power.25 The computing professions appear to be linked with masculinity and power, at least in Western cultures. Melanie Wilson writes, “The quali- ties required for entry to the professions and success in them are seen as masculine.”26 Masculine occupations tend to be associated with skill, learning, and hard work. Construction, business, and now IT have a prepon- derance of male professionals. Masculine occupations are more prestigious and better compensated. Wacjman writes, “To be in command of the very latest technology signifies being involved in directing the future, so it is a highly valued and mythologized activity.”27 The idea that women’s skills are more instinctive makes them less valued, and feminized occupations tend to be associated with the innate behaviors.28 Wilson points to research indicating that “women’s work tends to be regarded as semi-skilled merely because it is women’s work.”29 Women are a higher percentage of elementary school teachers, nurses, and care givers, and those positions receive modest compensation compared to occupations typically held by men. Specific to libraries, technology subfields may be seen as acceptable positions for men in an occupation traditionally dominated by women. As the research suggests, an increase in the number of women involved in technology would devalue those fields. Roos and Reskin explored the effect of an increase GENDEr, TECHNoloGY, aND liBrariES | laMoNT 141 in the numbers of women on occupational status. In a 1990 paper they wrote, Traditionally, “women’s” jobs have been both lower- paying and less valued than “men’s.” Occupational incumbents have thus been chagrined to learn that their occupation is feminizing, fearful of a drop in wages and prestige. This fear has a valid empirical basis: the percentage female in an occupation is negatively cor- related with occupational earnings.30 An influx of women into library IT would likely devalue the subfield and depress wages; as such, occupational seg- regation is one means of protecting wages and influence. Women are often deterred from entering or excelling in an occupation through subtle discrimination. Because the sexist actions or words are not always recognized as discriminatory, subtle sexism is difficult to define. The repetition of the behaviors and language over time creates a sense that those patterns are acceptable, and they become more difficult to change.31 Examples of subtle sexism include the expectation that the women will be more responsible for social occasions involving food or more responsible for the staff lounge or lunch room. Often the informal exchange of information and skills, so-called boy’s-room knowledge, eludes women because they are excluded from masculine socializing. In addition, men may be assigned different, usually less clerical tasks, and women are often associated with the softer tasks of user support, help desks and interfaces.32 Although subtle discrimination occurs in all work places, not just libraries, the effects in a gender-segregated work- place are compounded. Confronted with a complex series of social, cultural, and organizational cues, women are made to feel less competent and less comfortable with technology. The association of women’s positions with lower wages and prestige serves to sustain the occupa- tional segregation and justify the subtle discrimination that hinders women. Sometimes perception creates reality. It would be a mistake to group all women as a whole, expecting that the experiences of all are exactly alike, just as not all men are technologically adept. Socioeconomic factors, as well as ethnic and geographic differences, influence the abili- ties and desires of women and men to succeed in tech- nology professions. Yet the smaller number of women in technology subfields of librarianship implies an almost “symbolic image of the discipline as masculine, which in turn reinforces the minority position of women.”33 Likewise, the far greater number of women writing in children’s librarianship simply reinforces this subfield as feminine. According to Alksnis, “On the demand side, jobs are often seen as requiring the characteristics of the group that already dominates it.”34 The lack of women in the IT field continues to reinforce the stereotype and perpetuate the imbalance. n Conclusion To remedy the underrepresentation of women in IT, it would be simple to call for greater educational oppor- tunities for girls, mentoring programs for young pro- fessional women, and economic incentives to retain mid-career women. The situation, however, is not simple. A series of organizational, societal, and cultural percep- tions inhibit women from associating or identifying with IT. Rasmussen and Hapnes refer to a combination of orga- nizational culture and gender politics that discourage women.35 Instead of a focus on the numbers of women in IT, librarians should work to transform the organiza- tional culture. As technology progresses, the definition of technology work must be reevaluated and the entries into the technology fields must be redefined. In short, what constitutes IT must be rethought, recast, and reval- ued as technology develops. In the library specifically, IT and librarianship have much in common. At present, the library has a dichotomized workforce of female librarians and male IT workers. Over time, the skills of librarians and technologists will blend. If managed properly, the best of classic library theory and practice will combine with IT into a dynamic and diverse workforce as well as a thriving and innovative organization. References and notes 1. Examples of research and statistics concerning the num- ber and status of women in technology fields, in addition to those noted in the paper, include Carol Simard et al., Climbing the Technical Ladder: Obstacles and Solutions for Mid-Level Women in Technology (Palo Alto, Calif.: Anita Borg Institute for Women and Technology, 2008), http://anitaborg.org/files/Climbing_the_ Technical_Ladder.pdf (accessed Oct. 20, 2008); U.S. Department of Labor, Bureau of Labor Statistics, Household Data Annual Averages, “Table 11. Employed Persons by Detailed Occupation, Sex, Race and Hispanic or Latino Ethnicity,” ftp://ftp.bls.gov/ pub/special.requests/lf/aat11.txt (accessed Oct. 20, 2008); and Jay Vesgo, “CRA Taulbee Trends: Female Students and Faculty,” Computing Research Association, June 17, 2008, www.cra.org/ info/taulbee/women.html (accessed Oct. 20, 2008). 2. National Science Foundation, Division of Science Resources Statistics, Women, Minorities, and Persons with Dis- abilities in Science and Engineering: 2007, NSF 07-315, Table H-5, Employed Scientists and Engineers by Occupation, Highest Degree Level, and Sex: 2003 (Arlington, Va.: National Science Foundation, 2007): 222, http://www.nsf.gov/statistics/wmpd/ pdf/nsf07315.pdf (accessed June 11, 2008). 3. National Science Foundation, Division of Science Resources Statistics, Women, Minorities, and Persons with Disabili- ties in Science and Engineering: 2007, Table H-16, Median Annual Salary of Scientists and Engineers Employed Full Time, by High- est Degree, Broad Occupation, Age Group, and Sex: 2003, 225. 4. Association of Research Libraries, ARL Annual Salary Sur- vey 2007–08, Table 17, Number and Average Salaries by Position 142 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 and Sex (Washington, D.C.: ARL, 2008): 42–43, tables 17–18 www.arl.org/stats/annualsurveys/salary/annualedssal.shtml (accessed Aug. 2008). 5. Sook Lim, “Job Satisfaction of Information Technology Workers in Academic Libraries,” Library & Information Science Research 30, no. 2 (2008): 120. 6. Stephanie Maatta, “Placements and Salaries 2006: What’s an MLIS Worth?” Library Journal (Oct. 15, 2007), www.library journal.com/article/CA6490671.html (accessed Aug. 29, 2008) 7. Malin Hakanson, “The Impact of Gender on Citations: An Analysis of College & Research Libraries, Journal of Academic Librarianship and Library Quarterly,” College & Research Libraries 66, no. 4 (2005): 312–22. 8. Bente Rasmussen and Tove Hapnes, “Excluding Women from the Technologies of the Future? A Case Study of the Cul- ture of Computer Science,” Futures 23, no. 10 (1991): 1107. 9. Krista Scott-Dixon, “From Digital Binary to Analog Con- tinuum: Measuring Gendered IT Labor: Notes toward Multidi- mensional Methodologies,” Frontiers 26, no. 1 (2005): 26. 10. Ibid. 11. Ibid., 30. 12. Stephanie Maatta, “Placements and Salaries 2006.” 13. Christine Alksnis, Serge Desmarais, and James Curtis, “Workforce Segregation and the Gender Wage Gap: Is Women’s Work Valued as Highly as Men’s?” Journal of Applied Social Psy- chology 38, no. 6 (2008): 1416–41. 14. Ibid., 1419. 15. Association of Research Libraries, ARL Annual Salary Sur- vey 2007–08, 42–43, tables 17–18. 16. Lori Ricigliano and Renee Houston, “Men’s Work, Wom- en’s Work: The Social Shaping of Technology in Academic Libraries,” (paper presented at the Association of College and Research Libraries 11th Annual National Conference, Charlotte, N.C., Apr. 10–13, 2003): 1. 17. Alison Adam et al., “Being an ‘It’ in IT: Gendered Identi- ties in IT,” European Journal of Information Systems 15, no. 4 (2006): 368–78. 18. Eszter Hargittai and Steven Shafer, “Differences in Actual and Perceived Online Skills: The Role of Gender,” Social Science Quarterly 87, no. 2 (2006): 432–48. 19. Rasmussen and Hapnes, “Excluding Women,” 1108. 20. Nancy Ramsey and Pamela McCorduck, “Where are the Women in Information Technology? Preliminary Report of Liberature Search and Interviews” (report prepared for the National Center for Women and Information Technology, Feb. 5, 2005): 9, http://www.anitaborg.org/files/abi_wherearethe women.pdf (accessed June 12, 2009). 21. Ibid. 22. Indira R. Guzman, Kathryn R. Stam, and Jeffrey M. Stan- ton, “The Occupational Culture of IS/IT Personnel within Orga- nizations,” The DATA BASE for Advances in Information Systems 39, no. 1 (2008): 45. 23. Ramsey and McCorduck, “Where are the Women in Infor- mation Technology?” 9 24. Guzman, Stam, and Stanton, “Occupational Culture,” 45. 25. Ibid. 26. Melanie Wilson, “A Conceptual Framework for Studying Gender in Information Systems Research,” Journal of Information Technology 19, no. 1 (2004): 87. 27. Judy Wajcman, “Reflections on Gender and Technology Studies: What is State of the Art?” Social Studies of Science 30, no. 3 (June 2000): 454. 28. Alksnis, Desmarais, and Curtis, “Workforce Segregation,” 1418. 29. Wilson, “Conceptual Framework,” 85. 30. Patricia A. Roos and Barbara F. Reskin, “Occupational Desegregation in the 1970s: Integration and Economic Equal- ity?” Sociological Perspectives 35, no.1 (1992): 87. 31. Nijole V. Benokraitis, “Sex Discrimination in the 21st Cen- tury,” in Subtle Sexism: Current Practice and Prospects for Change, ed. Nijole V. Benokraitis (Thousand Oaks, Calif.: Sage, 1997): 11. 32. Fiona Wilson, “Can Compute, Won’t Compute: Women’s Participation in the Culture of Computing,” New Technology, Work and Employment 18, no. 2 (2003): 127. 33. Vivian Anette Lagesen, “Extreme Make-over? The Mak- ing of Gender and Computer Science” (PhD diss., Norwegian University of Science and Technology, Trondheim, Norway, 2005): 188. 34. Alksnis, Desmarais, and Curtis, “Workforce Segregation,” 1419. 35. Rasmussen and Hapnes, “Excluding Women,” 1108. 3222 ---- THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 143 Przemysław skibiński and Jakub swacha The Efficient Storage of Text Documents in Digital Libraries przemysław Skibiński (inikep@ii.uni.wroc.pl) is [QY: title?], institute of Computer Science, University of wrocław, Poland. Jakub Swacha (jakubs@uoo.univ.szczecin.pl) is [QY: title?], institute of information Technology in Management, University of Szczecin, Poland. Przemysław Skibiński and Jakub Swacha The Efficient Storage of Text Documents in Digital Libraries In this paper we investigate the possibility of improv- ing the efficiency of data compression, and thus reduc- ing storage requirements, for seven widely used text document formats. We propose an open-source text compression software library, featuring an advanced word-substitution scheme with static and semidynamic word dictionaries. The empirical results show an average storage space reduction as high as 78 percent compared to uncompressed documents, and as high as 30 percent com- pared to documents compressed with the free compression software gzip. I t is hard to expect the continuing rapid growth of global information volume not to affect digital libraries.1 The growth of stored information volume means growth in storage requirements, which poses a problem in both technological and economic terms. Fortunately, the digi- tal librarys’ hunger for resources can be tamed with data compression.2 The primary motivation for our research was to limit the data storage requirements of the student thesis elec- tronic archive in the Institute of Information Technology in Management at the University of Szczecin. The current regulations state that every thesis should be submitted in both printed and electronic form. The latter facilitates automated processing of the documents for purposes such as plagiarism detection or statistical language analy- sis. Considering the introduction of the three-cycle higher education system (bachelor/master/doctorate), there are several hundred theses added to the archive every year. Although students are asked to submit Microsoft Word–compatible documents such as DOC, DOCX, and RTF, other popular formats such as TeX script (TEX), HTML, PS, and PDF are also accepted, both in the case of the main thesis document, containing the thesis and any appendixes that were included in the printed ver- sion, and the additional appendixes, comprising mate- rials that were left out of the printed version (such as detailed data tables, the full source code of programs, program manuals, etc.). Some of the appendixes may be multimedia, in formats such as PNG, JPEG, or MPEG.3 Notice that this paper deals with text-document com- pression only. Although the size of individual text documents is often significantly smaller than the size of individual multimedia objects, their collective vol- ume is large enough to make the compression effort worthwhile. The reason for focusing on text-document compression is that most multimedia formats have efficient compression schemes embedded, whereas text document formats usually either are uncompressed or use schemes with efficiency far worse than the current state of the art in text compression. Although the student thesis electronic archive was our motivation, we propose a solution that can be applied to any digital library containing text documents. As the recent survey by Kahl and Williams revealed, 57.5 percent of the examined 1,117 digital library projects consisted of text content, so there are numerous libraries that could benefit form implementation of the proposed scheme.4 In this paper, we describe a state-of-the-art approach to text-document compression and present an open- source software library implementing the scheme that can be freely used in digital library projects. In the case of text documents, improvement in com- pression effectiveness may be obtained in two ways: with or without regard to their format. The more nontextual content in a document (e.g., formatting instructions, structure description, or embedded images), the more it requires format-specific processing to improve its com- pression ratio. This is because most document formats have their own ways of describing their formatting, structure, and nontextual inclusions (plain text files have no inclusions). For this reason, we have developed a compound scheme that consists of several subschemes that can be turned on and off or run with different parameters. The most suitable solution for a given document format can be obtained by merely choosing the right schemes and adequate parameter values. Experimentally, we have found the optimal subscheme combinations for the fol- lowing formats used in digital libraries: plain text, TEX, RTF, text annotated with XML, HTML, as well as the device-independent rendering formats PS and PDF.5 First we discuss related work in text compression, then describe the basis of the proposed scheme and how it should be adapted for particular document formats. The section “Using the scheme in a digital library project” discusses how to use the free software library that imple- ments the scheme. Then we cover the results of experi- ments involving the proposed scheme and a corpus of test files in each of the tested formats. n Text compression There are two basic principles of general-purpose data compression. The first one works on the level of char- acter sequences, the second one works on the level of przemysław Skibiński (inikep@ii.uni.wroc.pl) is associate Professor, institute of Computer Science, University of wrocław, Poland. Jakub Swacha (jakubs@uoo.univ.szczecin .pl) is associate Professor, institute of information Technology in Management, University of Szczecin, Poland. 144 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 individual characters. In the first case, the idea is to look for matching character sequences in the past buffer of the file being compressed and replace such sequences with shorter code words; this principle underlies the algo- rithms derived from the concepts of Arbraham Lempel and Jacob Ziv (LZ-type).6 In the second case, the idea is to gather frequency statistics for characters in the file being compressed and then assign shorter code words for frequent characters and longer ones for rare characters (this is exactly how Huffman coding works—what arithmetic coding assigns are value ranges rather than individual code words).7 As the characters form words, and words form phrases, there is high correlation between subsequent characters. To produce shorter code words, a compression algorithm either has to observe the context (understood as several preceding characters) in which the character appeared and maintain separate frequency models for different contexts, or has to first decorrelate the characters (by sorting them according to their contexts) and then use an adaptive frequency model when compressing the out- put (as the characters’ dependence on context becomes dependence on position). Whereas the former solution is the foundation of Prediction by Partial Match (PPM) algo- rithms, Burrows-Wheeler Transform (BWT) compression algorithms are based on the latter.8 Witten et al., in their seminal work Managing Gigabytes, emphasize the role of data compression in text storage and retrieval systems, stating three requirements for the compression process: good compression, fast decoding, and feasibility of decoding individual documents with minimum overhead.9 The choice of compression algorithm should depend on what is more important for a specific application: better compression or faster decoding. An early work of Jon Louis Bentley and others showed that a significant improvement in text compression can be achieved by treating a text document as a stream of space-delimited words rather than individual characters.10 This technique can be combined with any general-purpose compression method in two ways: by redesigning charac- ter-based algorithms as word-based ones or by implement- ing a two-stage scheme whose first step is a transform replacing words with dictionary indices and whose second step is passing the transformed text through any general- purpose compressor.11 From the designer’s point of view, although the first approach provides more control over how the text is modeled, the second approach is much eas- ier to implement and upgrade to future general-purpose compressors.12 Notice that the separation of the word- replacement stage from the compression stage does not imply that two distinct programs have to be used—if only an appropriate general-purpose compression software library is available, a single utility can use it to compress the output of the transform it first performed. An important element of every word-based scheme is the dictionary of words that lists character sequences that should be treated as single entities. The dictionary can be dynamic (i.e., constructed on-line during the com- pression of every document),13 static (i.e., constructed off-line before the compression stage and once for every document of a given class—typically, the language of the document determines its class),14 or semidynamic (i.e., constructed off-line before compression stage but indi- vidually for every document).15 Semidynamic dictionar- ies must be stored along with the compressed document. Dynamic dictionaries are reconstructed during decom- pression (which makes the decoding slower than in the other cases). When the static dictionary is used, it must be distributed with the decoder; since a single dictionary is used to compress multiple files, it usually attains the best compression ratios, but it is only effective with docu- ments of the class it was originally prepared for. n The basic compression scheme The basis of our approach is a word-based, lossless text compression scheme, dubbed Compression for Textual Digital Libraries (CTDL). The scheme consists of up to four stages: 1. document decompression 2. dictionary composition 3. text transform 4. compression Stages 1–2 are optional. The first is for retrieving tex- tual content from files compressed poorly with general- purpose methods. It is only executed for compressed input documents. It uses an embedded decompressor for files compressed using the Deflate algorithm,16 but an external tool—Precomp—is used to decode natively compressed PDF documents.17 The second stage is for constructing the dictionary of the most frequent words in the processed document. Doing so is a good idea when the compressed documents have no common set of words. If there are many docu- ments in the same language, a common dictionary fares better—it usually does not pay off to store an individual dictionary with each file because they all contain similar lists of words. For this reason we have developed two variants of the scheme. The basic CTDL includes stage 2; therefore it can use a document-specific semidynamic dictionary in the third stage. The CTDL+ variant uses a static dictionary common for all files in the same lan- guage; therefore it can omit stage 2. During stage 2, all the potential dictionary items that meet the word requirements are extracted from the document and then sorted according to their frequency THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 145 to form a dictionary. The requirements define the mini- mum length and frequency of a word in the document (by default, 2 and 6 respectively) as well as its content. Only the following kinds of strings are accepted into the dictionary: n a sequence of lowercase and uppercase letters (“a”–“z”, “A”–“Z”) and characters with ASCII code values from range 128–255 (thus it supports any typical 8-bit text encoding and also UTF-8) n URL address prefixes of the form “http:// domain/,” where domain is any combination of letters, digits, dots, and dashes n e-mails—patterns of the form “login@domain,” where login and domain are any combination of letters, digits, dots, and dashes n runs of spaces Stage 3 begins with parsing the text into tokens. The tokens are defined by their content; as four types of content are distinguished, there are also four classes of tokens: words, numbers, special tokens, and characters. Every token is then encoded in a way that depends on the class it belongs to. The words are those character sequences that are listed in the dictionary. Every word is replaced with its diction- ary index, which is then encoded using symbols that are rare or nonexistent in the input document. Indexes are encoded with code words that are between one and four bytes long, with lower indexes (denoting more frequent words) being assigned shorter code words. The numbers are sequences of decimal digits, which are encoded with a dense binary code, and, similarly to letters, placed in a separate location in the output file. The special tokens can be decimal fractions, IP numeri- cal addresses, dates, times, and numerical ranges. As they have a strict format and differ only in numerical values, they are encoded as sequences of numbers.18 Finally, the characters are the tokens that do not belong to any of the aforementioned group. They are sim- ply copied to the output file, with the exception of those rare characters that were used to construct code words; they are copied as well, but have to be preceded with a special escape symbol. The specialized transform variants (see the next sec- tion) distinguish three additional classes from the charac- ter class: letters (words not in the dictionary), single white spaces, and multiple white spaces. Stage 4 could use any general-purpose compression method to encode the output of stage 3. For this role, we have investigated several open-licensed, general- purpose compression algorithms that differ in speed and efficiency. As we believe that document access speed is important to textual digital libraries, we have decided to focus on LZ–type algorithms because they offer the best decompression times. CTDL has two embedded back- end compressors: the standard Deflate and LZMA, well- known for its ability to attain high compression ratios.19 n Adapting the transform for individual text document formats The text document formats have individual character- istics; therefore the compression ratio can be improved by adapting the transform for a particular format. As we noted in the introduction, we propose a set of sub- schemes (modifications of the original processing steps or additional processing steps) that can help compression— provided the issue that a given subscheme addresses is valid for the document format being compressed. There are two groups of subschemes: the first consists of solu- tions that can be applied to more than one document format. It includes n changing the minimum word frequency threshold (the “MinFr” column in table 1) that a word must pass to be included in the semidynamic dictionary (notice that no word can be added to a static dic- tionary); n using spaceless word model (“WdSpc” column in table 1) in which a single space between two words is not encoded at all; instead, a flag is used to mark two neighboring words that are not separated by a space; n run-length encoding of multiple spaces (“SpRuns” column in table 1); n letter containers (“LetCnt” column in table 1), that is, removing sequences of letters (belonging to words that are not included in the dictionary) to a separate location in the output file (and leaving a flag at their original position). Table 1 shows the assignment of the mentioned sub- schemes to document formats, with “+” denoting that a given subscheme should be applied when processing a given document format. Notice that we use different subschemes for the same format depending on whether a semidynamic (CTDL) or static (CTDL+) dictionary is used. The remaining subschemes are applied for only one document format. They attain an improvement in com- pression performance by changing the definition of acceptable dictionary words, and, in one case (PS), by changing the definition of number strings. The encoder for the simplest of the examined for- mats—plain text files—performs no additional format- specific processing. The first such modification is in the TEX encoder. The difference is that words beginning with “\” (TEX 146 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 instructions) are now accepted in the dictionary. The modification for PDF documents is similar. In this case, bracketed words (PDF entities)— for example “(abc)”—are accept- able as dictionary entries. Notice that PDF files are internally compressed by default—the transform can be applied after decompressing them into textual format. The Precomp tool is used for this purpose. The subscheme for PS files features two modifications: Its dictionary accepts words begin- ning with “/” and “\” or ending with “(“, and its number tokens can contain not only deci- mal but also hexadecimal digits (though a single number must have at least one decimal digit). The hexadecimal number must be at least 6 digits long, and is encoded with a flag: a byte containing its length (numbers with more than 261 digits are split into parts) and a sequence of bytes, each containing two digits from the number (if the number of digits is odd, the last byte contains only one digit). For RTF documents, the dictionary accepts the “\”-preceded words, like the TEX files. Moreover, the hexadecimal numbers are encoded in the same way as in the PS subscheme so that RTF documents containing images can be significantly reduced in size. Specialization for XML is roughly the transform described in our earlier article, “Revisiting Dictionary- Based Compression.”20 It allows for XML start tags and entities to be added to dictionary, and it replaces every end tag respecting the XML well-formedness rule (i.e., closing the element opened most recently) with a single flag. It also uses a single flag to denote XML attribute value begin and end marks. HTML documents are handled similarly. The only dif- ference is that the tags that, according to the HTML 4.01 specification, are not expected to be followed by an end- tag (BASE, LINK, XBASEHREF, BR, META, HR, IMG, AREA, INPUT, EMBED, PARAM and COL) are ignored by the mechanism replacing closing tags (so that it can guess the correct closing tag even after the singular tags were encountered).21 n Using the scheme in a digital library project Many textual digital libraries seriously lack text compres- sion capabilities, and popular digital library systems, such as Greenstone, have no embedded efficient text compression.22 Therefore we have decided to develop CTDL as an open-source software library. The library is free to use and can be downloaded from www.ii.uni.wroc .pl/~inikep/research/CTDL/CTDL09.zip. The library does not require any additional nonstan- dard libraries. It has both the text transform and back-end compressors embedded. However, compressing PDF documents requires them to be decompressed first with the free Precomp tool. The compression routines are wrapped in a code selecting the best algorithm depending on the chosen compression mode and the input document format. The interface of the library consists of only two functions: CTDL_encode and CTDL_decode, for, respectively, com- pressing and decompressing documents. CTDL_encode takes the following parameters: n char* filename—name of the input (uncompressed) document n char* filename_out—name of the output (com- pressed) document n EFileType ftype—format of the input document, defined as: enum EFileType { HTML, PDF, PS, RTF, TEX, TXT, XML}; n EDictionaryType dtype—dictionary type, defined as: enum EDictionaryType { Static, SemiDynamic }; CTDL_decode takes the following parameters: n char* filename—name of the input (compressed) document n char* filename_out—name of the output (decom- pressed) document Table 1. Universal transform optimizations CTDL Settings CTDL+ Settings Format MinFr WdSpc SpRuns LetCnt WdSpc SpRuns LetCnt HTML 3 + + + + + - PDF 3 - - - - - - PS 6 - + - - + - RTF 3 + - + + - - TEX 3 + + + + + + TXT 6 + + + + + + XML 3 + + + + + - THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 147 The library was written in the C++ programming language, but a compiled static library is also distributed; thus it can be used in any language that can link such libraries. Currently, the library is compatible with two platforms: Microsoft Windows and Linux. To use static dictionaries, the respective dictionary file must be available. The library is sup- plied with an English dictionary trained on a 3 GB text corpus from Project Gutenberg.23 Seven other dictionaries—German, Spanish, Finnish, French, Italian, Polish, and Russian— can be freely downloaded from www.ii.uni.wroc.pl/~inikep/ research/dicts. There also is a tool that helps create a new dictionary from any given corpus of documents, available from Skibiński upon request via e-mail (inikep@ii.uni .wroc.pl). The library can be used to reduce the storage require- ments or also to reduce the time of delivering a requested document to the library user. In the first case, the decom- pression must be done on the server side. In the second case, it must be done on the client side, which is pos- sible because stand-alone decompressors are available for Microsoft Windows and Linux. Obviously, a library can support both options by providing the user with a choice whether a document should be delivered compressed or not. If documents are to be decompressed client-side, the basic CTDL, using a semidynamic dictionary, seems hand- ier, since it does not require the user to obtain the static dictionary that was used to compress the downloaded doc- ument. Still, the size of such a dictionary is usually small, so it does not disqualify CTDL+ from this kind of use. n Experimental results We tested CTDL experimentally on a benchmark set of text documents. The purpose of the tests was to compare the storage requirements of different document formats in compressed and uncompressed form. In selecting the test files we wanted to achieve the following goals: n test all the formats listed in table 1 (therefore we decided to choose documents that produced no errors during document format conversion) n obtain verifiable results (therefore we decided to use documents that can be easily obtained from the Internet) n measure the actual compression improvement from applying the proposed scheme (apart from the RTF format, the scheme is neutral to the images embedded in documents; therefore we decided to use documents that have no embedded images) For these reasons, we used the following procedure for selecting documents to the test set. First, we searched the Project Gutenberg library for TEX documents, as this format can most reliably be transformed into the other formats. From the fifty-one retrieved documents, we removed all those containing images as well as those that the htlatex tool failed to convert to HTML. In the eleven remaining documents, there were four Jane Austen books; this overrepresentation was handled by removing three of them. The resulting eight documents are given in table 2. From the TEX files we generated HTML, PDF, and PS documents. Then we used Word 2007 to transform HTML documents into RTF, DOC, and XML (thus this is the Microsoft Word XML format, not the Project Gutenberg XML format). The TXT files were downloaded from Project Gutenberg. The tests were conducted on a low-end AMD Sempron 3000+ 1.80 GHz system with 512 MB RAM and a Seagate 80 GB ATA drive, running Windows XP SP2. For comparison purposes, we used three general- purpose compression programs: n gzip implementing Deflate n bzip2 implementing a BWT-based compression algorithm Table 2. Test set documents specification File Name Title Author TEx Size (bytes) 13601-t Expositions of Holy Scripture: Romans Corinthians Maclaren 1,443,056 16514-t A Little Cook Book for a Little Girl Benton 220,480 1noam10t North America, V. 1 Trollope 804,813 2ws2610 Hamlet Shakespeare 194,527 alice30 Alice in Wonderland Carroll 165,844 cdscs10t Some Christmas Stories Dickens 127,684 grimm10t Fairy Tales Grimm 535,842 pandp12t Pride and Prejudice Austen 727,415 148 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 n PPMVC implementing a PPM-derived compres- sion algorithm24 Tables 3–10 show n the bitrate attained on each test file by the Deflate- based gzip in default mode, the proposed com- pression scheme in the semidynamic and static variants with Deflate as the back-end compression algorithm, 7-zip in LZMA mode, the proposed compression scheme in the semidynamic and static variants with LZMA as the back-end compression algorithm, bzip2 and PPMVC; n the average bitrate attained on the whole test cor- pus; and n the total compression and decompression times (in seconds) for the whole test corpus, measured on the test platform (they are total elapsed times including program initialization and disk operations). Bitrates are given in output bits per character of an uncompressed document in a given format, so a smaller Table 3. Compression efficiency and times for the TXT documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.944 2.244 2.101 2.337 2.057 1.919 2.158 1.863 16514-t 2.566 2.150 1.969 2.228 1.993 1.838 2.010 1.780 1noam10t 2.967 2.337 2.109 2.432 2.151 1.958 2.160 1.946 2ws2610 3.217 2.874 2.459 2.871 2.659 2.312 2.565 2.343 alice30 2.906 2.533 2.184 2.585 2.360 2.056 2.341 2.090 cdscs10t 3.222 2.898 2.298 2.928 2.721 2.192 2.694 2.436 grimm10t 2.832 2.275 2.090 2.357 2.079 1.931 2.112 1.886 pandp12t 2.901 2.251 2.097 2.366 2.061 1.930 2.032 1.835 Average 2.944 2.445 2.163 2.513 2.260 2.017 2.259 2.022 Comp. Time 0.688 1.234 0.954 6.688 2.640 2.281 2.110 3.281 Dec. Time 0.125 0.454 0.546 0.343 0.610 0.656 0.703 3.453 Table 4. Compression efficiency and times for the TEX documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.927 2.233 2.092 2.328 2.049 1.913 2.146 1.852 16514-t 2.277 1.904 1.794 1.957 1.744 1.645 1.746 1.534 1noam10t 2.976 2.370 2.142 2.445 2.186 1.986 2.195 1.976 2ws2610 3.206 2.906 2.482 2.864 2.674 2.323 2.562 2.340 alice30 2.897 2.526 2.183 2.573 2.350 2.048 2.332 2.085 cdscs10t 3.224 2.931 2.328 2.941 2.759 2.222 2.723 2.466 grimm10t 2.831 2.304 2.120 2.364 2.113 1.960 2.143 1.910 pandp12t 2.881 2.239 2.090 2.346 2.049 1.916 2.013 1.817 Average 2.902 2.427 2.154 2.477 2.241 2.002 2.233 1.998 Comp. Time 0.688 1.250 0.969 6.718 2.703 2.406 2.140 3.329 Dec. Time 0.109 0.453 0.547 0.360 0.609 0.672 0.703 3.485 THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 149 bitrate (of, e.g., RTF documents compared to the plain text) does not mean the file is smaller, only that the com- pression was better. Uncompressed files have a bitrate of 8 bits per character. Looking at the results obtained for TXT documents (table 3), we can see an average improvement of 17 percent for CTDL and 27 percent for CTDL+ compared to the baseline Deflate implementation. Compared to the baseline LZMA implementation, the improvement is 10 percent for CTDL and 20 percent for CTDL+. Also, CTDL+ combined with LZMA compresses TXT docu- ments 31 percent better than gzip, 11 percent better than bzip2, and slightly better than the state-of-the-art PPMVC implementation. In case of TEX documents (table 4), the gzip results were improved, on average, by 16 percent using CTDL and by 26 percent using CTDL+; the numbers for LZMA are 10 percent for CTDL and 19 percent for CTDL+. In a cross-method comparison, CTDL+ with LZMA beats gzip by 31 percent, bzip2 by 10 percent, and attains results very close to PPMVC. On average, Deflate-based CTDL compressed XML documents 20 percent better than the baseline algorithm (table 5), and with CTDL+ the improvement rises to 26 percent. CTDL improves LZMA compression by 11 per- cent, and CTDL+ improves it by 18 percent. CTDL+ with LZMA beats gzip by 33 percent, bzip2 by 8 percent, and loses only 4 percent to PPMVC. Similar results were obtained for HTML documents (table 6): they were compressed with CTDL and Deflate 18 percent better than with the Deflate algorithm alone, and 27 percent better with CTDL+. LZMA compression efficiency is improved by 11 percent with CTDL and 20 percent with CTDL+. CTDL+ with LZMA beats gzip by 33 percent, bzip2 by 9 percent, and loses only 2 percent to PPMVC. For RTF documents (table 7), the gzip results were improved, on average, by 18 percent using CTDL, and 25 percent using CTDL+; the numbers for LZMA are respec- tively 9 percent for CTDL and 17 percent for CTDL+. In a cross-method comparison, CTDL+ with LZMA beats gzip by 34 percent, bzip2 by 7 percent, and loses 5 percent to PPMVC. Although there is no mode designed especially for DOC documents in CTDL (table 8), the basic TXT mode was used, as it was found experimentally to be the best choice available. The results show it managed to improve Deflate-based compression by 9 percent using CTDL, and by 21 percent using CTDL+, whereas LZMA-based compression was improved respectively by 4 percent for CTDL and 14 percent for CTDL+. Combined with LZMA, CTDL+ compresses DOC documents 30 percent better than gzip, 13 percent better than bzip2, and 1 percent bet- ter than PPMVC. In case of PS documents (table 9), the gzip results were improved, on average, by 5 percent using CTDL, and by 8 percent using CTDL+; the numbers for LZMA improved 3 percent for CTDL and 5 percent for CTDL+. In a cross-method comparison, CTDL+ with LZMA beats gzip by 8 percent, losing 5 percent to bzip2 and 7 percent to PPMVC. Finally, CTDL improved Deflate-based compression of PDF documents (table 10) by 9 percent using CTDL and 10 percent using CTDL+ (compared to gzip; the numbers are Table 5. Compression efficiency and times for the XML documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.046 1.551 1.514 1.585 1.405 1.339 1.451 1.242 16514-t 0.871 0.698 0.670 0.703 0.612 0.590 0.599 0.552 1noam10t 2.383 1.870 1.736 1.914 1.711 1.575 1.724 1.515 2ws2610 0.691 0.539 0.497 0.561 0.474 0.440 0.461 0.422 alice30 1.477 1.258 1.140 1.248 1.131 1.034 1.116 0.999 cdscs10t 2.106 1.892 1.576 1.862 1.741 1.462 1.721 1.538 grimm10t 1.878 1.485 1.422 1.521 1.337 1.276 1.337 1.198 pandp12t 1.875 1.404 1.349 1.465 1.263 1.207 1.252 1.105 Average 1.666 1.337 1.238 1.357 1.209 1.115 1.208 1.071 Comp. Time 0.750 1.844 1.390 10.79 4.891 5.828 7.047 3.688 Dec. Time 0.141 0.672 0.750 0.421 0.859 0.953 1.140 3.907 150 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 much higher if compared to the embedded PDF compres- sion—see “native” column in table 10); the numbers for LZMA are respectively 7 percent for CTDL and 10 percent for CTDL+. Combined with LZMA, CTDL+ compresses PDF documents 28 percent better than gzip, 4 percent bet- ter than bzip2, and 5 percent worse than PPMVC. The results presented in tables 3–10 show that CTDL manages to improve compression efficiency of the gen- eral-purpose algorithms it is based on. The scale of improvement varies between document types, but for most of them it is more than 20 percent for CTDL+ and 10 percent for CTDL. The smallest improvement is achieved in case of PS (about 5 percent). Figure 1 shows the same results in another perspective: the bars show how much better compression ratios were obtained for the same documents using different compression schemes com- pared to gzip with default options (0 percent means no improvement). Compared to gzip, CTDL offers a significantly better compression ratio at the expense of longer processing time. The relative difference is especially high in case of decompression. However, in absolute terms, even in the worst case of PDF, the average delay between CTDL+ and gzip is below 180 ms for compression and 90 ms for decompression per file. Taking into consideration the low-end specification of the test computer, these results Table 6. Compression efficiency and times for the HTML documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.696 2.054 1.940 2.121 1.868 1.751 1.932 1.670 16514-t 1.726 1.405 1.310 1.436 1.258 1.180 1.257 1.113 1noam10t 2.768 2.159 1.972 2.244 1.979 1.815 1.973 1.785 2ws2610 2.084 1.747 1.504 1.743 1.525 1.344 1.499 1.303 alice30 2.451 2.124 1.829 2.128 1.929 1.701 1.888 1.684 cdscs10t 2.880 2.593 2.084 2.597 2.410 1.966 2.348 2.131 grimm10t 2.603 2.074 1.916 2.138 1.883 1.752 1.889 1.688 pandp12t 2.640 2.037 1.891 2.120 1.826 1.717 1.777 1.596 Average 2.481 2.024 1.806 2.066 1.835 1.653 1.820 1.621 Comp. Time 0.750 1.438 1.078 8.203 3.421 3.328 2.672 3.500 Dec. Time 0.140 0.515 0.594 0.359 0.688 0.750 0.812 3.672 Table 7. Compression efficiency and times for the RTF documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 1.882 1.431 1.372 1.428 1.267 1.200 1.300 1.120 16514-t 0.834 0.701 0.696 0.662 0.601 0.591 0.568 0.529 1noam10t 2.244 1.774 1.637 1.765 1.594 1.462 1.601 1.404 2ws2610 0.784 0.630 0.581 0.629 0.545 0.500 0.520 0.485 alice30 1.382 1.196 1.065 1.134 1.046 0.948 0.995 0.922 cdscs10t 2.059 1.882 1.558 1.784 1.704 1.432 1.645 1.488 grimm10t 1.618 1.301 1.227 1.285 1.150 1.082 1.149 1.010 pandp12t 1.742 1.340 1.264 1.336 1.169 1.115 1.142 1.012 Average 1.568 1.282 1.175 1.253 1.135 1.041 1.115 0.996 Comp. Time 0.766 2.047 1.500 12.62 6.500 7.562 8.032 3.922 Dec. Time 0.156 0.688 0.766 0.469 0.875 0.953 1.312 4.157 THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 151 certainly seem good enough for practical applications. Compared to LZMA, CTDL offers better compression and a shorter compression time at the expense of longer decompression time. Notice that the absolute gain in compression time is several times the loss in decompres- sion time, and the decompression time remains short, noticeably shorter than bzip2’s and several times shorter than PPMVC’s. CTDL+ beats bzip2 (with the sole excep- tion of PS documents) in terms of compression ratio and achieves results that are mostly very close to the resource- hungry PPMVC. n Conclusions In this paper we addressed the problem of compressing text documents. Although individual text documents rarely exceed several megabytes in size, their entire col- lections can have very large storage space requirements. Although text documents are often compressed with general-purpose methods such as Deflate, much better compression can be obtained with a scheme specialized for text, and even better if the scheme is additionally specialized for individual document formats. We have developed such a scheme (CTDL), beginning with a text transform designed earlier for XML documents and Table 8. Compression efficiency and times for the DOC documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.798 2.183 2.062 2.181 1.976 1.854 2.115 1.818 16514-t 2.226 2.213 2.073 1.712 1.712 1.652 1.919 1.686 1noam10t 2.851 2.250 2.025 2.289 2.057 1.869 2.113 1.870 2ws2610 2.497 2.499 2.210 2.095 2.095 1.890 2.251 1.999 alice30 2.744 2.714 2.270 2.345 2.345 2.038 2.348 2.058 cdscs10t 2.916 2.891 2.231 2.559 2.560 2.062 2.475 2.196 grimm10t 2.691 2.677 2.059 2.179 2.179 1.856 2.075 1.833 pandp12t 2.761 2.171 2.050 2.189 1.955 1.843 1.983 1.770 Average 2.686 2.450 2.123 2.194 2.110 1.883 2.160 1.904 Comp. Time 0.718 1.312 1.031 7.078 4.063 3.001 2.250 3.421 Dec. Time 0.125 0.375 0.547 0.344 0.547 0.718 0.735 3.625 Table 9. Compression efficiency and times for the PS documents Deflate LZMA bzip2 PPMVC File Name gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 2.847 2.634 2.589 2.213 2.105 2.074 2.011 1.778 16514-t 3.226 3.129 3.039 2.730 2.707 2.699 2.613 2.505 1noam10t 2.718 2.551 2.490 2.147 2.060 2.015 1.892 1.694 2ws2610 3.064 2.922 2.795 2.600 2.521 2.450 2.336 2.186 alice30 3.224 3.154 3.026 2.750 2.745 2.691 2.553 2.400 cdscs10t 3.110 3.029 2.890 2.657 2.683 2.579 2.447 2.276 grimm10t 2.833 2.664 2.597 2.288 2.200 2.162 2.074 1.863 pandp12t 2.814 2.533 2.468 2.193 2.049 1.998 1.858 1.644 Average 2.980 2.827 2.737 2.447 2.384 2.334 2.223 2.043 Comp. Time 1.328 3.015 2.500 14.23 10.96 11.09 4.171 5.765 Dec. Time 0.203 0.688 0.781 0.609 1.063 1.125 1.360 6.063 152 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 modifying it for the requirements of each of the investigated docu- ment formats. It has two operation modes: basic CTDL and CTDL+ (the latter uses a common word dictionary for improved compres- sion) and uses two back-end com- pression algorithms: Deflate and LZMA (differing in compression speed and efficiency). The improvement in com- pression efficiency, which can be observed in the experimental results, amounts to a significant reduction of data storage require- ments, giving the reasons to use the library in both new and exist- ing digital library projects instead of general-purpose compression programs. To facilitate this pro- cess, we implemented the scheme as an open-source software library under the same name, freely avail- able at http://www.ii.uni.wroc . p l / ~ i n i k e p / re s e a rc h / C T D L / CTDL09.zip. Although the scheme and the library are now complete, we plan future extensions aiming both to increase the level of specializa- tions for currently handled docu- ment formats and to extend the list of handled document formats. Table 10. Compression efficiency and times for the (uncompressed) PDF documents Deflate LZMA bzip2 PPMVC File Name native gzip CTDL CTDL+ 7-zip CTDL CTDL+ 13601-t 3.443 2.624 2.191 2.200 1.986 1.708 1.656 1.852 1.659 16514-t 4.370 2.839 2.836 2.810 2.422 2.422 2.328 2.378 2.241 1noam10t 3.379 2.522 2.103 2.094 1.924 1.659 1.603 1.770 1.587 2ws2610 3.519 2.204 2.346 2.248 1.781 1.947 1.860 1.625 1.480 alice30 3.886 2.863 2.753 2.668 2.429 2.308 2.216 2.315 2.137 cdscs10t 3.684 2.835 2.688 2.557 2.399 2.276 2.164 2.260 2.079 grimm10t 3.543 2.557 2.135 2.120 2.008 1.713 1.661 1.858 1.696 pandp12t 3.552 2.684 2.267 2.256 2.071 1.831 1.769 1.870 1.705 Average 3.672 2.641 2.415 2.369 2.128 1.983 1.907 1.991 1.823 Comp. Time n/a 1.594 3.672 3.250 19.62 13.31 16.32 5.641 7.375 Dec. Time n/a 0.219 0.844 0.969 0.719 1.219 1.360 1.765 7.859 Figure 1. Compression improvement relative to gzip THE EFFiCiENT SToraGE oF TExT DoCuMENTS iN DiGiTal liBrariES | SkibiŃSki and Swacha 153 Acknowledgements Szymon Grabowski is the coauthor of the XML-WRT transform, which served as the basis for the CTDL library. References 1. John F. Gantz et al., The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth Through 2011 (Framingham, Mass.: IDC, 2008), http://www .emc.com/collateral/analyst-reports/diverse-exploding-digital -universe.pdf (accessed May 7, 2009). 2. Timothy C. Bell, Alistair Moffat, and Ian H. Witten, “Com- pressing the Digital Library,” in Proceedings of Digital Libraries ‘94 (College Station: Texas A&M Univ. 1994): 41. 3. Ian H. Witten and David Bainbridge, How to Build a Digital Library (San Francisco: Morgan Kaufmann, 2002). 4. Chad M. Kahl and Sarah C. Williams, “Accessing Digital Libraries: A Study of ARL Members’ Digital Projects,” The Jour- nal of Academic Librarianship 32, no. 4 (2006): 364. 5. Donald E. Knuth, TeX: The Program (Reading, Mass.: Addison-Wesley, 1986); Microsoft Technical Support, Rich Text For- mat (RTF) Version 1.5 Specification, 1997, http://www.biblioscape .com/rtf15_spec.htm (accessed May 7, 2009); Tim Bray et al., eds., Extensible Markup Language (XML) 1.0 (Fourth Edition), 2006, http://www.w3.org/TR/2006/REC-xml-20060816 (accessed May 7, 2009); Dave Raggett, Arnaud Le Hors, and Ian Jacobs, eds., W3C HTML 4.01 Specification, 1999, http://www.w3.org/ TR/REC-html40/ (accessed May 7, 2009); PostScript Language Reference, 3rd ed. (Reading, Mass.: Addison-Wesley, 1999), http://www.adobe.com/devnet/postscript/pdfs/PLRM.pdf (accessed May 7, 2009); PDF Reference, 6th ed., version 1.7, 2006, http://www.adobe.com/devnet/acrobat/pdfs/pdf_ reference_1-7.pdf (accessed May 7, 2009). 6. Jacob Ziv and Abraham Lempel, “A Universal Algorithm for Sequential Data Compression,” IEEE Transactions on Informa- tion Theory 23, no. 3 (1977): 337. 7. Ian H. Witten, Alistair Moffat, and Timothy C. Bell, Man- aging Gigabytes: Compressing and Indexing Documents and Images, 2nd ed. (San Francisco: Morgan Kaufmann, 1999). 8. John G. Cleary and Ian H. Witten, “Data Compression using Adaptive Coding and Partial String Matching,” IEEE Transactions on Communication 32, no. 4, (1984): 396; Michael Burrows and David J. Wheeler, “A Block-Sorting Lossless Data Compression Algorithm,” Digital Equipment Corporation SRC Research Report 124, 1994, www.hpl.hp.com/techreports/ Compaq-DEC/SRC-RR-124.pdf (accessed May 7, 2009). 9. Witten, Moffat, and Bell, Managing Gigabytes. 10. Jon Louis Bentley et al., “A Locally Adaptive Data Com- pression Scheme,” Communications of the ACM 29, no. 4 (1986): 320; R. Nigel Horspool and Gordon V. Cormack, “Constructing Word-Based Text Compression Algorithms,” Proceedings of the Data Compression Conference (Snowbird, Utah, 1992): 62. 11. See for example Andrei V. Kadach, “Text and Hypertext Compression,” Programming & Computer Software 23, no. 4 (1997): 212; Alistair Moffat, “Word-based text compression,” Software—Practice & Experience 2, no. 19 (1989): 185; Przemysław Skibiński, Szymon Grabowski, and Sebastian Deo- rowicz, “Revisiting Dictionary-Based Compression,” Software— Practice & Experience 35, no. 15 (2005): 1455. 12. Przemysław Skibiński, Jakub Swacha, and Szymon Grabowski, “A Highly Efficient XML Compression Scheme for the Web,” Proceedings of the 34th International Conference on Cur- rent Trends in Theory and Practice of Computer Science, LNCS 4910 (2008): 766. 13. Jon Louis Bentley et al., “A Locally Adaptive Data Com- pression Scheme,” Communications of the ACM 29, no. 4 (1986): 320. 14. Skibiński, Grabowski, and Deorowicz, “Revisiting Dic- tionary-Based Compression,” 1455. 15. Skibiński, Swacha, and Grabowski, “A Highly Efficient XML Compression Scheme for the Web,” 766. 16. Peter Deutsch, “DEFLATE Compressed Data Format Specification version 1.3,” RFC1951, Network Working Group, 1996, www.ietf.org/rfc/rfc1951.txt (accessed May 7, 2009). 17. Christian Schneider, Precomp—A Command Line Precom- pressor, 2009, http://schnaader.info/precomp.html (accessed May 7, 2009). 18. The technical details of the algorithm constructing code words and assigning them to indexes, and encoding num- bers and special tokens, are given in Skibiński, Swacha, and Grabowski, “A Highly Efficient XML Compression Scheme for the Web,” 766. 19. David Solomon, Data Compression: The Complete Reference, 4th ed. (London: Springer-Verlag, 2006). 20. Skibiński, Swacha, and Grabowski, “A Highly Efficient XML Compression Scheme for the Web,” 766. 21. Dave Raggett, Arnaud Le Hors, and Ian Jacobs, eds., W3C HTML 4.01 Specification, 1999, http://www.w3.org/TR/REC -html40/ (accessed May 7, 2009). 22. Ian H. Witten, David Bainbridge, and Stefan Boddie, “Greenstone: Open Source DL Software,” Communications of the ACM 44, no. 5 (2001): 47. 23. Project Gutenberg, 2008, http://www.gutenberg.org/ (accessed May 7, 2009). 24. Przemysław Skibiński and Szymon Grabowski, “Variable- Length Contexts for PPM,” Proceedings of the IEEE Data Compres- sion Conference (Snowbird, Utah, 2004): 409. ALCTS cover 2 LITA cover 3, cover 4 Index to Advertisers 3223 ---- 154 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 Tutorial Kathleen Carlson Delivering Information to Students 24/7 with Camtasia This article examines the selection process for and use of Camtasia Studio software, a screen video cap- ture program created by TechSmith. The Camtasia Studio software allows the author to create streaming videos which gives students 24 hour access on any topics including how to order books through interlibrary loan. H ow does one engage students in the library research pro- cess? In my brief time at the Downtown Phoenix campus library of Arizona State University (ASU) I have found a software program that allows librarians to bring the classroom to the student. Screen cap- ture programs allow you to create presentations and publish them for students to view on their own time. Instead of telling students how to do something, we need to show them.1 Recent studies show there are numer- ous benefits to using streaming video in higher education. Students that receive streaming video instruc- tion as well as traditional instruc- tion show dramatic improvement in class.2 This article takes a look at how I selected one software program and created a streaming video using the application. I examined three software appli- cations that help create video tutori- als and presentations: Cam Studio, Macromedia’s Captivate, and TechSmith’s Camtasia Studio. I first experimented with Cam Studio, which is open-source software. There are limitations to what you can do with software that is free. The screen size is too small and the file size it can create is limited. Macromedia’s Captivate is good if you want to create a series of screenshots with accompanying audio. I did not choose this streaming video program because I was unsure of the software’s capability, and I had no one to provide technical support. The third choice, the open-source Camtasia Studio, was the software I selected. There were several rea- sons why I preferred this software. I had more familiarity with it, and the software is very easy to load and is user friendly. It also has the ability to record a video of everything that is happening on your computer screen.3 Another reason I selected Camtasia Studio was because of the availability of an ASU software technician who had experience editing the streaming video. Most users view Camtasia’s video through Adobe Flash, but the program also can produce Windows Media, Quicktime, DVD-ready AVI, iPod, iPhone, RealMedia MP3, WEB, CD, Blog, and animated GIF formats.4 Camtasia performs screen captures in real time. You are able to simul- taneously use slideshow software, navigate to a website, and narrate step-by-step instructions. The full version of Camtasia Studio runs around $300. In addition to the software program, you also must have a combination headset and microphone. A stick microphone will work, but the combination headset will help eliminate any noise that can be picked up by a stick microphone. I purchased a Logitech Extreme PC Gaming Headset for about $20. When you purchase the Camtasia license online at http://www.techsmith .com/, the customer service depart- ment will e-mail you the access code along with a link from which you can download the software. The CD-Rom loaded with the Camtasia software arrives about ten days later. My first Camtasia Studio project was a tutorial on how to use the university’s interlibrary loan system. Here are the basic steps I took to cre- ate a streaming video: 1. Preproduction. This involves the creation of a script. 2. Production. The actual captur- ing of the video and audio content. Have all websites and programs open and minimized at the bottom of the screen in order to easily select them dur- ing the video capturing. 3. Postproduction. This is the most time-consuming and involves editing the video and com- pressing the file for delivery to users. 4. Publishing. Posting the video to a Web server and assessing the material’s success. To see the full 3 minute 53 second streaming video “How to Order an Article that ASU Does Not Own” go to http://www.asu.edu/lib/tutorials/ illiad/index.html. Implementing Camtasia Studio Once Camtasia Studio is installed on your computer, double click on the Camtasia Studio icon. It will bring up a Welcome window where you can select from the following (see figure 1): n start a new project by record- ing the screen n start recording a PowerPoint presentation n start a new project by import- ing media files n open an existing project I have selected “start a new proj- ect by recording the screen.” On the left hand menu there is a task list, and you can select one of the Kathleen Carlson (kathleen.carlson@ asu.edu) is Health Sciences Librarian, information Commons Library, arizona State University, downtown Phoenix campus. DEliVEriNG iNForMaTioN To STuDENTS 24/7 WiTH CaMTaSia | CarlSoN 155 following (see figure 2): n record the screen n record the PowerPoint I have selected “start a new proj- ect by recording the screen.” This will bring up a window, “New Recording Wizard Screen Recording Setup.” It asks you what you would like to record (see figure 3). n region of the screen n specific window n entire screen I have selected “entire screen.” When you click on the “next” but- ton, it brings up a recording options window (see figure 4). Select from the following: n record audio n record camera I have selected “record audio while recording the screen.” Next you see a window that lets you choose audio settings from the following (see figure 5): n microphone n speaker audio n microphone and speaker audio n manual input selection I have selected “microphone” (see figure 6). The next window is titled “Tune Volume Input Levels.” Use the input level lever to set the audio input level (see figure 7). Figure 1. welcome screen and what do you want to do? Figure 5. Choose audio settingsFigure 3. Screen recording setup Figure 4. recording options Figure 2. record the screen 156 iNForMaTioN TECHNoloGY aND liBrariES | SEpTEMBEr 2009 The “Begin Recording” window appears, which includes instructions on how to start and stop recording. You have the choice of clicking the “record” button on Camtasia Recorder or clicking the F9 key to start recording. To stop, click the “stop” button on Camtasia Recorder or click the F10 key (see figure 8). Finally click on either “record the screen” or “record PowerPoint.” To view your streaming video, click on the saved icon where it says Clip Bin or go to Camtasia toolbar and click on View. Then click on Clip Bin, then click on Thumbnails. That’s all there is to it. Summary I found Camtasia Studio to be very user friendly, although I cannot emphasize enough how important it is for librarians to collaborate with their IT staff. This software enables you to bring the classroom to the student when they need it. You may have instructed a class on library research, but many of these students may have already forgotten where to begin. Streaming video allows stu- dents to access presentations 24/7. Here is a checklist of things to think about when selecting software: n What do you want to accom- plish with the software? n What kind of access are you trying to give? n Do you want audio, video, or both? n Is it easy for the student to access and understand? n Have you researched the soft- ware to make sure it meets your needs? n How much money do you want to spend? n What additional equipment is necessary? Finally, and most importantly, work with your IT staff on all phases of your project. By developing a col- laborative relationship with them you will have fewer bumps in the road. Use your imagination: the sky is the limit. References 1. Diane Murley, “Tools for Creating Video Tutorials,” Law Library Journal 99, no. 4 (2007). 2. Ron Reed, “Streaming Technology Improves Achievement: Study Shows the Use of Standards-Based Video Content, Powered by New Internet Technology Application, Increases Student Achieve- ment,” T.H.E. Journal 30, no. 7 (2003). 3. Christopher Cox, “From Cameras to Camtasia: Streaming Media without the Stress,” Internet Reference Services Quarterly 9 no. 3/4 (2004). 4. John D. Clark and Qinghua Kou, “Captivate/Camtasia,” Journal of the Med- ical Library Association 96, no. 1 (2008), http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid=2212324 (accessed June 24, 2009). Figure 6. audio volume levels Figure 7. Begin recording Figure 8. Camtasia recorder 3224 ---- 158 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 Michelle FrisquePresident’s Message I know the President’s Message is usually dedicated to talking about where LITA is now or where we are hoping LITA will be in the future, but I would like to deviate from the usual path. The theme of this issue of ITAL is “Discovery,” and I thought I would participate in that theme. Like all of you, I wear many hats. I am president of LITA. I am head of the Information Services Department at the Galter Health Sciences Library at Northwestern University. I also am a new part-time student in the Masters of Learning and Organizational Change pro- gram at Northwestern University. As a student and a practicing librarian, I am now on both sides of the discovery process. As head of the Information Systems Department, I lead the team that is responsible for developing and maintaining a web- site that assists our health-care clinicians, researchers, students, and staff with selecting and managing the electronic information they need when they need it. As a student, I am a user of a library discovery system. In a recent class, we were learning about the Burke- Litwin Causal Model of Organization Performance and Change. The article we were reading described the model; however, it did not answer all of my questions. I thought about my options and decided I should investi- gate further. Before I continue, I should confess that, like many stu- dents, I was working on this homework assignment at the last minute, so the resources had to be available online. This should be easy, right? I wanted to find an overview of the model. I first tried the library’s website using several search strategies and browsed the resources in Metalib, the library catalog, and LibGuides with no luck. The information I found was not what I was looking for. I then tried Wikipedia without success. Finally, as a last resort, I searched Google. I fig- ured I would find something there, right? I didn’t. While I found many scholarly articles and sites that would give me more information for a fee, none of the results I reviewed gave me an overview of the model in question. I gave up. The student in me thought: It should not be this hard! The librarian in me just wanted to forget I had ever had this experience. This got me to thinking: Why is this so hard? Libraries have “stuff” everywhere. We access “stuff,” like books, journals, articles, images, datasets, etc., from hun- dreds of vendors and thousands of publishers who guard their stuff and dictate how we and our users can access that stuff. That’s a problem. I could come up with a million other reasons why this is so difficult, but I won’t. Instead, I would like to think about what could be. In this same class we learned about Appreciative Inquiry (AI) theory. I am simplifying the theory, but the essence of AI is to think about what you want something to be instead of identifying the problems of what is. I decided to put AI to the test and tried to come up with my ideal discovery process. I put both my student and librarian hats on, and here is what I have come up with so far: n I want to enter my search in one place and search once for what I need. I don’t want to have to search the same terms many times in various locations in the hopes one of them has what I am looking for. I don’t care where the stuff is or who provides the information. If I am allowed to access it I want to search it. n I want items to be recommended to me on the basis of what I am searching. I also want the system to recommend other searches I might want to try. n I want the search results to be organized for me. While perusing a result list can be loads of fun because you never know what you might find, I don’t always have time to go through pages and pages of information. n I want the search results to be returned to me in a timely manner. n I want the system to learn from me and others so that the results list improves over time. n I want to find the answer. I’m sure if I had time I would come up with more. While we aren’t there yet, we should continually take steps—both big and small—to perfect the discovery pro- cess. I look forward to reading the articles in this issue to see what other librarians have discovered, and I hope to learn new things that will bring us one step closer to creating the ultimate discovery experience. Michelle Frisque (mfrisque@northwestern.edu) is LITA President 2009–10 and Head, Information Systems, North- western University, Chicago. 3225 ---- EDITORIAL | TRUITT 159 Marc Truitt Editorial: Reflections on What We Mean by “Forever” W hat do we mean when we tell people that we want or intend to preserve content or an object “forever”? A couple of weeks ago, I attended the Fall Meeting of the Preservation and Archiving Special Interest Group (PASIG) in San Francisco. The group, generously spon- sored by Sun Microsystems, is the brainchild of Art Pasquinelli of Sun and Michael Keller of Stanford. First, a confession on my part. Since the University of Alberta (UA) was one of the founding members of PASIG, I had occasion to attend the first several PASIG meetings. In the beginning, there were just a handful of—perhaps fewer than ten—institutions represented. It seemed at the first couple of meetings, when the group was still finding its direction, that the content was slim, repetitious, and overly focused on Sun’s own solutions in the digital pres- ervation and archiving (DPA) arena. Since we had other attendees ably representing UA, I stayed away from the following several meetings. Well, PASIG has grown up. The attendee list for this meeting boasted nearly two hundred persons represent- ing more than thirty institutions. Among the attendees were many of the leading lights in DPA and the profes- sion generally. Institutions represented included several North American and European national libraries, as well as ARLs, memory institutions, and a host of companies and consultants offering a range of DPA solutions. Yes, PASIG has arrived, and we have Art, Mike, and Sun to thank for this. If I have one real remaining complaint about PASIG, it’s that the group is still overly focused on Sun’s solu- tions. True, other vendors such as ExLibris and VTLS attended, but their solutions don’t compete; rather, they build on Sun’s offerings. And while Microsoft also was in attendance for the first time, its presentation focused not so much on DPA solutions—it has none—as on a raft of interesting and useful plug-ins whose purpose is to facili- tate preservation of content created in Microsoft products such as Word, Excel, PowerPoint, etc. Other large vendors of DPA solutions—think IBM, for one—remain conspicu- ously absent. It’s time for Sun to do the “right thing” and “open source” PASIG. If Sun wishes to continue to sponsor PASIG by lending administrative and organizational expertise, that would be great. Indeed, a leading but not controlling role in PASIG would be entirely consistent with the company’s new focus on support of open-source efforts such as mySQL, OpenOffice, and OpenSolaris. So, what about the title of this editorial? When we talk of digital preservation, just how long are we think- ing of preserving an object? Ask any twenty specialists in DPA, and chances are that you’ll get at least ten different answers. For some, the timeframe can be as short as five to twenty years. For others, it’s fifty or perhaps one hun- dred years. At PASIG, at least one presenter described an organizational business model that envisions preserving content for five hundred years. And there are even some in our profession who glibly use what one might call “the DPA F-word,” although fortunately none of them seemed to be in attendance at this fall’s PASIG What does this mean in a very practical, nuts-and-bolts IT sense? Chris Wood of Sun gave a presentation at the 2008 PASIG Spring Meeting in which he estimated that the cost to supply power and cooling alone to maintain a peta- byte (1,000 TB) of disk-based digital content for a mere ten years would easily exceed $1 million.1 Refining his figures downward somewhat, Wood noted a few months later at the following PASIG meeting that for a 1 TB drive, the five- year estimated power and cooling for 2008–12 could be estimated at approximately $320, or $640,000 per petabyte over ten years, still a considerable sum.2 Add to this the costs of migration—consider that a modern spinning disk is generally thought to have a use- ful lifespan of about five years, and tape may have two or three decades—and the need regular integrity-checking of digital content for “bit-rot,” and you have the stuff of a sustainability nightmare. These challenges don’t even include the messy question of preservating an object so that it is usable in a century or five. While we probably will be able to read Word and Excel files for the foreseeable future, there are already countless files created with now- defunct PC applications of the 1980s and 1990s; many are stored on all kinds of obsolete media and today are skat- ing on the edge of inaccessibility. Already we are seeing concern expressed at institutions with significant digital library and digitization commit- ments that curating, migrating, and ensuring the integrity and usability of growing petabytes of content over centu- ries may be unsustainable in both dollars and staff.3 Can we even imagine the possible maintenance burden for our descendants, say, 250 or 500 years from now? In 2006, Alexander Stille observed that “one of the great ironies of the information age is that, while the late twentieth century will undoubtedly have recorded more data than any other period in history, it will also almost cer- tainly have lost more information than any previous era.”4 How are we to deal with this? Can we meaningfully plan for the preservation of digital content over centuries given our poor track record over just the past few decades? Perhaps we’re thinking too big when we speak of “for- ever.” Maybe we need to begin by conceptualizing and implementing on a more manageable scale. Or, to adopt a phrase that seemed to become the informal mantra of Marc Truitt (marc.truitt@ualberta.ca) is Associate University Librarian, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 160 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 both this year’s PASIG and the immediately preceding iPres meeting, “To get to forever you have to get to five years first.”5 n About this issue of ITAL A few months ago, while she was still working at the University of Nevada Las Vegas, ITAL’s longtime man- aging editor, Judith Carter, shared with me the program for Discovery Mini-Conference that had just been held at UNLV. The presentations, originally cast as poster sessions, suggested a diverse and fascinating collection of insights deserving of wider attention. I suggested to Judith that she and her colleagues had the makings of a great ITAL theme issue, and I’m pleased that they accepted my invitation to rework the presentations into a form suitable for publication here. I hope that you will find the results of their work interesting—I certainly do. They’ve done a superb job! Bravo to Judith and the presenters at the UNLV Discovery Mini-Conference! n Corrigenda In our September issue, in an article by Kathleen Carlson, we inadvertently characterized Camtasia Studio as an open-source product. It is not. Camtasia Studio is pub- lished by TechSmith Corporation. You can find out more at the product website (http://www.techsmith.com/ camtasia.asp). Also, in the same article, we provided a URL to a Flash tutorial titled “How to Order an Article that ASU Does Not Own.” Ms. Carlson has recently advised us that the tutorial in question is no longer available. References and Notes 1. Chris Wood, “The Billion File Problem and Other Archive issues” (presentation, Spring Meeting of the Sun Preserva- tion and Archiving Special Interest Group [PASIG], San Fran- cisco, California, May 28, 2008), http://events-at-sun.com/ pasig_spring/presentations/ChrisWood_MassiveArchive.pdf (accessed Oct. 22, 2009). 2. Chris Wood, “Archive and Preservation: Emerging Stor- age: Technologies & Trends” (presentation, Fall Meeting of PASIG, Baltimore, Maryland, Nov. 19, 2008), http://events -at-sun.com/pasig_fall08/presentations/PASIG_Wood.pdf. (accessed Oct. 22, 2009). 3. Consider, for example, the following extract from a recent posting to the Syslib-L electronic discussion list by the head of library systems at the University of North Carolina at Chapel Hill: I’m exaggerating a little in my subject line, but it’s been less than 4 years since we purchased our first large (5TB) storage array. We now have a raw 65TB online, and 84TB on order—although a considerable chunk of that 84 is going to replace storage that’s going out of warranty/maintenance and is more cost effective to replace (Apple XRAIDs, for instance). In the end, though we’ll net out with 100TB or thereabouts by the end of next year. A great deal of this space is going to digitization projects—no surprise there. We have over 20TB now in our “digital archive,” storage I consider dim, if not dark. We need a heck of a lot of space for stag- ing backups, givien [sic] how much we write to tape in a 24-hour period. Individual staff aren’t abusing our lack of quotas—it’s really almost all legitimate, project-driven work that’s eating us up. What’s scarier is that we’re now talking seriously about moving from project-driven work to programmatic work: the lat- est large photographic archive we acquired is being scanned as part of the acquisition/processing work- flow. We’re looking at ways to prioritize the scanning of our manuscript collections. Donors increasingly expect to see their gifts online. And we’re not even yet supporting an “institutional repository.” Will Owen, “0 to 60 in Three Years: Mass Storage Management,” online posting, Dec. 8, 2008, Syslib-L@listserv.indiana.edu, https://listserv.indiana.edu/cgi-bin/wa-iub.exe?A0=SYSLIB-L (account required; accessed Oct. 22, 2009). 4. Alexander Stille, “Are we losing our memory? or, The Museum of Obsolete Technology,” Lost Magazine, no. 3 (Feb. 2006), http://www.lostmag.com/issue3/memory.php (accessed Oct. 22, 2009). While Stille was referring in this quotation to both digital and nondigital materials, his comments are but part of a larger debate positing that the latter half of the twentieth century could well come to be known in the future as a “digital dark age” because of the vast quantity of at-risk digital content, recently estimated by one expert at some 369 exabytes (369 bil- lian GB) worth of data. Physorg.com, “‘Digital Dark Age’ May Doom Some Data,” http://www.physorg.com/news144343006 .html (accessed Oct. 22, 2009). 5. Ed Summers, “IPRES, IIPC, PASIC Roundup/Brain- dump,” online posting, Oct. 14, 2009, inkdroid, http://inkdroid .org/journal/2009/10/14/ipres-iipc-pasig-roundupbrain dump/ (accessed Oct. 22, 2009). 3226 ---- DISCOVERY: WHAT DO You MEAN BY THAT? | CARTER 161 Judith Carter Editorial Board Thoughts: Issue Introduction Discovery: What Do You Mean by That? M wuah ha ha ha haaa! Finally it’s my turn. I hold the power of the editorial. (Can you tell I’m writing this around Halloween?) Seriously now, I’ve been intimately and extensively involved with Information Technology and Libraries for eleven years, yet this is the first time I’ve escaped from behind the editing scenes to address the readership directly. As managing editor for seven of the eleven volumes (18–22 and 27–28) and an editorial board member reviewing manuscripts (vols. 23–26), I am honored Marc agreed to let me be guest editor for this theme issue. This issue is a compilation of presentations from the Discovery Mini-Conference held at the University of Nevada Las Vegas (UNLV) Libraries in the spring of 2009. The first article by Jennifer Fabbi gives the full chronol- ogy and framework of the project, but I have the pleasure of introducing this issue and topic by virtue of my role as guest editor, as well as my own participation in the Mini- Conference before I left UNLV in July 2009. n What is discovery? When the dean of libraries, Patricia Iannuzzi, announced that UNLV would have a Libraries-wide, poster-session style Discovery Mini-Conference, Jennifer Fabbi and I decided we wanted to be part of it. We had already been exploring various aspects of discovery as part of an orga- nizational focus as well as following up on a particular event that happened earlier in the year. While serving on a search committee, we posed a question to all the candi- dates: “What do you see the library catalog looking like in the future? What do you see as the relationship between the library catalog and other access or discovery tools?” One of the candidates had such a unique answer that it got us thinking: Are we all talking about the same thing when we discuss discovery? The Mini-Conference gave us the opportunity to explore the idea further. An all-library summit that pre- ceded the Mini-Conference announcement had focused on users finding known items. We knew that discovery was so much more and that it depended on the users’ needs. Of course, first we went to multiple online dictionar- ies to look up the meanings of “discovery” and found the following definitions: n Something learned or found; something new that has been learned or found n The process of learning something; the fact or pro- cess of finding out about something for the first time n The process of finding something; the process or act of finding something or somebody unexpect- edly or after searching We also looked at famous quotes about discovery. There were some of our favorites: A discovery is said to be an accident meeting a pre- pared mind. —Albert Szent-Gyorgyi Education is a progressive discovery of our own ignorance. —Will Durant Next, a colleague recommended we look at Chang’s browsing theory.1 This theory covered the broad spec- trum of how users seek information and showed a more serendipitous view than the former focus of known item search. Obviously, browsing implies a physical interac- tion with a collection, so we reframed the themes to fit discovery in the “every-library” electronic information environment. Chang’s five browsing themes, adapted to discovery: n Looking for a specific item, to locate n Looking for something with common characteris- tics, to find “more like this” n Keeping up-to-date, to find out what’s new in a field, topic or intellectual area n Learning or finding out, to define or form a research question n Goal-free, to satisfy curiosity or be entertained.2 All interesting information, but a little theoretical for a visual presentation. To make these themes more con- crete and visual, I suggested we apply them to personas as described in one of my favorite books, The Inmates are Running the Asylum.3 This encourages programmers to create a user with a full backstory and then design a product for their needs. To do this in an entertaining way, we identified five types of users we’ve encountered in our libraries and described an information-seeking need for each. I then cre- ated some colorful and representational characters using a well-known, alliteratively named candy’s website. Our five characters were 1. Mina, stylishly dressed and always carries a cell phone, is an undergraduate who rarely uses the library. She has a sociology class library assign- ment to find information on the cell phone habits of Generation X. 2. Ms. LVite lives in the Las Vegas area and contributes to the library. She is a regular from the community Judith Carter (jcarter.mls@gmail.com) is Head of Technical Services at Marquette University Raynor Memorial Libraries, Milwaukee, WI and Managing Editor of ITAL. 162 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 who likes to dig into everything the library owns about small mining towns in Nevada. 3. Dr. Prof is a faculty member with a slightly out- dated wardrobe but a thirst for knowledge. He wants to know what books have been published in his field of quantum bowtie mechanics by any of his colleagues across the country. 4. PhDead Tired is a slightly mussed grad student who is always in the library clutching a cup of coffee. He needs to narrow down his dissertation topic. 5. Duuuuude is an energetic, sociable young man who likes to hang out in the library with his friends. He has some time to kill on the computer. On our poster, we asked the Discovery Mini- Conference attendees to place cutouts of our personas on a pie chart divided into the five themes of discovery. Jennifer and I expected certain placements and were pleasantly surprised when our attendees challenged our assumptions with alternate possibilities. Another section of the poster related discovery behav- iors to specific electronic discovery tools. We provided a few and asked the attendees to add others (see table 1). While talking with each attendee, we provided a bookmark listing the five discovery behaviors (with col- orful character personas) and suggested they keep them in mind as they visited the other conference sessions. We challenged them to identify what user behaviors the other presenters’ systems or services were targeting. The message Jennifer and I hoped to convey with our poster was this: The way we think about discovery, or the users’ goals in finding information, drives the discovery Table 1. Relating discovery behaviors to electronic discovery tools User wants . . . Provide the User . . . Other tools?* To find a specific item Search by title, author, or call number (e.g., Libraries’ WebOPAC) Search a database WorldCat Flickr Google Books To find items with common characteristics Items linked by subject headings, format, or other elements; tag clouds; federated search for article databases (e.g., WebOPAC, Encore, Article databases) Flickr Summon Twine Delicious To be kept up-to-date Recently added items by subject; integration of blogs for news or updates (e.g., New Books List, LibGuides, Encore “recently added”) Blogs RSS Feeds Apple iTunes Amazon Readers Advisory Authors/Musicians websites Newspapers online To learn more about something General information that provides context, reviews (e.g., Wikipedia, Google, Encore community reviews) Dissertation abstracts Encyclopedias Database of databases (for context) Peer to peer: delicious, social tagging To satisfy curiosity or be entertained Surfing the Web, multimedia, social networking (e.g., Google, YouTube, Facebook) MySpace World of Warcraft Second Life Podcasts Wikipedia “random article” feature * Ideas generated at the Discovery Mini-Conference DISCOVERY: WHAT DO You MEAN BY THAT? | CARTER 163 systems we have or will create. As you read through this issue, I hope you’ll see some new ways to think about discovery and that those ways will fuel this audience’s potential to create new tools. What follows is a textual walk around our Mini- Conference. Taken as individual articles, each might not look like what you are used to seeing in ITAL. Taken as a whole that grew out of the process, these articles are what makes this a special issue. As I said before, Jennifer Fabbi provides the background and process for the Discovery Mini-Conference. Then, Alex Dolski describes a prototype MultiPAC discovery system he created and demonstrated, and he discusses the issues surrounding the design of such a system. Tom Ipri, Michael Yunkin, and Jeanne Brown, as members of the Usability Working Group, had already been conducting testing on UNLV Libraries’ website. They share their methods, findings, and results with us. Thomas Sommer presents a look at what the Special Collections Department has imple- mented to aid discovery of their unique materials. Wendy Starkweather and Eva Stowers used the Mini-Conference as an opportunity to research how other libraries are providing discovery opportunities to students via smart- phones. Patrick Griffis describes his work with free screen capture tools to build pathfinders to promote resource discovery. Patrick Griffis and Cyrus Ford each looked at enhancing catalog records, so they combined their two presentations here to describe ways to enrich the online catalog to better aid our users’ success. References 1. Shan-ju Chang, “Chang’s Browsing,” in Theories of Infor- mation Behavior, by Karen E. Fisher, Sanda Erdelez, and Lynne McKechnie (Medford, N.J.: Information Today, 2005): 69–74. 2. Ibid., 71–72. 3. Alan Cooper, The Inmates Are Running the Asylum, (India- napolis, Ind.: Sams, 1999). Personas are described in chapter 9. Figure 1. “Initial Thoughts” and “Five General Themes of Discovery Behavior” panel from the Discovery Mini-Conference poster 3227 ---- 164 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 “Discovery” Focus as Impetus for Organizational Learning Jennifer L. Fabbi The University of Nevada Las Vegas Libraries’ focus on the concept of discovery and the tools and processes that enable our users to find information began with an organizational review of the Libraries’ Technical Services Division. This article outlines the phases of this review and subsequent planning and organizational commit- ment to discovery. Using the theoretical lens of organiza- tional learning, it highlights how the emerging focus on discovery has provided an impetus for genuine learning and change. T he University of Nevada Las Vegas (UNLV) Libraries’ focus on the concept of discovery and the tools and processes that enable our users to find information stemmed from the confluence of several initiatives. However, a significant path that is directly responsible for the increased attention on discovery leads through one unit in UNLV Libraries—Technical Services. This unit, consisting of the Materials Ordering and Receiving (acquisitions) and Bibliographic and Metadata Services (cataloging) departments, had been without a permanent director for three years when I was asked to take the interim post in April 2008. While the initial expectation was that I would work with the staff to con- tinue to keep Technical Services functioning while we performed our third search for a permanent director, it became clear after three months that, because of Nevada’s budgetary limitations, we would not be able to go for- ward with a search at that time. As all personnel searches in UNLV Libraries were frozen, managers and staff across the divisions moved quickly to reassign staff with the aim of mitigating the effects of staff vacancies. There was division between the library adminis- trators as to what the solution would be for Technical Services: split up the division—for which we had trouble recruiting and retaining a leader in the past—and divvy up its functions among other divisions in the Libraries, or to continue to hold down the fort while conducting a review of Technical Services that would inform what it might become in the future. Other organizations have taken serious looks at, and provided roadmaps of, how their organizations’ focus of technical services will change in the future.1 The latter route was chosen, and the review—eventually dubbed Revisioning Technical Services—led directly to the inquiries and activities documented in this ITAL special issue. Detailing the process of Revisioning Technical Services and using the theoretical lens of organizational learning, I will demon- strate how the Libraries’ emerging focus on the concept of discovery has provided an impetus for genuine learn- ing and change. n Organizational learning In Images of Organization, Morgan devotes a chapter to theories of organizational development that characterize organizations using the metaphor of the brain.2 Based on the principles of modern cybernetics, Argyris and Schön provide a framework for thinking about how organiza- tions can learn to learn.3 While many organizations have become adept at single-loop learning—the ability to scan the environment, set objectives, and monitor their own Figure 1. Single- and double-loop learning Source: Learning-Org Discussion Pages, “Single and Double Loop Learning,” Learning-Org Dialog on Learning Organizations, http://www.learning-org.com/ graphics/LO23374SingleDLL.jpg (accessed Aug. 11, 2009). Jennifer L. Fabbi (jennifer.fabbi@unlv.edu) is Special Assistant to the Dean at the University of Nevada Las Vegas Libraries. “DISCOVERY” FOCUS AS IMPETUS FOR ORGANIzATIONAL LEARNING | FABBI 165 general performance in relation to existing operating norms—these types of systems are generally designed to keep the organization “on course.” Double-loop learn- ing, on the other hand, is a process of learning to learn, which depends on being able to take a “double look” at the situation by questioning the relevance of operating norms (see figure 1). Bureaucratized organizations have fundamental organizing principles, including manage- ment hierarchy and subunit goals that are seen as ends to themselves, which can actually obstruct the learning process. To become skilled in the art of double-loop learn- ing, organizations must avoid getting trapped in single- looped processes, especially those created by “traditional management control systems” and the “defensive rou- tines” of organizational members.4 According to Morgan, cybernetics suggests that learn- ing organizations must develop capacities that allow them to do the following:5 n Scan and anticipate change in the wider environ- ment to detect significant variations by o embracing views of potential futures as well as of the present and the past; o understanding products and services from the customer’s point of view; and o using, embracing, and creating uncertainty as a resource for new patterns of development. n Develop an ability to question, challenge, and change operating norms and assumptions by o challenging how they see and think about organizational reality using different templates and mental models; o making sure strategic development does not run ahead of organizational reality; and o developing a culture that supports change and risk taking. n Allow an appropriate strategic direction and pat- tern of organization to emerge by o developing a sense of vision, norms, values, limits, or “reference points” to guide behavior, including the ability to question the limits being imposed; o absorbing the basic philosophy that will guide appropriate objectives and behaviors in any situation; and o placing as much importance on the selection of the limits to be placed on behavior as on the active pursuit of desired goals. UNLV Libraries’ Revisioning Technical Services pro- cess and the resulting organizational focus on discovery is outlined below, and the elements identifying UNLV Libraries as a learning organization throughout this pro- cess are highlighted (see appendix A). n Revisioning Technical Services This review of Technical Services was a process consist- ing of several distinct steps over many months, and each step was informed by the data and opinions gained in the prior steps: Phase 1: Technical Services Baseline, focusing on the nature of Technical Services work at UNLV Libraries, in the library profession, and factors that affect this work now and in the future Phase 2: Organizational Call to Action, engaging the entire organization in shared learning and input Phase 3: Summit on Discovery, shifting significantly away from Technical Services and toward the concept of discovery of information and the experi- ence of our users Technical Services Baseline The first phase of the process, which I called the “Technical Services Baseline,” included a face-to-face meeting with me and all Technical Services staff. We talked openly about the challenges that we faced, options on the table for the division and why I thought that taking on this review would be the best course to pursue, and goals of the review. Outcomes of the process were guided by the dean of libraries, were written by me, and received input from Technical Services staff, resulting in the following goals: 1. Collect input about the kinds of skills and leader- ship we would like to see in our new Technical Services director. (while creating these goals, we were given the go-ahead to continue our search for a new director). 2. Investigate the organization of knowledge at a broad level—what is the added value that libraries provide? 3. Increase overall knowledge of professional issues in technical services and what is most meaningful for us at UNLV. 4. Encourage Technical Services staff to consider cur- rent and future priorities. After establishing these goals, I began to document information about the process on UNLV Libraries’ staff website (figure 2) so that all staff could follow its progress. 166 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 With the feedback I received at the face-to-face meet- ing and guided by the stated goals of the process, I gave Technical Services staff a series of three questions to answer individually: 1. What do you think the major functions of Technical Services are? Examples are “cataloging physi- cal materials” and “ordering and paying for all resources purchased from the collections budget.” 2. What external factors—in librarianship and other- wise—should we be paying the most attention to in terms of their effect on technical services work? Examples are “the ways that users look for infor- mation” and “reduction of print book and serials budgets.” Feel free to do a little research on this question and provide the sources of the informa- tion that you find. 3. What are the three highest priority/most impor- tant tasks on your to-do list right now? Eighteen of twenty staff members responded to the questions. I then analyzed the twenty pages of feedback according to two specific criteria: (1) I paid special atten- tion to phrases that indicated an individual’s beliefs, values, or philosophies to identify potential sources of conflict as we moved through the process; and (2) I looked for priority tasks listed that are not directly related to the individual’s job duties, as many of them were indicators of work stress or anxiety related to perceived impending change. During this phase, organizational learning was initi- ated through the process of challenging how Technical Services staff and others viewed Technical Services as a unit in the organization, and through the creation of shared reference points to guide our future actions. While beginning a dialogue about a variety of future manage- ment options for Technical Services work functions may have raised levels of anxiety within the organization, it also invited administration and staff to question the status quo and consider alternative modes of operation within the context of efficiency.6 In addition to thinking about current realities and external influences, staff were asked to participate in generating outcomes to guide the review process. These shared goals helped to develop a sense of coherence for what started out as a very loose assignment—a review that would inform what the unit might become in the future. Organizational Call to Action The next phase of the process, “A Call to Action,” required library-wide involvement and input. While I knew that this phase would involve a library staff survey, I also desired that all staff responding to the survey had a basic knowledge of some of the issues that are facing library technical services today. Using input from the two Technical Services department heads, I selected two readings for all library staff: Bothmann and Holmberg’s chapter on strategic planning for electronic resource man- agement addressed many of the planning, policy, and workflow issues that UNLV Libraries has experienced7; and Coyle’s article on information organization and the future of the library catalog offers several ideas for ensur- ing that valuable information is visible to our users in the information environments they are using.8 I also asked the library staff to visit the University of Nebraska–Lincoln’s “Encore Catalog Search” (http://iris.unl.edu) and go through the discovery experience by performing a guided search and a search on a topic of their choice. They were then asked to ponder what collections of physical or digi- tal resources we currently own at the Libraries that are not available from the library catalog. After completing these steps, I directed library staff to a survey of questions related to the importance of several items referenced in the articles in terms of the following UNLV Libraries priorities: n Creating a single search interface for users pulling together information from the traditional library catalog as well as other resources (e.g., journal articles, images, archival materials) n Considering non–MARC records in the library catalog for the integration of nontraditional library and nonlibrary resources into the catalog n Linking to access points for full-text resources from the catalog n Creating ways for the catalog to recommend items to users Figure 2. Project’s wiki page on staff website “DISCOVERY” FOCUS AS IMPETUS FOR ORGANIzATIONAL LEARNING | FABBI 167 n Creating metadata for materials not found in the catalog n Creating “community” within the library catalog n Implementing an Electronic Resource Management System (ERMS) to help manage the details related to subscriptions to electronic content n Implementing federated searching so that users can search across multiple electronic resource inter- faces at once n Making electronic resource license information available to library staff and patrons There also were several questions asking library staff to prioritize many of the functions that Technical Services already undertakes to some extent: n Cataloging specialized or unique materials n Cataloging and processing gift collections n Ensuring that full-text electronic access is repre- sented accurately in the catalog n Claiming and binding print serials n Ordering and receiving physical resources n Ordering and receiving electronic resources n Maintaining and communicating acquisitions bud- get and serials data The survey asked Technical Services staff to “think of your current top three priority to-do items. In light of what you read and what you think is important for us to focus on, how do you think your work now will have changed in five years?” All other library staff members were asked to respond to the following: 1. Please list two ways that Technical Services sup- ports your work now. 2. Please list two things you would like Technical Services to start doing in support of your work now. 3. Please list two things you think Technical Services can stop doing now. 4. Please list two things Technical Services will need to begin doing to support your work in the next five years. Finally, the survey included ample opportunity for additional comments. Fifty-eight staff members (over half of all library staff) completed the readings, activity, and survey. I analyzed the information to inform the design of subsequent phases of Revisioning Technical Services. The dean of libraries’ direct reports then reviewed the design. In addition, many library staff contributed additional read- ings and links to library catalogs and other websites to add to the Revisioning Technical Services staff webpage. Throughout this phase, the organization was invited into the learning process through engagement with shared reference points, the ability to question the status quo, and the ability to embrace views of potential futures as well as of the present and the past.9 The careful selec- tion of shared readings and activities created coherence among the staff in terms of thinking about the future, but these ideas also raised many questions about the concept of discovery and what route UNLV Libraries might take. The survey allowed library staff to better understand cur- rent practices in technical services, to prioritize new ideas against these practices, and to think about future options and their potential impact on their individual work as well as the collective work of the Libraries. Summit on Discovery In the third phase of this process, “The Discovery Summit,” focus began to shift significantly from Technical Services as an organizational unit to the concept of discovery and what it means for the future of UNLV Libraries. During this half-day event, employing a facilitator from off cam- pus, the dean of libraries and I designed a program to fulfill the following desired outcome: Through a process of focused inquiry, observation, and discussion, participants will more fully understand the discovery experience of UNLV Libraries users. The event was open to all library staff members; however, individuals were required to RSVP and complete an activity before the day of the event. (The facilitator worked specifically with the Technical Services staff at a retreat designed to prepare for upcoming inter- views for Technical Services director candidates.) Participants were each sent a “summit matrix” (see appendix B) ahead of time, which asked them to look for specific pieces of information by doing the following: 1. Search for the information requested with three dis- covery tools as your starting points: the Libraries’ Catalog, the Libraries’ website, and a general Internet search engine (like Google). 2. For each discovery tool, rate the information that you were able to find in terms of “ease of discov- ery” on a scale of 1 (lowest ease—few results) to 5 (highest ease—best results). 3. Document the thoughts and feelings you had and/ or process you went through in searching for this information. 4. Answer this question: Do you have other preferred starting points when looking for information that the Libraries own or provide access to? The information that staff members were asked to search for using each discovery tool was mostly specific to the region of Southern Nevada, such as, “I heard that Henderson (a city in southern Nevada) started as a mining community. Does UNLV Libraries have any books about that?” and “Find any photograph of the gay 168 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 pride parade in Las Vegas that you can look at in UNLV Libraries.” During the summit, the approximately sixty partici- pants were asked to discuss their experiences searching for the matrix information, including any affective component to their experience, and they were asked to specify criteria for their definition of “ease of discovery.” Next, we showed end-user usability video testing foot- age of a UNLV professor, a human resources employee, and a UNLV librarian going through similar discovery exercises. After each video, we discussed these users’ experiences—their successes, failures, and frustrations— and the fact that even our experts were unable to dis- cover some of this information. Finally, we facilitated a robust brainstorming session on initiatives we could undertake to improve the discovery experience of our users. [Editor’s note: Read more about this usability testing in “Usability as a Method for Assessing Discovery” on page 181 of this issue.] During the wrap-up of the Discovery Summit, the final phase of this initial process, the Discovery Mini- Conference was introduced. A call for proposals for library staff to introduce or otherwise present discovery concepts to other library staff was distributed. This call tied together the Revisioning Technical Services process to date and also placed the focus on discovery to the Libraries’ upcoming strategic planning process. This strategic planning process, outlining broad directions for the Libraries to focus on for the next two years, would be the first time we would use our newly created evaluation framework. We focused on the concepts of discovery, access, and use, all tied together through an emphasis on the user. All library staff members were invited to submit a poster session or other visual display on various themes related to discovery of information to add to our collective and individual knowledge bases and to better understand our colleagues’ philosophies and positions on discovery. In addressing one of six Mini-Conference themes listed below, all drawn directly from the Revisioning Technical Services survey results, potential participants were asked to consider the question, “What are your ideas for ways to improve how users find library resources?” n single search interface (federated searching, harvester-type platform, etc.) n open source vs. vendor infrastructure n information-seeking behavior of different users n social networking and Web 2.0 features as related to discovery n describing primary sources and other unique mate- rials for discovery n opening the library catalog for different record types and materials Proposals could include any of these perspectives: n an environmental scan with a summary of what you learn n a visual representation of what you would con- sider improvement or success n a position for a specific approach or solution that you advocate Ultimately, we had seventeen distinct projects involv- ing twenty-four staff members for the afternoon Mini- Conference. It was attended by approximately seventy additional staff members from UNLV Libraries as well as representatives from institutions who share our Innovative system. We collected feedback on each project in written form and electronically after the Mini-Conference. Mini- Conference content was documented on its own wiki pages and in this special issue of ITAL. During this phase of the Revisioning Technical Services process, there was an emphasis on understand- ing our services from the customers’ point of view, a hall- mark of a learning organization.10 During the Discovery Summit, we aimed to transform frustration and uncer- tainty over the user experience of the services we are pro- viding into a motivation to embrace potential futures. The Mini-Conference utilized the discovery themes that had evolved throughout the Revisioning Technical Services process to provide a cohesive framework for library staff members to share their knowledge and ideas about dis- covery systems and to question the status quo. n Organizational ownership of discovery: Strategic planning and beyond Through the phases of the Revisioning Technical Services process outlined above, it should be evident how the concept of discovery, highlighted during the process, moved from being focused on Technical Services to being owned by the entire organization. While the vocabulary of discovery had previously been owned by pockets of staff throughout UNLV Libraries, it has now become a common lexicon for all. The Libraries’ evaluation framework, which includes discovery, had set the stage for our upcoming organizational strategic plan. Just prior to the Discovery Summit, the dean of libraries’ direct reports group began to discuss how it would create a strategic plan for the 2009–11 biennium. It became increasingly apparent how important a focus on discovery would be in this process, and that we needed to time our planning right, allowing the organization and ourselves time to become familiar with the potential activities we might commit to in this area before locking into a strategic plan. “DISCOVERY” FOCUS AS IMPETUS FOR ORGANIzATIONAL LEARNING | FABBI 169 The dean’s direct reports group first spent time crafting a series of strategic directions to focus on in the two-year time period we were planning for. Rather than give the organization specific activities to undertake, the strategic directions were meant to focus our new initiatives—and in a way to limit that activity to those that would move us past the status quo. Of the sixteen directions, one stemmed directly from the organiza- tion’s focus on discovery: “Improve discoverability of physical and electronic resources in empowering users to be self sufficient; work toward an interface and system architecture that incorporates our resources, internal and external, and allows the user to access them from their preferred starting point.” An additional direction also touched on the discovery concept: “Monitor and adapt physical and virtual spaces to ensure they respond to and are informed by next-generation technologies, user expectations, and patterns in learning, social interactions, and research collaboration; encourage staff to experiment with, explore, and share innovative and creative applica- tions of technology.” Through their division directors and standing com- mittees, all library staff members were subsequently given the opportunity to submit action items to the strategic plan within the framework of the strategic directions. The effort was made by the dean of libraries for this part of the process to coincide with the Discovery Mini-Conference, a time when many library staff members were being exposed to a wide variety of poten- tial activities that we might take as an organization in this area. One of the major action items that made it into the strategic plan was for the dean’s direct reports to charge an oversight task force with the investigation and recommendation of a systems or systems that would foster increased, unified discovery of library collections. The charge of this newly created Discovery Task Force includes a set of guiding principles for the group in rec- ommending a discovery solution that n creates a unified search interface for users pull- ing together information from the library catalog as well as other resources (e.g., journal articles, images, archival materials); n enhances discoverability of as broad a spectrum of library resources as possible; n is intuitive: minimizes the skills, time, and effort needed by our users to discover resources; n supports a high level of local customization (such as accommodating branding and usability consid- erations); n supports a high level of interoperability (easily con- necting and exchanging data with other systems that are part of our information infrastructure); n demonstrates commitment to sustainability and future enhancements; and n is informed by preferred starting points of the user. In setting forth these guiding principles, the work of the Discovery Task Force is informed by the organiza- tion’s discovery values, which have evolved over a year of organizational learning. In the timing of the strategic planning process and the emphasis of the plan, we made sure that the orga- nization’s strategic development did not run ahead of organizational reality and also have worked to develop a culture that supports change and risk taking.11 The stra- tegic discovery direction and pattern of organizational focus has been allowed to emerge throughout the organi- zational learning process. As evidenced in both the stra- tegic plan directions and guiding principles laid out in the charge of the Discovery Task Force, the organization has begun to absorb the basic philosophy that will guide appropriate objectives in this area and has focused more on this guiding philosophy than on the active pursuit of one right answer as it continues to learn. n Conclusion Using the theoretical lens of organizational learning, I have documented how UNLV Libraries’ emerging focus on the concept of discovery has provided an impetus for learning and change (see appendix A). Our experience throughout this process supports the theory that organi- zational intelligence evolves over time and in reference to current operating norms.12 Argyris and Schön warn that a top-down approach to management focusing on control and clearly defined objectives encourages single- loop learning.13 Had UNLV Libraries chosen a more management-oriented route at the beginning of this process, it most likely would have yielded an entirely dif- ferent result. In this case, genuine organizational learning proved to be action based and ever-emerging, and while this is known to introduce some level of anxiety into an organization, the development of the ability to question, challenge, and potentially change operating norms has been worth the cost.14 I believe that while any single idea we have broached in the discovery arena may not be com- pletely unique, it is the entire process of organizational learning that is significant and applicable to many infor- mation and technology-related areas of interest. References 1. Karen Calhoun, The Changing Nature of the Catalog and its Integration with Other Discovery Tools (Washington, D.C.: Library 170 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 Scan and anticipate change in the wider environment to detect significant variations by n embracing views of potential futures as well as of the present and the past (Revisioning Phase 1: Technical Services questions); n understanding products and services from the cus- tomer’s point of view (Revisioning Phase 3: Summit); and n using, embracing, and creating uncertainty as a resource for new patterns of development (Revisioning Phase 1: Meeting; Phase 3: Summit). Develop an ability to question, challenge, and change operating norms and assumptions by n challenging how they see and think about orga- nizational reality using different templates and mental models (Revisioning Phase 2: Survey); n making sure strategic development does not run ahead of organizational reality (Strategic Planning process; Discovery Task Force charge); and n developing a culture that supports change and risk taking (Strategic Planning process). Allow an appropriate strategic direction and pattern of organization to emerge by n developing a sense of vision, norms, values, limits, or “reference points” to guide behavior, including the ability to question the limits being imposed (Revisioning Phase 1: Outcomes; Phase 2: Shared read- ings, activity; Strategic Planning process; Discovery Task Force charge); n absorbing the basic philosophy that will guide appropriate objectives and behaviors in any situa- tion (Strategic Planning process, Discovery Task Force charge); and n placing as much importance on the selection of the limits to be placed on behavior as on the active pursuit of desired goals (Strategic Planning process, Discovery Task Force charge). of Congress, 2006), http://www.loc.gov/catdir/calhoun-report -final.pdf (accessed Aug. 12, 2009); Bibliographic Services Task Force, Rethinking How We Provide Bibliographic Services for the University of California (Univ. of California Libraries, 2005), http://libraries.universityofcalifornia.edu/sopag/BSTF/Final .pdf (accessed Aug. 12, 2009). 2. Gareth Morgan, Images of Organization (Thousand Oaks, Calif.: Sage, 2006). 3. Chris Argyris and Donald A. Schön, Organizational Learn- ing II: Theory, Method, and Practice (Reading, Mass.: Addison Wesley, 1996). 4. Morgan, Images of Organization, 87. 5. Morgan, Images of Organization, 87–97. 6. Ibid. 7. Robert L. Bothmann and Melissa Holmberg, “Strategic Planning for Electronic Management,” in Electronic Resource Management in Libraries: Research and Practice, ed. Holly Yu and Scott Breivold, 16–28 (Hershey, Pa.: Information Science Refer- ence, 2008). 8. Karen Coyle, “The Library Catalog: Some Possible Futures,” The Journal of Academic Librarianship 33, no. 3 (2007): 414–16. 9. Morgan, Images of Organization. 10. Ibid. 11. Ibid. 12. Ibid. 13. Argyris and Schön, Organizational Learning II. 14. Morgan, Images of Organization. APPENDIx A. Tracking UNLV Libraries’ Discovery Focus across Characteristics of Organizational Learning “DISCOVERY” FOCUS AS IMPETUS FOR ORGANIzATIONAL LEARNING | FABBI 171 Please complete the following and bring to the Summit on Discovery—February 24: 1. Search for the information requested in each row of the table below with three discovery tools as your starting points: the Libraries Catalog, the Libraries Website, and a general Internet search engine (like Google). 2. For each discovery tool, rate the information that you were able to find in terms of “ease of discovery” on a scale of 1 (lowest ease) to 5 (highest ease). 3. Document the thoughts and feelings you had and/ or process you went through in searching for this information in the space provided. 4. Answer this question: Do you have other preferred starting points when looking for information that the Libraries own or provide access to? APPENDIx B. Summit Matrix What am I looking for? Libraries Catalog Libraries Website Google Thoughts, etc., on what I discovered What’s all the fuss about Frazier Hall? Why is it important? Does UNLV Libraries have any documents about the history of the university that reference it? It’s Black History month and my professor wants me to find an oral history about African Americans in Las Vegas that is available in UNLV Libraries. I heard that Henderson started as a mining community. Does UNLV Libraries have any books about that? Find any photograph of the gay pride parade in Las Vegas that you can look at in UNLV Libraries. 3228 ---- 172 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 Information Discovery Insights Gained from MultiPAC, a Prototype Library Discovery System Alex A. Dolski At the University of Nevada Las Vegas Libraries, as in most libraries, resources are dispersed into a number of closed “silos” with an organization-centric, rather than patron-centric, layout. Patrons frequently have trouble navigating and discovering the dozens of disparate interfaces, and any attempt at a global overview of our information offerings is at the same time incomplete and highly complex. While consolidation of interfaces is widely considered to be desirable, certain challenges have made it elusive in practice. M ultiPAC is an experimental “discovery,” or meta- search, system developed to explore issues sur- rounding heterogeneous physical and networked resource access in an academic library environment. This article discusses some of the reasons for, and outcomes of, its development at the University of Nevada Las Vegas (UNLV). n The case for MultiPAC Fragmentation of library resources and their interfaces is a growing problem in libraries, and UNLV Libraries is no exception. Electronic information here is scattered across our Innovative WebPAC; our main website, our three branch library websites; remote article databases, local custom databases, local digital collections, special collections, other remotely hosted resources (such as LibGuides), and others. The number of these resources, as well as the total volume of content offered by the Libraries, has grown over time (figure 1), while access provisions have not kept pace in terms of usability. In light of this dilemma, the Libraries and various units within have deployed finding and search tools that provide browsing and searching access to certain subsets of these resources, depending on criteria such as n the type of resource; n its place within the libraries’ organizational structure; n its place within some arbitrarily defined topical categorization of library resources; n the perceived quality of its content; and n its uniqueness relative to other resources. These tools tend to be organization-centric rather than patron-centric, as they are generally provisioned in relative isolation from each other without placing as much emphasis on the big picture (figure 2). The result is, from the patron’s perspective, a disaggregated mass of information and scattered finding tools that, to varying degrees, each accomplishes its own specific goals at the expense of macro-level findability. Currently, a compre- hensive search for a given subject across as many library resources as possible might involve visiting a half-dozen interfaces or more—each one predicated upon awareness of each individual interface, its relation to the others, and Figure 1. “Silos” in the library Figure 2. Organization-centric resource provisioning Alex A. Dolski (alex.dolski@unlv.edu) is Web & Digitization Application Developer at the University of Nevada Las Vegas Libraries. INFORMATION DISCOVERY INSIGHTS GAINED FROM MULTIPAC | DOLSkI 173 the characteristics of its spe- cific coverage of the corpus of library content. Our library website serves as the de facto gate- way to our electronic, net- worked content offerings. Yet usability studies have shown that findability, when given our website as a starting point, is poor. Undoubtedly this is due, at least in part, to interface fragmentation. Test sub- jects, when given a task to find something and asked to use the library website as a starting point, fail out- right in a clear majority of cases.1 MultiPAC is a technical prototype that serves as an exploration of these issues. While the system itself breaks no new technical ground, it brings to the forefront critical issues of metadata quality, organizational structure, and long-term planning that can inform future actions regard- ing strategy and implemen- tation of potential solutions at UNLV and elsewhere. Yet it is only one of numerous ways that these issues could be addressed.2 In an abstract sense, MultiPAC is biased toward principles of simplification, consolidation, and unifica- tion. In theory, usability can be improved by eliminating redundant interfaces, con- solidating search tools, and bringing together resource-specific features (e.g., OPAC holdings status) in one interface to the maximum extent possible (figure 3). Taken to an extreme, this means being able to support searching all of our resources, regardless of type or location, from a single interface; abstracting each resource from whatever native or built-in user interface it might offer; and relying instead on its data interface for querying and result-set gathering. Thus MultiPAC is as much a proof-of-concept as it is a concrete implementation. n Background: How MultiPAC became what it is MultiPAC came about from a unique set of circumstances. From the beginning, it was intended as an exploratory project, with no serious expectation of it ever being deployed. Our desire to have a working prototype ready for our Discovery Mini-Conference meant that we had just six weeks of development time, which was hardly sufficient for anything more than the most agile of Table 1. Some popular existing library discovery systems Name Company/Institution Commercial Status Aquabrowser Serials Solutions Commercial Blacklight University of Virginia Open-source (Apache) Encore Innovative Interfaces Commercial eXtensible Catalog University of Rochester Open-source (MIT/GPL) LibraryFind Oregon State University Open-source (GPL) MetaLib Ex Libris Commercial Primo Ex Libris Commercial Summon Serials Solutions Commercial VuFind Villanova University Open-source (GPL) WorldCat Local OCLC Commercial Table 2. Some existing back-end search servers Name Company/Institution Commercial Status Endeca Endeca Technologies Commercial IDOL Autonomy Commercial Lucene Apache Foundation Open-source (Apache) Search Server Microsoft Commercial Search Server Express Microsoft Free Solr (superset of Lucene) Apache Foundation Open-source (Apache) Sphinx Sphinx Technologies Open-source (GPL) Xapian Community Open-source (GPL) Zebra Index Data Open-source (GPL) 174 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 development models. The resulting design, while foun- dationally solid, was limited in scope and depth because of time constraints. Another option, instead of developing MultiPAC, would have been to demonstrate an existing open-source discovery system. The advantage of this approach is that the final product would have been considerably more advanced than anything we could have developed our- selves in six weeks. On the other hand, it might not have provided a comparable learning opportunity. n Survey of similar systems Were its development to continue, MultiPAC would find itself among an increasingly crowded field of competitors (table 1). A number of library discovery systems already exist, most backed by open-source or commercially available back-end search engines (table 2), which handle the nitty-gritty, low-level ingestion, indexing, and retrieval. These lists of systems are by no means comprehensive and do not include notable experimental or research systems, which would make them much longer. n Architecture In terms of how they carry out a search, meta-search applications can be divided into two main groups: dis- tributed (or federated search), in which searches are “broadcast” to individual resources that return results in real time (figure 4); and harvested search, in which searches are carried out against a local index of resource contents (figure 5).3 Both have advantages and disadvan- tages beyond the scope of this article. MultiPAC takes the latter approach. It consists of three primary components: the search server, the user interface, and the metadata harvesting system (figure 6). Figure 4. The federated search process Figure 5. The harvested search process Figure 6. The three main components of MultiPAC Figure 3. Patron-centric resource provisioning INFORMATION DISCOVERY INSIGHTS GAINED FROM MULTIPAC | DOLSkI 175 n Search server After some research, Solr was chosen as the search server because of its ease of use, proven library track record, and HTTP–based representational state transfer (REST) application programming interface (API), which improves network-topological flexibility, allowing it to be deployed on a different server than the front-end Web application—an important consideration in our server environment.4 Jetty—a Java Web application server bundled with Solr—proved adequate and convenient for our needs. The metadata schema used by Solr can be customized. We derived ours from the unqualified Dublin Core meta- data element set (DCMES),5 with a few fields removed and some fields added, such as “library” and “depart- ment,” as well as fields that support various MultiPAC features, such as thumbnail images, and primary record URLs. DCMES was chosen for its combination of general- ity, simplicity, and familiarity. In practice, the Solr schema is for finding purposes only, so whether it uses a standard schema is of little importance. n User interface The front-end MultiPAC system is written in PHP 5.2 in a model-view-controller design based on classical object design principles. To support modularity, new resources can be added as classes that implement a resource-class interface. The MultiPAC HTML user interface is composed of five views: search, browse, results, item, and list, which exist to accommodate the finding process illustrated in figure 7. Each view uses a custom HTML template that can be easily styled by nonprogrammer Web designers. (Needless to say, judging by figures 8–12, they haven’t been.) Most dynamic code is encapsulated within dedi- cated “helper” methods in an attempt to decouple the templates from the rest of the system. Output formats, like resources, are modular and decoupled from the core of the system. The HTML user interface is one of several interfaces available to the MultiPAC system; others include XML and JSON, which effectively add Web services support to all encompassed resources—a feature missing from many of the resources’ own built-in interfaces.6 n Search view Search view (figure 8) is the simplest view, serving as the “front page.” It currently includes little more than a brief introduction and search field. The search field is not complicated; it is, in fact, possible to include search forms on any webpage and scope them to any subset of resources on the basis of facet queries. For example, a search form could be scoped to Las Vegas–related resources in Special Collections, which would satisfy the demand of some library departments for custom search engines tailored to their resources without contribut- ing to the “interface fragmentation” effect discussed in the introduction. (This would require a higher level of metadata quality than we currently have, which will be discussed in depth later.) Because search forms can be added to any page, this view is not essential to the MultiPAC system. To improve simplification, it could be easily removed and replaced with, for example, a search form on the library homepage. n Browse view Browse view (figure 9) is an alternative to search view, intended for situations in which the user lacks a “concrete target” (figure 7). As should be evident by its appearance, Figure 7. The information-finding process supported by MultiPAC Figure 8. The MultiPAC search view page 176 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 this is the least-developed view, simply displaying facet terms in an HTML unordered list. Notice the facet terms in the format field; this is malprocessed, MARC– encoded information resulting from a quick-and-dirty Extensible Stylesheet Language (XSL) transformation from MARCXML to Solr XML. n Results view The results page (figure 10) is composed of three columns: 1. The left column displays a facet list—a feature gen- erally found to be highly useful for results-gathering purposes.7 The data in the list is generated by Solr and transformed to an HTML unordered list using PHP. The facets are configurable; fields can be made “facetable” in the Solr schema configuration file. 2. The center column displays results for the current search query that have been provided by Solr. Thumbnails are available for resources that have them; generic icons are provided for those that do not. Currently, the results list displays item title and description fields. Some items have very rich descriptions; others have minimal descriptions or no descriptions at all. This happens to be one of several significant metadata quality issues that will be discussed later. 3. The right column displays results from nonin- dexed resources, including any that it would not be feasible to index locally, such as Google, our article databases, and so on. MultiPAC displays these resources as collapsed panes that expand when their titles are clicked and initiate an AJAX request for the current search query. In a situation in which there might be twenty or more “panes” to load, performance would obviously suffer greatly if each one had to be queried each time the results page loaded. The on-demand loading process greatly speeds up the page load time. Currently, the right column includes only a handful of resource panes—as many as could be developed in six weeks alongside the rest of the prototype. It is anticipated that further development would entail the addition of any number of panes—perhaps several dozen. The ease of developing a resource pane can vary greatly depending on the resource. For developer- friendly resources that offer a useful JavaScript Object Notation (JSON) API, it can take less than half an hour. For article databases, which vendors generally take great pains to “lock down,” the task can entail a two-day marathon involving trial-and-error HTTP-request-token authentication and screen-scraping of complex invalid HTML. In some cases, vendor license agreements may prohibit this kind of use altogether. There is little we can do about this; clearly, one of MultiPAC’s severest limita- tions is its lack of adeptness at searching these types of “closed” remote resources. n Item view Item view (figure 11) provides greater detail about an individual item, including a display of more metadata fields, an image, and a link to the item in its primary con- text, if available. It is expected that this view also would include holdings status information for OPAC resources, although this has not been implemented yet. The availability of various page features is dependent on values encoded in the item’s Solr metadata record. For example, if an image URL is available, it will be displayed; if not, it won’t. An effort was made to keep the view logic separate from the underlying resource to improve code and resource maintainability. The page template itself does not contain any resource-dependent conditionals. n List view List view (figure 12), essentially a “favorites” or “cart” view, is so named because it is intended to duplicate the list feature of UNLV Libraries’ Innovative Millennium Figure 9. The MultiPAC browse view page INFORMATION DISCOVERY INSIGHTS GAINED FROM MULTIPAC | DOLSkI 177 OPAC. The user can click a button in either results view or item view to add items to the list, which is stored in a cookie. Although currently not feature-rich, it would be reasonable to expect the ability to send the list as an e-mail or text message, as well as other features. n Metadata harvesting system For metadata to be imported into Solr, it must first be harvested. In the harvesting process, a custom script checks source data and com- pares it with local data. It downloads new records, updates stale records, and deletes missing records. Not all resources support the ability to easily check for changed records, meaning that the full record set must be down- loaded and converted during every harvest. In most cases, this is not a problem; most of our resources (the library catalog excluded) can be fully dumped in a matter of a few seconds each. In a production environment, the harvest scripts would be run automatically every day or so. In practice, every resource is different, necessitating a different harvest script. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is the proto- col that first jumps to mind as being ideal for metadata harvesting, but most of our resources do not support it. Ideally, we would modify as many of them as possible to be OAI–compliant, but that would still leave many that are out of our hands. Either way, a substantial number of custom harvest scripts would still be required. For demonstration purposes, the MultiPAC prototype was seeded with sample data from a handful of diverse resources: 1. A set of 16,000 MARC records from our library catalog, which we converted to MARCXML and then to Solr XML using XSL transformations 2. Our locally built Las Vegas Architects and Buildings Database, a MySQL database containing more than 10,000 rows across 27 tables, which we queried and dumped into XML using a PHP script 3. Our locally built Special Collections Database, a smaller MySQL database, which we dealt with the same way 4. Our CONTENTdm digital collections, which we downloaded via OAI-PMH and transformed using another custom XSL stylesheet There are typically a variety of conversion options for each resource. Because of time constraints, we simply chose what we expected would be the quickest route for each, and did not pay much attention to the quality of the conversion. n How MultiPAC answers UNLV Libraries’ discovery questions MultiPAC has essentially proven its capability of solv- ing interface multiplication and fragmentation issues. Figure 10. The MultiPAC results view page 178 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 By adding a layer of abstraction between resource and patron, it enables us to reference abstract resources instead of their specific implementations—for example, “the library catalog” instead of “the INNOPAC catalog.” This creates flexibility gains with regard to resource pro- vision and deployment. This kind of “pervasive decoupling” can carry with it a number of advantages. First, it can allow us to provide custom-developed services that vendors cannot or do not offer. Second, it can prevent service interruptions caused by maintenance, upgrades, or replacement of individual back-end resources. Third, by making us less dependent on specific implementations of vendor products—in other words, reducing vendor “lock-in”—it can potentially give us leverage in vendor contract negotiations. Because of the breadth of information we offer from our website gateway, we as a library are particularly sensitive about the continued availability of access to our resources at stable URLs. When resources are not persistent, patrons and staff need to be retrained, expec- tations need to be adjusted, and hyperlinks—scattered all over the place—need to be updated. By decoupling abstract resources from their implementations, MultiPAC becomes, in effect, its own persistent URI system, unify- ing many library resources under one stable URI schema. In conjunction with a URL rewriting system on the Web server, a resource-based URI schema (figure 13) would be both powerful and desirable.8 n Lessons learned in the development of MultiPAC The lessons learned in the development of MultiPAC fall into three main categories, listed here in order of importance. Metadata quality considerations Quality metadata—characterized by unified schemas; useful crosswalking; and consistent, thorough descrip- tion—facilitates finding and gathering. In practice, a sur- rogate record is as important as the resource it describes. Below a certain quality threshold, its accompanying resource may never be found, in which case it may as well not exist. Surrogate record quality influences relevance ranking and can mean the difference between the most relevant result appearing on page 1 or page 50 (relevance, of course, being a somewhat disputed term). Solr and similar systems will search all surrogates, including those that are of poor quality, but the resulting relevancy rank- ing will be that much less meaningful. Figure 13. Example of an implementation-based vs. resource-based URI Implementation-based http://www.library.unlv.edu/arch/archdb2/index.php/projects/view/1509 Resource-based (hypothetical) http://www.library.unlv.edu/item/483742 Figure 11. The MultiPAC item view page Figure 12. The MultiPAC list view page INFORMATION DISCOVERY INSIGHTS GAINED FROM MULTIPAC | DOLSkI 179 Metadata quality can be evaluated on several lev- els, from extremely specific to extremely broad (figure 14). That which may appear to be adequate at one level may fail at a higher level. Using this figure as an example, MultiPAC requires strong adherence to level 5, whereas most of our metadata fails to reach level 4. A “level 4 failure” is illustrated in table 3, which compares sample metadata records from four different MultiPAC resources. Empty cells are not necessarily “bad”— not all metadata elements apply to all resources—but this type of inconsistency multiplies as the number of resources grows, which can have negative implications for retrieval. Suggestions for improving metadata quality The results from the MultiPAC project suggest that meta- data rules should be applied strictly and comprehensively according to library-wide standards that, at our libraries, have yet to be enacted. Surrogate records must be treated as must-have (rather than nice-to-have) features of all resources. Resources that are not yet described in a system that supports search- able surrogate records should be transitioned to one that does; for example, HTML web- pages should be tran- sitioned to a content management system with metadata ascrip- tion and searchability features (at UNLV, this is planned). However, it is not enough for resources to have high-quality meta- data if not all schemas are in sync. There exist a number of resources in our library that are well-described but whose schemas do not mesh well with other resources. Different formats are used; different descriptive elements Figure 14. Example scopes of metadata application and evalua- tion, from broad (top) to specific Table 3. Comparing sample crosswalked metadata from four different UNLV Libraries resources Library Catalog Digital Collections Special Collections Database Las Vegas Architects & Buildings Database Title Goldfield: boom town of Nevada Map of Tonopah Mining District, Nye County, Nevada 0361 : Mines and Mining Collection Flamingo Hilton Las Vegas Creator Paher, Stanley W. Booker & Bradford Call Number F849.G6P34 Contents (Item-level description of contents) Format Digital Object Photo Collections Database Record Language eng Eng eng Coverage Tonopah Mining District (Nev.) ; Ray Mining District (Nev.) Description (Omitted for brevity) Publisher Nevada Publications University of Nevada Las Vegas Libraries UNLV Architecture Studies Library Subject (LCSH omitted for brevity) (LCSH omitted for brevity) 180 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 are used; and different interpretations, however subtle, are made of element meanings. Despite the best intentions of everyone involved with its creation and maintenance, and despite the high quality of many of our metadata records when examined in isola- tion, in the big picture, MultiPAC has demonstrated—per- haps for the first time—how much work will be needed to upgrade our metadata for a discovery system. Would the benefits make the effort worthwhile? Would the effort be implementable and sustainable given the limitations of the present generation of “silo” systems? What kind of adjustments would need to be made to accommodate effective workflows, and what might those workflows look like? These questions still await answers. Of note, all other open-source and vendor systems suffer from the same issues, which is a key reason that these types of systems are not yet ascendant in libraries.9 There is much promise in the ability of infrastructural standards like FRBR, SKOS, RDA, and the many other esoteric information acronyms to pave the way for the next generation of library discovery systems. Organizational considerations Electronic information has so far proved relatively elusive to manage; some of it is ephemeral in existence, most of it is constantly changing, and all of it is from diverse sources. Attempts to deal with electronic resources—representing them using catalog surrogate records, streamlining web- site portals, farming out the problem to vendors—have not been as successful as they have needed to be and suf- fer from a number of inherent limitations. MultiPAC would constitute a major change in library resource provision. Our library, like many, is for the most part organized around a core 1970s–80s ILS–support model that is not well adapted to a modern unified discovery environment. Next-generation discovery is trending away from assembly-line-style acquisition and processing of primarily physical resources and toward agglomerating interspersed networked and physical resource clouds from on- and offsite.10 In this model, increasing responsibilities are placed on all content pro- viders to ensure that their metadata conforms to site-wide protocols that, at our library, have yet to be developed. n Conclusion In deciding how to best deal with discovery issues, we found that a traditional product matrix comparison does not address the entire scope of the problem, which is that some of the discoverability inadequacies in our libraries are caused by factors that cannot be purchased. Sound metadata is essential for proper functioning of a unified discovery system, and descriptive uniformity must be ensured on multiple levels, from the element level to the institution level. Technical facilitators of improved discoverability already exist; the responsibility falls on us to adapt to the demands of future discovery systems. The specific discovery tool itself is only a facilitator, the specific implementation of which is likely to change over time. What will not change are library-wide metadata quality issues that will serve any tool we happen to deploy. The MultiPAC project brought to light important library-wide discoverability issues that may not have been as obvious before, exposing a number of limitations in our exist- ing metadata as well as giving us a glimpse of what it might take to improve our metadata to accommodate a next-generation discovery system, in whatever form that might take. References 1. UNLV Libraries Usability Committee, internal library website usability testing, Las Vegas, 2008. 2. Karen Calhoun, “The Changing Nature of the Catalog and Its Integration with Other Discovery Tools.” Report prepared for the Library of Congress, 2006. 3. Xiaoming Liu et al., “Federated Searching Interface Tech- niques for Heterogeneous OAI Repositories,” Journal of Digital Information 4, no. 2 (2002). 4. Apache Software Foundation, Apache Solr, http://lucene .apache.org/solr/ (accessed June 11, 2009). 5. Dublin Core Metadata Initiative, “Dublin Core Metadata Element Set, Version 1.1,” Jan. 14, 2008, http://dublincore.org/ documents/dces/ (accessed June 25, 2009). 6. Lorcan Dempsey, “A Palindromic ILS Service Layer,” Lorcan Dempsey’s Weblog, Jan. 20, 2006, http://orweblog.oclc .org/archives/000927.html (accessed July 15, 2009). 7. Tod A. Olson, “Utility of a Faceted Catalog for Scholarly Research,” Library Hi Tech 4, no. 25 (2007): 550–61. 8. Tim Berners-Lee, “Hypertext Style: Cool URIs Don’t Change,” 1998, http://www.w3.org/Provider/Style/URI (accessed June 23, 2009). 9. Bowen, Jennifer, “Metadata to Support Next-Generation Library Resource Discovery: Lessons from the eXtensible Cata- log, Phase 1,” Information Technology and Libraries 2, no. 27 (June 2008): 6–19. 10. Calhoun, “The Changing Nature of the Catalog.” 3229 ---- USABILITY AS A METHOD FOR ASSESSING DISCOVERY | IPRI, YUNkIN, AND BROWN 181 Tom Ipri, Michael Yunkin, and Jeanne M. Brown Usability as a Method for Assessing Discovery The University of Nevada Las Vegas Libraries engaged in three projects that helped identify areas of its website that had inhibited discovery of services and resources. These projects also helped generate staff interest in the Usability Working Group, which led these endeavors. The first project studied student responses to the site. The second focused on a usability test with the Libraries’ peer research coaches and resulted in a presentation of those findings to the Libraries staff. The final project involved a specialized test, the results of which also were presented to staff. All three of these projects led to improvements to the website and will inform a larger redesign. U sability testing has been a component of the University of Nevada Las Vegas (UNLV) Libraries Web management since our first usability studies in 2000.1 Usability studies are a widely used and rela- tively standard set of tools for gaining insight into Web functionality. These tests can explore issues such as the effectiveness of interactive forms or the complexity of accessing full-text articles from third-party databases. They can explore aesthetic and other emotional responses to a site. In addition, they can provide an opportunity to collect input concerning satisfaction with the layout and logic of the site. They can reveal mistakes on the site, such as coding errors, incorrect or broken links, and problematic wording. They also allow us to engage in testing issues of discovery to isolate site elements that facilitate or hamper discovery of the Libraries’ resources and services. The Libraries’ Usability Working Group seized upon two library-wide opportunities to highlight findings of the past year’s studies. The first was the Discovery Summit, in which the staff viewed videos of staff attempting find- ing exercises on the homepage and discussed the finding process. The second was the Discovery Mini-Conference, an outgrowth of a new evaluation framework and the Libraries’ strategic plan. Through a poster display, the Working Group highlighted areas dealing with discovery of library resources. The Mini-Conference allowed us to leverage library-wide interest in the topic of effective information-finding on the Web to draw wider attention to usability’s importance in identifying the likelihood of our users discovering library resources independently. The Usability Working Group engaged in three projects to help identify areas of the website that inhibited discov- ery and to generate staff interest in the process of usability. All three of these projects led to improvements to the website and will inform a larger redesign. The first project is an ongoing effort to study student responses to the site. The second was to administer a usability test with the Libraries’ Peer Research Coaches and present those find- ings to the Libraries’ staff. The final project was requested by the dean of libraries and involved a specialized test, the results of which also were presented to staff. n Student studies The Usability Working Group began its ongoing evalu- ation of UNLV Libraries’ website by conducting two series of tests: one with five undergraduate students and one with five graduate students. Not surprisingly, most students self-reported that the main reason they come to the Libraries’ site is to find books and journal articles for assignments. The group created a set of fourteen tasks that were based on common needs for completing assignments: 1. Find a journal article on the death penalty. (Note: If students go somewhere other than the library, guide them back.) 2. Find what floor the book The Catcher in the Rye is on. 3. Find the most current issue of the journal Popular Mechanics. 4. Identify a way to ask a question from home. 5. Find a video on global warming. 6. You need to write a bibliography for a paper. Find something on the website that would help you. 7. Find out what Lied Library’s hours were for July 4. 8. Find the Libraries’ tutorial on finding books in the library. 9. The library offers workshops on how to use the library. Find one you can take. 10. Find a library-recommended website in business. 11. Find out what books are checked out on this card. 12. Find instructions for printing from your personal laptop. 13. Your sociology professor, Dr. Lampert, has placed something on reserve for your class. Please find the material. 14. Your professor wants you to read the book Efficiency and Complexity in Grammars by John A. Hawkins. Find a copy of the book for your assignment. (The Tom Ipri (tom.ipri@unlv.edu) is Head, Media and Computer Services; Michael Yunkin (michael.yunkin@unlv.edu) is Web Content Manager/Usability Specialist; and Jeanne M. Brown (jeanne.brown@unlv.edu) is Head, Architecture Studies Library and Assessment Librarian, University of Nevada Las Vegas Libraries. 182 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 moderator will prompt if the person stops at the catalog.) The results of these tests revealed that the site was not as conducive to discovery as was hoped. The Libraries are planning on a complete redesign of the site in the near future; however, the results of these first two series of usability tests were compelling enough to prompt an intermediary redesign to improve some of the areas that were troublesome to students. That said, the tests also found certain parts of the old site (figure 1) to be very effective: 1. All participants used the tabbed box in the center of the page, which gives them access to the catalog, serials lists, databases, and reserves. 2. All students quickly found the “Ask a Librarian” link when prompted to find a way to ask a ques- tion from home. 3. Most students found the Libraries’ hours, partly because of the “Hours” tab at the top of the page and partly because of multiple access points. 4. Many participants used the “Site Search” tab to navigate to the search page, but few actually used it to conduct searches. They effectively used the site map information also included on the search page. The usability tests also revealed some variables that undermined the goal of discoverability. 1. Due to the various sources of library-related infor- mation (website, catalog, vendor databases) navi- gation posed problems for students. Although not a specific question in the usability tests, the results show students often struggled to get back to the Libraries’ home page to start a new question. 2. Students often expected to find different content under “Help and Instruction” than what was there. 3. Students used the drop down boxes as a last resort. Often, they would expand a drop down box and quickly navigate away without selecting anything from the list. 4. With some exceptions, students mainly ignored the tabs across the top of the home page. 5. Although students made good use of the tabbed box in the center of the page, many could not distinguish between “Journals” and “Articles & Databases.” 6. Similarly, students easily found the “Reserves” tab but could not make sense of the difference between “Electronic Reserves (E-Reserves)” and “Other Reserves.” 7. No student found business resources via the “Subject Guides” drop down menu at the bottom of the home page. n Peer-coach test and staff presentation UNLV Libraries employs peer research coaches, under- graduate students who serve as frontline research mentors to their peers. The Usability Working Group administered the same test they used with the first group of undergraduate and graduate students to the peer research coaches. Although these students are trained in library research, they still struggled with some of the usability tasks. The Usability Working Group presented the findings of the Peer Research Coach tests with staff. The Peer Research Coaches are highly regarded in the Libraries, so staff were surprised that they had so much difficulty navigating the site; this presentation was the first time many of the staff had seen the results of usability studies of the site. The shocking nature of these results generated a great deal of interest among the staff regarding the work of the Usability Working Group. n The dean’s project In January 2009, the dean of libraries asked the Usability Working Group for assistance in planning for the Discovery Summit. Initially, she requested to view Figure 1. UNLV Libraries’ original website design USABILITY AS A METHOD FOR ASSESSING DISCOVERY | IPRI, YUNkIN, AND BROWN 183 the video from some of the usability tests with the goal of identifying discovery-oriented problems on the Libraries’ website. Soon after, the Dean tasked the group with performing a new set of usability tests using three subjects: a librarian, a library employee with little research or Web expertise, and a faculty researcher. Each participant was asked to complete three tasks, first using the Libraries’ website, then using Google. The tasks were based on items found in the Libraries’ Special Collections: 1. Find a photograph available in UNLV Libraries of the Basic Magnesium mine in Henderson, Nevada. 2. Find some information about the Baneberry Nuclear test. Are there any documents in UNLV Libraries about the lawsuit associated with the test? 3. Find some information about the local Greenpeace chapter. Are there any documents in UNLV Libraries about the Las Vegas chapter? The Dean viewed those videos and chose the most interesting clips for a presentation at the Discovery Summit. Prior to this meeting, The Libraries’ staff were instructed to try completing the tasks on their own so that they might see the potential difficulties users must overcome and to compare the user experience provided by our website with that provided by Google. At the Discovery Summit, the dean presented the staff a number of clips from these special usability tests, giv- ing the staff an opportunity to see where users familiar with the Libraries collections stumble. The staff also were shown several clips of undergraduates using the website to perform basic tasks, such as finding journal articles or videos in the Libraries, with varying degrees of success. These clips helped illustrate the various difficulties users encounter when attempting to discover library holdings, including unfamiliar search interfaces, library jargon, and a lack of clear relationships between the catalog and other databases. This discussion helped set the stage for the Discovery Mini-Conference. n Initial changes to the site UNLV Libraries’ website is in the process of being rede- signed, and the results of the usability studies are being used to inform that process. However, because of the seriousness of some of the issues, some changes are being implemented into an intermediary design (figure 2). The new homepage n combines article and journal searching into one tab and removes the word “databases” from the page entirely; n adds a website search to the tabbed box; n adds a “Music & Video” search option; n makes better use of the picture on the page by incorporating rotating advertisements in that area; n widens the page, allowing more space on the rest of the site’s templates; n breaks the confusing “Help & Instruction” page into two more specific pages: “Help” and “Using the Libraries”; and n adds the main library and the branch library hours to the homepage. This new homepage is just the beginning of our efforts to improve discovery through the Libraries’ website. The Usability Working Group already has plans to do a card sort for the “Using the Library” category to further refine the content and language of that section. The group plans to test the initial changes to the site to ensure that they are improving discovery. Reference 1. Jennifer Church, Jeanne Brown, and Diane VanderPol, “Walking the Web: Usability Testing of Navigational Pathways at the University of Nevada Las Vegas Libraries,” in Usability Assessment of Library-Related Web Sites: Methods and Case Studies, ed. Nicole Campbell (Chicago: ALA, 2001). Figure 2. UNLV Libraries’ new website design 3230 ---- 184 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 Thomas Sommer UNLV Special Collections in the Twenty-First Century University of Nevada Las Vegas (UNLV) Special Collections is consistently striving to provide several avenues of dis- covery to its diverse range of patrons. Specifically, UNLV Special Collections has planned and implemented several online tools to facilitate unearthing treasures in the collec- tions. These online tools incorporate Web 2.0 features as well as searchable interfaces to collections. T he University of Nevada Las Vegas (UNLV) Special Collections has been working toward creating a visible archival space in the twenty-first century that assists its patrons’ quest for historical discovery in UNLV’s unique Southern Nevada, gaming, and Las Vegas collections. This effort has helped patrons ranging from researchers to students to residents. Special Collections has created a discovery environment that incorporates several points of access, including virtual exhibits, a collection-wide search box, and digital collections. UNLV Special Collections also has added Web 2.0 features to aid in the discovery and enrichment of this historical infor- mation. These new features range from a What’s New blog to a digital collection with interactive features. The first point of discovery within the UNLV Special Collections website began with the virtual exhibits. Staff created the virtual exhibits as static HTML pages that showcased unique materials housed within UNLV Special Collections. They showed the scope and diversity of materials on a specific topic available to researchers, faculty, and students. One virtual exhibit is “Dino at the Sands” (figure 1), a point of discovery for the history not only of Dean Martin but of many Rat Pack exploits.1 The photographs in this exhibit come from the Sands Collection. It is a static HTML page, and it provides information and pictures regarding one of Las Vegas’ most famous entertainers. This exhibit contains links to Rat Pack information and various resources on Dean Martin, including photo- graphs, books, and videotapes. A second mode of discovery within the UNLV Special Collections website is its new “Search Special Collections” Google-like search box (figure 2). This is located on the homepage and searches the manuscript, photograph, and oral history primary source collections.2 The purpose is to aid in the discovery of material within the collections that is not yet detailed in the public online catalog. In the past researchers would have to work through the Special Collection’s website to locate the resources. They can now go to one place to search for various types of material—a one-stop shop. The search results are easy to read and highlight the search term (see figure 3).3 The third point of access is the digital collection. These collections are digital copies of original materials located within the archives. The digital copies are presented online, described, and organized for easy access. Each collection offers full-text searches, browsing, zoom, pan, Figure 2. UNLV Special Collections search box Figure 1. “Dino at the Sands” exhibit Thomas Sommer (thomas.sommer@unlv.edu) is University and Technical Services Archivist in Special Collections at the University of Nevada Las Vegas Libraries. UNLV SPECIAL COLLECTIONS IN THE TWENTY-FIRST CENTURY | SOMMER 185 side-by-side comparison, and exporting for presentation and reuse. The newest example of a digital collection is “Southern Nevada: The Boomtown Years” (figure 4).4 This collection brings together a wide range of original materials from var- ious collections located within UNLV Special Collections, the Nevada State Museum, the Historical Society in Las Vegas, and the Clark County Heritage Museum. It even provides standards-based activities for elementary and high school students. This project was funded by the Nevada State Library and Archives under the Library Services and Technology Act (LSTA) as amended through the Institute of Museum Figure 4. “Southern Nevada: The Boomtown Years” digital collection Figure 5. “What’s New” blog Figure 6. UNLV Special Collection Facebook page Figure 3. Hoover Dam search results 186 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 and Library Services (IMLS). UNLV Special Collections Director Peter Michel selected the content. The team included fourteen members, four of whom were funded by the grant. Christy Keeler, PhD, created the educator pages and designed the student activities. New collections are great, but users have to know they exist. To announce new collections and displays, Special Collections first added a What’s New blog that includes an RSS feed to keep patrons up-to-date on new messages (figure 5).5 Another avenue of interaction was implemented in April 2009 when Special Collections created its own Facebook page (figure 6).6 Students and researchers are encouraged to become fans. Status updates with images and links to southern Nevada and Las Vegas resources lead the fans back to the main web- site where the other treasures can be discovered. Special Collections has implemented various Web 2.0 features within its newest digital collections. Specifically, it added a comments section, a “Rate It” feature, and an RSS feature to its latest digital collections (figures 7, 8, and 9). These latest trends enrich the collections’ resources with patron-supplied information.7 As is apparent, UNLV Special Collections imple- mented several online tools to allow patrons to discover its extensive primary resources. These tools range from virtual exhibits and digital collections with Web 2.0 features to blogs and social networking sites. Special Collections has endeavored to stay on top of the latest trends to benefit its patrons and facilitate their discovery of historical materials in the twenty-first century. Figure 8. “Rate It” feature for Aerial View of Hughes Aircraft Plant photograph Figure 7. Comments section for Aerial View of Hughes Aircraft Plant photograph Figure 9. RSS feature for the index to the “Welcome Home Howard” digital collection Continued on page 190 190 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 As previously mentioned, these easy-to-use tools can allow screencast videos and screenshots to be integrated into a variety of online spaces. A particularly effective type of online space for potential integration of such screencast videos and screenshots are library “how do I find . . .” research help guides. Many of these “how do I find . . .” research help guides serve as pathfinders for patrons, outlining processes for obtaining information sources. Currently, many of these pathfinders are in text form, and experimentation with the tools outlined in this article can empower library staff to enhance their own pathfinders with screencast videos and screenshot tutorials. Reference 1. “UNLV Libraries Strategic Plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed July 30, 2009): 2. UNLV Special Collections continued from page 186 References 1. Peter Michel, “Dino at the Sands,” UNLV Special Collec- tions, http://www.library.unlv.edu/speccol/dino/index.html (accessed July 28, 2009). 2. Peter Michel, “UNLV Special Collections Search Box.” UNLV Special Collections. http://www.library.unlv.edu/speccol/ index.html (accessed July 28, 2009). 3. UNLV Special Collections search results, “Hoover Dam,” http://www.library.unlv.edu/speccol/databases/index .php?search_query=hoover+dam&btS=Search&cols[]=oh&cols []=man&cols[]=photocoll&act=2 (accessed October 27, 2009). 4. UNLV Libraries, “Southern Nevada: The Boomtown Years,” http://digital.library.unlv.edu/boomtown/ (accessed July 28, 2009). 5. UNLV Special Collections, “What’s New in Special Col- lections,” http://blogs.library.unlv.edu/whats_new_in_special_ collections/ (accessed July 28, 2009). 6. UNLV Special Collections, “UNLV Special Collections Facebook Homepage,” http://www.facebook.com/home .php?#/pages/Las-Vegas-NV/UNLV-Special-Collections/70053 571047?ref=search (accessed July 28, 2009). 7. UNLV Libraries, “Comments Section for the Aerial View of Hughes Aircraft Plant Photograph,” http://digital.library .unlv.edu/hughes/dm.php/hughes/82 (accessed July 28, 2009); UNLV Libraries, “‘Rate It’ feature for the Aerial View of Hughes Aircraft Plant Photograph,” http://digital.library.unlv.edu/ hughes/dm.php/hughes/82 (accessed July 28, 2009); UNLV Libraries, “RSS feature for the index to the Welcome Home How- ard Digital Collection” http://digital.library.unlv.edu/hughes/ dm.php/ (accessed July 28, 2009). STATEMENT OF OWNERSHIP, MANAGEMENT, AND CIRCULATION Information Technology and Libraries, Publication No. 280-800, is published quarterly in March, June, September, and December by the Library Information and Technology Association, American Library Association, 50 E. Huron St., Chicago, Illinois 60611-2795. Editor: Marc Truitt, Associate Director, Information Technology Resources and Services, University of Alberta, K Adams/Cameron Library and Services, University of Alberta, Edmonton, AB T6G 2J8 Canada. Annual subscription price, $65. Printed in U.S.A. with periodical-class postage paid at Chicago, Illinois, and other locations. As a nonprofit organization authorized to mail at special rates (DMM Section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. ExTENT AND NATURE OF CIRCULATION (Average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: September 2009 issue). Total number of copies printed: average, 5,096; actual, 4,751. Mailed outside country paid subscriptions: average, 4,090; actual, 3,778. Sales through dealers and carriers, street vendors, and counter sales: average, 430; actual 399. Total paid distribution: average, 4,520; actual, 4,177. Free or nominal rate copies mailed at other classes through the USPS: average, 54; actual, 57. Free distribution outside the mail (total): average, 127; actual, 123. Total free or nominal rate distribution: average, 181; actual, 180. Total distribution: average, 4,701; actual, 4,357. Office use, leftover, unaccounted, spoiled after printing: average, 395; actual, 394. Total: average, 5,096; actual, 4,751. Percentage paid: average, 96.15; actual, 95.87. S t a t e m e n t o f O w n e r s h i p , M a n a g e m e n t , a n d C i r c u l a t i o n ( P S F o r m 3 5 2 6 , S e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e U n i t e d S t a t e s P o s t O f f i c e P o s t m a s t e r i n C h i c a g o , O c t o b e r 1 , 2 0 0 9 . fboze Rectangle 3231 ---- SMARTPHONES: A POTENTIAL DISCOVERY TOOL | STARkWEATHER AND STOWARD 187 Smartphones: A Potential Discovery Tool Wendy Starkweather and Eva Stowers The anticipated wide adoption of smartphones by research- ers is viewed by the authors as a basis for developing mobile-based services. In response to the UNLV Libraries’ strategic plan’s focus on experimentation and outreach, the authors investigate the current and potential role of smart- phones as a valuable discovery tool for library users. W hen the dean of libraries announced a Discovery Mini-Conference at the University of Nevada Las Vegas Libraries to be held in spring 2009, we saw the opportunity to investigate the potential use of smartphones as a means of getting information and services to students. Being enthusiastic users of Apple’s iPhone, we and the Web technical support manager, developed a presentation highlighting the iPhone’s poten- tial value in an academic library setting. Because Wendy is UNLV Libraries’ director of user services, she was interested in the applicability of smartphones as a tool for users to more easily discover the libraries’ resources and services. Eva, as the health sciences librarian, was aware of a long tradition of PDA use by medical professionals. Indeed, first-year Bachelor of Science Nursing students are required to purchase a PDA bundled with select soft- ware. Together we were drawn to the student-outreach possibilities inherent in new smartphone applications such as Twitter, Facebook, and MySpace. n Presentation Our brief review of the news and literature about mobile phones in general provided some interesting findings and served as a backdrop for our presentation: n A total of 77 percent of Internet experts agreed that the mobile phone would be “the primary con- nection tool” for most people in the world by 2020.1 The number of smartphone users is expected to top 100 million by 2013. There are currently 25 million smartphone users, with sales in North America having grown 69 percent in 2008.2 n Smartphones offer a combination of technologies, including GPS tracking, digital cameras, and digi- tal music, as well as more than fifty-thousand spe- cialized apps for the iPhone and new ones being designed for the Blackberry and the Palm Pre.3 The Palm Pre offered less than twenty applications at its launch, but one million apllication downloads had been performed by June 24, 2009, less than a month after launch.4 n The 2009 Horizon Report predicts that the time to adoption of these mobile devices in the educa- tional context will be “one year or less.”5 Data gathered from campus users also was presented, providing another context. In March 2009, a survey of University of California, Davis (UC-Davis) students showed that 43 percent owned a smartphone.6 UC-Davis is participating in Apple’s University Education Forum. Here at UNLV, 37 percent of students and 26 percent of faculty and staff own a smartphone.7 The presentation itself highlighted the mobile appli- cations that were being developed in several libraries to enhance student research, provide library instruction, and promote library services. Two examples were Abilene Christian University (http://www.acu.edu/technology/ mobilelearning/index.html), which in fall 2008 distrib- uted iPhones and iPod Touches to the incoming fresh- man class; and Stanford University (http://www.stanford .edu/services/wirelessdevice/iphone/) which partici- pates in “iTunes U” (http://itunes.stanford.edu/). If the Libraries were to move forward with smartphone technol- ogies, it would be following the lead of such universities. Readers also may be interested in Joan Lippincott’s recent concise summary of the implications of mobile technologies for academic libraries as well as the chap- ter on library mobile initiatives in the July 2008 Library Technology Report.8 n Goals: A balancing act Ultimately the goal for many of these efforts is to be where the users are. This aspiration is spelled out in UNLV Libraries’ new strategic plan relating to infrastructure evolution, namely, “Work towards an interface and system architecture that incorporates our resources, internal and external, and allows the user to access from their preferred starting point.”9 While such a goal is laudable and fits very well into the discovery emphasis of the Mini-Conference presentation, we are well aware of the need for further investigation before proceeding directly to full-scale devel- opment of a complete suite of mobile services for our users. Of critical importance is ascertaining where our users are and determining whether they want us to be there and in what capacity. The value of this effort is demonstrated in Booth’s research report on student interest in emerging technologies at Ohio State University. The report includes the results of an extensive environmental survey of their Wendy Starkweather (wendy.starkweather@unlv.edu) is Director, User Services Division, and Eva Stowers (eva.stowers @unlv.edu) is Medical/Health Sciences Librarian at the University of Nevada Las Vegas Libraries. 188 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 library users. The study is part of Ohio State’s effort to actualize their culture of assessment and continuous learn- ing and to use “extant local knowledge of user popula- tions and library goals” to inform “homegrown studies to illuminate contextual nuance and character, customiza- tion that can be difficult to achieve when using externally developed survey instruments.”10 UNLV Libraries are attempting to balance early exper- imentation and more extensive data-driven decision-mak- ing. The recently adopted strategic plan includes specific directions associated with both efforts. For experimenta- tion, the direction states, “Encourage staff to experiment with, explore, and share innovative and creative applica- tions of technology.”11 To that end, we have begun work- ing with our colleagues to introduce easy, small-scale efforts designed to test the waters of mobile technology use through small pilot projects. “Text-a-Librarian” has been added to our existing group of virtual reference service, and we introduced a “text the call number and record” service to our library’s OPAC in July 2009. UNLV Libraries’ strategic plan helps foster the healthy balance by directing library staff to “emphasize data collec- tion and other evidence based approaches needed to assess efficiency and effectiveness of multiple modes and formats of access/ownership” and “collaborate to educate faculty and others regarding ways to incorporate library collections and services into education experiences for students.”12 Action items associated with these directions will help the Libraries learn and apply information specific to their users as the Libraries further adopt and integrate mobile tech- nologies into their services. As we begin our planning in earnest, we look forward to our own set of valuable discoveries. References 1. Janna Anderson and Lee Rainie, The Future of the Internet III, Pew Internet & American Life Project, http://www.pewinternet .org/~/media//Files/Reports/2008/PIP_FutureInternet3.pdf .pdf (accessed July 20, 2009). 2. Sam Churchill, “Smartphone Users: 110M by 2013,” blog entry, Mar. 24, 2009, dailywireless.org, http://www.daily wireless.org/2009/03/24/smartphone-users-100m-by-2013 (accessed July 20, 2009). 3. MG Siegler, “State Of The iPhone Ecosystem: 40 Million Devices and 50,000 Apps,” blog entry, June 8, 2009, Tech Crunch, http://www.techcrunch.com/2009/06/08/40-million-iphones -and-ipod-touches-and-50000-apps (accessed July 20, 2009). 4. Jenna Wortham, “Palm App Catalog Hits a Million Down- loads,” blog entry, June 24, 2009, New York Times Technology, http://bits.blogs.nytimes.com/2009/06/24/palm-app-catalog- hits-a-million-downloads (accessed July 20, 2009). 5. Larry Johnson, Alan Levine, and Rachel Smith, Horizon Report, 2009 Edition (Austin, Tex.: The New Media Consortium, 2009), http://www.nmc.org/pdf/2009-Horizon-Report.pdf (accessed July 20, 2009). 6. University of California, Davis. “More than 40% of Cam- pus Students Own Smartphones, Yearly Tech Survey Says,” TechNews, http://technews.ucdavis.edu/news2.cfm?id=1752 (accessed July 20, 2009). 7. University of Nevada Las Vegas, Office of Informa- tion Technology, “Student Technology Survey Report: 2008– 2009,” http://oit.unlv.edu/sites/default/files/survey/Survey Results2008_Students3_27_09.pdf (accessed July 20, 2009). 8. Joan Lippincott, “Mobile Technologies, Mobile Users: Implications for Academic Libraries,” ARL Bi-monthly Report 261 (Dec. 2008), http://www.arl.org/bm~doc/arl-br-261-mobile .pdf. (accessed July 20, 2009); Ellyssa Kroski, “Library Mobile Initiatives,” Library Technology Reports 44, no. 5 (July 2008): 33–38. 9. “UNLV Libraries Strategic Plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed July 20, 2009): 2. 10. Char Booth, Informing Innovation: Tracking Student Inter- est in Emerging Library Technologies at Ohio University (Chicago: Association of College and Research Libraries, 2009), http:// www.ala.org/ala/mgrps/divs/acrl/publications/digital/ ii-booth.pdf (accessed July 20, 2009); “UNLV Libraries Strategic Plan 2009–2011,” 6. 11. “UNLV Libraries Strategic Plan 2009–2011,” 2. 12. Ibid. 3232 ---- Patrick Griffis Building Pathfinders with Free Screen Capture Tools BUILDING PATHFINDERS WITH FREE SCREEN CAPTURE TOOLS | GRIFFIS 189 This article outlines freely available screen capturing tools, covering their benefits and drawbacks as well as their potential applications. In discussing these tools, the author illustrates how they can be used to build pathfinding tuto- rials for users and how these tutorials can be shared with users. The author notes that the availability of these screen capturing tools at no cost, coupled with their ease of use, provides ample opportunity for low-stakes experimenta- tion from library staff in building dynamic pathfinders to promote the discovery of library resources. O ne of the goals related to discovery in the University of Nevada Las Vegas (UNLV) Libraries’ strategic plan is to “expand user awareness of library resources, services and staff expertise through promotion and technology.”1 Screencasting videos and screenshots can be used effectively to show users how to access mate- rials using finding tools in a systematic, step-by-step way. Screencasting and screen capturing tools are becoming more intuitive to learn and use and can be downloaded for free. As such, these tools are becoming an efficient and effective method for building pathfinders for users. One such tool is Jing (http://www.jingproject.com), free- ware that is easy to download and use. Jing allows for short screencasts of five minutes or less to be created and uploaded to a remote server on Screencast.com. Once a Jing screencast is uploaded, Screencast.com provides a URL for the screencast that can be shared via e-mail or instant message or on a webpage. Another function of Jing is recording screenshots, which can be annotated and shared by URL or pasted into documents or pre- sentations. Jing serves as an effective tool for enabling librarians working with students via chat or instant mes- saging to quickly create screenshots and videos that visu- ally demonstrate to students how to get the information they need. Jing stores the screenshots and videos on its server, which allows those files to be reused in subject or course guides and in course management systems, course syllabi, and library instructional handouts. Moreover, Jing’s files storage provides an opportunity for librarians to incorporate tutorials into a variety of spaces where patrons may need them in such a manner that does not require internal library server space or work from inter- nal library Web specialists. Trailfire (http://www.trailfire.com) is another screen- capturing tool that can be utilized in the same man- ner. Trailfire allows users to create a trail of webpage screenshots that can be annotated with notes and shared with others via a URL. Such trails can provide users with a step-by-step slideshow outlining how to obtain specific resources. When a trail is created with Trailfire, a URL is provided to share. Like Jing, Trailfire is free to download and easy to learn and use. Wink (http://debugmode.com/wink) was originally created for producing software tutorials, which makes it well suited for creating tutorials about how to use data- bases. Although Wink is much less sophisticated than expensive software packages, it can capture screenshots, add explanation boxes, buttons, titles, and voice to your tutorials. Screenshots are captured automatically as you use your computer on the basis of mouse and keyboard input. Wink files can be converted into very compressed Flash presentations and a wide range of other file types, such as PDF, but do not support AVI files. As such, Wink tutorials converted to Flash have a fluid movie feel simi- lar to Jing screencasts, but Wink tutorials also can be con- verted to more static formats like PDF, which provides added flexibility. SlideShare (http://www.slideshare.net) allows for the conversion of uploaded PowerPoint, OpenOffice, or PDF files into online flash movies. An option to sync audio to the slides is available, and widgets can be created to embed slideshows onto websites, blogs, subject guides, or even social networking sites. Any of these tools can be utilized for just-in-time vir- tual reference questions in addition to the common use of just-in-case instructional tutorials. Such just-in-time screen capturing and screencasting offer a viable solu- tion for providing more equitable service and teachable moments within virtual reference applications. These tools allow library staff to answer patron questions via e-mail and chat reference in a manner that allows patrons to see processes for obtaining information sources. Demonstrations that are typically provided in face-to- face reference interactions and classroom instruction ses- sions can be provided to patrons virtually. The efficiency of this practice is that it is simpler and faster to capture and share a screencast tutorial when answering virtual reference questions than to explain complex processes in written form. Additionally, the fact that these tools are freely available and easy to use provides library staff the opportunity to pursue low-stakes experimentation with screen capturing and screencasting. The primary drawback to these freely available tools is that none of them provides a screencast that allows for both voice and text annotations, unlike commercial prod- ucts such as Camtasia and Captivate. However, tutorials rendered with these freely available tools can be repur- posed into a tutorial within commercial applications like Camtasia Studio (http://www.techsmith.com/camtasia .asp) and Adobe Captivate (http://www.adobe.com/ products/captivate/). Patrick Griffis (patrick.griffis@unlv.edu) is Business Librarian, University of Nevada Las Vegas Libraries. 190 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 As previously mentioned, these easy-to-use tools can allow screencast videos and screenshots to be integrated into a variety of online spaces. A particularly effective type of online space for potential integration of such screencast videos and screenshots are library “how do I find . . .” research help guides. Many of these “how do I find . . .” research help guides serve as pathfinders for patrons, outlining processes for obtaining information sources. Currently, many of these pathfinders are in text form, and experimentation with the tools outlined in this article can empower library staff to enhance their own pathfinders with screencast videos and screenshot tutorials. Reference 1. “UNLV Libraries Strategic Plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed July 30, 2009): 2. UNLV Special Collections continued from page 186 References 1. Peter Michel, “Dino at the Sands,” UNLV Special Collec- tions, http://www.library.unlv.edu/speccol/dino/index.html (accessed July 28, 2009). 2. Peter Michel, “UNLV Special Collections Search Box.” UNLV Special Collections. http://www.library.unlv.edu/speccol/ index.html (accessed July 28, 2009). 3. UNLV Special Collections search results, “Hoover Dam,” http://www.library.unlv.edu/speccol/databases/index .php?search_query=hoover+dam&btS=Search&cols[]=oh&cols []=man&cols[]=photocoll&act=2 (accessed October 27, 2009). 4. UNLV Libraries, “Southern Nevada: The Boomtown Years,” http://digital.library.unlv.edu/boomtown/ (accessed July 28, 2009). 5. UNLV Special Collections, “What’s New in Special Col- lections,” http://blogs.library.unlv.edu/whats_new_in_special_ collections/ (accessed July 28, 2009). 6. UNLV Special Collections, “UNLV Special Collections Facebook Homepage,” http://www.facebook.com/home .php?#/pages/Las-Vegas-NV/UNLV-Special-Collections/70053 571047?ref=search (accessed July 28, 2009). 7. UNLV Libraries, “Comments Section for the Aerial View of Hughes Aircraft Plant Photograph,” http://digital.library .unlv.edu/hughes/dm.php/hughes/82 (accessed July 28, 2009); UNLV Libraries, “‘Rate It’ feature for the Aerial View of Hughes Aircraft Plant Photograph,” http://digital.library.unlv.edu/ hughes/dm.php/hughes/82 (accessed July 28, 2009); UNLV Libraries, “RSS feature for the index to the Welcome Home How- ard Digital Collection” http://digital.library.unlv.edu/hughes/ dm.php/ (accessed July 28, 2009). STATEMENT OF OWNERSHIP, MANAGEMENT, AND CIRCULATION Information Technology and Libraries, Publication No. 280-800, is published quarterly in March, June, September, and December by the Library Information and Technology Association, American Library Association, 50 E. Huron St., Chicago, Illinois 60611-2795. Editor: Marc Truitt, Associate Director, Information Technology Resources and Services, University of Alberta, K Adams/Cameron Library and Services, University of Alberta, Edmonton, AB T6G 2J8 Canada. Annual subscription price, $65. Printed in U.S.A. with periodical-class postage paid at Chicago, Illinois, and other locations. As a nonprofit organization authorized to mail at special rates (DMM Section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. ExTENT AND NATURE OF CIRCULATION (Average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: September 2009 issue). Total number of copies printed: average, 5,096; actual, 4,751. Mailed outside country paid subscriptions: average, 4,090; actual, 3,778. Sales through dealers and carriers, street vendors, and counter sales: average, 430; actual 399. Total paid distribution: average, 4,520; actual, 4,177. Free or nominal rate copies mailed at other classes through the USPS: average, 54; actual, 57. Free distribution outside the mail (total): average, 127; actual, 123. Total free or nominal rate distribution: average, 181; actual, 180. Total distribution: average, 4,701; actual, 4,357. Office use, leftover, unaccounted, spoiled after printing: average, 395; actual, 394. Total: average, 5,096; actual, 4,751. Percentage paid: average, 96.15; actual, 95.87. S t a t e m e n t o f O w n e r s h i p , M a n a g e m e n t , a n d C i r c u l a t i o n ( P S F o r m 3 5 2 6 , S e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e U n i t e d S t a t e s P o s t O f f i c e P o s t m a s t e r i n C h i c a g o , O c t o b e r 1 , 2 0 0 9 . 3233 ---- ENHANCING OPAC RECORDS FOR DISCOVER | GRIFFIS AND FORD 191 Patrick Griffis and Cyrus Ford Enhancing OPAC Records for Discovery This article proposes adding keywords and descriptors to the catalog records of electronic databases and media items to enhance their discovery. The authors contend that sub- ject liaisons can add value to OPAC records and enhance discovery of electronic databases and media items by providing searchable keywords and resource descriptions. The authors provide an examination of OPAC records at their own library, which illustrates the disparity of use- ful keywords and descriptions within the notes field for media item records versus electronic database records. The authors outline methods for identifying useful keywords for indexing OPAC records of electronic databases. Also included is an analysis of the advantages of using Encore’s Community Tag and Community Review features to allow subject liaisons to work directly in the catalog instead of collaborating with cataloging staff. A t the University of Nevada Las Vegas (UNLV) Libraries’ Discovery Mini-Conference, there was a wide range of initiatives and ideas presented. Some were large-scale initiatives that focused on design- ing search platforms and systems as well as information architecture schemas that would enhance library resource discovery. But there was not much focus on enhancing the representation of library resources within the construct of bibliographic records in the OPAC. Since searching plat- forms can only be as useful as the information available for searching, and since OPAC records are the method for representing the majority of library resources, we thought it important that the prominence of OPAC records and how they represent library resources be considered in the Mini-Conference. To that end, our presentation focused on enhancing the OPAC records for nonbook items to support their discoverability as opposed to focusing on search systems and information architecture schemas. Our proposition was that subject liaisons’ expertise could be used to enhance OPAC records by including their own keyword search terms and descriptive summaries in OPAC records for electronic databases as well as records of media items. This proposition acts as a moderate approach to ini- tiatives that call for OPAC records to be opened for user- generated content in that this approach provides subject liaison mediation and expertise to modify records. As such, this approach may serve as an effective stopgap in cases where there is resistance toward permitting social tagging and user descriptions within OPAC records. Such an initiative also is scalable, allowing liaisons to provide as few or as many terms as they want. Such an initiative would require collaboration between cataloging staff and subject liaisons. n Disparity between media and database records At UNLV Libraries, terms included in the notes fields of bibliographic records are indexed for keyword searching. In the case of media items, there is extensive use of notes to include descriptive terms that enhance discoverability for users. For example, notes for films indicate any awards the film has won as well as festivals in which it has been featured (see figure 1). As a result, users can discover films through keyword searches of film awards or film festivals. A film student who is searching “Cannes Film Festival” via a keyword search will generate results that include films owned by UNLV Libraries that have been featured at that festival. These keyword-searchable notes add value and discoverability for this type of material, and subject liaisons can be a source for such information. While it appears that notes in media records are heavily populated with a variety of user-centric informa- tion, there is relatively little use of descriptive notes for Figure 1. The notes field in an OPAC record of a film item Patrick Griffis (patrick.griffis@unlv.edu) is Business Librarian and Cyrus Ford (cyrus.ford@unlv.edu) is Special Formats Catalog Librarian, University of Nevada Las Vegas Libraries. 192 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2009 electronic databases (see figure 2). For databases, notes traditionally include information about access restrictions and mode of access while overlooking information rep- resenting the content of the resource. These fields could be utilized for specific terms relating to database con- tent not adequately covered by the Library of Congress Subject Headings (LCSH). Subject liaisons have special- ized knowledge of which databases work best for unique content areas, class assignments, and information needs. This user-centric knowledge can be used to enhance database discovery if liaisons were to provide catalogers with information and descriptors to add to the record. As an example, at UNLV Libraries there is one par- ticular database that provides a Strengths, Weaknesses, Opportunities, and Threats (SWOT) analysis for compa- nies, but that natural language term isn’t found anywhere in the general database summary listing or subject head- ings. If it were added to a note field as part of a descrip- tion or as a labeled descriptor, then students could easily find this database to complete their assignments. This proposal is scalable, allowing liaisons to provide as few or as many key terms as they want, depending on their preference or on the vagaries of a particular data- base. Subject liaisons could opt to add a few major terms from their own knowledge and expertise that they feel will add value for patrons searching the OPAC. Subject liaisons also could mine the index and thesaurus terms of individual databases to identify prominent content areas for individual databases to find useful keywords. n Mining electronic database index descriptors Electronic databases typically have subject matter tax- onomies developed by experts who assign descriptors to journal articles. Subject liaisons could mine these taxonomies to identify predominant descriptors for indi- vidual databases to add to the database catalog records. Predominance of a subject descriptor could be deter- mined by examining the relative number of articles that are assigned to that descriptor. Such a strategy of indexing key predominant subject descriptors identified from database subject matter taxonomies could serve to uncover unique content areas not served with LCSH. A different application of this strategy could be employed for identifying predominant and emerging research areas for particular groups. Subject liaisons could conduct a citation analysis of articles authored by members of a particular research group to record and codify the subject descriptors of each article. Once codi- fied, an analysis could determine the most predominant subject descriptors for articles authored by that particu- lar group. This could serve as a baseline for identifying emerging research areas and their terms. Both types of analysis have potential to provide useful keyword terms for database records. n Using Encore’s community features In 2008, UNLV Libraries purchased and implemented the Innovative Interfaces’ Encore discovery platform, which provides a Google-like interface for searching the public catalog and the ability to narrow results using fac- ets such as location, year, language, and format. Encore also includes many display features that showcase the information provided in the bibliographic records. Two of Encore’s Web 2.0 features provide users with the abil- ity to contribute data to records via community tags and community reviews. UNLV requires users to enter a valid library barcode number and PIN. Subject liaisons could use the community reviews feature to add descriptive summaries of items to Encore records independently, without the need for cataloging staff to edit a MARC record. However, the content of community reviews are not indexed for searching and thus only add value at the point when a user is determining whether the resources they have retrieved are valuable for them. On the other hand, if a community tag is added to an item, that tag is included in the community tags section Figure 2. The notes field in an OPAC record of an electronic database ENHANCING OPAC RECORDS FOR DISCOVER | GRIFFIS AND FORD 193 of the Encore result display and becomes an indexed keyword for searches in Encore (see figure 3). If that tag term is searched in Encore’s keyword search, the biblio- graphic record attached to that tag term will be included in the results list under the community tags facet. Since these community tags are searchable, subject liaisons can add keywords to Encore records without collaboration with cataloging staff. However, this provides limited suc- cess because the keyword is included and indexed only in Encore records—not in the OPAC records. Also, the community tags facet must be selected from the results display for the Encore record tags to be searchable. n The case for collaboration As described above, keywords and descriptions added by subject liaisons into Encore records have inherent dis- covery limitations when compared to a cataloger adding the same information directly to the MARC bibliographic record. The advantages of collaboration between subject liaisons and catalogers is clear, and subject librarians at UNLV Libraries have experienced similar collaboration in efforts in the past. In 2006, subject librarians at UNLV Libraries were offered the opportunity to create their own descriptions of electronic resources through an initiative to update the summary descriptions for the electronic databases portion of the Libraries’ website. At that time, all existing electronic database summaries were those used by the database publishers. The project provided subject liaisons the option to create custom summary descriptions to rep- resent electronic databases in their own terms. Each sub- ject liaison had a document file for their descriptions, and the website editors used them to update the electronic databases list on the Libraries’ website. This particular initiative serves as one example of the willingness of subject liaisons to share their subject exper- tise to enhance the representation of library resources through collaboration with Technical Services staff. As such, collaboration between subject liaisons and catalog- ers to allow liaisons to add terms to OPAC records of electronic databases and media items could prove to be both effective and feasible as an initiative toward enhanc- ing the discovery of library resources. Figure 3. Encore community tag LITA cover 2, cover 3, cover 4 Index to Advertisers 3234 ---- 2 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 Andrew K. PacePresident’s Message I n my first column, I mentioned that the LITA board’s main objective is “to oversee the affairs of the division during the period between meetings.” Of course, over- sight requires communication. Sometimes this is among board members, or it’s an e-mail update, or a post to the LITA-L discussion list, or even the articles in this journal. Regardless, I see the cornerstone of “between-meeting oversight” as keeping the membership fully (or even partially) engaged from January through June and July through December. As a mea culpa for the board, but without placing the blame on any one individual, I am willing to concede that the board has not done an adequate job of engaging the membership between American Library Association (ALA) meetings. While ALA itself is addressing this problem with recommendations for virtual participation and online collaboration, LITA should be at the forefront of setting the benchmark for virtual communication, par- ticipation, education, planning, and membership devel- opment. In an attempt to posit some solutions, as opposed to finding someone to blame, I first thought of the LITA committees. Which one should be responsible for commu- nicating LITA opportunities and events to the member- ship using twenty-first-century technology? Education? Membership? Web Coordinating? Program Planning? Publications? In the end, I was left with the choice of two evils: merge all the committees into one so that they can do everything or create a new committee to deal with the perceived problem. Knowing that neither of those solutions will suffice, I’d like to put the onus back on the membership. Maybe I’m trying to be a 2.0 librarian—crowdsourcing the prob- lem, that is, taking the task that might have been done by an individual or committee and asking for more of a community-driven solution. In the past, LITA focused on the necessary technologies for crowdsourcing—discus- sion lists, blogs, and wikis—as if the technology alone could solve the problem. The BIGWIG Taskforce and Web Coordinating Committee have shouldered the burden of both implementing the technology and gaining philo- sophical consensus on its use—a daunting task that can easily appear chaotic. Now that the technology is com- moditized (and generally embraced by ALA at large and other divisions as well), perhaps it is time to embrace the philosophy of crowdsourcing. Maybe it’s just because I have had cloud computing and web-scale architectures on the brain too much lately (having decided that it is impossible to serve two mas- ters—job and volunteer work—I shall forever endeavor to find the overlap between the two), but I sincerely believe that repeating the mantra that LITA’s strength is its membership is not mere rhetorical lipservice. EBay is better for sellers because there are so many buyers; it is better for buyers because there are so many sellers. GoogleDocs works for sharing documents better than a corporate wiki or Microsoft Sharepoint because it breaks down the barriers of domains, allowing the participants to determine who shares responsibility for producing something. BarCamps are rising in popularity not only because of a content focus on open data, open source, and open access, but because of the participatory and user- generated style of the BarCamp-style meetings. As a division of ALA, LITA has two challenges— leading the efforts of educating the membership, other divisions, and ALA about impending sea changes in information technology, but also embracing these tech- nologies itself. We must eat our own dog food, as the say- ing goes. Perhaps it is more fitting to suggest that LITA must not only focus on getting technology to work, but putting technology to work. In the next few months, the LITA board will be tack- ling LITA’s strategic plan, which expires in 2008. That means it is time not only to review the strategy—to edu- cate, to serve, to reach out—but also to assess the tactics employed to fulfill that strategy. You are probably reading this column in or after the month in which the strategic plan ends, which does not mean that we will be coasting into the ALA Midwinter Meeting. On the contrary, I sin- cerely hope to gather enough information from commit- tees, task forces, members, and nonmembers in order for the LITA leadership to come up with something strategi- cally meaningful going into the next decade. One year isn’t nearly long enough to see something this big through to completion. Just as national politicians begin reelection campaigns as soon as they are elected, I suspect that ALA divisional presidents begin think- ing about their legacy within the first couple months of office, if not before. But I hope, at least, to establish some groundwork, including a platform strategy that will allow the membership to maintain a connection with the board and with other members—to crowdsource solutions on a scale that has not been attempted in the past and that will solidify our future. And when we have a plan, you can trust that we will use all the available methods at our disposal to promote it and solicit your feedback. andrew K. Pace (pacea@oclc.org) is LITA President 2008/2009 and Executive Director, Networked Library Services at OCLC Inc. in Dublin, Ohio. 3235 ---- EDitorial | truitt 3 Marc TruittEditorial A s I write this, Hurricane Ike is within twelve hours of making landfall in Texas; currently, it appears that the storm will strike directly at the Houston– Galveston area. Houstonians with long memories will be comparing Ike to Hurricane Alicia, which devastated the region in 1983, killing twenty-one and doing $2.6 billion in damage.1 Younger residents and/or more recent immi- grants to the area will recall Tropical Storm Allison, which though not of hurricane force, lashed the city and much of east Texas for two weeks in June 2001, leaving in its wake twenty-three dead, $6.4 billion in losses, and tens of thou- sands of homes damaged or destroyed.2 And of course, more recently, and much better known to all of us, regard- less of where we live, Katrina, the “mother of all storms,” killed over eighteen hundred, caused over $80 billion in damage, left huge swaths of New Orleans uninhabitable, and created a population exodus with whose effects we are living even to this day.3 Common to each of these disasters—and so many others like them—is the fact that they have often wrought terrible damage on libraries in their areas. Most of us have probably seen the pictures of the water- and mildew- damaged collections at Tulane, Xavier, the University of New Orleans, and the New Orleans public library sys- tem. And the damage from these events is long-term or even permanent. I formerly worked at the University of Houston (UH), and when I left there in 2006 that institu- tion was still dealing with the consequences of Allison’s destruction of UH’s subterranean law library. And now I have to wonder whether UH librarians, faculty, and stu- dents might not be facing a similar or even worse catas- trophe all over again with Ike. ITAL editorial board member Donna Hirst has done the profession a great service with her column, “The Iowa City Flood of 2008: A Librarian and IT Professional’s Perspective,” which appears in this issue. Her account of how library IT folks there dealt with relocations of serv- ers, library staff, and indeed library IT staff members themselves should be made required reading for all of us in the field, as well as for senior library administrators. The problem, I think we all secretly know, is that emer- gency preparedness—also known by its current moniker “business continuity planning” (BC)—and disaster recov- ery (DR) are not “sexy” subjects. Devoting a portion of our always too modest resources of money, equipment, staffing, and time to what is, at best, a sort of insurance against what might happen someday seems inexcusably profligate today. Such planning and preparation doesn’t roll out any shiny new services and will win few plaudits from staff or patrons, to say nothing of new resources from those who control our institutional purse strings. Buying higher bandwidth equipment for a switching closet is likely to be a far easier sell. That is, until that unthinkable something happens, and your organization is facing (or suffers) a catastrophic loss of IT services. Note that I didn’t say “equipment” or “infrastructure.” The really important loss will be one of services. “Stuff”—in the form of servers, workstations, net- works, etc.—all costs money, but ultimately is replaceable. What are not replaceable—at least not immediately—are library services to staff and patrons: access to comput- ing (networking, e-mail, productivity applications, etc.), Internet resources, and perhaps most importantly nowa- days, the licensed electronic content on which we and our patrons have so come to rely. While the news coverage will emphasize (not without justice, I think) the lost or rescued books in a catastrophic loss situation, what staff and patrons are likely to demand first and loudest will be continuation or restoration of technology-based library services such as e-mail, Web presence, Web access, and licensed content. Lest there be doubt, does anyone recall what drove evacuees into public libraries in the wake of Katrina? It was, as much as anything, the desire to locate loved ones and especially the need to seek informa- tion and forms for government assistance—all of which required access to networked computing resources. If we have one at all—I suspect that many of us have a DR plan that is sadly dated and that has never been tested. Look at it this way: Would you roll out a critical and highly visible new Web service without careful prep- aration and testing? Yet many of us somehow think that BC or DR is somehow different, with no periodic review or testing required. Since we feel we have no resources to devote to BC or DR planning and testing, we excuse our failure to do so by telling ourselves and our administra- tions that “we can’t really plan for a disaster, since the precise circumstances for which we’re planning won’t be the ones that actually occur.” And so we find ourselves later facing a crisis without any preparation. Here at the University of Alberta Libraries, we’ve been giving the questions of business continuity and disaster recovery a good deal of thought lately. Our preexisting DR plan was typical of the sort I’ve described above: out- of-date, vanishingly skeletal in its details, without explicit reference or relevance to maintenance and restoration of mission critical services, and of course, untested. Impetus for our review has come from several sources. Perhaps the most interesting of these has been a univer- sity-sponsored BC planning process that embraces a two- pronged approach: marc truitt (marc.truitt@ualberta.ca) is Associate Director, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 4 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 n Identify and prioritize your organization’s services. Working with other constituencies within the library, we have identified and prioritized approxi- mately ten broad services to be maintained or restored in the event of an interruption of our normal business activities. For example, our top priority is the continuation or restoration of access to licensed electronic content (e.g., e-journals, e-books, databases, etc.). Our IT disaster planning will be informed by and respond to this goal. n Identify “upstream” and “downstream” dependencies. We are dependent on others for services so that we can provide our own; thus we cannot offer access to the Internet for our users unless campus IT provides us with a gateway to off-campus networks. We need to make certain as we plan that campus IT is aware of and can provide this service in the scenarios for which we’re planning. By the same token, others are dependent on us for the provision of services critical to their planning: our consortial partners, for example, rely on us for ILS, document delivery, and other technology-based services that we need to plan to continue in the event of a disaster. These two facets—services and dependencies—can be expressed as a matrix that is helpful in planning for BC and DR goals that are both responsive to the needs of the organization and achievable in terms of upstream and downstream dependencies. It has been an enlighten- ing exercise. One consequence has been our decision to include, as part of next fiscal year’s budget request, fund- ing to help create a DR site at our library’s remote storage facility, to enable us quickly to restore access to our most critical technology services. In the past, we might have used this annual request as an opportunity to highlight our need for funding to support rolling out some glamor- ous new service initiative. With this request, though, we are explicitly recognizing that we as an organization need to commit to measures that ensure the continuance in a variety of situations of our existing core services. That’s a major change in mindset for us, as I suspect it would be for many library IT organizations. A final interesting aspect of our planning process is that one of the major drivers for the university is a con- cern about business continuity in the event of a people- based disaster. As avian influenza (aka, “bird flu”) has spread beyond the confines of its Southeast Asian point of origin, worry about how we continue to operate in the midst of a pandemic has been added to the more predict- able suite of fires, floods, tornadoes, and earthquakes (okay, not likely in Alberta). Indeed, pandemic planning is in many ways far more difficult than that for more “normal” disasters. While in many smaller libraries the “IT shop” may be comprised of one person in many hats, in larger organizations such as ours (approximately 25 full-time equivalent employees in library IT), there tends to be a great deal of specialization. Can the webmaster, in the midst of a crisis, support staff workstations? Can the help desk technician deduce why our vendor for Web of Science has suddenly and inexplicably disabled our access? Our BC process rules tell us that we should be planning for “three-deep” expertise in all critical areas, since the assumption is that a pandemic might mean that a third or more of our staff would be ill (or worse) at any given time. How many of us offer critical technology ser- vices that suffer from that IT manager’s ultimate staffing nightmare, the single point of failure? We have no profound answers to these questions, and our planning process is by no means the one that will work for all organizations. But the evidence of Katrina, Ike, and Iowa City is plain: We need to be as prepared as possible for these events. The time to “get religion” about business continuity and disaster recovery is before the unthinkable occurs, not after. Are there any of you out there with experiences—either in preparation and plan- ning or in recovery operations—that you would consider sharing with ITAL readers? We all would benefit from your thoughts and experiences. I know I would! Post-Ike postscript. Ike roared ashore four days ago and it is clear from media coverage since that Galveston suffered a catastrophe and Houston was badly damaged. Reports from area libraries are sketchy and only today beginning to filter out. Meanwhile, at the University of Houston, the building housing the Architecture Library lost its roof, and the salvageable portions of its collection are to be relocated to the main M.D. Anderson Library. References 1. “Hurricane Alicia,” Wikipedia, http://en.wikipedia.org/ wiki/Hurricane_Alicia (accessed Sept. 12, 2007). 2. “Tropical Storm Allison,” Wikipedia, http://en.wikipedia .org/wiki/Tropical_Storm_Allison (accessed Sept. 12, 2007). 3. “Hurricane Katrina,” Wikipedia, http://en.wikipedia .org/wiki/Hurricane_katrina (accessed Sept. 12, 2007). 3236 ---- Author ID box for 2 column layout EDitorial BoarD tHouGHts | Hirst 5 Donna HirstEditorial Board Thoughts The Iowa City Flood of 2008: A Librarian and IT Professional’s Perspective D o you like to chase fire trucks? Do you enjoy watch- ing a raft of adventurers go over the waterfall, careening from rock to rock? Well, this is a story of the Iowa City flood of 2008, a flood projected to happen once every five hundred years, from the perspective of a librarian and IT professional. n The approach of the flood The winter of 2008 was hard, and we got mounds of snow. The spring was wet that year in Iowa City. It rained almost every day. Minnesota’s snow melt-off hadn’t been released from the reservoir due to the heavy rains. Everyone watched the river rise, day by day. The parks were underwater; the river was creeping up toward buildings, including the University of Iowa. In early June, with about a day and a half notice, library staff at the university’s main library, art library, and music library were told to evacuate. One of the first acts of evacuation was the relocation of all of the library servers to the engineering building up the hill—high and dry—literally rolling them across the street and up the sidewalk. Although all servers were relocated to engi- neering, engineering didn’t have enough power in their server room to handle the extra capacity to run all of our machines. The five Primo servers that run our Discovery searching service had to stay disconnected. With the servers safe and sound, we moved our atten- tion to staff workstations. The personal workstations of the administrative staff and the finance department were moved to the business library. The libraries’ laptops were collected and moved into the branch libraries, which would be receiving displaced staff. Many staff would be expected to work from public clusters in the various library branches, locked down to specific functions. As library staff were collecting their critical posses- sions, the town was madly sandbagging. More than a million sandbags were piled around university buildings, private businesses, and residences. In retrospect, some of the sandbags may have made a difference, but since the flood was so much greater than anticipated, the water largely went over and around, leaving a lot of soggy sandbags. On June 13, the day before the main library was to be closed, the decision was made to move books up from the basement. There were well over 500,000 volumes in the basement, and a group of approximately five hundred volunteers moved 62,000 volumes and 37,000 manuscript boxes from the lower shelves. Volunteers passed books hand to hand into the third, fourth, and fifth floors of the building. A number of the volunteers came from sandbagging teams. Individuals who had never been in a Boxes of manuscripts being stacked on the fifth floor Photo by Carol Jonck Moving boxes out of the basement Photo courtesy of the University of Iowa News Service Donna Hirst (donna-hirst@uiowa.edu) is Project Coordinator, Library Information Technology, University of Iowa Libraries, Iowa City. 6 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 library, didn’t know what a circulation desk was, or what a Library of Congress call number was were working hard side by side with physicians, ministers, scientists, students, and retirees. The end result was not orderly, but the collection was saved from the encroaching river. The libraries at the University of Iowa are indebted to these volunteers who helped protect the collection from the expected water. n The river peaks Approximately twenty university buildings were closed because of the flood, including the main library, the art building, and the music building. The university’s power plant was closed. The entire arts campus was deeply under water. Most of the main roads connecting the east side of Iowa City to the west side were closed, and most of the highways into Iowa City were closed. Interstate 80 was closed in multiple places, and no traffic was allowed from the east side of the state to the west side. Many bridges in and around Iowa City were closed; some had actually crumbled and floated down stream. So the president of the university, Sally Mason, closed the university for the first time in its history. Most staff would not be able to get to work anyway. Many individu- als were struggling with residences and businesses that were under water. The university was to be closed for the week of June 15, with the university’s hospitals con- tinuing to operate under strained conditions; continued delivery of patient services was a priority. Most library staff stayed home and followed the news stories, shocked at the daily news of destruction and loss. Select library IT staff began working in the background to set up new work environments for library staff returning to foreign workstations or relocated work environments. At the flood’s peak, the main library took several inches of water in the basement. There was slight rust- ing in the compact shelving, but the collection was com- pletely saved. A portion of the basement was lower, and the computer equipment controlling the libraries’ public computer cluster was completely ruined. This computer cluster housing more than two hundred workstations Library staff and volunteers sandbagging Photo by Carol Jonck Moving books out of the basement Photo courtesy of the University of Iowa News Service The beginning of a book chain to the fourth floor Photo courtesy of the University of Iowa News Service EDitorial BoarD tHouGHts | Hirst 7 which had been moved on the last day before the evacu- ation. Much of this administrative work could proceed, and during the first week at the business library our finance department successfully completed our end-of- year rollover process on all our materials funds. Staff from the music library, art library, preservation, and spe- cial collections were assigned to the business library. The engineering library adopted the main library circulation and reserve departments. The media services staff was relocated to the physics library. The media staff had cleverly pulled most of the staff development videos and made them available to staff from the physics library, thus allowing the many displaced library staff to make progress on staff develop- ment requirements. was completely out of commission. The basements and first floors of the art and music buildings were completely ruined, but the libraries for these disciplines were on higher floors. The collections were spared, but there was absolutely no access to the building. n The libraries take baby steps to resume service After a week of being completely shut down, the university opened to a first day of summer school, but things were not the same. For the nineteen university buildings that had been flooded, hordes of contractors, subcontractors, and labor- ers began the arduous task of reclamation. University staff could work at home when that was possible, and most of the library’s dislocated reference staff did that, devel- oping courses for the fall, progressing on selection work, and so on. Staff could take vacation, but few chose this option. Approximately 160 staff from the main library and the art and music libraries were reassigned to four branch libraries that were not affected by the flood. All of Central Technical Services (CTS) and Interlibrary Loan staff were assigned to the Hardin Health Science Library. Central shipping and facilities was also at Harden Library, thus the convoluted distribution of mail started from here. Most of the pub- lic machines were taken by CTS staff, but their routine work proceeded very slowly. CTS did not have access to OCLC until the end of their flood relocation, which seri- ously impacted their workflow. An early problem that had to be solved was providing telephones and printing to relocated staff. Virtually none of the relocated staff had dedicated telephones, even the administration. In any given location the small number of regular branch staff graciously shared their phones with their visitors. Sharing equipment tended to be true for printers as well. For a few critical phone numbers in the main library, the phone number was transferred to a designated phone in the branch. Thus often, when regu- lar staff or student workers answered a phone, they had no idea what number the originating caller was trying to call. Staff were encouraged to transfer their office phone number to their cell phone. At the business library, the library administrative staff and the finance staff had their personal workstations, Library staff sandbagging Photo by Donald Baxter 8 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 was closed for about four weeks. The art and music libraries may be closed for a year. When library staff returned to the main library, there were books and man- uscript boxes piled on the floor and on top of all the study tables. Some of the main corridors, approximately twenty- one feet wide, were so filled with library materials that you almost had to walk sideways and suck in your tummy to walk down the hall. Bathrooms were blocked and access to elevators was lim- ited. Every library study table on the third through fifth floors were piled three feet high or more with books. For many weeks, library staff and volunteers care- fully sorted through the materials and reshelved them as required. Many mate- rials needed conservation treatment, not because of the flood, but because of age and handling. Many adjustments needed to be made to resume full service. Due dates for all circula- tion categories had to be retrospectively altered to allow for the libraries being closed and for the extraordinary situations in which our library users found themselves during the flood. Library materials were returned wet and moldy, and some items were lost. During the flood, in some cases, buildings actually floated down river. The libraries’ preservation department did extensive commu- nity education regarding treatment of materials damaged in the flood. The university was very interested in documenting the affect of the flood, and thus the libraries cooperated in trying to gather statistics on the number of hours of library staff and volunteers used during the flood. Record keeping was complex, since one person could be a staff person working on flood efforts but also a volunteer working evenings and weekends. n Our neighbors The effect of the Iowa City flood of 2008 has been exten- sive, but was nothing compared to the flood in Cedar Rapids, our neighbor to the north. The Cedar Rapids Public Library lost their entire collection of 300,000 vol- umes, except for the children’s collection and 26,000 vol- umes that were checked out to library users that week. IT staff were housed throughout the newly distrib- uted libraries complex. One IT staff member was at the engineering library, one was at the health science library, and two were at the business library. Several IT staff were relocated to the campus computer center. n The libraries proceed apace despite hurdles As the water receded and workers cleaned and pro- ceeded with air handling and mold abatement, a very limited number of library staff were allowed back into the main library, typically with escorts, for very limited periods of time. During this time IT staff was able to go into the main library and retrieve barcode scanners to allow CTS staff to progress with book processing. Staff went back for unprocessed materials needing original cataloging since staff had the time to process materials but didn’t have the materials. IT staff retrieved some of our Zebra printers so that labels could be applied to unbound serials. As IT staff were allowed limited access to the main library, they went around to the various staff workstations and powered them up so that relocated staff could utilize the remote desktop function. n Moving back The art and music libraries were evacuated June 10. The main library was evacuated June 13. The main library Passing the books up the stairs Photo courtesy of the University of Iowa News Service 3237 ---- Author ID? mEtaPHor’s rolE in tHE inFormation BEHavior oF Humans intEractinG witH comPutErs | sEasE 9 Robin Sease Metaphor’s Role in the Information Behavior of Humans Interacting with Computers Metaphors convey information, communicate abstrac- tions, and help us understand new concepts. While the nascent field of information behavior (IB) has adopted common metaphors like “berry-picking” and “gap-bridg- ing” for its models, the study of how people use metaphors is only now emerging in the subfield of human informa- tion organizing behavior (HIOB). Metaphors have been adopted in human–computer interaction (HCI) to facili- tate the dialogue between user and system. Exploration of the literature on metaphors in the fields of linguistics and cognitive science as well as an examination of the history of use of metaphors in HCI as a case study of metaphor usage offers insight into the role of metaphor in human information behavior. Editor’s note: This article is the winner of the LITA/ Ex Libris Writing Award, 2008. O ur world is growing increasingly digital; our entire lives—our interactions, our entertainment, even our personal memories—are mediated by technol- ogy. Humans have had thousands of years to learn to communicate with each other, largely employing meta- phors and analogies to negotiate meaning. Our experi- ence communicating with computers is both nascent yet broadening every day with increasing dependency. We must fully understand the role that metaphors play in the exchange of information to facilitate the communication between humans and computers. n Metaphors: a definition Originally regarded as rhetorical devices, Plato abhorred the use of metaphors, arguing that they could convince a man to do the illogical. Schön explains that at that time metaphors were considered a “kind of anomaly of language, one which must be dispelled in order to clear the path for a general theory of reference or meaning.”1 Aristotle, on the other hand, saw that they provided insight into the items of comparison. “Ordinary words convey only what we know already; it is from metaphor that we can best get hold of something new.”2 Traditionally the objects in the equation have been called the tenor and the vehicle, but more recently they are referred to as the target and source domains. In the meta- phor, “Alex is a space cadet,” Alex is the tenor or target domain (the abstract or undefined), and space cadet rep- resents the vehicle or source domain (the known). If “the essence of metaphor is understanding and experiencing one thing in terms of another,” then the vehicle or the source domain is responsible for elucidating the tenor or target domain.3 One measures the relationship between these domains, the tenor and the vehicle, with “ground” and “tension.” Ground concerns the similarities between the domains and tension represents the dissimilarities.4 Metaphors have been studied from multiple perspec- tives: from the creative use of metaphors in literature to the comprehension or appreciation of metaphors.5 The research from other disciplines can offer insight into the effect of metaphors on human information behavior. I will first discuss the use of metaphors in language and then review some of the theories on how they work. n Metaphorically speaking: the role of metaphors in language The work of Lakoff and Johnson has been fundamental to understanding the pervasive use of metaphors in our language. They propose that metaphors are an underly- ing structure forming and shaping the way we discuss and even think about the world. They argue that the “human conceptual system is metaphorically structured and defined.”6 Mapping from a source domain to a target domain is central to the semantics of language and com- munication. “Domains need structure so that one can reason about them. The major function of metaphor is thus to supply structure in terms of which reasoning can be done.”7 In Metaphors We Live By, Lakoff and Johnson cata- logue examples of underlying conceptual metaphors. They identify orientation metaphors that underlie how we speak about abstract concepts such as health, happi- ness, and success. Each of these states is associated with the direction up. One can be “up and at ‘em” or in “high spirits” or of “high standing.” Counter examples include “being under the weather,” “feeling down,” and “low comedy.” Metaphors shape the way we think about the concepts we are describing. For instance, the metaphor “argument is war” (“defending your point of view,” “attacking your opponent’s stance,” and “he shot me down”) may define expectations for “winning” and “los- ing” and detrimentally shape our ability to negotiate and compromise.8 Lakoff and Johnson refer also to Michael Reddy’s 1979 piece, “The Conduit Metaphor.”9 Reddy hypoth- esizes that linguistically and conceptually we see ideas or meanings as objects, linguistic expressions as contain- ers, and communication as sending. The “receivers” of robin sease (seaser@u.washington.edu) is an MLIS candidate at the iSchool, University of Washington, Seattle. 10 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 the communication are the information users or seekers. The designers “package their ideas,” “put them down on paper,” and “convey” them to the user who “gets” them or not. Reddy argues that this underlying metaphor influences the way we think about the communication process, making information and meaning an object rather than a process, which trivializes the function of the reader or listener.10 Metaphors are undeniably central to our ability to communicate and use language, and perhaps more fun- damentally, to convey meaning or to infer meaning—to illustrate and explain as well as to identify and to catalog. The role of metaphors in human cognition is still a matter of great debate. n Thinking about metaphors: the cognitive role of metaphors Information science is at its heart the study of informa- tion. If metaphors exist as a necessary component of language—a tool to convey meaning and to transfer information—then metaphors are by necessity a compo- nent of information science. Understanding how meta- phors work provides insight into information itself. Early propositions about how metaphors were under- stood stemmed from poetic and rhetorical research. That is, if a sentence cannot be interpreted literally, then it must be interpreted figuratively. To illustrate, the asser- tion “my child is a pig” is initially illogical, so the receiver would then move on to figurative interpretation. Once that determination is made, the mind sets about find- ing meaning from the expression. This theory argues that once the statement is deemed false, the statement is treated like a simile, or a comparison statement, by identifying traits or attributes in the source domain (the pig: sloppy, slovenly, fascinated with mud) that would be applicable to the target domain.11 One group of theorists questions this premise, point- ing to sentences that can be interpreted literally and figu- ratively. One useful example is the statement “my dog is an animal.”12 While this expression is true literally, most would reject the literal interpretation in favor of one that depicts the dog as a ferocious or uncontrollable beast. Glucksberg and Keysar, among others, seek a model that focuses on the associations between the domains. They hypothesize that metaphors are not “implicit compari- sons” but are class-inclusion statements or “assertions of categorization.”13 Research in cognitive processing of analogies has shifted from plain association of A is to B where A traits are matched to B traits to a hypothesis that maps from A to B and leads insight into a super-ordinate category that includes both A and B. Gentner’s work studying science metaphors in the 1980s is partially founded on this theory. She notes that through “analogical reasoning, learning can result in the generation of new categories and schemas.”14 She is particularly interested in creating ways for computers to interpret figurative expressions. She proposes a structure- mapping theory: a system of relations (not just traits) from the source domain to the target domain with a parallelism between the structures that allows for a one-to-one map- ping of the domains and relationships. Weiner explores a similar tactic with human–computer interaction language processing by prototyping the shared framework. The prototype theory allows for a range of possible predicates and would accommodate greater tension (the differences in a metaphor) in the same way that we can categorize penguins and chickens under the prototype of bird.15 These theories of categorization remain popular today, but still struggle to account for certain things about the way metaphors are comprehended. Specifically, take the Shakespearean line, “Juliet is the sun.” Categorization theory does not explain why some attributes like “glow- ing” and “center of the solar system” are transferred from the source while others such as “nuclear” and “huge” are not.16 This theory also stumbles with novel poetic metaphors like e e cummings’ “the voice of your eyes is deeper than all roses.”17 Alternative theorists argue that while the categorization-based theories accommodate the ground (commonality) in a metaphor, they fail to fully explain the effect and purpose of the tension (differences) in the equation. Lakoff fervently contends that simplifying conceptual models to mere categorization ignores the unique nature of each specific mapping: Each mapping defines an open-ended class of potential correspondences across inference patterns. When acti- vated, a mapping may apply to a novel source domain knowledge structure and characterize a corresponding target domain knowledge structure.18 In other words, each pairing creates new meaning or conceptual frameworks from which other metaphors and meanings can be instantiated. A is to B creates meaning C, rather than A and B are part of C. Looking at it from the perspective of Lanier, a vocabulary is created upon which we can define even more vocabulary.19 Lakoff maintains that the theory of conceptual domains speaks to both the uni-directional nature of metaphors as well as the “syste- maticity” that allows the interpreter to selectively identify the aspects that are consistent and discard the aspects that are inconsistent with the metaphor.20 More recent work approaches the question from a connectivist point of view, seeking ways to identify an overarching model consistent with and encompassing of other theories. This premise rests on the foundation of metaphor as communication and examines the use of metaphors in conversational contexts. The necessary mutual cognitive environment of the communicators, the mEtaPHor’s rolE in tHE inFormation BEHavior oF Humans intEractinG witH comPutErs | sEasE 11 working memory, and the common ground that they find are all of importance, but so are context and motivation as influencing factors. The context in which the statement is made, the place in which it is interpreted, and the moti- vation of the user to understand the statement combine to affect the meaning that is derived. For instance, the phrase, “I want you to sheepdog this project” could mean something different in the context of a chaotic group of workers than in the context of a core team threatened by competing entities.21 Likewise, the relationship of the receiver to the sender could modify the motivation of the receiver to seek meaning beyond the first or easiest interpretation. n Classifying metaphors: metaphors in information science These notions of context and user-motivation are not new to the field of information science. At the turn of the cen- tury the subfield of information behavior had begun to direct its attention to cognitive psychology, the nature of man–machine dialogue, and to a certain extent the role of metaphor in deciphering and creating meaning. Spink investigates human information behavior (HIB) from an evolutionary perspective.22 After exploring a wide variety of research in fields, Spink and Currier per- formed a qualitative analysis of the information behavior of historical figures. They postulated that modular cogni- tive architecture makes Homo sapiens rare in their ability to think of one thing in terms of another.23 The resulting mapping allows for the creation of new cognitive struc- tures in a similar fashion to Lanier’s vocabulary develop- ment conjecture. Spink and Currier’s work launched a new theory of information use, which has led to recent research into metaphor use. In an attempt to model an integrative approach to human information behav- ior incorporating the everyday life information-seeking and sense-making approach, the information-foraging approach, and the problem-solution view of information seeking, Spink and Cole recognized a gap in the research covering actual information use and proffered a fourth information approach to account for it. Their informa- tion-use theory “starts from an evolutionary psychology notion that humans are able to adapt to their environment and survive because of our modular cognitive architec- ture.”24 Development of this theory has birthed a sub-area within the field of human information behavior dubbed human information organizing behavior (HIOB) of which the use of metaphors or metaphor instantiation is a neces- sary component. Cole and Leide explore the notion of modular cog- nitive architecture in an attempt to model a cognitive framework for metaphor use in HIOB. Similar to the categorization theory of metaphor use, they claim that “metaphor instantiation is similar to a form of super- ordinate category instantiation . . . along with the meta- phor comes the structure of the metaphor.”25 Following in Belkin’s footsteps, they address the problem of a “domain novice attempting to formulate his information need into an effective query to an information retrieval system.”26 They conducted three case studies with the purpose of developing a methodology that researchers can use to “ascertain the efficacy of metaphor instantia- tion as an information need structuring device.”27 They conclude that metaphor instantiation might help us create systems that more closely resemble the way that humans behave with information: interaction, organization, and retrieval. n Metaphors in human–computer interaction: a case study reality bytes While theorists of various fields explored the nature of metaphors, the field of human–computer interaction (HCI) found itself thrust into the thick of it. Rarely does one intentionally adopt new ideas so whole-heartedly without first considering the ramifications, but the history of HCI shows that that is exactly what happened. It began with enthusiastic adoption to improve communication, then reeled in recognition of the drawbacks of metaphor mismatches, and finally has lurched to a standstill while new approaches to metaphor use are explored. The first instances of metaphor and analogy in the field of computer science and HCI preceded images of windows, desktops, mice, scrollbars, and icons. The ini- tial focus was on natural language processing to improve the communication between the user and the system.28 Although the field of information science was on the periphery of metaphor research at the time, it certainly was interested in improving the dialogue between users and systems. Belkin proposed a model of information seeking that highlighted the user ’s anomalous state of knowledge. He argued for a better understanding of user’s conceptual models in order to improve sys- tem communications.29 Although he did not propose metaphors specifically, the advent of the graphical user interface (GUI) placed metaphors in a position to tackle Belkin’s concerns. Hci Gets Gui Perhaps because of the difficulty of man–machine dia- logue, GUIs emerged. By simplifying the “language” to “point and click,” even an average user could make the system do what it was supposed to do.30 With its more intuitive and memorable interface, the GUI was the 12 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 result of years of frustration trying to remember system functions and commands. Because illustrations of the abstract are necessarily grounded in something concrete, GUIs and metaphors were inexorably intertwined; in a sense, metaphors were “inescapable.”31 Metaphors enacted through the user interface would become the primary mechanism of communication between the user and the system. GUI metaphors can be categorized several ways. A typical breakdown is to break out noun and verb meta- phors into “organization metaphors” and “operations metaphors.”32 Alternatively, Fineman further divides the nouns and classifies various metaphors into three basic types: functionality metaphors, interface metaphors, and interaction metaphors.33 Fineman describes an e-mail program. Functionality metaphors outline the expecta- tions that a user should have for an application and gen- erally guide the overall behavior of the tool. In the e-mail program the functionality metaphor would be “e-mail is postal mail.” Interface metaphors are the mechanical metaphors that allow the user to accomplish the tasks within the functionality metaphor. The interface meta- phors should be guided by the functionality metaphors, but not constrained by them. Examples would include the address book and printer metaphors. Interaction metaphors, or the verbs, are the underlying metaphors that define the form of the action, how things are per- formed; these metaphors span beyond a particular tool, but greatly affect the functionality metaphor.34 The effect of the selected metaphors cannot be under- estimated. For instance, many feel that the direct manipu- lation metaphor (data is an object that can be manipulated) and GUI are synonymous.35 And within the graphical user interface, the choice of desktop has affected all aspects of the interface with the user. One need only reflect upon the famous Englebart demonstration of the “mouse” most often viewed in Alan Kay’s video presentation.36 Englebart’s mouse preceded the notion of a desktop and more closely resembled a pilot’s controls than an office worker sitting at a typewriter keyboard. Imagine how different our computers would be today had the pilot metaphor ever got off the ground.37 the ground we walk on Having adopted metaphors, the field of HCI wanted a better understanding of why and how they worked. Carroll and Thomas stressed the importance of psychol- ogy research and rallied for the use of metaphor for its grounding purposes, that is, bridging abstract concepts to concrete attributes. In a manner similar to Belkin, they brought forth the notion that the designer of the system creates a conceptual model of how it works. The meta- phors used within the user interface serve as bridges to the user’s mental model of the system. “People employ metaphors in learning about computing systems, the designers of those systems should anticipate and support likely metaphorical constructions to increase the ease of learning and using the system.”38 They encouraged designers to consider the limitations and consequences of metaphors; ideally, the metaphor should convey its limitations to the user. Their eagerness to adopt meta- phors, which they considered “crucial” for motivating and facilitating understanding, was countered only by their warning that “for most computer systems there will come a point at which the metaphor or metaphors that initially helped the user understand the system will begin to hinder further learning.”39 Case recognized the importance of assessing users’ needs and expectations when designing metaphors for systems. His study of historians found that metaphors and analogies are commonly used in the information behavior of historians. He endorsed their use in inter- face development despite potential pitfalls. Concerned mostly with transitioning historians from physical to electronic format, Case argued that digital documents and files should more closely resemble physical files— not necessarily physically but in the manner of retrieval and storage.40 Espousing a slightly more conservative opinion, Marcus indicated that an “appropriate metaphor bal- ances delicately expectation and surprise on part of the user/viewer.”41 Marcus repeated that the objective of the designer is to design a conceptual model that clearly indicates to users what their expectations of the system should be, the goal being that the conceptual model cre- ated by the designers will map as much as possible to an existing mental model that the user can bring to refer- ence.42 Metaphors are not only useful for familiarizing users with the system, but also affect the system design as part of the design rationale. MacLean, Bellotti, Young, and Moran noted the usefulness of metaphors in the creative process, but expressed concern that designers should consider the effect of even implicit metaphors.43 Some metaphors are inevitable because “new concepts and processes require new terminology. We can either coin new terms, borrow them from Greek, Latin, or other languages, create terms by adding prefixes or suffixes—or use metaphoric terms.”44 Many metaphors used by designers in their communications are simply embedded in the language of computer science. What makes computer science so unique among the sciences, especially when using metaphors, is that they not only talk about something in terms of metaphors, they imple- ment them too. “We live with our metaphors.”45 This discourse may carry loads of inexplicable metaphors for common users, “heaps” and “stacks” and “parents” and “children,” for instance, come readily to mind for anyone with computer science experience, but do mEtaPHor’s rolE in tHE inFormation BEHavior oF Humans intEractinG witH comPutErs | sEasE 13 not necessarily convey meaning to users. We should stay aware of our metaphors so that we avoid seeing “platforms, engines and objects rather than ‘platforms’, ‘engines’ and ‘objects.’”46 the tension builds These caveats that metaphors must be constantly moni- tored and selected with care, coupled with a growing collection of mismatched and ill-fitting metaphors, began the initial protestations over the use of metaphors in HCI. The field of HCI started experiencing the effect of the tension in the metaphorical equation—those attributes that fail to match. Gentner and Nielson summarize three “classic drawbacks” of metaphors: n The target domain has features not in the source domain (magic attributes). n The source domain has features not in the target domain (misleading attributes). n Some features exist in both domains but act differ- ently (violation of expectations).47 Even proponents of metaphors readily admitted the limits of metaphors, specifically that they never match perfectly and that they can “limit meaning.”48 Halasz and Moran cautioned that teaching new users through analogical models may be an easy way to introduce a user to a new system but that “analogical models can act as barriers preventing new users from developing an effective understanding of systems.”49 Halasz and Moran argued that computers are unique; we should abandon analogical models and rather seek to create a conceptual model of the system that would more accurately reflect the actual system. A system designer’s conceptual model would represent the system to improve the user’s ability to solve problems and apply reason within the system. They confess that moving away from analogical models leaves the user without the tool of “prior knowledge,” so for teaching purposes (though not long-term reasoning purposes) they offer the use of smaller, simpler metaphors—those that they liken to liter- ary metaphors used to “make a point in passing. Once the point is made, the metaphor can be discarded.”50 Noting that there was room for error and rejection on behalf of the user, Marcus explained that some inappro- priate metaphors simply become assimilated or evolve. For example, the original Apple trashcan icon more closely resembled a “kitchen garbage can” for scraps and rotting things than an office wastebasket for paper, but over the years it has evolved to its current office basket icon.51 Also, as technology changes, the metaphors will change. “The paradigm shift, or change in metaphors, will be constant and swift as paradigms evolve from prototypes, become typed, evolve to archetypes, and eventually become stereotyped or obsolete.”52 Without stating it explicitly, he spoke of dead metaphors: meta- phors that no longer bring new meaning to light, the “arm” of a chair or the “leg” of a table, for instance. These metaphors are accepted idiomatically with no need for explanation and exploration. Aware of the ease with which users employ idi- omatic icons in computing, Cooper adduced that idi- oms and meaningless symbols are preferable to new metaphors, claiming “metaphors offer a tiny boost in learnability to first time users at tremendous cost. The biggest problem is that by representing old technology, metaphors firmly nail our conceptual feet to the ground, forever limiting the power of our software.”53 He pro- posed that we move away from a metaphoric paradigm to an idiomatic paradigm where a word or symbol sim- ply stands for something else and does not carry with it the weight of analogy. Many of the metaphors originally created in computing have become dead metaphors or idioms already. People do not think of their memory buffer where they store copied or cut items as an actual clipboard. The Macintosh trashcan is ubiquitously cited as a per- fect example of a mismatched metaphor and illustrates what may happen when a metaphor becomes idiomatic. For many years to the horror and confusion of many users, the trashcan both deleted files and was used to eject a diskette. A user would drag their diskette icon to the trashcan to eject it. Although this may seem like just a poor choice of metaphor, it does have a sensible origin. Historically, computers had no hard drive, but rather ran applications from diskettes. When you were entirely done with the application you would remove the application icon from the desktop by placing it in the trash. You would also need to eject the diskette. For expediency, Apple engineers incorporated ejection and desktop removal into one quick task. It was user tested and readily adopted.54 The metaphor was a natural extension until the function- ality changed. The user is not the only potential victim of metaphors; the blinders of an adopted metaphor can curtail a sys- tem designers’ vision.55 Gentner and Nielson take great offense at the direct manipulation metaphor because it reduces us to “pointing” and “grunting” as if we were children barely able to communicate or patrons at a res- taurant where we don’t speak the language. When they state “computer interfaces must evolve to utilize more of the power of language,”56 they are not speaking of voice control and natural language processing, but to creat- ing a shared language understandable by both the user and the system. Only “power users” of a machine have breached the walls of the interface and have attempted to learn the language of the machine itself, but even they are inevitably dragged down by the restrictions of direct manipulation.57 14 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 Near the end of the millennium, user interface guide- lines and handbooks backed off—afraid to support or spurn metaphor use in HCI. Blackwell’s chronicle of the history of the desktop metaphor notes that 1990 “marked the middle of a decade (1985 to 1995) in which research- ers anticipated problems with metaphor at the start and had experienced failure by the end.”58 The silence is most stunning in the Handbook of Human Computer Interaction, a 1,582 page volume in which only two of the sixty-two chapters even mention metaphors.59 Hollan, Bederson, and Helfman caution against metaphors in their chapter on information visualization,60 while Neale and Carroll cautiously return to Carroll’s original thesis, stress- ing the importance of creating a conceptual model (the designer’s model of the system’s functions) that “should incorporate an accurate understanding of the user’s task, requirements, experience, capabilities, and limitations.”61 n Metaphor ever after By the year 2000, system designers found themselves stuck between a rock and hard drive. Investigations into the efficacy of metaphors find that metaphors are a mixed bag, unavoidable, useful, yet problematic.62 While creating a taxonomy of HCI metaphors, Barr, Biddle, and Noble conclude that “the analysis present in the taxonomy should indicate that there are many benefits to user-interface metaphors if we choose them correctly and harness them properly.”63 Yet Blackwell’s dissertation research finds that metaphors afford “surprisingly little benefit for cognitive tasks” and that the benefit is “largely restricted to mnemonic assistance.”64 Blackwell notes that the benefits were greatest when the user constructed his or her own metaphor rather than using the system- supplied metaphor. Interestingly, while studying stu- dents’ understanding of search engines, Hendry identi- fied a conceptual metaphor (not provided by the system) common to many of the students’ visions of an informa- tion retrieval system. Although Hendry does not suggest that metaphors should be used when creating systems, he does question how existing conceptual metaphors might be identified through sketching and then incorporated to create mappings “between problem domains and pro- gramming notations.”65 Endeavoring to incorporate the benefits of metaphors while dodging the drawbacks, recent variations on the use of metaphor have been tendered. Neale and Carroll lobby for composite metaphors—metaphors made up of multiple metaphors—to alleviate the tension between source and target domains.66 Powell found composite metaphors useful for facilitating computer game play without unduly upsetting users. She explains that gamers have readily adopted the tool or inventory bag from which the user may equip their character with a man- nequin style “dress-up” panel. The bag and mannequin metaphors have no real-world association but work effectively.67 Hsu, investigating composite metaphors, confirmed Neale and Carroll’s assertions and found that the “closer the mapping between designers’ conceptual models and users’ mental models, the greater the effect of interface metaphors.”68 As an alternative to composite metaphors, Khoury and Simoff propose a new class of metaphors that they call “elastic.” They explain that metaphors in language are unavoidable, and we must deal with them in informa- tion technology. Rather than focusing on concrete objects, however, metaphors should focus on social structures, such as relationships in game play. They conclude that “elastic metaphors can provide an optimal mapping from source to target domains.”69 n Conclusion Historically, in HCI the designer of the system supplies metaphors to help the user understand the system better. Unfortunately, this format falls prey to Reddy’s conduit metaphor: the receiver of the information is left out of the communication process. If HCI is to learn from human- to-human interaction, then the user of the system should be able to communicate his or her needs to the system. If the system does not have the capacity to understand the request, then the user and the system should be empow- ered to select mutually agreeable simple metaphors for communicating. The user should be given the option to choose his or her own metaphors, and the metaphors, vocabulary, and “language” created should be able to evolve as the boundaries of the comparison are reached. A common complaint from users is, “The computer just isn’t listening to me.” And they are, of course, right. The field of information science, and particularly the subfield of human information behavior, are in a unique position to help resolve the long-standing debate over the use of metaphors in HCI. From Belkin’s early stated objec- tives to improve information systems to Cole and Leide’s pursuit of metaphor instantiation in human information organizing behavior, the study of information behavior attempts to better understand and ideally facilitate the user—assisting them in their acquisition and application of information. Metaphors are clearly utilized by humans as they communicate with each other, seek and concep- tualize information, and solve problems. To improve the interaction between human and computer, we must first gain better insight into the role that metaphors play in our own interaction with information. mEtaPHor’s rolE in tHE inFormation BEHavior oF Humans intEractinG witH comPutErs | sEasE 15 References 1. Donald A. Schön, “Generative Metaphor,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1993): 138 . 2. Aristotle, Rhetoric, Book III, Chapter 10, ed. Lee Honeycutt, trans. William R. Roberts, www.public.iastate.edu/~honeyl/ Rhetoric/rhet3-10.html (accessed June 25, 2008). 3. George Lakoff and Mark Johnson, Metaphors We Live By (Chicago: Univ. of Chicago Pr., 1980): 5. 4. Andrew Ortony, “Metaphor, Language, and Thought,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1979/1993): 1–18. 5. Robert Sternberg, Roger Tourangeau, and Georgia Nigro, “Metaphor, Induction, and Social Policy: The Convergence of Macroscopic and Microscopic Views,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1979/1993): 277–303. 6. Lakoff and Johnson, Metaphors We Live By, 9. 7. George Lakoff, “The Contemporary Theory of Metaphor,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1979/1993): 194. 8. Lakoff and Johnson, Metaphors We Live By. 9. Michael J. Reddy, “The Conduit Metaphor,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cam- bridge Univ. Pr., 1993): 174–201. 10. Ibid. 11. Alan Paivio and Mary Walsh, “Psychological Processes in Metaphor Comprehension and Memory,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1979/1993): 307–28. 12. Sam Glucksberg and Boaz Keysar, “How Metaphors Work,” in Metaphor and Thought, 2nd ed., ed. Andrew Ortony (New York: Cambridge Univ. Pr., 1993): 408. 13. Ibid., 401. 14. Dedre Gentner, “Reasoning and Learning by Analogy,” American Psychologist 52, no. 1 (1997): 33. 15. Judith E. Weiner, “A Knowledge Representation Approach to Understanding Metaphors,” Computational Linguistics 10, no. 1 (1984): 1–14. 16. George Lakoff, “Position Paper on Metaphor,” Proceed- ings of the 1987 Workshop on Theoretical Issues in Natural Language Processing (Morristown, N.J.: Association for Computational Linguistics, 1987): 94–197. 17. Gentner, “Reasoning,” 106. 18. Lakoff, “Contemporary Theory,” 210. 19. Jaron Lanier, “Jaron’s World: The Meaning of Metaphor,” Discover (Mind and Brain) 28, no. 2 (2007), http://discover magazine.com/2007/feb/jarons-world-metaphors-vocabulary (accessed June 25, 2008). 20. Lakoff and Johnson, “Metaphors.” 21. David Ritchie, “Metaphors in Conversational Context: Toward a Connectivity Theory of Metaphor Interpretation,” Metaphor and Symbol 19, no. 4 (2004): 265–87. 22. Amanda Spink and Charles Cole, “A Human Information Behavior Approach to a Philosophy of Information,” Library Trends 52, no. 3 (2004): 617–28; Amanda Spink and James Currier, “Towards an Evolutionary Perspective for Human Information Behavior: An Exploratory Study,” Journal of Documentation 62, no. 2 (2006): 171–93; Amanda Spink and James Currier, “Emerg- ing Evolutionary Approach to Human Information Behavior,” in New Directions in Human Information Behavior, ed. Amanda Spink and Charles Cole, ol. 8 of Information Science and Knowledge Management (Netherlands: Springer, 2006): 170–202. 23. Spink and Currier, “Towards an Evolutionary Perspec- tive.” 24. Amanda Spink and Charles Cole, “Human Information Behavior: Integrating Diverse Approaches and Information Use,” Journal of the American Society for Science and Technology 57, no. 1 (2005): 25. 25. Charles Cole and John E. Leide, “A Cognitive Framework for Human Information Behavior: The Place of Metaphor in Human Information Organizing Behavior” in New Directions in Human Information Behavior ed. Amanda Spink and Charles Cole, vol. 8 of Information Science and Knowledge Management (Nether- lands: Springer, 2006): 174. 26. Ibid., 173. 27. Ibid., 198. 28. Weiner, “A Knowledge Representation Approach”; Gent- ner, “Reasoning and Learning.” 29. Nicholas J. Belkin, “Anomalous State of Knowledge for Information Retrieval,” Canadian Journal of Information Science 5 (1980): 133–43. 30. Donald Gentner and Jacob Nielson, “The Anti-Mac Inter- face,” Communications of the ACM 39, no. 8 (1996): 70–82. 31. Richard M. Chisholm, “New Metaphors for Understand- ing the New Machines” Proceedings of the 4th Annual International Conference on Systems Documentation (New York: ACM, 1986): 91. 32. Aaron Marcus, “Metaphor Design in User Interfaces: How to Effectively Manage Expectation, Surprise, Comprehen- sion, and Delight” in Conference Companion on Human Factors in Computing Systems CHI ‘95, ed. Irivin Katz, Robert Mack, and Linn Marks (New York: ACM, 1995): 373–74. 33. Benjamin Fineman, “Computers as People: Human Inter- action Metaphors in Human-Computer Interaction (master’s thesis, Carnegie-Mellon University, 2004), www.mildabandon .com/paper/paper.pdf (accessed June 25, 2008). 34. Ibid. 35. Gentner and Nielson, “The Anti-Mac Interface.” 36. Alan Kay, Doing with Images Makes Symbols (University Video Communications, 1987), Flash Video File, http://video .google.com/videoplay?docid=-533537336174204822 (accessed June 25, 2008). 37. Alan F. Blackwell, “The Reification of Metaphor as a Design Tool,” ACM Transactions on Computer-Human Interaction 13, no. 4 (2006): 490–530. 38. John M. Carroll and John C. Thomas, “Metaphor and the Cognitive Representation of Computing Systems,” IEEE Transac- tions on Systems, Man and Cybernetics 12, no. 2 (1982): 108. 39. Ibid., 113. 40. Donald Case, “Conceptual Organization and Retrieval of Text by Historians: The Role of Metaphor and Memory,” Journal of the American Society for Information Science 42, no. 9 (1991): 657–68. 41. Aaron Marcus, “Managing Metaphors for Advanced User Interface,” Proceedings of International Workshop AVI ’94 (New York: ACM, 1994): 14. 16 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 42. Ibid. 43. Allan MacLean, Victoria Bellotti, Richard Young, and Thomas Moran, “Reaching Through Analogy: A Design Ratio- nale Perspective on Roles of Analogy” in Proceedings of CHI ‘91 Conference on Human Factors in Computer Systems (New York: ACM Press, 1991), 167–72. 44. Chilsolm, “New Metaphors,” 90. 45. Gerald J. Johnson, “Of Metaphor and the Difficulty of Computer Discourse,” Communications of the ACM 37, no. 12 (1994): 97–102. 46. Ibid., 101. 47. Gentner and Nielson, “The Anti-Mac Interface.” 48. Chisolm, “New Metaphors,” 90. 49. Frank Halasz and Thomas P. Moran, “Analogy Consid- ered Harmful,” International Journal of Man-Machine Studies 14 (1981): 383. 50. Ibid., 185. 51. Marcus, “Managing Metaphors,” 14. 52. Ibid., 16. 53. Alan Cooper, “The Myth of Metaphor” originally pub- lished in Visual Basic Programmer’s Journal (July 1995), www .cooper.com/articles/art_myth_of_metaphor.htm (accessed June 25, 2008). 54. Tim Rohrer, “Metaphors We Compute By: Bringing Magic into Interface Design,” (1995), http://zakros.ucsd.edu/~trohrer/ metaphor/gui4web.htm (accessed June 25, 2008). 55. Gentner and Nielson, “The Anti-Mac Interface.” 56. Ibid., 74 57. Ibid. 58. Blackwell, “The Reification of Metaphor,” 493. 59. Handbook of Human Computer Interaction, 2nd rev. ed., ed. Martin Helander, Thomas Landauer, and P. Prabhu (Amster- dam: Elsevier Science Pub. B.V., 1998). 60. James Hollan, Benjamin Bederson, and Jonathan Helfman, “Information Visualization” in Handbook of Human Computer Interaction, 2nd rev. ed., ed. Martin Helander, Thomas Landauer, and P. Prabhu (Amsterdam: Elsevier Science Pub. B.V., 1998), 441–62. 61. Dennis C. Neale and John M. Carroll, “The Role of Meta- phors in User Interface Design” in Handbook of Human Computer Interaction, 2nd rev. ed., ed. Martin Helander, Thomas Landauer and P. Prabhu (Amsterdam: Elsevier Science Pub. B.V., 1998): 447. 62. A. F. Blackwell and T. R. G. Green, “Does Metaphor Increase Visual Language Usability?” in Proceedings 1999 IEEE Symposium on Visual Languages (1999): 246–53.; Lee Ratzan, “Making Sense of the Web: A Metaphorical Approach,” Informa- tion Research 6, no. 1 (2000), http://informationr.net/ir/6-1/ paper85.html (accessed June 25, 2008); Christopher R. Wolfe, “Plant a Tree in Cyberspace: Metaphor and Analogy as Design Elements in Web-Based Learning Environments,” CyberPsychol- ogy & Behavior 4, no. 1 (2001): 67–76; Muna K. Yousef, “Legal, Social, Theoretical and Fundamental Aspects: Assessment of Metaphor Efficacy in User Interfaces for the Elderly: A Tentative Model for Enhancing Accessibility,” Proceedings of the 2001 EC/ NSF Workshop on Universal Accessibility of Ubiquitous Computing: Providing For the Elderly (New York: ACM, 2001): 120–24. 63. Pippin Barr, Robert Biddle, and James Noble, “A Tax- onomy of User Interface Metaphors” Proceedings of SIGCHI-NZ Symposium On Computer-Human Interaction (CHINZ 2002) (Ham- ilton, New Zealand: Australian Computer Society, 2002): 6. 64. Alan F. Blackwell, “Metaphor in Diagrams” (PhD diss., Darwin College, Univ. of Cambridge, 1998), www.cl.cam .ac.uk/~afb21/publications/thesis/blackwell-thesis.pdf (accessed June 25, 2008): 1. 65. David G. Hendry, “Sketching with Conceptual Metaphors to Explain Computational Processes” Visutal Languages and Human-Centric Computing (VL-HCC ‘06), (Piscataway, N.J.: IEEE, 2006): 7. 66. Neale and Carroll, “The Role of Metaphors in User Inter- face Design.” 67. Amy Powell, “Composite Metaphor, Games and Inter- face” Proceedings of the Second Australasian Conference on inter- active Entertainment: vol. 123, ACM International Conference Proceeding Series (Sydney, Australia: Creativity & Cognition Studios Pr., 2005): 159–62. 68. Yu-chen Hsu, “The Long-Term Effects of Integral Versus Composite Metaphors on Experts’ and Novices’ Search Behav- iors,” Interacting with Computers 17 (2005): 391. 69. Gerald Khoury and Simeon Simoff, “Elastic Metaphors: Expanding the Philosophy of Interface Design” in Selected Papers from Conference on Computers and Philosophy, ed. John Weckert and Yeslam Al-Saggaf, vol. 37 of ACM International Conference Proceeding Series Volume 101 (Darlinghurst, Australia: Austra- lian Computer Society, 2003): 70. 3238 ---- Evaluation oF tHE nEw JErsEY DiGital HiGHwaY | JEnG 17 Judy Jeng Evaluation of the New Jersey Digital Highway The aim of this research is to study the usefulness of the New Jersey Digital Highway (NJDH, www.njdigitalhigh way.org) and its portal structure. The NJDH intends to provide an immersive and user-centered portal for New Jersey history and culture. The research recruited 145 participants and used a Web-based questionnaire that contained three sections: for everyone, for educators, and for curators. The feedback on the usefulness of the NJDH was positive and the portal structure was favorable. The research uncovered several reasons why some collections did not want to or could not participate. The findings also suggested priorities for further development. This study is one of the few on the evaluation of cultural heritage digital library. T he New Jersey Digital Highway (NJDH, www .njdigitalhighway.org) is a digital library for New Jersey history and culture, including collections of New Jersey libraries, museums, archives, and his- torical societies. The NJDH, funded in part by the 2003 National Leadership Grant of the Institute for Museum and Library Services, is a joint project by New Jersey State Library, the New Jersey Division of Archives and Records Management at Rutgers University Libraries, the New Jersey Historical Society, and the American Labor Museum. As part of the project, the NJDH identifies 686 cultural heritage institutions (public libraries, archives, historical societies, and museums). As of November 2007, there are more than ten thousand objects (pictures, records, and oral histories) in the repository. More are being added daily. The NJDH, at this writing, is still very much a work in process. The principal investigator of this project continues to extend opportunities to more com- munities to link their sites and scan their images.1 The NJDH provides portals for four different groups of people: everyone, educators, students, and librarians and curators. Its mission is to develop an immersive, user-centered information portal and to support the New Jersey learner through a collaboration among cul- tural heritage institutions that supports preservation of the past, new access strategies for the future, and active engagement with resources at the local and the global level for shared access and local ownership. The NJDH uses FEDORA (Flexible Extensible Digital Object Repository Architecture) as a platform to mount participating institutions’ digital objects and metadata. FEDORA is developed jointly by Cornell University and the University of Virginia and is currently supported through an Andrew W. Mellon Foundation grant that is customizable and allows local institutions to have true control over what they digitize and post.2 FEDORA is built on XML with core standards that support flexibility and interoperability such as METS (Metadata Encoding and Transmition Standard, www.loc.gov/standards/ mets) and OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting, www.openarchives.org) func- tions. FEDORA is chosen for the NJDH because it can effectively accommodate and manage a broad array of information sources with the flexibility to integrate with other information repositories. The NJDH uses a metadata structure based on MODS (Metadata Open Description Schema, www.loc.gov/ standards/mods), METS, NISO, and PREMIS (Preservation Metadata, www.loc.gov/standards/premis) metadata standards to support preservation of digital objects, to ensure scalability for projects and interoperability with other systems through OAI-PMH. This hybrid approach enables NJDH collection managers and metadata creators to provide information through multiple presentation standards in a schema easily understood within distinc- tive cultural heritage organization communities. MODS is used for descriptive metadata, provides and retains stan- dard bibliographic cataloging principles, and is therefore easily mapped to MARC. The NJDH therefore includes a mapping utility that allows the export of records from the NJDH to online catalogs for any organization that wants to make its digital objects accessible within its integrated library system. Additionally, there are four other types of metadata in NJDH: source metadata describes provenance, condition, and conservation of analog source materials such as photographs, books, maps, audio, and video; technical metadata describes born digital images and pro- vides information about the digital master files that will be maintained for long-term preservation and access; rights metadata identifies the rights holder(s) for each informa- tion source, identifies the permissions for use including any restrictions, and documents the copyright status of each work; digital provenance metadata provides a digital “audit trail” of any changes to the metadata.3 The use of the NJDH has steadily grown and has some three thousand unique visitors a month averaging eight to ten thousand visits per month.4 n Prior cultural heritage digital library evaluations Literature review indicates that few researchers have investigated the usability or the evaluation of cultural Judy Jeng (jjeng@njcu.edu) is Head of Collection Services, New Jersey City University, New Jersey. 18 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 heritage digital libraries. The MINERVA (Ministerial Network for Valorising Activities) project proposed a number of criteria and principles specifically for usability evaluations of cul- tural Web applications, including visibility, affordance, natural mapping, constraints, conceptual models, feed- back, safety, flexibility, the scope and aim of the site, meaningful organization of the website’s functions, quality of content (for example, consistency, complete- ness, conciseness, accuracy, objectivity), design of func- tional layout, consistent use of graphics and multimedia components, as well as provision for navigation tools and search mechanisms.5 In addition, Vaki, Dallas, and Dalla proposed sixteen usability guidelines for cultural applications.6 Garoufallou, Siatri, and Balatsoukas reported their research on the user interface of the VeriaGrid applica- tion.7 The VeriaGrid system (www.theveriagrid.org) is a platform based on digital cartography that supports a vector map of the city of Veria organized by layers and linked to multimedia objects such as text, images, photos, and video clips. The researchers were interested in learn- ability, errors, and satisfaction. n Usefulness as the primary evaluation criterion for the NJDH The NJDH aims to serve heterogeneous communities and information needs. Like other digital cultural ser- vices, it is not easy to address usability issues. Lynch has said that digital libraries of cultural heritage don’t really have natural communities around them and that digital materials find their own unexpected user communities.8 Garoufallou, Siatri, and Balatsoukas said that “different types of users, such as students and scholars or tourists and travelers look at these services from different angles (for example, scholarly or recreational needs). Thus, the provision of accessible and user-friendly systems is important for the wider use and acceptance of these services.”9 The aim of this evaluation was to assess usefulness of the NJDH from the perspectives of general users, educa- tors, and cultural heritage professionals. Usefulness is one of the criteria of usability with a focus on “Did it really help me?” and “Was it worth the effort?” Usefulness dif- fers from usableness in that usableness refers to functions such as “Can I turn it on?” or “Can I invoke that func- tion?” Usefulness can also mean “serving an intended purpose.” In the Technology Acceptance Model (TAM) developed by Davis and his colleagues, perceived useful- ness refers to the extent to which an information system will enhance a user’s performance.10 In addition to usefulness and usableness, Jeng has gathered a comprehensive collection of usability criteria such as effectiveness, efficiency, satisfaction, learnability, ease of use, memorability, mistake recovery, and interface effectiveness.11 Usability is a multidimensional construct and has a theoretical root in human–computer interaction. Although usefulness may be an important evalua- tion criterion, Thomas and Jeng report that usefulness is an often overlooked criterion of usability.12 Literature review indicates that usefulness has been used as either the primary or one of the criteria in the following evaluations of digital libraries: eLibraryHub, the Digital Work Environment, GROW (Geotechnical, Rock, and Water Engineering, www.grow.arizona.edu), McMaster University Library’s Gateway, the Miguel de Cervantes Virtual Library, Minnesota’s Foundations Project, and the Moving Image Collections.13 This paper reports the evaluation of the NJDH. n Research method A Web-based online survey was conducted in September– December 2006. The questionnaire was designed, col- lected, and analyzed using Web-based software called SurveyMonkey. Convenience sampling method was used in this study. Subjects were recruited by posting a link on the NJDH website, by posting announcements on a number of electronic discussion lists for educators and cultural heritage professionals, and by word-of-mouth invitations. The participants were asked to complete a two-part questionnaire. The first part gathered demographic data such as gender, age, ethnic background, educational background, the county they live in, and how they learned about the NJDH. The second part contained three sections: one for everyone, one for educators, and one for cultural heritage professionals. The section for everyone contained twenty-six ques- tions, including seven-point Likert scales and open-ended questions with a focus on the digital library’s usefulness, navigation, design, terminology, and user lostness. In addition to this general section, educators were asked to complete another fifteen questions pertaining specifically to the educators’ portal; the cultural heritage profession- als had another thirteen questions regarding the librar- ians and curators’ portal. A total of 145 individuals participated in the survey, of which 32 were educators (22%) and 28 (20%) were cultural heritage professionals. The participants were mostly white (127 respondents or 89%), mostly female (118 respondents or 81%), and most had a master’s or doctoral degree (114 respondents or 79%). In terms of age distribution, more than half of the participants were over 50 (79 respondents or 55%) (see table 1). Nearly all (136 respondents or 94%) were residents of New Jersey. Evaluation oF tHE nEw JErsEY DiGital HiGHwaY | JEnG 19 Among the educators that partici- pated in this survey who evaluated the educators’ portal, 56% (18 respondents) worked at colleges or universities, 16% (5 respondents) worked at high schools, 13% (4 respondents) worked at elementary or middle schools, and 6% (2 respondents) identified themselves as specialists in museums, libraries, or archives. Roughly a third (10 respondents or 31%) were teachers, 3% (1 respondent) was a teach- ing assistant, 13% (4 respondents) were school administrators, and 28% (9 respon- dents) were school library media special- ists or librarians (see table 2). In terms of what they teach, 27% (7 respondents) teach New Jersey history, 23% (6 respon- dents) teach social studies, 12% (3 respon- dents) teach civics, 8% (2 respondents) teach geography, and 8% (2 respondents) teach popular culture. As to the survey participants who identified themselves as cultural heri- tage professionals, 61% (17 respondents) worked at libraries, 11% (3 respondents) worked at museums, 11% (3 respondents) worked at historical societies, and 4% (1 respondent) worked with archives. In terms of their roles at those organizations, 61% (17 respondents) said they were fac- ulty or staff, 18% (5 respondents) were administrators, one was a consultant, one was a librarian, and one was a volunteer (see table 3). n Findings How do users find out about the nJDH and will they come back? The survey found that more than half of the respon- dents (58 participants or 40%) learned about the NJDH from their colleagues or friends, 19 participants (13%) learned through attending conferences, 16 participants (11%) were linked from other websites (see figure 1). The NJDH digital library intends to build rich and “one stop shop” digital collections of New Jersey history and culture. Cultural heritage digital library plays a par- ticularly important role for students of the humanities because the digital library is the humanist’s laboratory, its resources are the scholar’s primary data.14 It is important to enhance users’ awareness of this digital library among New Jerseyans and even promote this cultural heritage digital library to users at global level. Table 1. Demographic Data (N = 145) Total % Gender Male 27 18.6 Female 118 81.4 Age 18–24 1 0.7 25–49 63 44.1 50–64 74 51.7 65+ 5 3.5 Ethnic Background White 127 89.4 African American 5 3.5 Asian 6 4.2 Hispanic 3 2.1 Native American 1 0.7 Education High school 5 3.4 Associate’s degree 7 4.8 Bachelor’s degree 19 13.1 Master’s or PhD degree 114 78.6 In terms of the purposes of visiting the NJDH, the study found 72 respondents (76%) were just browsing and 23 respondents (24%) were looking for specific infor- mation such as a specific county information, history, and family genealogy (see figure 2). Seventy-two respondents (74%) replied that they will come back to use the NJDH again (see figure 3). Those who said “No” gave reasons such as their doubts on whether the information in the NJDH is reliable and authoritative, the depth and breadth of content in this digital library, and the inconsistency of fonts and font sizes. n Navigation Navigation has been reported in literature as a common problem in a digital library. Users could accidentally leave the digital library, following the links to other Web-based resources, and were unaware that they were no longer using the digital library. Brinck, Gergle, and Wood report that disorientation is among the biggest frustrations for Web users.15 20 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 average 2.54 on a 7-point Likert scale, 1 being easy to navigate and 7 being difficult to navigate). Twenty- three participants (25%) marked 1 on the Likert scale, 28 participants (30%) marked 2, and 26 participants (28%) marked 3. These brought the total of the top three points to 83%. The overall response regarding user lostness was also not a problem (response average 2.42 on a 7-point Likert scale, 1 being not lost at all and 7 being very lost). Only two participants expressed they were very lost and one expressed lost. The reasons that could lead to user lostness include the lack of mate- rial in the collections so far, the need for explanation of how relevance is ranked, the home page being text heavy and cluttered, the photos not being legible, the lack of author information in documents, no indica- tion of a trail of how one got there, lengthy URLs, the need for better chosen direct links instead of layered links, and patrons’ unfamiliarity with icons and their functions. n Layout The rating for the layout of the NJDH was very posi- tive (response average 2.54 on a 7-point Likert scale, 1 being good and 7 being bad). However, the site may improve its appearance in the following areas: there is currently too much text per page (the font is too small and the use of typography, informational hierarchy, and white space must be improved); more important information needs to go at the top of pages; and more colors need to be used. n Terminology The degree to which users interact with a digital library depends on how well users understand the terminology displayed on the system interface. Literature review has indicated that the inappropri- ate use of jargon has been a common problem in digital library design. Hartson, Shivakumar, and Pérez-Quinones report from their usability inspec- tion of the Networked Computer Science Technical Reference Library (www.ncstrol.org) that problems with wording accounted for 36% of the digital library’s usability problems.16 System designers often assume too much about the extent of user knowledge. The precise use of words in a user interface is one of the utmost important design consid- erations for usability. Table 2. Educators’ Demographic Data (N = 32) Total % Institutions University or College 18 56 High School 5 16 Elementary or middle school 4 13 Museums and others 2 6 No answer 3 9 Total 32 100 Roles Teacher 10 31 Teaching assistant 1 3 Administrator 4 13 Librarian 9 28 No answer 8 25 Total 32 100 Table 3. Cultural Heritage Professional’s Demographic Data (N = 28) Total % Institutions Library 17 61 Museum 3 11 Historical society 3 11 Archives 1 4 Others or no answer 4 14 Roles Faculty or staff 17 61 Administrator 5 18 Consultant 1 4 Librarian 1 4 Volunteer 1 4 No answer 3 11 This survey found the overall response regarding the navigation of the NJDH was very positive (response Evaluation oF tHE nEw JErsEY DiGital HiGHwaY | JEnG 21 This research found that the overall response regarding terminology and labeling in the NJDH was positive (response average 2.34 on a 7-point Likert scale, 1 being clear and 7 being not clear). n Usefulness Usefulness was the funda- mental research focus of this study. This research investigated whether the NJDH was useful to the general public, educators, and students. The responses were overwhelmingly positive: 73% of the respondents gave 1–3 ratings on the 7-point Likert scale (1 being useful and 7 being not useful)—30% (29 respondents) marked 1, 33% (32 respondents) marked 2, and 12% (12 respondents) marked 3. The average response was 2.63. This was a very positive response. When it comes to the specific section for educators to evaluate the educator’s portal, the rating was also posi- tive (response average 3.04). Those educators felt that the most useful information was the “how to” information for teaching with digital resources, research genealogy, developing an oral history, and so on. Twelve respon- dents (44%) indicated they would encourage their stu- dents to use the NJDH site for term papers or homework assignments. Thirteen respondents (50%) indicated they would make their own lesson plans using the resources and information from the NJDH. Regarding the student’s portal, those educators who responded to the survey indicated that, from their per- spectives, the most useful information for students was the general information about New Jersey, including a direc- tory of cultural heritage organizations, places to visit, etc. As for the librarians and curators’ portal, those cul- tural heritage professionals identified the Librarians and Curators’ Resource Center as the most useful resource in the NJDH, followed by the Digital Highway Collections Roadmap and associated guidelines, calendar, the searching capabilities of New Jersey Cultural Heritage Organizations, and New Jersey information. Sixteen respondents (67%) said they would recommend this digi- tal library to their patrons, two respondents (8%) won’t, and six respondents (25%) were not sure. It is obvious that the NJDH administrators need to work harder in this area to enhance usefulness for cultural heritage profes- sionals and their patrons. Figure 1. Where did you hear about NJDH? The survey asked all respondents to suggest what themes should be enriched in the NJDH collections. The suggestions were, in this order: New Jersey history, New Jersey state and county documents, New Jersey culture, genealogy, everyday life in New Jersey, New Jersey indus- try, more immigration resources, education in New Jersey, New Jersey in wartime, and transportation. Regarding the librarians and curators’ portal, the respondents suggested the contents of this particular por- tal should be enhanced in the following priority order: (1) more links to other websites with history resources and activities, (2) access to mentors experienced in digitizing and metadata who can provide one-to-one assistance, (3) a discussion list or blog where users can ask questions or share ideas with others, (4) information about training sessions around New Jersey on digitization and metadata, (5) more resources on digital preservation and metadata, (6) educational activities that users can share with their patrons, (7) a tool for users to create their own interac- tive activities using the NJDH resources, and (8) more information about helping patrons to use the NJDH more effectively. n Portal structure The NJDH provides four portals for different target users: everyone, educators, students, and librarians and curators. Each portal provides different interface and packages dif- ferent information for a different type of user. The survey found 80% of the subjects understood the purpose of the four portals (by marking 1 or 2 on the 7-point Likert scale) and only 4 participants (4%) found this type of portal struc- ture confusing. The survey further found 65% of partici- pants felt this kind of portal structure helpful to them. 22 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 n Why not contributing to the NJDH collections? The respondents indicated that the barriers for them to contribute collections or resources to the NJDH were, in this order: (1) lack of staff or time, (2) lack of funding, (3) lack of knowledge, and (4) copyright concerns. n Statistical analyses The study found demographic factors, such as age, gen- der, ethnic background, and educational level, do not have significant effects on a number of areas: (1) how the participants ranked usefulness of the digital library, (2) usefulness evalu- ation of the four-portal structure, (3) understanding of terminology, (4) ease of navigation, and (5) lostness. The study found the correlation between navigation and lostness was statistically significant: r (66) = .83, p < .001. When a user felt the system easy to navigate, the user felt less lost. The study also found usefulness of the digital library has a statistically significant effect on a user ’s return decision. A one-way analysis of vari- ance was conducted. The analysis of variance was significant, F (2, 59) = 20.42, p < .001. The strength of rela- tionship between usefulness rank- ing and the decision of whether to revisit the digital library, as assessed by n2, was strong, with the useful- ness factor accounting for 41% of the variance of the return decision. Because the overall F test was significant, follow-up tests were conducted to evaluate pairwise differences among the means. Using the Turkey test, the pairwise comparisons Yes vs. No and Yes vs. Not Sure were significant. The pairwise com- parison No vs. Not Sure was not significant. n Conclusions Usability evaluation is a user-centered evaluation to learn from users’ needs, expectations, and satisfaction. This research studied usefulness, navigation, user lost- ness, terminology, and layout. The overall response was positive, and the finding was that the NJDH was useful in providing New Jersey history and culture information. Designers of the NJDH learned from the study the priori- ties of adding various New Jersey themes to the collec- tions and how to make the site easier to use. As a result of the study, lifelong learners are identified as an important target audience. This research provided insights on why people came to use this particular digital library, their pleasure of using it, how to improve ease-of-use, navigation, website appearance, and the use of terminology and labeling. The front page of the website was redesigned to address the overuse of text on each page. The study also helped to discover what components of the site were more useful and why. Furthermore, it investigated why some muse- ums or collections in New Jersey have not participated in this digital library development project. As a result of the study, more emphasis has been placed on building tools Figure 2. Purpose of the most recent visit Figure 3. Will you use NJDH again? Evaluation oF tHE nEw JErsEY DiGital HiGHwaY | JEnG 23 to increase independent collection contribution by muse- ums and archives. The observations of this study may help the development of other academic digital librar- ies because the barriers found in the study are common obstacles. After eighteen months of the study, the NJDH Governance Planning Committee still uses the evalua- tion report to address more complex and fundamental changes and the reorganization of the digital library. The study confirmed that users of this digital library appreciated the idea of providing different portals for different users. The study did not find demographic factors (age, gender, ethnic background, and educational level) play statistically significant roles in the usefulness rankings of the digital library or portal structure, terminology, ease of use, or user lostness. The study found there was a strong correlation between ease of navigation and user lostness. Users don’t have feelings of lostness when a system is easy to navigate. The study also found users will come back to revisit a digital library when they find the site is useful. n Acknowledgments Judy Jeng and Grace Agnew were the codesigners of the questionnaire for this study. Judy served as the evalua- tion consultant for the NJDH. Grace Agnew, the Associate University Librarian for Digital Library Systems at Rutgers University, was the principal investigator of the NJDH. The NJDH received funding from Institute of Museum Library Services Grant LG30-03-0269-03. References 1. Linda Langschied, “History and High-Tech Intersect on the New Jersey Digital Highway,” www.imls.gov/profiles/ Nov07.shtm (accessed Aug. 12, 2008). 2. Linda Langschied and Ann Montanaro, “The New Jersey Digital Highway: A Next-Generation Approach to Statewide Digital Library Development,” Microform & Imaging Review 34, no. 4 (2005): 167–73. 3. The New Jersey Digital Highway: Final Report on IMLS Grant #LG30-03-0269-03, www.njdigitalhighway.org/documents/ njdh-final_report_www_version.pdf (accessed Aug. 12, 2008). 4. Ibid. 5. MINERVA Working Group 5, Handbook for Quality in Cultural Web Sites Improving Quality for Citizens: Version 1.2—Draft. (2003), www.minervaeurope.org/publications/ qualitycriteria1_2draft/qualitypdf1103.pdf (accessed Aug. 12, 2008). 6. Elina Vaki, Costis Dallas, and Christina Dalla, Calim- era: Cultural Applications: Local Institutions Mediating Electronic Resources: Deliverable D 18: Usability Guidelines, www.calimera .org/Lists/Resources%20Library/The%20end%20user%20 experience,%20a%20usable%20community%20memory/ Usability%20Guidelines.pdf (accessed Aug. 12, 2008). 7. Emmanouel Garoufallou, Rania Siatri, and Panagiotis Balatsoukas, “Virtual Maps—Virtual Worlds: Testing the Usabil- ity of a Greek Virtual Cultural Map,” Journal of the American Society for Information Science and Technology 59, no. 4 (2008): 591–601. 8. Clifford Lynch, “Digital Collections, Digital Libraries and the Digitization of Cultural Heritage Information,” First Monday 7, no. 5 (2002), www.firstmonday.org/issues/issue7_5/lynch/ (accessed Aug. 12, 2008). 9. Garoufallou, Siatri, and Balatsoukas, “Virtual Maps— Virtual Worlds,” 591–601. 10. Fred D. Davis, “Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology,” MIS Quarterly 13, no. 3 (1989): 319–40; Fred D. Davis, Richard P. Bagozzi, and Paul R. Warshaw, “User Acceptance of Computer Technology: A Comparison of Two Theoretical Models,” Man- agement Science 35, no. 8 (1989): 982–1003. 11. Judy Jeng, “Usability of the Digital Library: An Evalua- tion Model” (PhD diss., Rutgers University, 2006): 10–19; Judy Jeng, “Usability Assessment of Academic Digital Libraries: Effectiveness, Efficiency, Satisfaction, and Learnability,” Libri: International Journal of Libraries and Information Services 55, no. 2/3 (2005): 96–121; Judy Jeng, “What is Usability in the Context of the Digital Library and How Can It Be Measured?” Informa- tion Technology and Libraries 24, no. 2 (2005): 47–56. 12. Rita Leigh Thomas, “Elements of Performance and Satis- faction as Indicators of the Usability of Digital Spatial Interfaces for Information-Seeking: Implications for ISLA” (PhD diss., Univ. of Southern California, 1998); Judy Jeng, “Usability of the Digital Library: An Evaluation Model” (PhD diss., Rutgers Uni- versity, 2006): 33. 13. Yin-Leng Theng, Mei-Yee Chan, Ai-Ling Khoo, and Raju Buddharaju, “Quantitative and Qualitative Evaluations of the Singapore National Library Board’s Digital Library,” in Design and Usability of Digital Libraries: Case Studies in the Asia Pacific, ed. Yin-Leng Theng and Schubert Foo (Hershey, Pa.: Informa- tion Science Publishing, 2005): 334–49.; N. Meyyappan, Schubert Foo, and G. G. Chowdhury, “Design and Evaluation of a Task- Based Digital Library for the Academic Community,” Journal of Documentation 60, no. 4 (2004): 449–75; Janice Lodato, “Creat- ing an Educational Digital Library: GROW a National Civil Engineering Education Resource Library,” (paper presented at the Conference on Human Factors in Computing Systems, Vienna, Austria, Apr. 24–29, 2004), in the ACM Digital Library, http://portal.acm.org/citation.cfm?id=985942&coll=portal&dl =ACM&CFID=32427354&CFTOKEN=28824529 (accessed Aug. 12, 2008); Brian Detlor et al., Fostering Robust Library Portals: An Assessment of the McMaster University Library Gateway (Hamil- ton, Ont.: Michael G. DeGroote School of Business, McMaster University, 2003); Álvaro Quijano-Solís and Raúl Novelo-Peña, “Evaluating a Monolingual Multinational Digital Library by Using Usability: An Exploratory Approach from a Developing Country,” The International Information & Library Review 37, no. 4 (2005): 329–36; Eileen Quam, “Informing and Evaluating a Meta- data Initiative: Usability and Metadata Studies in Minnesota’s Foundations Project,” Government Information Quarterly 18, no. 24 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 3 (2001): 181–94; Judy Jeng, “Metadata Usefulness Evaluation of the Moving Image Collections” (paper presented at the New Jersey Library Association annual conference, Long Branch, New Jersey, Apr. 23–25, 2007), www.njla.org/conference/2007/ presentations/Metadata.pdf (accessed Aug. 12, 2008). 14. Gregory Crane and Clifford Wulfman, “Towards a Cul- tural Heritage Digital Library,” Proceedings of the 3rd ACM/ IEEE-CS Joint Conference on Digital Libraries, in the ACM Digital Library, http://delivery.acm.org/10.1145/830000/827150/p75 -crane.pdf?key1=827150&key2=9784876911&coll=ACM&dl=A CM&CFID=8598346&CFTOKEN=44546164 (accessed Aug. 12, 2008). 15. Tom Brinck, Darren Gergle, and Scott D. Wood, Designing Web Sites that Work: Usability for the Web (San Francisco: Morgan Kaufmann, 2002). 16. H. Rex Hartson, Priy A. Shivakumar, and Manuel A. Pérez-Quinones, “Usability Inspection of Digital Libraries: A Case Study,” International Journal on Digital Libraries 4, no. 2 (2004): 108–23. 3239 ---- introDucinG ZoomiFY imaGE | smitH 25 Column Title Editor Author ID box for 3 column layout Communications “Just in casE” answErs: tHE twEntY-First-cEnturY vErtical FilE | DalrYmPlE 25 Tam Dalrymple “Just-in-Case” Answers: The Twenty-First- Century Vertical File This article discusses the use of OCLC’s QuestionPoint service for managing electronic publications and other items that fall outside the scope of OCLC Library’s OPAC and Web resources pages, yet need to be “put somewhere.” The local knowledge base serves as both a col- lection development tool and as a virtual vertical file, with records that are easy to enter, search, update, or delete. We do not deliberately collect for the Vertical File, but add to it day by day the useful thing which turns up. These include clip- pings from newspapers, excerpts from periodicals . . . broadsides that are not injured by folding . . . anything that we know will be used if available. —Wilson Bulletin, 1919 I nformation that “will be used if available” sounds like the contents of the Internet.1 As with libraries everywhere, the OCLC Library has come to depend on the Internet as an almost limitless resource. And like libraries everywhere, it has con- fronted the advantages and disad- vantages of that scope. This means that in addition to using the OPAC and OCLC library’s webpages, OCLC library staff have used a mix of bookmarks, del.icio.us tags, and Post-it® notes to keep track of rel- evant, authoritative, substantive, and potentially reusable information. Much has been written about the use of QuestionPoint’s transaction management capabilities and of the important role of knowledge bases in providing closure to an inquiry. In contrast, this article will look at QuestionPoint’s use as a manage- ment tool for future questions, for items that fall outside the scope of OCLC library’s OPAC and Web resources pages yet need to be “put somewhere.” The QuestionPoint local knowledge base is just the spot for these new vertical file items. About OCLC Library OCLC is the world’s largest nonprofit membership computer library ser- vice and research organization. More than 69,000 libraries in 112 coun- tries and territories around the world use OCLC services to locate, acquire, catalog, lend, and preserve library materials. OCLC Library was estab- lished in 1977 to provide support for OCLC’s mission. The collection con- centrates on library, information and computer sciences, business manage- ment, and has special collections that include the papers of Frederick G. Kilgour and archives of the Dewey Decimal Classification™. OCLC Library has a distinct cli- entele to which it offers a complete range of services—print and elec- tronic collections, reference, interli- brary loan—within its subject areas. Because of the nature of the orga- nization, the library supports long- term and collaborative research, such as that done by OCLC Programs and Research staff, as well as the immediate information needs of product management and marketing staff. OCLC Library also provides information to OCLC’s other service areas, such as finance and human resources. While most OCLC Library acqui- sitions are done on demand, OCLC Library selects and maintains an extensive collection of periodicals, journals, and reference resources, most of them online and accessi- ble—along with the OPAC—to OCLC employees worldwide from the library’s webpages (See figure 1). Often, however, OCLC staff, like those of many organizations, are too busy to consult these resources themselves and thus depend on the library. OCLC Library staff pursue the answers to such research questions through its collections and look to enhance the collections with “any- thing that we know will be” of use. One of the challenges is keeping track of the “anything” that falls out- side the library’s primary collections scope; QuestionPoint helps with that task. Traditional uses of QuestionPoint QuestionPoint is a service that pro- vides question management tools aimed at increasing the visibility of reference services and making them more efficient. OCLC Library uses many of those tools, but there are significant ones it does not use (for example, Chat). And although the library’s QuestionPoint-based AskA link is visible by default on the front page of the corporate intranet as well as on OCLC Library–specific pages, less than than 8 percent of ques- tions over the last year were received through that link. One reason for this low use may be that for most of OCLC Library’s history, e-mail has been the primary contact method, and so it remains. Even when the staff need clarifica- tion of a question, they automatically opt for telephone or e-mail messag- ing. Working with a Web form and question-and-answer software has not caught on as a replacement for these more established methods. However, QuestionPoint remains tam Dalrymple (dalrympt@oclc.org) is Senior Information Specialist at OCLC, Dublin, Ohio. 26 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 200826 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 the reference “workspace.” When questions come in through e-mail or phone, librarians enter them into QuestionPoint, using it to add notes and keep track of sources checked. Completed transactions are added to the local knowledge base. (Because their questions involve proprietary matters, many special libraries do not add their answers to the global knowledge base, and OCLC Library is no exception. The local knowl- edge base is accessible only by OCLC Library staff.) Not surprisingly, most of the questions received are about librar- ies, museums, and other cultural institutions, their collections, users, and staff. This means that the likeli- hood of reuse of the information in the OCLC Library knowledge base is relatively high, and makes the local knowledge base an early stop in the reference process. Though statistics vary widely by individual institutions and type of library—and though some libraries have opted not to use the knowledge base—the average ratio for all QuestionPoint libraries is about one knowledge base search for every three ques- tions received. In contrast, in the past year OCLC Library staff averaged 4.2 local knowledge base searches for every three questions received. The view of the QuestionPoint knowledge base as a repository of answers to questions that have been asked is a traditional one. OCLC Library’s use of the QuestionPoint knowledge base in anticipation of information needs of its clients—as a way of collection development—is distinctive. In many respects this use creates an updated version of the old- fashioned vertical file. Nontraditional uses of QuestionPoint Just-in-case The vertical file has a quirky place in the annals of librarianship. It has been the repository for facts and information too good to throw away but not quite good enough to catalog. H. W. Wilson still offers its Vertical File Index, a specialized subject index to pamphlets issued on topics often unavailable in book form, which began in 1932. By now, except for special collections, the Internet has practically relegated the vertical file to the backroom with the card plat- ens and electric erasers. OCLC Library now uses its QuestionPoint knowledge base to manage information that once might have gone into a vertical file: the authoritative reports, studies, .org sites, and other resources that are often not substantive enough to cata- log, but too good to hide away in a single staff member’s bookmarks. The QuestionPoint knowledge base provides a place for these resources; more importantly, QuestionPoint provides fast, efficient ways to col- lect, tag, manage, and use them. QuestionPoint allows development of such collections with powerful capabilities that allow for future retrieval and use of the information, and it does so without the incred- ibly time-consuming processes of the past. A 1909 description of such processes describes in detail the inef- ficiency of yore: In the Public library [sic] of Newark, N.J., material is filed in folders made of No. 1 tag manila paper, cut into pieces about 11x18 inches in size. One end is so turned up against the others as to make a receptacle 11x19 1/2 inches. The front fold is a half inch shorter than the back one, and this leaves a margin exposed on the back one, whereon the subject of that folder is written.2 Thus a major benefit of using QuestionPoint to manage these resources is saving time. Because QuestionPoint is a routine part of OCLC Library’s workflow, it allows the addition of items directly to the Figure 1. OCLC Library intranet homepage introDucinG ZoomiFY imaGE | smitH 27“Just in casE” answErs: tHE twEntY-First-cEnturY vErtical FilE | DalrYmPlE 27 knowledge base quickly and with a minimum of fuss. There is initially no need to make the entry “pretty,” but only to describe the resource briefly, add the URL, and tag it (see figure 2). Unlike a physical vertical file, tagging items in the knowledge base allows items to be “put” in multiple places. Staff can also add comments that characterize the authoritative- ness of a resource. Occasionally librarians come across articles or resources that might address multiple questions. Instead of burying the data in one overarch- ing knowledge base record, staff can make an entry for each aspect of the resource. An example of this is www .galbithink.org/libraries/analysis. htm, a page created by Douglas Galbi, Senior Economist with the Federal Communications Commission (see figure 3). The site provides statistics, including historical statistics, on U.S. public libraries. Rather than describe these generically with a tag like “library statistics”—not very useful in any case—each source can be added sep- arately to the QuestionPoint knowl- edge base. For example, the item “Audiovisual Materials in U.S. Public Libraries” can be assigned specific tags—audiovisual, AV, videos—that will make the data more accessible in the future. In other words, librar- ians use the FAQ model of asking and answering just one question at a time. An important element in adding “answers” to OCLC Library’s knowl- edge base is the ability to provide context. With QuestionPoint, librar- ians can not only describe what the resource is, but why it may be of future use. And just the act of adding information to the knowledge base serves as a valuable mnemonic— “I’ve seen that somewhere.” Records added to the knowledge base in this way can be easily updated with information about newer editions or better sources. Equally valuable is the ability to edit and add keywords when the resource becomes useful for unforeseen questions. sharing information with staff The knowledge base also serves as a more formal collection develop- ment tool. When librarians run across potentially valuable resources, they can send a description and a link to a product manager who may find it of use. Library staff use QuestionPoint’s keyword capability to add tags of people’s names and job titles to facilitate ongoing current awareness. Employees may provide feedback suggesting an item be added to the Figure 3. A page with diverse facts and figures: www.galbithink.org/libraries/analysis.htm Figure 2. A sample QuestionPoint entry, this for a report by the National Endowment for the Arts 28 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 200828 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 permanent print collection, or linked to from the library website. OCLC Library strives to inform users without subjecting them to information overload. When a 2007 survey of OCLC staff found the library’s RSS feeds seldom used, librarians began to send e-mails directly to individuals and teams. The reaction of OCLC staff indicates that such personal messages, with con- tent summaries that allow recipients to quickly evaluate the contents, are more often read than OCLC Library RSS feeds—especially if items sent continue to be valuable. Requirements that enable this kind of sharing include knowledge of company goals, staff needs, and product initiatives. To keep up-to- date, librarians meet regularly with other OCLC staff, and monitor orga- nizational changes. Attendance at OCLC’s Members Council meetings provides information on hot top- ics that help identify resources for future use. While OCLC’s growth as a global organization has brought challenges in maintaining aware- ness of the full range of organization needs, the QuestionPoint knowledge base offers a practical way to manage increased volume. Maintaining resources of potential interest to staff with QuestionPoint has another benefit: it helps keep librarians aware of internal experts who can help the library with ques- tions, and in many cases allows the library to connect staff with mutual interests to one another. This has become especially important as OCLC has grown and its services continue to integrate with one another. Conclusions Beyond its usefulness as a system to receive, manage, and answer inqui- ries, QuestionPoint is providing a way to facilitate access to online resources that addresses the particu- lar needs of OCLC Library’s con- stituency. It is fast and easy to use: a standard part of the daily workflow. It enables direct links to sources and accommodates tagging those sources with the names of people and proj- ects, as well as subjects. It serves as part of the library’s collection man- agement and selection system. Using QuestionPoint in this way has some potential drawbacks. “Just in case” acquisition of virtual resources entails some of the risks of traditional acquisitions: acquiring resources that are seldom used, cre- ating a database of resources that are difficult to retrieve, and perhaps the necessity of “weeding” or updat- ing obsolete items. With company growth comes the issue of scalability, as well. But for now, the benefits have far outweighed the risks. Most of the items added have been identified for and shared with at least one staff member, so the effort has provided immediate payoff. n The knowledge base serves as a collection development tool, helping to identify items that can be cataloged and added to the permanent collection. n The record in the knowledge base can serve as a reminder to check for later editions. n The knowledge base records are easy to update or even delete. The QuestionPoint virtual verti- cal file helps OCLC Library manage and share those useful things that “just turn up.” References 1. “The Vertical File for Pamphlets and Miscellany,” Wilson Bulletin 1, no. 16 (June 1919): 351. 2. Kate Louise Roberts, “Vertical File,” Public Libraries 12 (Oct. 1907): 316–17. 3240 ---- introDucinG ZoomiFY imaGE | smitH 29 Column Title Editor Author ID box for 3 column layout PlaYinG taG in tHE DarK: DiaGnosinG slownEss in liBrarY rEsPonsE timE | Brown-sica 29 Margaret Brown-SicaTutorial Playing Tag In the Dark: Diagnosing Slowness In Library Response Time In this article the author explores how the systems department at the Auraria Library (which serves more than thirty thousand primarily com- muting students at the University of Colorado–Denver, the Metropolitan State College of Denver, and the Community College of Denver) diag- nosed and analyzed slow response time when querying proprietary databases. Issues examined include vendor issues, proxy issues, library network hardware, and bandwidth and network traffic. W hy is everything so slow?” This is the question that library systems depart- ments often have the most trouble answering. It is also easy to dismiss because it is often the fault of factors beyond the control of library staff. What usually prompts these ques- tions are the experiences of the refer- ence librarians. When these librarians are trying to help students at the reference desk, it is very frustrating when databases seem to respond to queries slowly, files take forever to load onto the computer screen, and all the while the line in front of the desk get continues to grow. Or the library gets calls from students using databases and the catalog from their homes who complain that searching library resources takes too long, and that they are getting frustrated and using Google instead. This question is so painful because libraries spend so much of their shrinking budgets on high quality information in the form of expensive proprietary databases, and it is all wasted if users have trouble using them. In this case the problem seemed to be how slow the process of searching for information and downloading documents from databases was. For lack of a better term, the Auraria Library called this the “response time” problem. This article will discuss the various ways the systems (technology) department of the Auraria Library, which serves the University of Colorado–Denver, Metropolitan State College of Denver, and the Community College of Denver, tried to identify problems and improve database response time. The systems department defined “response time” as the time it took for a person to send a query from a computer at home or in the library to a proprietary information database and receive a response back, or how long it took to load a selected full- text article from a database. When a customer sets out to use a database in the library, the query to the database could be slowed down by many dif- ferent factors. The first is the proxy, in our case Innovative Interfaces’ Inc. Web Access Management (III WAM), a product that authenticates the user via the III API (Application Program Interface) product. To do this the query travels over network hardware, switches, and wires to the III server and back again. Then the query goes to the database’s server, which may be almost anywhere in the world. Hardware problems at the database vendor’s end can affect this transfer. In the case of Auraria Library this transfer can be influenced by traffic on the library’s network, the university’s network, and any other place in between. This could also be hampered by the amount of memory in the computer where the query originates, by the amount of tasks being performed by that computer, etc. The bandwidth of the network and its speed can also have an effect. Basically, the bottlenecks needed to be found and fixed. Bottlenecks are described by Webopedia as “the delay in transmission of data through the circuits of a computer’s micro- processor or over a TCP/IP network. The delay typically occurs when a system’s bandwidth cannot support the amount of information being relayed at the speed it is being pro- cessed. There are, however, many factors that can create a bottleneck in a system.”1 Literature review There is not a lot on database response slowness in library literature, prob- ably because the issue overlaps with computer science and really is not one problem but a possibility of one of several problems. The issue is figuring out where the problem lies. Gerhan and Mutula examined tech- nical reasons for network slowness, performing bandwidth testing at a library in Botswana and one in the United States using the same com- puter, and giving several suggestions for testing, fixing technical problems, and issues to examine. Gerhan and Mutula concluded that bandwidth and insufficient network infrastruc- ture were the main culprits in their sit- uation. They studied both bandwidth and bandwidth “squeeze.” Looking for the bandwidth “squeeze” means looking along the internet’s “journey of many stages through routers and exchange points, each successively farther removed from the user.”2 Bandwidth bottlenecks could occur at any one or more of those stages in the query’s transmission. The following four sections parse that lengthy path- way and examine how each may con- tribute to delays. Badue et al. in their article “Basic Issues on the Processing of Web Queries,” described Web margaret Brown-sica (margaret.brown -sica@ucdenver.edu) is Head of Technology and Distance Education Support, Auraria Library, serving the University of Colorado–Denver, Metropolitan State College of Denver, and the Community College of Denver. 30 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 200830 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 queries, load balancing, and how they function.3 Bertot and McClure’s “Assessing Sufficiency and Quality of Bandwidth for Public Libraries” is based on data collected as part of the 2006 Public Libraries and the Internet study and provides a very straight- forward approach for checking spe- cific areas for problems.4 It outlines why basic data such as bandwidth readings may not give the complete picture. It also gives a nice outline of factors involved such as local settings and parameters, ultimate connectivity path, application resource needs, and protocol priority. Azuma, Okamoto, Hasegawa, and Masayuki’s “Design, Implementation and Evaluation of Resource Management System for Internet Servers” was very helpful in understanding the role and function of proxy servers and problems they can present.5 Vendor issues This is a very thorny topic because it is out of the library’s control, and also because the library has so many data- bases. The systems department asked the reference staff to send reports of problems listing the type of activity attempted, time and dates, the names of the database, the problem and any error messages encountered. A few that seemed to be the slowest were selected for special examination. One vendor worked extensively with the library and in the end it was believed that there were problems at their end in load balancing, which eventually seemed to be fixed. That company was in the middle of a merger and that may have also been an issue. We also noted that a database that uses very large image files, ARTSTOR, was hard to use because it was so slow. This company sent the library an appli- cation that simulated the databases’ use and was supposed to test to see if bandwidth at Auraria Library was sufficient for that database. According to the test, it was. Databases that con- sistently were perceived as the slowest were those that had the largest docu- ments and pictures, such as those that used primarily PDFs and visual material. This, with the results of the testing, pointed to a problem indepen- dent of vendor issues. Bandwidth and network traffic The systems department decided to do bandwidth testing on the library’s public and staff computers after read- ing Gerhan and Mutula’s article about the University of Botswana. The gen- eral perception is that bandwidth is often the primary problem in net- work slowness, as well as the prob- lems with databases that use larger files. Several of the computers were tested in several successive days dur- ing what is usually the busiest time for the network, between noon and 2 p.m. The results were good, averag- ing about 3000 kilobytes per second (kbps). For this test we used the CNET bandwidth meter, which downloads an image to your computer, mea- sures the time of the download, and compares it to the maximum speeds offered by other Internet service pro- viders.6 There are several bandwidth meters available on the Internet. When the network administrator checked the switches for network traffic, they showed low traffic, almost always less than 20 percent of capacity. This was confusing: If the problem was neither with the bandwidth nor the vendors, what was causing the slow network performance? One of the university network administrators was consulted to see if any factor in their sphere could be having an effect on our network. We knew that the main university network had implemented a band- width shaper to regulate bandwidth. “These devices limit bandwidth . . . by greedy applications, guarantee mini- mum throughput for users, groups or protocols, and better utilize wide- area connections by smoothing out bursty traffic.”7 It was thought that perhaps this might be incorrectly pri- oritizing some of the library’s traffic. This was a dead end, though—the network administrators had stopped using the device. If the bandwidth was good and the traffic was manageable, then the problem appeared to not be at the library. However, according to Bertot and McClure, the bandwidth ques- tion is complex because typically an arbitrary number describes the number of kbps used to define “broadband.” . . . Such arbi- trary definitions to describe bandwidth sufficiency are gen- erally not useful. The Federal Communications Commission (FCC), for example, uses the term “high speed” for connections of 200kbps in at least one direc- tion. There are three problematic issues with this definition: 1. It specifies unidirectional bandwidth, meaning that a 200kbps download, but a much slower upload (e.g., 56kbps) would fit this defi- nition; 2. Regardless of direction, bandwidth of 200kbps is neither high speed nor does it allow for a range of Internet-based applications and services. This inad- equacy will increase sig- nificantly as Internet-based applications continue to demand more bandwidth to operate properly. 3. The definition is in the con- text of broadband to the single user or household, and does not take into con- sideration the demands of a high-use multiple-worksta- tion public-access context.8 Proxy issues Auraria Library uses the III WAM proxy server product. There were several things that pointed to the introDucinG ZoomiFY imaGE | smitH 31PlaYinG taG in tHE DarK: DiaGnosinG slownEss in liBrarY rEsPonsE timE | Brown-sica 31 proxy being an issue. One was that the systems department had been experimenting with invoking the proxy in the library building in order to collect more accurate statistics and found that complaints about speed seemed to have started around the same time as this experiment. But if the bandwidth was not showing inadequacy and the traffic was light, why was this happening? The answer is better explained by Azuma et al.: Needless to say, busy Web serv- ers must have many simultane- ous HTTP sessions, and server throughput is degraded when effective resource management is not considered, even with large network capacity. Web proxy servers must also accommodate a large number of TCP connec- tions, since they are usually pre- pared by ISPs (Internet Service Providers) for their customers. Furthermore, proxy servers must handle both upward TCP connec- tions (from proxy server to Web servers) and downward TCP connections (from client hosts to proxy server). Hence, the proxy server becomes a likely spot for bottlenecks to occur during Web document transfers, even when the bandwidth of the network and Web server performance are adequate.9 Testing was done from on campus and off campus, with and without using the proxy server. The results showed that the connection was faster without the proxy. When testing was done from the health sciences library at the University of Colorado with the same type of server and proxy, the response time was much faster. The difference between Auraria Library and the other library is that the com- munity Auraria Library serves (the Community College of Denver, Metropolitan State College, and the University of Colorado–Denver) has a much larger user population who overwhelmingly use databases from home, therefore taxing the proxy server. The other library belonged to a smaller campus, but the hardware was the same. The proxy was imme- diately dropped for on-campus users, and that resulted in some response- time improvements. A conference call was set up with the proxy ven- dor to determine if improvements in response time might be attained by changing from a proxy server to LDAP (Lightweight Directory Access Protocol) authentication. The response given was that although there might be other benefits, increased response time was not one of them. Library network hardware It was evident that the biggest bottle- neck was the proxy, so the systems department decided to take a closer look at III’s hardware. The switch that regulated traffic between the network and the server that houses our integrated library system, part of which is the proxy server, was discovered to have been set at “half- duplex.” Half-duplex refers to the trans- mission of data in just one direc- tion at a time. For example, a walkie-talkie is a half-duplex device because only one party can talk at a time. In contrast, a telephone is a full-duplex device because both parties can talk simultaneously. Duplex modes often are used in reference to network data transmissions. Some modems contain a switch that lets you select between half- duplex and full-duplex modes. The correct choice depends on which program you are using to transmit data through the modem.10 When this setting was changed to full duplex response time increased. There was also concern that this switch had not been functioning as well as it could. The switch was replaced, and this also improved response time. In addition, the old server purchased through III was a generic server that had specifi- cations based on the demands of the ILS software and didn’t into consid- eration the amount of traffic going to the proxy server. Auraria Library, which serves a campus of more than thirty thousand full-time equivalent students, is a library with one of the largest commuter student popula- tions in the country. A new server had been scheduled to be purchased in the near future, so a call was made to the ILS vendor to talk about our hypoth- esis and requirements. The vendor agreed that the library should change the specification on the new server to make sure it served the library’s unique demands. A server will be purchased with increased memory and a second processor to hopefully keep these problems from happening again in the next few years. Also, the cabling between the switch and the server was changed to greater facili- tate heavy traffic. Conclusion Although it is sometimes a daunting task to try to discover where prob- lems occur in the library’s database response time because there are so many contributing factors and because librarians often do not feel that they have enough technical knowledge to analyze such problems, there are cer- tain things that can be examined and analyzed. It is important to look at how each library is unique and may be inadequately served by current band- width and hardware configurations. It is also important not to be intimidated by computer science literature and to trust patterns of reported problems. The Auraria Library systems depart- ment was fortunate to also be able to compare problems with colleagues at other libraries and test in those librar- ies, which revealed issues that were unique and therefore most likely due to a problem at the library end. It is important to keep learning about how 32 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 200832 inFormation tEcHnoloGY anD liBrariEs | DEcEmBEr 2008 your system functions and to try to diagnose the problem by slowly look- ing at one piece at a time. Though no one ever seems to be completely satis- fied with the speed of their network, the employees of Auraria Library, especially those who work with the public, have been pleased with the increased speed they are experiencing when using proprietary databases. Having improved on the response- time speed issue, other problems that are not caused by the proxy hard- ware have been illuminated, such as browser configuration, which may be hampering certain databases—some- thing that had been attributed to the network. References 1. Webopedia, s.v. “Bottleneck,” www.webopedia.com/TERM/b/bottle- neck.html (accessed Oct. 8, 2008). 2. David R. Gerhan and Stephen Mutula, “Bandwidth Bottlenecks at the University of Botswana,” Library Hi Tech 23, no. 1 (2005): 102–17 3. Claudine Badue et al., “Basic Issues on the Processing of Web Queries,” SIGIR Forum; 2005 Proceedings (New York: Asso- ciation for Computing Machinery, 2005): 577–78. 4. John Carlo Bertot and Charles R. McClure,” Assessing Sufficiency and Quality of Bandwidth for Public Librar- ies,” Information Technology and Librar- ies 26, no. 1 (Mar. 2007): 14 –22. 5. Kazuhiro Azuma, Takuya Oka- moto, Go Hasegawa, and Murata Mas- ayuki, “Design, Implementation and Evaluation of Resource Management Sys- tem for Internet Servers,” Journal of High Speed Networks 14, no. 4 (2005): 301–16. 6. “CNET Bandwidth Meter,” http:// reviews.cnet.com/internet-speed-test (accessed Oct. 8, 2008). 7. Michael J. DeMaria, “Warding off WAN Gridlock,” Network Computing Nov. 15, 2002, www.networkcomputing.com/ showitem.jhtml?docid=1324f3 (accessed Oct. 8, 2008). 8. Bertot and McClure, “Assessing Sufficiency and Quality of Bandwidth for Public Libraries,” 14. 9. Azuma, Okamoto, Hasegawa, and Masayuki, “Design, Implementation and Evaluation of Resource Management Sys- tem for Internet Servers,” 302. 10. Webopedia, s.v. “Half-Duplex,” www.webopedia.com/TERM/h/half _duplex.html (accessed Oct. 8, 2008). LITA cover 2, cover 3, cover 4 Index to Advertisers 3241 ---- 2 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Andrew K. PacePresident’s Message andrew K. pace (pacea@oclc.org) is LITA President 2008/2009 and Executive Director, Networked Library Services at OCLC Inc. in Dublin, Ohio. W elcome to my first ITAL column as LITA presi- dent. I’ve had the good fortune to write a number of columns in the past—in Computers in Libraries, Smart Libraries Newsletter, and most recently American Libraries—and it is a role that I have always cherished. There is just enough space to say what you want, but not all the responsibility of backing it up with facts and figures. In the past, I have worried about hav- ing enough to say month after month for an undefined period. Now I am daunted by only having one year to address the LITA membership and communicate goals and accomplishments of my quickly passing tenure. I am simultaneously humbled and extremely excited to start my presidential year with LITA. I have some ambitious agenda items for the division. I said when I was running that I wanted to make LITA the kind of orga- nization that new librarians and IT professionals want to join and that seasoned librarians wanted to be active in. Recruitment to LITA is vital, but there is also work to be done to make that recruitment even easier. I am fortunate in following up the great work of my predecessors, many of whom I have had the pleasure of serving with on the LITA board since 2005. They have set the bar for me and make the coming year as challenging as anything I have done in my career. I also owe a lot to the membership who stepped forward to volunteer for committees, liaison appointments, and other volunteer opportunities. I also think it is important for LITA mem- bers to know just how much the board relies on the faith- ful and diligent services of the LITA staff. At my vice presidential Town Meeting, I talked about marketing and communication in terms of List (who), Method (how), and Message (what and why). Not only was this a good way to do some navel gazing on what it means to be a member of LITA, it laid some groundwork for the year ahead. I think it is an inescapable conclusion that the LITA board needs to take another look at stra- tegic planning (which expires this year). The approach I am going to recommend, however, is not one that tries to connote the collective wisdom of a dozen LITA leaders. Instead, I hope we can define a methodology by which LITA committees, interest groups, and the membership at large are empowered to both do the work of the division and benefit from it. One of the quirky things that some people know about me is that I actually love bureaucracy. I was pleased to read in the LITA bylaws that it is actually my duty as president to “see that the bylaws are observed by the officers and members of the Board of Directors.” I will tell you all that I also interpret this to mean that the president and the board will not act in ways that are not prescribed. The strength of a volunteer organization comes from its volunteers. The best legacy a LITA president can provide is to give committees, interest groups, and the member- ship a free reign to create its future. As for the board, its main objective is to oversee the affairs of the division during the period between meet- ings. Frankly, we’re not so great at this, and it is one of the biggest challenges for any volunteer organization. It is also one of my predecessor’s initiatives that I plan to follow through on with his help as immediate past presi- dent. Participation and involvement—and the ability to follow the work and strategies of the division—should be easier for all of us. So, if I were to put my platform in a nutshell it would be this—recruitment, communication, strategic planning, and volunteer empowerment. I left out fun, because it goes without saying that most of us are part of LITA because it’s a fun division with great members. This is a lot to get done in one year, but because it will be fun, I’m looking forward to it. 3242 ---- EDitoRiaL | tRuitt 3 Marc TruittEditorial I doubt that many of the Blog People are in the habit of sustained reading of complex texts. —Michael Gorman, 2005 S o, three plus years after the fact, why am I opening with Michael Gorman’s unfortunate characteriza- tion of those he labeled “Blog People”? I have no interest in reopening this debate, honestly! But the problem with generalizations, however unfair, is that at their heart there is just enough substance to make them “stick”—to give them a grain or two of credibility. Gorman’s words struck a chord in me that existed before his charge and has continued to exist to this day. The substance in Gorman’s words had little to do with these “Blog People” as such; rather, my interest was piqued by the implications in his remark about how we all deal with “complex texts” and the “sustained reading” of the same. In a time of wide availability of full-text electronic articles, it has become so easy and tempting to cherry pick the odd phrase here or there, without study of the work as a whole. How has scholarship especially been changed by the ease with which we can reduce works to snippets with- out having considered their overall context? I’m not arguing that scholarly research and writing hasn’t always been at least in part about finding the perfect juicy quotation around which we then weave our own theses. Many of us well recall the boxes of 3x5” citation and 5x8” quotation files that we or our patrons laboriously assembled through weeks, months, and years of detailed research. But if the style of compil- ing these files that I witnessed (and indeed did) is any guide, their existence was the product of precisely that “sustained reading of complex texts” of which Gorman spoke. My vague, nagging sense is that what is changing is this style of approaching whole texts. I wondered then about how much scholarly research today is driven by keyword searches of digitized texts that then essentially produce “virtual quotation files” without our having had to struggle with their context in the whole of the original source text? Fast forward three years. Lately, several articles touch- ing on our changing ways of interacting with resources have appeared in both scholarly and popular venues, and these have served to underline my sense that we are miss- ing something because of our growing lack of engage- ment with whole texts. Writing in the July/August issue of The Atlantic Monthly, Nicholas Carr asks “Is Google Making Us Stupid?” Drawing an analogy to the scene in the film 2001: A Space Odyssey, in which astronaut Dave Bowman disables supercomputer HAL’s memory circuits, Carr says I can feel it, too. Over the past few years I’ve had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. My mind isn’t going—so far as I can tell—but it’s changing. I’m not thinking the way I used to think. I can feel it most strongly when I’m reading. Immersing myself in a book or a lengthy article used to be easy. My mind would get caught up in the narrative or the turns of the argument, and I’d spend hours strolling through long stretches of prose. That’s rarely the case anymore. Now my concen- tration often starts to drift after two or three pages. I get fidgety, lose the thread, begin looking for something else to do. I feel as if I’m always dragging my wayward brain back to the text. The deep reading that used to come naturally has become a struggle.1 Carr goes on to explain that “what the Net seems to be doing is chipping away my capacity for concentra- tion and contemplation. My mind now expects to take in information the way the Net distributes it: in a swiftly moving stream of particles. Once I was a scuba diver in the sea of words. Now I zip along the surface like a guy on a Jet Ski.”2 Carr’s nagging fear found similar expression among some tech-savvy participants of library online forums; one of the more interesting comments appeared on the Web4Lib electronic discussion list. In a discussion of the article, Tim Spalding of LibraryThing observed that he himself had experienced what he dubbed “the Google effect” and noted Something is lost. . . . Human culture often advances by externalizing pieces of our mental life—writing externalizes memory, calculators externalize arithmetic, maps, and now GPS, externalize way-finding, etc. Each shift changes the culture. And each shift comes with a cost. Nobody memorizes texts anymore, nobody knows the times tables past ten or twelve and nobody can find their way home from the stars and the side of the tree the moss grows on.3 Meanwhile, another article appeared on a closely related topic, this time in the journal Science. James A. Evans observed that, because “scientists and scholars tend to search electronically and follow hyperlinks rather than browse or peruse,” the easy availability of electronic resources was resulting in an “ironic change” for scientific marc truitt (marc.truitt@ualberta.ca) is Associate Director, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 4 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 scholarship, in that as more journal issues came online, the articles referenced tended to be more recent, fewer journals and articles were cited, and more of those citations were to fewer journals and articles. The forced browsing of print archives may have stretched scientists and scholars to anchor findings deeply into past and present scholarship. Searching online is more efficient and following hyper- links quickly puts researchers in touch with prevailing opinion, but this may accelerate consensus and narrow the range of findings and ideas built upon.4 Evans’s research highlights an additional irony: an unintended benefit to the scholarly process in the paper- based world was “poor indexing,” since it encouraged browsing through less relevant, older, or more marginal literature. This browsing had the effect of “facilitat[ing] broader comparisons and led researchers into the past. Modern graduate education parallels this shift in pub- lication—shorter in years, more specialized in scope, culminating less frequently in a true dissertation than an album of articles.”5 What is one to make of all of this? At the outset, I wish to state clearly that I am not some sort of anti e-text Luddite. Electronic texts are a fact of life, and are becoming moreso every day. Even though they are in their infancy as a medium, they’ve already transformed the landscape of bibliographic access. My interest is not with the tool, but with the manner in which we are using it. I began by suggesting that I share with Gorman a concern about how we increasingly engage with “com- plex texts” today. Unlike him, though, my concern is not limited only to the so-called Blog People (whomever they may be), but indeed, it includes all of us. With the explosion in easily accessible electronic texts, our ideas and habits concerning interaction with these texts are changing, sometimes in unintended ways. In a recent informal survey I conducted of my colleagues at work, I asked, “Have you ever read an e-book (not just a journal article) from (virtual) cover to (virtual) cover?” For those whose answer was affirmative, I also asked, “How many such books have you read in their entirety?” Out of twenty-odd responses, three individuals answered that yes, they had had occasion to read an entire e-book (for a total of six books among the three “yes” respondents, which seemed surprisingly high to me). Of greater interest, though, were those who chose to question the premise of the survey, arguing that people don’t “read” e-books the way that they read paper ones. It does make one wonder, then, how Amazon thinks it possesses a viable business model in the Kindle e-book reader, for which it currently lists an astounding 140,000+ available e-books. Clearly, some e-books are being read as whole texts, by some people, for some purposes. But I suspect that’s another story.6 Carr and Evans use slightly differing imagery to describe a similar phenomenon. Carr closes with a refer- ence back to the death of 2001’s HAL, saying, “As we come to rely on computers to mediate our understanding of the world, it is our own intelligence that flattens into artificial intelligence.”7 Evans, on the other hand, com- pares contemporary scientific researchers to Newton and Darwin, each of whom produced works that “not only were engaged in current debates, but wove their proposi- tions into conversation with astronomers, geometers, and naturalists from centuries past.” Twenty-first-century scientists and scholars, by contrast, are able because of readily available electronic resources “to frame and pub- lish their arguments more efficiently, [but] they weave them into a more focused—and more narrow—past and present.” 8 Perhaps the most succinct statement, though, comes from LibraryThing’s Tim Spalding, who summa- rized the problem thusly: “We advance by becoming dumber.”9 An ITAL research and publishing opportunity for an inquisitive and enterprising scholar, perhaps? I’d wel- come the manuscript! Shameless Plugs Department. By the time you read this, we at ITAL will have launched our new blog, ITALica (http://ital-ica.blogspot.com). ITALica addresses a need we on the ITAL editorial board have long sensed; that is, an area for “letters to the editor,” updates to articles, supplementary materials we can’t work into the jour- nal—you name it. One of the most important features of ITALica will be a forum for readers’ conversations with our authors: We’ll ask authors to host and monitor dis- cussion for a period of time after publication so that you’ll then have a chance to interact with them. ITALica is currently a pilot project. For our first issue we will have begun with a discussion hosted by Jennifer Bowen, whose article “Metadata to Support Next-Generation Library Resource Discovery: Lessons from the eXtensible Catalog, Phase I” was published in the June 2008 issue of ITAL. For our second ITALica, we plan to expand coverage and discussion to include all articles and other features in the September issue you now have in hand. ITALica is sure to become a stimulat- ing supplement to and forum for topics originating in ITAL. We look forward to seeing you there! References and Notes Extract. Michael Gorman, “Revenge of the Blog People!” Library Journal (Feb. 15, 2005) www.libraryjournal.com/article/ CA502009.html (accessed July 21, 2008). 1. Nicholas Carr, “Is Google Making Us Stupid?” The Atlantic Monthly 301 (July/Aug. 2008) www.theatlantic.com/ doc/200807/google (accessed July 23, 2008). EDitoR’s coLumn | tRuitt 5 2. Ibid. 3. Tim Spalding, “Re: ‘Is Google Making Us Stupid? What the Internet is Doing to Our Brains,’” Web4Lib discussion list post, June 19, 2008, http://article.gmane.org/gmane.education .web4lib/12349 (accessed July 24, 2008). 4. James A. Evans, “Electronic Publication and the Narrow- ing of Science and Scholarship,” Science (July 18, 2008) www .sciencemag.org/cgi/content/full/321/5887/395 (accessed July 24, 2008). Emphasis added. 5. Ibid. 6. As of 5:30PM (EST), July 24, 2008, Amazon’s website listed 145,591 “Kindle books.” www.amazon.com/s/qid=1216934603/ ref=sr_hi?ie=UTF8&rs=154606011&bbn=154606011&rh=n%3A1 54606011&page=1. 7. Carr, “Is Google Making Us Stupid?” 8. Evans, “Electronic Publication and the Narrowing of Sci- ence of Scholarship.” 9. Spalding, “Re: ‘Is Google Making Us Stupid?’” 3243 ---- 6 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Mireia Ribera TurróEditorial Board Thoughts The June issue of ITAL featured a new column enti-tled Editorial Board Thoughts. The column features commentary written by ITAL editorial board mem- bers on the intersection of technology and libraries. In the June issue Kyle Felker made a strong case for Gerald Zaltman’s book How Customers Think as a guide to doing user-centered design and assessment in the context of limited resources and uncertain user needs. In this col- umn I introduce another factor in the library–IT equation, that of rapid technological change. In the midst of some recent spring cleaning in my library I had the pleasure of finding a report documenting the current and future IT needs of Purdue University’s Hicks Undergraduate Library. The report is dated winter 1995. The following summarizes the Hicks Undergraduate Library’s IT resources in 1995: [The library] has seven public workstations running eight different databases and using six different search software programs. Six of the stations support a single database only; one station supports one CD-ROM application and three other applications (installed on the hard drive). None of the computers runs Windows, but the current programs do not require it. Five sta- tions are equipped with six-disc CD-ROM drives. We do not anticipate that we will be required to upgrade to Windows capability in the near future for any of the application programs. Today the Hicks Undergraduate Library’s IT resources are dramatically different. As opposed to seven pub- lic workstations, we have more than seventy comput- ers distributed throughout the library and the Digital Learning Collaboratory, our Information Commons. This excludes forty-six laptops available for patron checkout and eighty-eight laptops designated for instructional use. We have moved from eight CD-ROM databases to more than four hundred networked databases accessible throughout the Purdue University Libraries, campus, and beyond. As a result, there are hundreds of “search software programs”—doesn’t that phrase sound odd today?—including the library databases, the catalog, and any number of commercial search engines like Google. Today all, or nearly all, of our machines run Windows, and the Macs have the capability of running Windows. In addition to providing access to databases, our machines are loaded with productivity and multimedia software allowing students to consume and produce a wide array of information resources. Beyond computers, our library now loans out additional equipment including hard drives, digital cameras, and video cameras. The 1995 report also includes system specifications for the computers. These sound quaint today. Of the seven computers six were 386 machines with processors clock- ing in at 25 MHz. The computers had between 640K and 2.5MB of RAM with hard drives with capacities between 20 and 60MB. The seventh computer was a 286 machine probably with a 12.5 MHz processor, and correspond- ingly smaller memory and hard disc capacity. The report does not include monitor specifications, though, based on the time, they were likely fourteen- or fifteen-inch CGA or EGA cathode ray tube monitors. Modern computers are astonishingly powerful in comparison. According to a member of our IT unit, the computers we order today have 2.8 GHz dual core processors, 3GB of RAM, and 250GB hard drives. This equates to being 112 times faster, 1,200 times more RAM, and hard drives that are 4,167 times larger than the 1995 computers! As a benchmark, consider Moore’s Law, a doubling of capacitors every two years, a sixty-four fold increase over a thirteen year period. Who would have thought that library computers would outpace Moore’s Law?! Today’s computers are also smaller than those of 1995. Our standard desktop machines serve as an example, but perhaps not as dra- matically as laptops, mini-laptops, and any of the mobile computing machines small enough to fit into your pocket. Monitors are smaller, though also bigger. Each new com- puter we order today comes standard with a twenty-inch flat panel LCD monitor. It is smaller in terms of weight and overall size, but the viewing area is significantly larger. These trends are certainly not unique to Purdue. Nearly every other academic library could boast similar IT advancements. With this in mind, and if Moore’s Law continues as projected, imagine the computer resources that will be available on the average desktop machine— although one wonders if it will in fact be a desktop machine—in the next thirteen years. What things out on the distant horizon will eventually become com- monplace? Here the quote from the 1995 report about Windows is particularly revealing. What things that are currently state-of-the-art will we leave behind in the next decade? What’s DOS? What’s a CD-ROM? Will we soon say, What’s a hard drive? What’s software? What’s a desktop computer? In the last thirteen years we have also witnessed the widespread adoption and proliferation of the Internet, the network that is the backbone for many technolo- gies that have become essential components of physical and digital libraries. Earlier this year, I co-authored an ARL SPEC Kit entitled Social Software in Libraries.1 The survey reports on the usage of ten types of social soft- ware within ARL libraries: (1) social networking sites like MySpace and Facebook; (2) media sharing sites like 6 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 matthew m. Bejune (mbejune@purdue.edu) is an ITAL Editorial Board Member (2007–09), Assistant Professor of Library Science at Purdue University, and doctoral student in the Graduate School of Library and Information Science at the University of Illinois at Urbana–Champaign. Matthew M. Bejune EDitoRiaL BoaRD tHouGHts | BEjunE 7 YouTube and Flickr; (3) social book- marking and tagging sites like del. icio.us and LibraryThing; (4) wikis like Wikipedia and Library Success: A Best Practices Wiki; (5) blogs; (6) RSS used to syndicate content from webpages, blogs, podcasts, etc.; (7) chat and instant messenger services; (8) Voice Over Internet Protocol (VOIP) services like GoogleTalk and Skype; (9) virtual worlds like Second Life and Massively Multiplayer Online Games (MMOGs) like World of Warcraft; and (10) wid- gets either developed by libraries like Facebook applications, Firefox catalog search extensions, or wid- gets implemented by libraries like MeeboMe and Firefox plugins. Of the 64 ARL libraries that responded, a 52% response rate, 61 (95% of respondents) said they are using social software. Of the three librar- ies not using social software, two indicated they plan to do so in the future. In combination then, 63 out of 64 respondents (98%) indicated they are either currently using or planning to use social software. As part of the survey there was a call for examples of social software used in libraries. Of the 370 examples we received, we selected around 70 for publication in the SPEC kit. The examples are captivating and they illus- trate the wide variety of applications in use today. Of the ten social software applications in the SPEC kit, how many of them were at our disposal in 1995? By my count three: chat and instant messenger services, VOIP, and virtual worlds such as text-based MUDs and MOOs. Of these three, how many were in use in librar- ies? Very few, if any. In our survey we asked libraries for the year in which they first implemented social software. The earliest applications were CU-SeeMe, a VOIP chat service at Cornell University in 1996, IM at the University of California Riverside in 1996 as well, and interoffice chat at the University of Kentucky in 1998. The remain- ing libraries adopted social software in year 2000 and beyond, with 2005 being the most common year with 22 responses or 34% of the libraries that had adopted social software. A look at this data shows that my earlier use of a thirteen-year time period to illustrate how difficult it is to project technological innovations that may prove disruptive to our organizations is too broad a time frame. Perhaps we should scale this back to looking at five-year increments of time. Using the SPEC Kit data, in year 2003, a total of 16 ARL libraries had adopted social software. This represents 25% of the total number of institutions that responded when we did our survey. This seems like Figure 1. Responses to the question, “Please enter the year in which your library first began using social software” (n=61). a more reasonable time frame to be looking to the future. So, what does the future hold for IT and libraries, whether it be thirteen or five years in the future? I am not a technologist by training, nor do I consider myself a futurist, so I typically defer to my colleagues. There are three places I look to for prognostications of the future. The first is LITA’s Top Technology Trends, a recurring dis- cussion group that is a part of ALA’s Annual Conference sand Midwinter Meetings. Past Top Technology Trends discussions can be found on LITA’s blog (www.ala .org/ala/lita/litaresources/toptechtrends/toptechnol- ogy.cfm) and on LITA’s website (www.ala.org/ala/lita/ litaresources/toptechtrends/toptechnology.cfm). The second source is The Horizon Project, a five-year qualita- tive research effort aimed at identifying and describing emerging technologies within the realm of teaching and learning. The project is a collaboration between The New Media Consortium and EDUCAUSE. The Horizon Project website (http://horizon.nmc.org/wiki/Main_Page) con- tains the annual Horizon Reports going back to 2004. A final approach to project the future of IT and libraries is to consider the work of our peers. The next library innova- tion may emerge from a sister institution. Or perhaps it may take route at your local library first! Reference 1. Bejune, Matthew M. and Jana Ronan. Social Software in Libraries. ARL SPEC Kit 304. Washington, D.C.: Association of Research Libraries, 2008. 3244 ---- 8 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 20088 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 From Our Readers: Virtues and Values in Digital Library Architecture Mark Cyzyk Editor’s Note: “From Our Readers” will be an occasional feature, highlighting ITAL readers’ letters and commen- taries on timely issues. At the Fall 2007 Coalition for Networked Information (CNI) conference in Washington, D.C., I pre-sented “A Survey and Evaluation of Open-Source Electronic Publishing Systems.” Toward the end of my pre- sentation was a slide enumerating some of the things I had personally learned as a Web application architect during my review of the systems under consideration: n Platform independence should not be neglected. n One inherits the flaws of external libraries and frameworks. Choose with care. n Installation procedures must be simple and flawless. n Don’t wake the SysAdmin with “Slap a GUI on that XML!”—and push application administration out, as much as possible, to select users. n Documentation must be concise, complete, and comprehensive. “I can’t guess what you’re thinking.” Initially, these were just notes I thought might be useful to others, figuring it’s typically helpful to share experiences, especially at international conferences. But as I now look at those maxims, it occurs to me that when abstracted further they point in the direction of more general concepts and traits—concepts and traits that accurately describe us and the products of our labor if we are successful, and prescribe to us the concepts and traits we need to understand and adopt if we are not. In short, peering into each maxim, I can begin to make out some of the virtues and values that underlie, or should underlie, the design and architecture of our digital library systems. n Freedom and equality Platform independence should not be neglected. “Even though this application is written in platform- independent PHP, the documentation says it must be run on either Red Hat or SuSE, or maybe it will run on Solaris too, but we don’t have any of these here.” While I no doubt will be heartily flamed for suggest- ing that Microsoft has done more to democratize comput- ing than any other single company, I nevertheless feel the need to point out that, for many of us, Windows server operating systems and our responsibility for adminis- tering them Way Back When provided the impetus for adding our swipe-card barcodes to the ACL of the Data Center—surely a badge of membership in the Club of Enterprise IT if ever there was one. You may not like the way Windows does things. You may not like the way Microsoft plays with the other boys. But to act like they don’t exist is nothing more than foolish burying one’s head in the *NIX sand. Windows servers have proven themselves time and again as being affordable, easily managed, dependable, and, yes, secure workhorses. Windows is the Ford pickup truck of the server world, and while that pickup will some day inevitably suffer a blowout of its twenty-year-old head gasket (and will therefore be respectfully relegated to that place where all dearly departed trucks go), it’s been a long and good run. We should recognize and appreciate this. Windows clearly has a place in the data center, sitting quietly humming alongside its Unix and Linux brothers. I imagine that it actually takes some effort to produce platform-dependent applications using platform-inde- pendent languages and frameworks. Such effort should be put toward other things. Keep it pure. And by that I mean, keep it platform independent. Freedom to choose and presumed equality among the server-side OSes should reign. n Responsibility and good sense One inherits the flaws of external libraries and frame- works. Choose with care. So you’ve installed the OS, you’ve installed and configured the specified Web server, you’ve installed and configured the application platform, you’ve downloaded and com- piled the source, yet there remains a long list of external libraries to install and configure. One by one you install them. Suddenly, when you get to Library Number 16 you hit a snag. It won’t install. It requires a previous version of Library Number 7, and multiple versions of Library Number 7 can’t be installed at the same time on the same box. Worse yet, as you take a break to read some more of the documentation, it sure looks like required Library Number 19 is dependent on the current version of Library Number 7 and won’t work with any previous version. And could it be that Library Number 21 is dependent on Library Number 20, which is dependent on Library Number 23, which is dependent on—yikes—Library Number 21? mark cyzyk (mcyzyk@jhu.edu) is the Scholarly Communication Architect, Library Digital Programs Group, Sheridan Libraries, johns hopkins University in Baltimore. FRom ouR REaDERs: ViRtuEs anD VaLuEs in DiGitaL LiBRaRY aRcHitEctuRE | cYzYK 9 All things come full circle. But let’s suppose you’ve worked out all of these dependencies, you’ve figured out the single, secret Order in which they must install, you’ve done it, and it looks like it’s working! Yet, when you go to boot up the Web service, sud- denly there are errors all over the place, a fearsome crash- ing and burning that makes you want to go home and take a nap. Something in your configuration is wrong? Something in the way your configuration is interacting with an external library is wrong? You search the logs. You gather the relevant messages. They don’t make a lot of sense. Now what to do? You search the lists, you search the wikis to no avail, and finally, in desperation, you e-mail the developers. “But that’s a problem with Library X, not with our application.” Au contraire. I would like to strongly suggest a Copernican revolu- tion in how we think about such situations. While it’s obvious that the developers of the libraries themselves are responsible for developing and maintaining them, I’d like to suggest that this does not relieve you, the developer of a system that relies on their software, from responsi- bility for its bugs and peculiar configuration problems. I’d like to suggest that, far from pushing responsibility in the case mentioned above out to the developers of the malfunctioning external library, that you, in choos- ing that library in the first place, have now inherited responsibility for it. Even if you don’t believe in this notion of inheritance, if you would please at least act as if it were true, we’d all be in a better place. Part of accepting this kind of respon- sibility is you then acting as a conduit through which we poor implementers learn the true nature of the problem and any solutions or temporary workarounds we may apply so that we can get your system up and running pronto. In the end, it’s all about your system. Your system as a whole is only as strong as the weakest link in its chain of dependencies. n Simplicity and Perfection Installation procedures must be simple and flawless. It goes without saying that if we can’t install your system we a fortiori can’t adopt it for use in our organization. I remember once having such a difficult time trying to get a system up and running that I almost gave up. I tried first to get it running against Apache 1.4, then against Apache 2.0. I had multiple interactions with the develop- ers. I banged my head against the wall of that system for days in frustration. The documentation was of little help. It seemed to be more part of an internal documentation project, a way for the developers to communicate among themselves, than to inform outsiders like me about their system. And related to this I remember driving to work during this time listening to a report on NPR about the famous Hopkins pediatric neurosurgeon, Dr. Ben Carson. Apparently, earlier in the week he had separated the brains of Siamese twins and the twins were now doing fine, recuperating. The NPR commentator marveled at the intricacy of the operation and at the fact that the whole thing took, I believe, five hours. “Five hours? FIVE HOURS?!” I exclaimed while bar- reling down the highway in my vintage 1988 Ford Ranger pickup (head gasket mostly sealed tight, no compression leakage). “I can’t get this system at work installed in FIVE DAYS!” Our goal as system architects needs to be that we provide to our users simple and flawless installation pro- cedures so that our systems can, on average, be installed and configured in equal or less time than it takes to per- form major brain surgery.1 “All in an Afternoon” should become our motto. I am happy to find that there are useful and easy to use package managers, e.g., Yum and Synaptic, for doing such things on various Linux distributions. Windows has long had solid and sophisticated installation utilities. Tomcat supports drop-in-place WAR files. When possible and appropriate, we need to use them. n Justice and E-Z Livin Don’t wake the SysAdmin with “Slap a GUI on that XML!”—and push application administration out, as much as possible, to select users. I remember reading Plato’s Republic as an undergraduate and the feeling of being let down when the climax of the whole thing was a definition in which “justice” simply is each man serving his proper place in society and not transgressing the boundaries of his role. “That’s it?” I thought. “So you have this rigidly hier- archical society and each person in it knows his role and knows in which slot his role fits—and keeping to this is ‘justice’?” This may not be such a great way to structure a soci- ety, but now that I think about it, it’s a great way to struc- ture a computer application. Sit down and carefully look at the functions your program will provide. Then create a small set of user roles to which these functions will be carefully mapped. In the end you will have a hierarchical structure of roles and functions that should look perfectly simple and rational when drawn on a piece of paper. And while the Superuser role should have power over 10 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 all and access to all functions in the application, the list of functions that he alone has access to should be small, i.e., the actual work of the Superuser should be minimized as much as possible by making sure that most functions are delegated to the members of other, appropriate, proper user roles. Doing this happily results in what I call the State of E-Z Livin: The last thing you want is for users to con- stantly be calling you with data issues to fix. You there- fore will model management of the data—all of it—and the configuration of the application itself—most of it— directly into the architecture of the application, provide users the GUIs they need to configure and manage things themselves, and push as much functionality as you can out to them where it belongs. Let them click their respec- tive ways to happiness and computing goodness. You build the tool, they use it, and you retire back to the land of E-Z Livin. Users are assigned to their roles, and all roles are in their proper places. Application architecture justice is achieved. n Clarity and wholeness Documentation must be concise, complete, and compre- hensive. “I can’t guess what you’re thinking.” As system developers we’ve probably all had the magical experience of a Mind Meld with a fellow developer when working intensively on a project. I have had this experi- ence with two other developers, separately, at different stages of my career. (One of them, in fact, used to point out to everyone that, “between the two of us, we make one good developer!”) This is a wonderful and magical and productive working relationship in which to be, and it needs to be recognized, supported, and exploited whenever it happens. You are lucky if you find yourself designing and developing a system and your counterpart is reading your mind and finishing your sentences. However, just as it’s best to leave that nice young couple cuddling in the corner booth alone, so too it really doesn’t make a lot of sense to expect the Mind-Melded developers to turn out anything that remotely resem- bles coherent and understandable documentation. Those undergoing a Mind Meld by definition know perfectly well what they mean. To the rest of us it just feels like we missed a memo. If you have the luxury, make sure that the one writ- ing the documentation is not currently undergoing a Mind Meld with anyone else on the development team. Scotty typically stayed behind while he beamed the oth- ers down. Beam them down. Be that Scotty. You do the world a great service by staying behind on the ship and dutifully reporting, clearly and comprehensively, what’s happen- ing down on the Red Planet. To these five maxims, and their corresponding vir- tues, I would add one more set, one upon which the others rely: n Empathy and graciousness You are not your audience. At least in applied computing fields like ours, we need to break with the long-held “Guru in the Basement” mentality. The actions of various managerial strata have now ostensibly acknowledged for us that technical exper- tise, especially in applied fields, is a commodity, i.e., it can be bought. A dearth of such expertise is remedied by sim- ply applying money to the situation—admittedly difficult to do at the majority of institutions of higher education, but a common occurrence at the wealthiest. Nevertheless, the dogmatic hold of the Guru has been broken and the magical aura that once draped her is not now so resplen- dent—her relative rarity, and the clubby superiority that depended upon it, has been diluted significantly by the sheer number of counterparts who can and will gleefully fill her function. We respect, value, and admire her; it’s just that her stranglehold on things has (rightfully) been broken. And while nobody is truly indispensable, what is more difficult and rare to find is someone who has the Guru’s same level of technical chops coupled with a genuine empathic ability to relate to those who are the intended users of her systems and services. Unless your systems and services are geared primarily toward other developers, programmers, and architects— and presumably they are not, nor, in the library world, should they be—your users will typically be significantly unlike you. Let me repeat that: Your users are not like you. Rephrased: You are not your audience. When looking back over the other maxims, values, and virtues mentioned in this essay then, the moral- psychological glue that binds them all is composed of empathy for our users—faculty, students, librarians, non-technical staff—and the graciousness to design and carry out a project plan in a spirit of openness, caring, flexibility, humility, respect, and collaboration. When empathy for the users of our systems is absent—and there are cases where you can actually see this in the design and documentation of the system itself—our systems will ultimately not be used. When the spirit of graciousness is broken, men become robots, mere rule followers, and users will boycott using their systems and will look else- FRom ouR REaDERs: ViRtuEs anD VaLuEs in DiGitaL LiBRaRY aRcHitEctuRE | cYzYK 11 where, naturally preferring to avoid playing the Simon- Says games so often demanded by Tech Folk in their workaday worlds; there is a reason the comic strip Dilbert is so funny and rings so true. When confronted with a lack of empathy and graciousness on our part, the users who can boycott using our systems and services will boy- cott using our systems and services. And we’ll be left out in the rain, feeling like, as Bonnie Raitt once sadly sang, “I can’t make you love me if you don’t / I can’t make your heart feel something it won’t.” Empathy and gracious- ness, while not guaranteeing enthusiastic adoption of our systems and services, are a necessary precondition for users even countenancing participation. There are undoubtedly other virtues and values that can usefully be expounded in the context of digital library architecture—consistency, coherence, and elegance imme- diately come to mind—and I could go on and on analyz- ing the various maxims surrounding these that bubble up through the stack of consciousness during the course of the day. Yet doing so would conflict with another virtue I think is key to the success and enjoyment of opinion- piece essays like this and maybe even of other sorts of publications and presentations: Brevity. Note 1. A colleague of mine has since informed me that Car- son’s operation took twenty-five hours, not five. Nevertheless, my admonition here still holds. When installation and con- figuration of our systems are taking longer, significantly longer, than it takes to perform major brain surgery, surely there is something amiss? 3245 ---- 12 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 MyLibrary: A Digital Library Framework and Toolkit Eric Lease Morgan This article describes a digital library framework and toolkit called MyLibrary. At its heart, MyLibrary is designed to create relationships between information resources and people. To this end, MyLibrary is made up of essentially four parts: (1) information resources, (2) patrons, (3) librarians, and (4) a set of locally defined, institution-specific facet/term combinations intercon- necting the first three. On another level, MyLibrary is a set of object-oriented Perl modules intended to read and write to a specifically shaped relational database. Used in conjunction with other computer applications and tools, MyLibrary provides a way to create and support digital library collections and services. Librarians and developers can use MyLibrary to create any number of digital library applications: full-text indexes to journal literature, a traditional library catalog complete with circulation, a database-driven website, an institutional repository, an image database, etc. The article describes each of these points in greater detail. n Background and history The term “MyLibrary” was coined by Keith Morgan, Doris Sigl, and myself in 1997 when we worked in the Department of Digital Library Initiatives at the North Carolina State University Libraries. At that time it denoted a personaliz- able/customizable user interface to sets of library collec- tions and services. It was a reaction to the then-popular portal applications called My Netscape, My Yahoo!, and My Dejanews.1 In that form, MyLibrary was a monolithic turnkey application. Librarians were expected to use the admin- istrative interface to organize information resources into three distinct groups: databases, electronic texts, and library links (services). Each item in each group was expected to be associated with one or more discipline terms. Patrons were expected to come to the system, register, select a discipline, and use the databases, texts, and library links to do library research. Patrons had three additional functions at their disposal. The first was the ability to add “personal” links—bookmarks to their favorite websites. Second, they had the ability to select multiple disciplines and thus refine the number of resources associated with “their” page. Finally, and to a small degree, patrons had the ability to change the graphic design of the page. Because of these customizable features and its implementation at NCSU Libraries, the system was officially called MyLibrary@NCState. MyLibrary@NCState was packaged and distrib- uted as open-source software, a newly coined term at that time. It was subsequently downloaded and installed in roughly two dozen libraries across the world. Some of these libraries used it in exactly the man- ner it was designed, and some of them are still acces- sible today.2 Other libraries used parts and pieces of the system to build their own applications. For example, the OpenUniversity used only the underlying database structure.3 On the other hand, Los Alamos National Laboratory used to MyLibrary@NCState concept and completely re-wrote the Perl modules.4 More importantly, the concept of MyLibrary—a user- driven, customizable interface to sets of library collec- tions and services—became very popular. MyLibrary-like applications sprang up all over the library landscape. These implementations did not use the Perl modules and scripts written under the MyLibrary@NCState rubric, but they did organize content in an underlying database and allowed patrons to mix and match the content for their specific purposes.5 As a turnkey application, MyLibrary@NCState func- tioned correctly. It did not crash and it did not output invalid data. At the same time, MyLibrary@NCState did not fare very well when it came to usability tests. For example, Gibbons describes how the usability of MyLibrary was improved to meet the needs of course offerings.6 In another article, Brantley describes how users had difficulty “understanding the discipline-spe- cific nature” of MyLibrary@NCState.7 Its installation process was nonstandard and therefore difficult to implement. As written, MyLibrary@NCState was difficult to extend and enhance, and thus it did not truly benefit from its open-source nature. Data entry was tedious and for this reason its content was difficult to initialize and maintain. The idea of actively customizing a user interface was foreign to many users. People do not take an active role in customizing their user interfaces. They accept the defaults or unconsciously expect the user interface to adapt to their needs.8 For all these rea- sons, MyLibrary@NCState’s popularity lasted about five years, but for many of the reasons outlined previously, the concept of MyLibrary still seems viable. The balance of this article describes two things: (1) how the current implementation of MyLibrary has evolved beyond the turnkey nature of MyLibrary@NCState, and (2) how the “new and improved” MyLibrary has been and can be used to create a number of digital library applications. Eric Lease morgan is head of the Digital Access and Information Architecture Department, hesburgh Libraries, University of Notre Dame, Indiana. mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 13 n MyLibrary, relationships, and facet/term combinations More than anything else, MyLibrary is intended to pro- vide a framework for creating relationships between information resources and people. Most of the time these information resources are the traditional things of libraries such as books, journals, indexes, catalogs, manuscripts, and photographs. The people of MyLibrary are patrons and librarians. Relationships can be drawn between information resources and people through the use of facet/term combinations—a locally defined and institution-specific controlled vocabulary. Information resources and people can be described in similar fashions. Resources, for example, are described with subjects. They are described according to their phys- ical format and function. Patrons and librarians focus much of their energies in specific subjects: “I am major- ing in philosophy.” Sometimes people focus their atten- tion on specific formats: “I need a journal article on . . .” Sometimes people are interested in particular functions: “I need a definition for . . .” People can belong to particular audiences and they might want to use audience-specific resources: “These resources are particularly useful for students in GEOG 203.” In our increasingly networked environment, it is just as important to create relationships between people as it is to create relationships between information resources and patrons. Librarians are not seen as the only author- ity on data and information. The opinions of one’s peers play an important role too. Users want to read reviews, rank items according to various weights, and make deci- sions based on the thoughts of people like them. Through facet/term combinations applied to users, this is possible. Moreover, since users do not visit libraries as often as they used to, librarians need to figure out ways of staying in touch with their populations. By applying facet/term combinations to librarians as well as users, the librarians can know who their users are and users can easily iden- tify subject experts. Intended for use as the framework for a controlled vocabulary, the facet/term combinations of MyLibrary give the librarian and developer an opportunity to describe and relate the primary components of librar- ies—information resources and people. Through these facet/term combinations, conceptual links can be created between information resources and users, between users and librarians, and between librarians and resources. After creating a set of facet/term combinations, the librarian and developer can address increasingly popular desires such as but not limited to: n As a librarian, this is the set of resources I curate . . . n Because you are in this class, you might want to use . . . n Here is a list of all the encyclopedias on the topic of . . . n Here is a list of patrons who use the resources I curate . . . n Here is a list of the full-text article indexes . . . n Here is a list of articles on . . . n The library owns the following special collections . . . n These special collections can be used for this class . . . n Other people in this class have also used . . . n Other people like you have used . . . n Recommended resources for this subject are . . . n Resources for this subject are . . . n The subject-specific librarian is . . . To be able to address these issues, the librarian and the developer first create sets of facet/term combinations and then assign one or more of them to information resources, patrons, and/or librarians. After the assignments have been made, lists of relevant MyLibrary objects (informa- tion resources or people) can be generated by specify- ing—“joining” in relational database parlance—facet/ term combinations held in common between the objects. For example, if many information resources, patrons, and librarians were classified using a Subjects/Astronomy facet/term combination, then the librarian and devel- oper can create a list of astronomy-related resources for patrons, a list of astronomy-interested patrons for librarians, and list of astronomy-responsible librarians for patrons. n MyLibrary facets and terms MyLibrary facets are intended to be the headings for very broad categories. MyLibrary terms are expected to denote examples of the facets. Facet/term combinations are expected but not required to be defined for every MyLibrary implementation. Every librarian and devel- oper who uses MyLibrary is expected to define his or her own set of facet/term combinations. In the form of a simplified entity-relationship diagram, figure 1 illustrates how the relationships between information resources and people are modeled in MyLibrary. An easy-to-understand facet might be Formats denot- ing the physical manifestation of an information resource. Terms associated with a Formats facet might include Books, Manuscripts, Journals, Microforms, Articles, Maps, Pictures, Movies, or Datasets. Given just about any information resource, a Formats facet/term combination can be assigned to it. For example, a library that owns the Encyclopaedia Britannica might “catalog” it with the 14 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Formats/Books facet/term combination: Title—Encyclopaedia Britannica Facet/Term—Formats/Books Another easy-to-understand facet might be called Research Tools, denoting things used to find data and information. Example terms might include Dictionaries, Thesauri, Manuals, Journal indexes, Library catalogs, Internet indexes, Encyclopedias, Atlases, or Almanacs. Continuing with the example above, Encyclopaedia Britannica might have an additional facet/term combina- tion assigned to it: Title—Encyclopaedia Britannica Facet/Term—Formats/Books Facet/Term—Research tools/Encyclopedias An Audience facet might be created to denote classes of users. In an academic library, possible Terms might include Freshman, Sophomores, Juniors, Seniors, Graduate students, Instructors, Faculty, and Staff. Using a different information resource—say, Dissertation Abstracts—we might come up with a different set of facet/term combinations: Title—Dissertation Abstracts Facet/term—Research tools/Bibliographic indexes Facet/term—Audiences/Graduate students Using MyLibrary’s facet/term combinations, it is almost trivial to create an authorities list. An Authors facet can be created to denote the creators of works. Specific names can be used as terms. Similarly, there might be a need or desire to include genre headings. Consequently, The Adventures of Huckleberry Finn might be described like this: Title—The Adventures of Huckleberry Finn Facet/term—Audiences/Adolescents Facet/term—Authors/Mark Twain Facet/term—Formats/Books Facet/term—Genre/Coming of age stories Facet/term—Genre/Novels n MyLibrary objects Facet/term combinations are used to describe and create relationships between MyLibrary objects. These objects include information resources and people, and the people consist of users and librarians. The idea of facet/term combinations has been described above. This section describes the MyLibrary objects—information resources and people—in greater detail. information resources Information resources are the traditional information- carrying “things” of a library. Typically they include books, journals, articles, manuscripts, indexes, catalogs, finding aids, etc. In order to organize and increase access to these materials, libraries systematically describe col- lections using rigorous cataloging procedures. With the advent of ubiquitous computing and the Internet, at least two things have happened regarding the “things of a library.” First, they are increasingly less biblio- graphic in nature. While the number of books, journals, and articles is certainly not decreasing, the number of conference presentations, simulations, images, sounds, movies, and data sets is multiplying at an astounding rate. Second, because of this additional content, the traditional rigorous cataloging procedures of librari- anship do not scale to the amount of work that needs to be done. Dublin Core metadata elements were cre- ated to address these problems. Facet/term combina- tions form the foundation for creating simple but local controlled vocabularies. Facet/term combinations plus Dublin Core metadata elements plus a number of other attributes brought along from MyLibrary@NCState for backwards compatibility are used to describe informa- tion resource objects in MyLibrary. attributes A few things ought to be noted about some of the MyLibrary attributes. First, many of the Dublin Core ele- ments can be duplicated with facet/term combinations. The prime candidates are elements that can be expressed as database many-to-many relationships. The Dublin Core element called creator is an excellent example. Any Figure 1. Simplified MyLibrary entity-relationship diagram. Facets have a one-to-many relationship with terms. Terms have a many- to-many relationship with resources, patrons, and librarians. After defining sets of facet/term combinations, the MyLibrary API allows librarians and developers to build interconnections between resources, patrons, and librarians. mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 15 single information resource may have many creators, and any creator may be associated with many resources. Librarians and developers who use MyLibrary are able to place creator information in an attribute of a MyLibrary resource object and/or in a facet/term combination. The former usage is similar to traditional library catalog- ing technique and consequently requires additional over- head for editing records. The application of facet/term combinations makes it much easier to maintain database integrity as well as create browsable lists. Just like creators, subjects might be better imple- mented as facet/term combinations, and the MyLibrary subject attribute might be used as a placeholder for keywords or non-controlled vocabulary terms. Each MyLibrary resource object might have multiple sub- jects. Using the facet/term approach, this is no prob- lem to implement. Using the Dublin Core subject field approach, this is challenging, since the field is not repeatable. To circumvent this, librarians and develop- ers are encouraged to delimit subject term values with predefined characters (such as “|”). Upon indexing or display, the subject attribute can be parsed into multiple values. identifiers MyLibrary resource objects possess three distinct types of identifiers, and each has it own explicit use. The first is the MyLibrary resource identifier, which is a relational database key. It is non-assignable and non-editable by librarians or developers. It is an internal value used to maintain relational database integrity. The second type of MyLibrary resource identifier is called the fkey, and it is used to denote a foreign key. This attribute is primarily intended to contain the value of an identifier from a remote information system like the 001 field of a MARC record. A better example includes the harvesting of records from OAI data repositories. Each record in each OAI repository has an Internet-wide unique identifier. This value is not a URL, but usually a combination of characters and numbers analogous to the 001 field of a MARC record. Each repository may also implement a concept called “sets,” and each record might belong to multiple sets. When harvesting from a reposi- tory, the librarian and developer can save the OAI iden- tifier as an fkey value, and when the same record from an alternative set is discovered, the associated resource object can be updated instead of duplicated. The third type of identifier are Resource::Location objects. They are primarily intended for but not limited to URLs. Unlike all of the other resource attributes, Resource::Location objects are intended to have many values because information resources have many loca- tions. For example, a library might have a printed version of The Adventures of Huckleberry Finn, and its location is denoted by a call number. A library might also have an electronic version, and its location is a URL. An online bibliographic database might be located at a particular URL, but its locally developed help text might be located at a different URL. Each Resource::Location object has three qualities: (1) a key, (2) a type, and (3) a value. The key is an internal relational database identifier. The type is an institution-defined value denoting the kind of loca- tion. Examples might include primary URL, help text URL, call number, local file name, and ISBN or ISSN. The value is an example of the type, and, in the case of Dublin Core elements, might very well be the identifier. Using MyLibrary Resource::Location objects, single information resources can be displayed and multiple locations can be associated with them. Library services Think creatively regarding the definition of resource objects. Think library services as well as books, journals, and databases. Libraries are about more than collections. They are also about services applied against those collections. Libraries want to promote their services just as much as they want to promote access to bibliographic indexes, special collections, and the wealth of monographs. These services include bibliographic and information literacy sessions, circulation services (such as interlibrary loan, item recalls, renewals, or document delivery), library tours, one-on-one reference consultations, and online chats. Each of these services has a title, a description, and probably a URL where details can be read online. MyLibrary resource objects provide a means to embody this information in a concise package. All that is miss- ing are facet/term combinations to relate them to other information resources or people. Consider an Audience facet. Putting things on reserve is something of inter- est to instructors. Consider an Audience term called Instructors. Assign an Audience/Instructors facet/term combination to instructions for putting things on reserve. Things put on reserve are intended for use by students. Again, consider assigning something like an Audience/ Students facet/term combination to instructions for using the reserve book room. people—patrons and librarians MyLibrary includes two types of objects representing people: patrons and librarians. Like information resource objects, librarian and patron objects are characterized using a number of attributes plus facet/term combina- tions. On one level, the patron attributes are simple and rudimentary only including things like first name, last name, username, password, e-mail address, URL, and image. This type of information was explicitly designed to map to the FOAF (Friend of a Friend) architecture in the hopes of future compatibility. Patron objects also 16 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 include attributes for things like last date visited and total number of visits. This information forms the basis for potential What’s New? functionality. The patron object also includes functionality to record personal links for bookmarking features. The MyLibrary librarian object is even simpler than the patron object since it only includes attributes for name, e-mail address, and URL. Just like the MyLibrary information resource objects, both the patron objects and the librarian objects can be mapped to facet/term combinations. Just as MLA Bibliography might be “cataloged” using a Subjects/ English literature facet/term combination, a patron or librarian object can be “cataloged” in the same way. Once these sorts of relationships are established, recommenda- tions can begin to take shape. Once patrons start book- marking and associating particular resources and services to their identity, the system can take the next step and address things such as “People like you also used . . . ” or “Popular resources in this area are . . .” Moreover, once facet/term combinations are associated with people, then relationships between people can be created and the system can answer statements such as “Other people interested in this topic include . . .” or “The patrons who are interested in this subject are . . .” Establishing facet/term combinations for people is not as difficult as it may seem at first. In an academic library, much of this information can be gleaned from human resources data or the institution’s registrar office. Libraries probably already get this information in one shape or another to populate their integrated library system circulation module. At the very least, this infor- mation includes a first name, a last name, and a unique institution identifier (possibly a username). Given this information, the librarian and developer could query the institution’s directory services to discover institutional department and/or major field of study. Just as this information is loaded into the integrated library system to support borrowing, it can be loaded into a MyLibrary instance. Each department or major can then be mapped to facet/term combinations. Privacy is a real issue with the inclusion of patron information in a MyLibrary instance. It should be taken very seriously. The use of MyLibrary does not assume the inclusion of patron information; it is more than possible to use MyLibrary and not have it contain any information about people. On the other hand, without this informa- tion a library prevents itself from providing the sort of services increasingly expected by its patrons. A discus- sion of the professional ethics of providing personalized services to library users in a computer-networked envi- ronment is beyond the scope of this article. Each library must weigh for itself the strengths, weaknesses, advan- tages, and threats of using information about patrons to provide individualized services. n Combining MyLibrary with other “toolboxes” As a framework or toolbox, MyLibrary is intended to support only certain aspects of a digital library, namely, the collection of content, information about people, and a means of making relationships between them. MyLibrary is not intended to be an “integrated library system.” It has no acquisitions module. It has no circulation mod- ule. It includes the only the most basic functionality for searching. Instead, librarians and developers are expected to combine MyLibrary with other tools to fulfill these functions. For example, acquisitions functionality can be implemented by harvesting OAI content. By combin- ing MyLibrary with another set of Perl modules called Net::OAI::Harvester, librarians and developers can import OAI-based content into a MyLibrary instance.9 Feed Net::OAI::Harvester an OAI root URL, and it will systematically harvest remote metadata in any number of metadata formats. Since Dublin Core metadata is required of all OAI data repositories, and since MyLibrary sup- ports a one-to-one mapping to Dublin Core elements, it is trivial to create MyLibrary resource objects based on each of the harvested records. Appendix A illustrates a simple yet complete OAI acquisitions application. It har- vests journal article metadata from the Directory of Open Access Journals. Just about any bibliographic metadata format can be mapped to Dublin Core. Examples include MARC, MARCXML, MODS, EAD, and TEI. To get content in these forms into a MyLibrary instance, the librarian and developer need to write a program reading bibliographic data, parsing out the desired information, and saving it to MyLibrary. Considering MARC data, the venerable Perl module called MARC::Record could be used to read and parse the data.10 The other data formats are XML- based, and a Perl-based application supporting XSTL or XPath could be used to read and parse the data. In all of these cases the content of the MyLibrary instance should be considered brief and the fkey value might point to the original file on the local file system. Such MyLibrary resource objects are useful for syndication, search result displays, or browsable lists. If more detail is required, then the brief records can point to the full metadata through the fkey value. MyLibrary is not intended to support search. That is because search is best supported not by a database but by an indexer.11 There are myriad indexers available. Some of them include Swish-e, KinoSearch, Zebra, and Lucene.12 To search the content of a MyLibrary instance, librarians and developers are expected to write reports against the instance and use them as the content for indexing. Appendix B illustrates a rudimentary but complete pro- mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 17 gram creating a KinoSearch index against a MyLibrary instance. Once the index is created, librarians and devel- opers are expected to write interfaces to search the index. Appendix C illustrates one searching technique: get a query as input, search the index, return a record’s ID value, lookup the record in MyLibrary, display. In summary, MyLibrary first defines a number funda- mental library objects (information resources, people, and a controlled vocabulary). It then supports a Perl-based application programmer interface (API) for doing input/ output against these objects. The input can be garnered from any number of streams—manual data entry, tab- delimited text files, MARC or XML files, OAI, etc. The output can be XML files, RSS or Atom feeds, OAI, HTML subject pages, e-mail messages, or PDF files. n Production and demonstration applications A number of diverse applications have been created with MyLibrary. Some of them are production services. Some of them are not fully developed and only exist to dem- onstrate the possibilities. This section briefly describes a number of them. alex catalogue of Electronic text The Alex Catalogue of Electronic Text is a collection of just less than 14,000 public-domain documents from American literature, English literature, and Western philosophy. Much of the content comes from Project Gutenberg, but it also includes content from the defunct Eris Project of Virginia Tech and the Internet Wiretap Archive. Each MyLibrary resource object includes as much Dublin Core data as possible. The description attribute of each MyLibrary resources includes not an abstract of the electronic text, but an RDF/XML ver- sion of the original text. A report was written against the MyLibrary instance that saves the RDF/XML to the local file system. These files were then indexed with an open-source indexer called Zebra, and access to the index was provided through a Web Services–based protocol called SRU (Search/Retrieve via URL). Consequently, the catalogue is full-text searchable as well as searchable via title, creator, and subject. The contents of the subject fields were computed by analyzing each document and extracting statistically significant words. The searchable interface supports a Did You Mean? service by compar- ing search terms to alternative spellings and a WordNet thesaurus. The Catalogue’s title and creator browsable lists are static HTML files built by a script written against the underlying MyLibrary instance. Finally, links to all of the documents and their subjects have been uploaded to Del.icio.us. To facilitate this, a script was written against the database extracting all the titles, their creators, and subjects (“tags”). These things were then sent to Del.icio .us via a Perl module implementing the Del.icio.us API. article index The Directory of Open Access Journals includes an OAI interface to its journal titles as well as some of its articles. The Article Index system harvested the article metadata and saved it to a MyLibrary instance. Along the way, journal titles and publishers were saved to underlying facet/term combinations and linked to each article. This enabled the creation of browsable lists via publisher and source. The content of the database was indexed using KinoSearch and made accessible via a Perl module written to implement SRU. Search results are displayed in a brief format. Details are available via a simple Asynchronous JavaScript and XML (AJAX-y) link. Appendixes A, B, and C illustrate the core of this application. catholic Research Resources alliance The Catholic Research Resources Alliance (CRRA) is a “portal” intended to highlight rare and unique materials of interest to Catholic scholars. Much of this content exists in archives. Archives use an XML format called EAD to describe their holdings. The CRRA provides a mechanism for ingesting these EAD files, parsing out controlled vocab- ularies, populating facet/term combinations accordingly, full-text indexing the EAD, and supporting a searchable/ browsable interface to the entire content via SRU. The CRRA also supports ingesting MARC records as well as getting its input from online data-entry forms. Reports are written against the underlying MyLibrary instance allow- ing the CRRA’s content to be accessible via OAI. Facebook A Facebook application has been written against the MyLibrary data of the Hesburgh Libraries University of Notre Dame’s database-driven website. After Facebook users load the application into their profile, they are presented with a set of default recommended resources. The user then has the option to select a different set of resources based on subject terms presented in a pop-up menu. The resulting list of resources is then saved to the user’s profile pane, giving easy access to the pertinent databases and indexes of his or her selected subject. Library catalog MyLibrary has been used to create a demonstration library catalog. About 300,000 MARC records were downloaded 18 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 from the Library of Congress. A program was written that reads each MARC record, crosswalks it to Dublin Core, and creates MyLibrary resource objects accordingly. Each MARC record is saved as an individual file on the file system. The whole collection is indexed with KinoSearch, and an SRU interface provides access to the index. As search results are returned, the existence of ISBN num- bers is checked. If found, cover art and user reviews are retrieved and displayed from Amazon. Each record is dis- played in a brief format, but links to a fully tagged format is available as well as MARCXML and MODS formats. Each record is also associated with a “Get it for me” link. Once clicked, the item is essentially checked out to the user. Each user then has a “bookshelf” link displaying the items they have borrowed. Hesburgh Libraries, university of notre Dame’s database-driven website The Hesburgh Libraries’ database-driven website is prob- ably the most extensive MyLibrary application in exis- tence, and its primary purpose is to support the majority of the libraries’ website. The system begins with the integrated library system where much (but not all) of the library’s website content has been cataloged using tradi- tional methods. Each item in the catalog destined for the website has been flagged with a local note denoting such. Each item’s description has also been enhanced with facet/term combinations. On a nightly basis, all of the items destined for the website are exported from the cata- log as MARC records. On a nightly basis, another script reads these records and updates a MyLibrary instance. Reports are written against the instance creating sub- ject pages, format pages, tool pages, etc., complete with descriptions, recommendations, and links to associated librarians. Some information resources on the website are not deemed worthy of a record in the catalog. For these items, a manual data-entry form was created allowing bibliographers and subject specialist librarians to supple- ment the website’s content. These resources are seam- lessly integrated into the website along with the resources from the catalog. To facilitate search, reports are written against the MyLibrary instance and fed to Swish-e. The resulting index is then supplemented with the content of static Web pages to support Search This Site functionality. Using this database-driven and MyLibrary-based system, the content of the libraries’ website has many fewer bro- ken links because the links are all centrally maintained. The site also sports a common look and feel, making it easy for users to know where they are located in the system. This process also eliminates the need for selec- tors and subject specialist librarians to know any HTML. They can focus on content and the system can focus on presentation. n Future directions and conclusion The MyLibrary modules work in the manner in which they were intended, and they continue to be distributed and supported as open-source software, but software is never complete. MyLibrary is available from CPAN (Comprehensive Perl Archive Network). It is supported by a website com- plete with voluminous documentation, sample applica- tions, access to a CVS repository, blog commentaries, and a mailing list with about 150 subscribers.13 Yet despite the support, use of MyLibrary outside the University of Notre Dame has been underwhelming. I assume this is true because the number of Perl programmers in libraries is shrinking as the number other programming languages (PHP, Python, Ruby, Java, etc.) grows. The modularity of the system may also be a factor since most of the library profession can not write a computer program and there- fore will have a difficult time understanding how to put MyLibrary into practical use. The idea of facet/term combinations used to describe information resources as well as people may be off-putting. Finally, because MyLibrary requires an underlying database to operate, the normal Perl installation process (perl Makefile.PL; make; make test; make install) can only be done after a bit of pre-installation processing. This is possibly another impediment to adoption—the installation process is a bit unusual. Despite these issues, MyLibrary works very well for the University of Notre Dame, and a number of improve- ments are planned. First, the underlying database con- tains a table for user reviews, and a Perl module needs to be written allowing input/output against these tables. Similarly, MyLibrary presently includes tables for keep- ing track of how often a particular resource is used and by whom, but there is no module to update the table. Future work will enhance this statistics table and implement the statistics module. Finally—and most importantly—work will be done to make it easy to do input/output against a MyLibrary instance through a REST-ful (Representational State Transfer) interface. As defined by REST, this inter- face will exploit the four transfer methods of HTTP (GET, POST, PUT, and DELETE) to retrieve, create, edit, and remove MyLibrary objects from the underlying database. By exploiting REST-ful computing techniques, at least two things will be enabled. First, application program- mers will be able to use their favorite computer language to maintain a MyLibrary instance. There will be no need to know Perl; REST is computer-language indepen- dent. Second, through the use of REST-ful computing MyLibrary content will be more easily syndicated. For example, the output of a REST-ful MyLibrary interface could be manifested in many flavors of XML. Atom comes to mind, but an RDF/XML representation may be mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 19 more expressive. The output of a REST-ful interface to MyLibrary could also be manifested as a JSON (Javascript Object Notation) data structure, making it easier to inte- grate MyLibrary content in AJAX-y interfaces. As more and more library collections and services are manifested in a computer-networked environment, the need to provide these collections and services in new and different ways increases. MyLibrary is an attempt to address this issue, and it has met with qualified success. Acknowledgments An enormous debt of gratitude goes to Rob Fox of the Hesburgh Libraries, University of Notre Dame for writ- ing the bulk of the MyLibrary Perl modules. Rob and I sat down together for a couple days in 2003 to learn about object-oriented Perl programing techniques from Ed Summers (now working at the Library of Congress). We then coupled that experience with the needs and desires of the libraries to articulate and design MyLibrary as it is today. While I wrote bits and pieces of the modules and used them to write many applications, Rob was the per- son who really got his hands dirty. References and Notes 1. Keith Morgan and Tripp Reade, “Pioneering Portals: MyLibrary@NCState,” Information Technology and Libraries 19, no. 4 (Dec. 2000): 191–98. 2. The author has identified at least four MyLibrary@ NCState implementations still up and running from across the world, including The Wellington City Libraries in New Zea- land, www.wcl.govt.nz/mylibrary (accessed Feb. 19, 2008); the Buswell Library Electronic Access Center of Wheaton College, http://libweb.wheaton.edu/mylibrary (accessed Feb. 19, 2008); the Biblioteca Mario Rostoni at the Universita Carlo Cattaneo, http://mylibrary.liuc.it/mylibrary (accessed Feb. 19, 2008); and Auburn University, http://mylibrary.auburn.edu (accessed Feb. 19, 2008). 3. Anne Ramsden, James McNulty, Fiona Durham, Helen Clough, and Nicola Dowson created MyOpenLibrary for the OpenUniversity in the United Kingdom. “MyOpenLibrary is an online personalised library system developed for Open University students and staff. Every individual user can have a virtual library ‘shelf’ or space which is tailored to meet their particular needs. The system is based on the MyLibrary soft- ware originally developed at North Carolina State University and now supported at Notre Dame University. The software has a simple basic interface, groups resources under clear headings, and provides a tick box facility for selecting and removing resources. Users sign in because it is a personalised service, but then they can customise the colour and settings of their page according to need, and if they are familiar with the Internet, they add their own personal favourite links. There is a quick search facility for searching individual databases and Internet search engines. The system is currently being used by 20 Open University courses and this is expected to increase year on year. For more information see http://myopenlibrary. open.ac.uk/.” MyOpenLibrary includes 80,768 patrons (79% of the total student population of OpenUniversity), 111 dis- ciplines, 12,731 e-books, 500 databases, and 38,708 journals. From personal correspondence between the author and James McNulty (Feb. 19, 2008). 4. “The LANL implementation of MyLibrary @ LANL is an object oriented redesign of the Mylibrary source code created by Eric Lease Morgan of North Carolina State University. The code was designed by two summer students Andres Monroy- Hernandez and Cesar Ruiz-Meraz from Monterrey, Mexico. The code is currently maintained by Mariella di Giacomo and Ming Yu.” From http://library.lanl.gov/lww/mylibweb.htm (accessed Feb. 19, 2008). 5. A search against Google for “mylibrary” returns myriad results, many of which are MyLibrary-like applications and services. Representative samples include MyLib of Malaysia’s National Digital Library, www.mylib.com.my (accessed Feb. 19, 2008); My Library of Hennepin County Library, www.hclib .org/pub/ipac/MyLibrary.cfm (accessed Feb. 19, 2008); and MyLibrary of Coastal Carolina University, www.coastal.edu/ library/mylibrary.html (accessed Feb. 19, 2008). 6. Susan Gibbons, “Building Upon the MyLibrary Concept to Better Meet the Information Needs of College Students,” D-Lib Magazine 9, no. 3 (Mar. 2003), www.dlib.org/dlib/march03/ gibbons/03gibbons.html (accessed Feb. 19, 2008). 7. Steve Brantley, Annie Armstrong, and Krystal M. Lewis, “Usability Testing of a Customizable Library Web Portal,” Col- lege and Research Libraries 67, no. 2 (Mar. 2006): 146–63, www.ala .org/ala/acrl/acrlpubs/crljournal/backissues2006a/marcha/ Brantley06.pdf (accessed Feb. 19, 2008). 8. Udi Manber, Ash Patel, and John Robison, “Experience with personalization on Yahoo!” Communications of the ACM 43, no. 8 (Aug. 2000): 35–39. 9. Net::OAI::Harvester, http://search.cpan.org/dist/OAI -Harvester (accessed Feb. 19, 2008). 10. MARC::Record, http://search.cpan.org/dist/MARC -Record (accessed Feb. 19, 2008). 11. Search is a function best supported by an indexer, not a relational database. Relational databases are tools for organizing and maintaining data. Through the process of normalization, relational databases store data unambiguously and efficiently. Because relational databases store their information in tables, records, and fields, it is necessary to specify the tables, records, and fields when querying a database. This requires the user to know the structure of the database. Moreover, standard relational databases do not support full-text searching nor relevance-ranked output. Indexers excel at search. Given a stream of documents, indexers parse tokens (words) and associ- ate them with document identifiers. Searches against indexes return document identifiers and provide the means to retrieve the documents without the necessary knowledge of the index’s structure. Indexers are weak at data maintenance. In a well- designed database, authority terms can be updated in a single location and reflected throughout the database. Indexers do not support such functionality. Databases and indexers are two sides of the same information retrieval coin. Together they form the technological core of library automation. 20 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 12. There are a growing number of open-source indexers available on the Web, including Swish-e, http://swish-e.org (accessed Feb. 19, 2008); KinoSearch, www.kinosearch.com/ kinosearch (accessed Feb. 2008); Zebra, http://indexdata.com/ zebra (accessed Feb. 19, 2008); and Lucene, http://lucene .apache.org (accessed Feb. 19, 2008). 13. The canonical home page for MyLibrary version 3.x is http://mylibrary.library.nd.edu (accessed Feb. 19, 2008). APPENDIx A # harvest DOAJ articles into a MyLibrary instance # require use MyLibrary::Core; use Net::OAI::Harvester; # define use constant DOAJ => ‘http://www.doaj.org/oai.article’; # the OAI repository MyLibrary::Config->instance( ‘articles’ ); # the MyLibrary instance # create a facet called Formats $facet = MyLibrary::Facet->new; $facet->facet_name( ‘Formats’ ); $facet->facet_note( ‘Types of physical items embodying information.’ ); $facet->commit; $formatID = $facet->facet_id; # create an associated term called Articles $term = MyLibrary::Term->new; $term->term_name( ‘Articles’ ); $term->term_note( ‘Short, scholarly essays.’ ); $term->facet_id( $formatID ); $term->commit; $articleID = $term->term_id; # create a location type called URL $location_type = MyLibrary::Resource::Location::Type->new; $location_type->name( ‘URL’ ); $location_type->description( ‘The location of an Internet resource.’ ); $location_type->commit; $location_type_id = $location_type->location_type_id; # create a harvester and loop through each OAI set mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 21 $harvester = Net::OAI::Harvester->new( ‘baseURL’ => DOAJ ); $sets = $harvester->listSets; foreach ( $sets->setSpecs ) { # get each record in this set and process it $records = $harvester->listAllRecords( metadataPrefix => ‘oai_dc’, set => $_ ); while ( $record = $records->next ) { # map the OAI metadata to MyLibrary attributes $FKey = $record->header->identifier; $metadata = $record->metadata; $name = $metadata->title; @creators = $metadata->creator; $note = $metadata->description; $publisher = $metadata->publisher; next if ( ! $publisher ); $location = $metadata->identifier; next if ( ! $location ); $date = $metadata->date; $source = $metadata->source; @subjects = $metadata->subject; # create and commit a MyLibrary resource $resource = MyLibrary::Resource->new; $resource->fkey( $FKey ); $resource->name( $name ); $creator = ‘’; foreach ( @creators ) { $creator .= “$_|” } $resource->creator( $creator ); $resource->note( $note ); $resource->publisher( $publisher ); $resource->source( $source ); $resource->date( $date ); $subject = ‘’; foreach ( @subjects ) { $subject .= “$_|” } $resource->subject( $subject ); $resource->related_terms( new => [ $articleID ]); $resource->add_location( location => $location, location_type => $location_type_id ); $resource->commit; } } 22 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 # done exit; APPENDIx B # index MyLibrary data with KinoSearch # require use KinoSearch::InvIndexer; use KinoSearch::Analysis::PolyAnalyzer; use MyLibrary::Core; # define use constant INDEX => ‘../etc/index’; # location of the index MyLibrary::Config->instance( ‘articles’ ); # MyLibrary instance to use # initialize the index $analyzer = KinoSearch::Analysis::PolyAnalyzer->new( language => ‘en’ ); $invindexer = KinoSearch::InvIndexer->new( invindex => INDEX, create => 1, analyzer => $analyzer ); # define the index’s fields $invindexer->spec_field( name => ‘id’ ); $invindexer->spec_field( name => ‘title’ ); $invindexer->spec_field( name => ‘description’ ); $invindexer->spec_field( name => ‘source’ ); $invindexer->spec_field( name => ‘publisher’ ); $invindexer->spec_field( name => ‘subject’ ); $invindexer->spec_field( name => ‘creator’ ); # get and process each resource foreach ( MyLibrary::Resource->get_ids ) { # create, fill, and commit a document with content my $resource = MyLibrary::Resource->new( id => $_ ); my $doc = $invindexer->new_doc; $doc->set_value ( id => $resource->id ); mYLiBRaRY: a DiGitaL LiBRaRY FRamEwoRK anD tooLKit | moRGan 23 $doc->set_value ( title => $resource->name ) unless ( ! $resource->name ); $doc->set_value ( source => $resource->source ) unless ( ! $resource->source ); $doc->set_value ( publisher => $resource->publisher ) unless ( ! $resource->publisher ); $doc->set_value ( subject => $resource->subject ) unless ( ! $resource->subject ); $doc->set_value ( creator => $resource->creator ) unless ( ! $resource->creator ); $doc->set_value ( description => $resource->note ) unless ( ! $resource->note ); $invindexer->add_doc( $doc ); } # optimize and done $invindexer->finish( optimize => 1 ); exit; APPENDIx C # search a KinoSearch index and display content from MyLibrary # require use KinoSearch::Searcher; use KinoSearch::Analysis::PolyAnalyzer; use MyLibrary::Core; # define use constant INDEX => ‘../etc/index’; # location of the index MyLibrary::Config->instance( ‘articles’ ); # MyLibrary instance to use # get the query my $query = shift; if ( ! $query ) { print “Enter a query. “; chop ( $query = )} # open the index $analyzer = KinoSearch::Analysis::PolyAnalyzer->new( language => ‘en’ ); $searcher = KinoSearch::Searcher->new( invindex => INDEX, analyzer => $analyzer ); # search $hits = $searcher->search( qq( $query )); # get the number of hits and display $total_hits = $hits->total_hits; 24 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 print “Your query ($query) found $total_hits record(s).\n\n”; # process each search result while ( $hit = $hits->fetch_hit_hashref ) { # get the MyLibrary resource $resource = MyLibrary::Resource->new( id => $hit->{ ‘id’ }); # extract dublin core elements and display print “ id = “ . $resource->id . “\n”; print “ name = “ . $resource->name . “\n”; print “ date = “ . $resource->date . “\n”; print “ note = “ . $resource->note . “\n”; print “ creators = “; foreach ( split /\|/, $resource->creator ) { print “$_; “ } print “\n”; # get related terms and display @resource_terms = $resource->related_terms(); print “ term(s) = “; foreach (@resource_terms) { $term = MyLibrary::Term->new(id => $_); print $term->term_name, “ ($_)”, ‘; ‘; } print “\n”; # get locations (URLs) and display @locations = $resource->resource_locations(); print “ location(s) = “; foreach (@locations) { print $_->location, “; “ } print “\n\n”; } # done exit; 3246 ---- aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 25 Are PDF Documents Accessible? Mireia Ribera Turró Adobe PDF is one of the most widely used formats in scientific communications and in administrative docu- ments. In its latest versions it has incorporated structural tags and improvements that increase its level of accessi- bility. This article reviews the concept of accessibility in the reading of digital documents and evaluates the acces- sibility of PDF according to the most widely established standards. In a world in which an increasing amount of informa-tion is circulating in digital format, document acces-sibility is becoming a major concern. Many countries have adopted legislative measures concerning digital accessibility (see, for example, the Web Accessibility Initiative at www.w3.org/WAI/Policy) and the guru of the Web, Jakob Nielsen, has included it in several columns (Nielsen 1996, 1999) and reports (Coyne and Nielsen 2001a, 2001b, 2001c; Schade and Neilsen 2002). Improving document accessibility for disabled per- sons, including the elderly, offers good business oppor- tunities for IT firms. For example, Sun has introduced strict accessibility guidelines in its Java programming language, and Microsoft has incorporated an increasing number of assistive technologies in its operating system. For its part, Adobe came out clearly in favor of accessibil- ity in the latest updates of its flagship format, PDF, and its free Reader program (Adobe 2005). The efforts of these and many other companies are necessary if persons with disabilities are to be able to use products as well as persons without disabilities. In an effect similar to that of the cascade of interactions that takes place in the search for information in a digital library (Bates 2002), the accessibility of a digital product is contextual and depends on many layers: the product itself, the application used to operate it, the support of the operating system, and the additional assistive tech- nologies used to transform the content (Henry 2007). For example, an HTML document is considered to be accessible if it complies with the Web Content Accessibility Guidelines (WCAG 1.0) (W3C 1999, W3C 2006), but it is only usable if the browser with which it is consulted provides the options of accessibility (e.g., by allowing users to modify the associated style sheet), if the user has the necessary assistive technologies—screen magnifiers, screen readers, alternate pointing devices, etc.—to use the information and functionality contained in the document, and if all these tools interact correctly with each other. This article focuses on the accessibility of the PDF due to its importance in the world of digital publishing. Though we do not have global statistics on its use, a Google search specifying PDF as the format returns 236 million documents, whereas none of the other recover- able formats reaches 50 million documents (Postscript 10 million, Microsoft Word 37 million, Microsoft Excel 14 million, Microsoft PowerPoint 14 million). (The search was performed on April 14, 2007, with the arguments filetype:pdf, etc. Values were rounded off to the nearest million.) It should be remembered that PDF is the main format used for digital publishing of electronic journals and for a great variety of administrative documents, including e-government communications. Furthermore, the subformat PDF/A for archiving is the preferred format for digital preservation in many large libraries, including the Library of Congress, which recommends it for textual documents in which the appearance is more relevant than the structure (Library of Congress 2005). Finally, according to a study by Forrester Research in 2005, PDF/A and XML will be the dominant formats in document archiving in 2008 (Markham 2005). If our digi- tal memory is going to be in PDF, we must ensure that it is accessible to all persons. So far, the many studies that have been carried out on the accessibility of digital information have considered mainly the accessibility of Web content in HTML. Digital documents in a broad sense have never been evaluated from an accessibility viewpoint, and the only user studies carried out on them have dealt with usability—without paying particular attention to special capacities (see Dillon 2004)—or user preferences with regard to articles in electronic format (Tenopir 2003). Because it is a very new field, few studies have concentrated on PDF accessi- bility, and they do not form part of the scientific literature. However, Joe Clark and Duff Johnson have published some interesting articles on the subject (see Clark 2005, Johnson 2006, 2007a, and 2007b). n What does accessible really mean? The most widespread view of digital accessibility is the regulatory one. The concept of accessibility of digital information was mainly disseminated with the publica- tion of WCAG 1.0 by the World Wide Web Consortium (W3C) as the de facto standard in this area, and with its incorporation in the federal legislation of the United States. In the United States, compliance with the acces- sibility guidelines has been used as a requirement in calls for tenders, and in some cases bidders have been taken to court for failure to comply. From this view- point, an accessible application is one that is “valid,” i.e., approved by the criteria of WCAG 1.0 or Section 508 of the Rehabilitation Act (U.S. Access Board) and comply- mireia Ribera turró (ribera@ub.edu) is professor at Universitat de Barcelona Department of Library and Information Science. 26 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 ing with their established checklist (see appendixes for the detailed checklist). Products must be certifiable as accessible. To facilitate the administrative procedures for approving bids, there is great interest in the creation of automatic protocols for checking compliance. Examples of this are products such as the historic Bobby, the LIFT extension for Adobe Dreamweaver, and the new WCAG 2.0, which is still being revised at the time of submission of this manuscript. Some authors have evaluated digital journals and database interfaces according to their com- pliance with these guidelines (e.g., Coonin 2002; Stewart, Narendra, and Schmetzke 2005). See the importance of the concept “programmatically determined” in the draft version of WCAG 2.0 (W3C 2006). The international standard defined by ISO 16071:2003 offers a different definition of accessibility. It considers accessibility to be “the usability of a product, service, environment, or facility by people with the widest range of capabilities” (ISO 16071:2003). In other words, acces- sibility is considered equivalent to usability, with the sole difference that the objective users are not specified but rather defined broadly as having “the widest range of capabilities.” If we consult the standard definition of usability (ISO 9241-11:1998), we can rewrite the definition of accessibility as “the extent to which a product can be used (by users with the widest range of capabilities) to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.” Accessibility must also be measured according to the parameters of efficacy, efficiency, and satisfaction for the type of user (Hornbaek 2006). Furthermore, one should not consider the accessibility of a product in general, but rather in a given context and for given tasks. According to this definition, it would not be appropriate to state that “the website of the company company.com is accessible.” We should state that “the website of the company com- pany.com is accessible for broadband connections in office environments with the browsers Internet Explorer 6.0 or later and Mozilla Firefox 7 or later, and for commercial transactions.” A new concept, the “baseline,” appears is WCAG 2.0. It marks a great change in the philosophy of website accessibility. The baseline defines the context of the software platform, and the accessibility can be evalu- ated only in this context. Specifying the context is particu- larly important because some authors have seen a parallel between disabled users and disabling situations, which arise increasingly with the new paradigm of ubiquitous computation (Newell 1995). For example, a person suffer- ing from deafness may have the same problems of access as a user in a noisy environment (a discotheque) or one in which silence is compulsory (a hospital or library). Accessibility is also linked to computer manipulability. According to the MVC (Model-View-Controller) pattern, the final format or document should allow its content, its presentation, and its interactivity to be manipulated inde- pendently in order to personalize each one according to the user’s preferences. For example, a webpage (content) should be navigable with a keyboard and with a mouse (control), and should be viewable with different font sizes, color contrasts, etc. (presentation). The separation of presentation and content, for example, through cascad- ing style sheets (CSS), is thus highly desirable. However, there are also other aspects of digital formats related to computer manipulability. For example, it is considered that an open format is more accessible than a proprietary format because it is easier to develop assistive technolo- gies to take advantage of its potential; a multiplatform format is more accessible than a format linked to a par- ticular platform because it is adapted to a greater diver- sity of controls; a format that includes characteristics of internationalization is more accessible than one that does not because it can present a greater wealth of content; and a format that uses semantic encoding is more accessible than one that uses syntactic encoding because the soft- ware tools can extract more information from it. Finally, several authors in the field of publishing (Dechilly 2004) and the field of accessible design (Paciello 2000, Petrie and Weber 2002) have related the accessibility of the digital documents to the structuring of the content and its potential transformation. Specifically, Raman (1994) defines the accessibility of a digital document as n the amount of structural information captured by the encoding; n the degree to which this structural information is available for processing by other applications; and n the availability of the appropriate software needed to process this structure. n Disabilities that affect reading and assistive technologies Returning to the definition of accessibility as being for the widest range of capabilities, it is observed that some disabilities have direct effects on digital reading and asso- ciated activities (O’Hara 1996), and that there are some well-established assistive technologies that can eliminate or minimize these effects. There are three main groups of print disabilities that also affect the comprehension of graphics: n All degrees of vision problems, from total blind- ness to reduced vision, color-blindness, and other dysfunctions. The most widely used assistive tech- nology for total blindness is that of screen read- ers, which digest the information on the screen and transform it into spoken text. Reduced vision makes it difficult to read or capture the informa- tion offered; for persons with this disability, screen aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 27 magnifiers offer an optical zoom of the information shown in addition to color and contrast adjust- ment. In both cases additional information in the document is often required, such as explana- tory subtitles for video recordings and alternative descriptions for images. n Motor skill problems, particularly those affecting the upper extremities. This disability hinders the interaction with information and the activation of controls, links, and even linear scrolling in the document. There are a great variety of assis- tive technologies for persons with this disability, including pointing devices, alternative keyboards, voice synthesis technologies for activating controls, and even assistive technologies for automatic text completion. n Different types of cognitive problems that affect reading comprehension. Those caused by dyslexia or early deafness can benefit from screen magni- fiers, screen readers, and automatic text comple- tion. Those caused by cognitive disabilities—which have not been widely studied—often require a simplified presentation of the information through graphics or very simple language. n PDF PDF is the descendent of Postscript and is oriented toward presentation. Though recent research has experi- mented with new, more versatile image models (Bagley, Brailsford, and Hardy 2003), these have not yet been incorporated in the commercial format, which is still basi- cally Postscript. PDF is a format of digital dissemination that has replicated paper documents for many years. Its faithful reproduction and portability on different plat- forms, in addition to a commercial policy of free dissemi- nation of the Reader program, have given it a dominant market position among digital publishing formats. In the latest versions, PDF incorporates functions of digital management of access rights (DRM) and allows informa- tion providers to regulate the permission to view, print, extract, and modify the content. The orientation toward presentation, which has turned PDF into the de facto standard in the publishing industry, is its main drawback for accessibility. In order to solve this, from version 1.4 onward Adobe has incor- porated structural elements in the format (e.g., structural tags). This article therefore only studies the most recent versions of the format. Despite the potential of the format, it is possible to create PDF documents of very different qualities. The application from which a PDF document is generated and the process followed in creating it directly affect the accessibility of the resulting document. Specifically, PDF can have four increasing levels of accessibility: 1. PDF image—no accessibility 2. PDF text 3. PDF text, with order 4. Tagged PDF—maximum accessibility pDF image documents A PDF image document is a document obtained from scanning or photographing a printed document. Its con- tent is exclusively the bitmap resulting from the optical process. It does not allow searches in the document or text extraction because the text is only coded as a graphic. A PDF image document is similar to a paper docu- ment in its level of accessibility. For blind persons or persons with reading problems (caused by dyslexia or early deafness) it must be transformed through an optical character recognition process in order to encode the text and adapt it to screen readers or alter the presentation. The only advantage that the digital presentation may have over a paper presentation is the possibility of optical zooming and increasing the text size to benefit persons with certain visual impairments. pDF text documents The second level of accessibility is that of PDF text docu- ments, which come from the same source as image files but have gone through an optical character recognition process and incorporate the resulting text in the file. In this case it is possible to search the content, export the content to a word processor, listen to it with screen read- ers, and perform other types of conversion. Specifically, as of version 6 or later, Reader incorporates the possibility of removing columns, viewing the text in negative (white on black), increasing the font size, and even hearing it with a screen reader. According to the quality of the OCR used and of the subsequent manual revision, these files often contain slight typographic errors that may affect the results of text searches and also make continued reading more dif- ficult (especially when a screen reader is being used). If the quality of the original is poor, or it is in bad condition, OCR programs often make small mistakes. Furthermore, if the original has a creative layout in which the order of presentation does not correspond to the order of reading, the resulting text appears disordered and is therefore illegible; this problem can arise due to such common structures as footnotes, headers, and margin notes. pDF documents with ordered text The third level of accessibility is that of PDF text docu- 28 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 ments in which the correct reading order has been estab- lished. This can be done when the document is created or by editing it at a later stage. tagged pDF documents The fourth and highest level of accessibility is when the PDF document contains ordered text and structural tags to define headers, tables, lists, etc. With this encoding, an assistive technology can present a summary of the docu- ment, facilitate navigation, provide structural informa- tion of the content, etc. This fourth level is normally achieved only by post- processing a PDF document that has already been created. When documents are converted from the most widely used word processors to PDF, there still arise errors that must be revised manually. For example, when tables are converted from Microsoft Word to PDF, the tags are created correctly in general, but the headers of the tables are not marked up because Word does not allow them to be differentiated structurally. Another example is the conversion from Open Office 2.0, in which the documents created have major structural deficiencies. n Are PDF documents accessible? Even at the highest level of accessibility, tagged PDF document, the aspects discussed above in the definition of accessibility must be checked. Though PDF is totally multimedia, particularly in the latest versions, and it now allows programming to be included, the commonest PDF documents are plain documents in which text and images are reproduced and the interactivity is reduced to the use of forms; these documents will thus be the ones analyzed here in an initial approach to PDF accessibility. Though this may represent a limitation, in fact it includes most of the PDF documents used for electronic journals and administrative documents. PDF’s accessibility is analyzed from the viewpoint of the end users, the readers of the document, so comments on the most widespread user agent, Reader, are also included. PDF’s compliance with WCAG 1.0, the WCAG 2.0 draft, and Section 508 is evaluated, and its accessibil- ity is considered from the viewpoint of the platforms on which it runs and the programs that can be used to create documents. However, Reader and Adobe Acrobat Professional are not evaluated as authoring tools or user agents because that is not the focus of this article. WCAG 1.0 In 1997 the W3C officially created an initiative to foster the accessibility of the Web (Engelen 2001), following the vision of its creator, Tim Berners Lee: “The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.” The Web Accessibility Initiative (WAI, www.w3.org/WAI) works with organiza- tions around the world to develop strategies, guidelines, and resources to help make the Web accessible to people with disabilities. W3C-WAI has established three recom- mendations to improve the accessibility of the Web: n WCAG 1.0, released in May 1999, based on the Trace Unified Web Guidelines (version 8), affecting Web content in itself (e.g., an HTML page) n The Authoring Tool Accessibility Guidelines, affecting software used to build websites (e.g., Dreamweaver) n The User Agents Accessibility Guidelines (UAAG), affecting browsers and multimedia players that interact with Web content (e.g., Mozilla Firefox) Of these guidelines, the ones that have had the great- est impact are WCAG 1.0, because they affect content pro- viders, such as governments. Most legislations promoting digital accessibility have made direct or indirect reference to these guidelines. WCAG 1.0 is divided into in fourteen guidelines or general principles of accessible design, which are speci- fied through several checkpoints. Each checkpoint has a priority level based on its impact on accessibility. Checkpoints of [Priority 1] are a basic requirement for some groups to be able to use Web documents. The ones of [Priority 2] will remove significant barriers to accessing Web documents. Checkpoints of [Priority 3] will improve access to Web documents (W3C 1999). WCAG 1.0 is designed to evaluate documents, not formats, so in this article the evaluation often refers to the potential of the format if it is used properly to create documents. PDF can potentially comply with all the checkpoints of WCAG 1.0 applicable to text, images, and forms (its multimedia potential has not been analyzed in this arti- cle) in any priority except for three points: 5.2 Headers for tables of complex data, in Priority 1. The current version of PDF only foresees the use of TH as the header of tables, specifies attributes that allows it to be related to columns or files, but provides no mechanism for grouping cells (like COLGROUP or ROWGROUP of HTML). 3.4 Relative units in markup language attribute val- ues and style sheet property values, in Priority 2. Though the most recent versions of PDF use CSS, the format only allows absolute values to be speci- fied. However, this does not prevent Reader from making extensive zooms of the content of the PDF aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 29 documents. 9.5 Provide keyboard shortcuts, in Priority 3. No mechanism is specified for linking a keyboard shortcut to a content. In both Point 4.1, for identifying changes of language, and Point 5.1, on simple table headers (both of them in Priority 1), although PDF provides for the incorpora- tion of this information, some experiments carried out with the Acrobat Professional tools for conversion from Microsoft Word have shown that this information is not correctly transferred from the word processor to the PDF document. For a more detailed analysis, see the tables in appen- dix A. Here we have only commented on the most outstanding aspects. WCAG 2.0 draft In the course of time WCAG 1.0 has become out of date. The Web of 2007 is very different from that of 1999: The increased use of multimedia formats; the growth of Web- based software, AJAX, the XML subformats; and the paradigm of ubiquitous computing have made it neces- sary to redefine WCAG 1.0 guidelines, which were ini- tially highly focused on HTML, in order to extend them to all types of format; it has been considered necessary to be able to define an available software platform base- line (for a detailed discussion of the differences between WCAG 1.0 and WCAG 2.0, see www.webaim.org/stan dards/wai/wcag2.php#differences). However, due to the enormous success of WCAG 1.0, the development of the WCAG 2.0 has been subject to enormous pressure, and it has received more comments and suggestions than any other guideline of W3C. This is why the process of approval is slower than normal, and though publication dates have been repeatedly announced, the current docu- ment is no more than a draft. WCAG 2.0 has four principles, each one addressed by several guidelines. Under each guideline there are success criteria used to evaluate conformance to this standard for that guideline, which fall into three levels of conformance, each representing a higher level of acces- sibility (W3C 2006). PDF complies almost absolutely with all the success criteria of all the levels of the four principles of accessibil- ity described in WCAG 2.0. Only in Principle 3 (which establishes that both the con- tent and the controls must be understandable), Guideline 3.1, Level 3, “Make text content readable and understand- able” does WCAG 2.0 fail in several success criteria: 3.1.3 Offer definitions for words used in an unusual way. 3.1.6 Offer the pronunciation of words where meaning cannot be determined without pronunciation. In all of these points, the generic title attribute could be used for all the tags, but there is no standard mecha- nism for differentiating pronunciation or slang. For a more detailed analysis, see the tables in appen- dix B. Here we have only commented on the most out- standing aspects. section 508 of the Rehabilitation act In August 2000 the U.S. Access Board (www.access-board .gov), an independent federal group that oversees the production of guidelines on accessibility for compliance with various legislative measures, published a revised amendment of the Rehabilitation Act of 1973, stating that the information provided by federal agencies must be accessible to people with disabilities (Engelen 2001). As Johnson mentions, “The regulations also apply to con- tractors that submit electronic documents to the federal government”(2007b). Compliance with Section 508 is parallel to compliance with WCAG 1.0. PDF complies with all the points except the ability to associate data cells and header cells for tables that have more than one logical level (checkpoint h). For a detailed analysis see the tables in appen- dix C. Here we have only commented on the most outstanding aspects. n Are PDF documents accessible from a computer’s viewpoint? Some doubt still remains about whether PDF is an open or closed format. It is a proprietary format, so strictly speaking it is not open, but Adobe systematically pub- lishes the specification of format and allows it to be used by third parties free of charge, simply reserving the intel- lectual property rights (see Adobe 2006, 32, for the exact terms of the Adobe license). Furthermore, Adobe recently initiated the process for PDF to become an ISO standard (Adobe 2007). The latest versions of PDF allow content and presen- tation to be differentiated. The content consists of the text and images, and the presentation can be encoded like webpages with some properties of CSS version 1 and 2. The control of the document depends on the pro- gram used to read it, since PDF does not allow keyboard shortcuts or alternative actions to be defined for any application. One of its main advantages is that it can be reproduced faithfully on any platform, but in terms of accessibility it is not multiplatform. The structure, links, and forms of Adobe Reader for assistive technologies are accessed through the Microsoft Active Accessibility tech- nology, so they can only be used on Microsoft platforms. The screen reader incorporated as of Reader 6 does work on both Windows and Macintosh. 30 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 With regard to internationalization, PDF supports the UniCode character set and also allows the language of the document to be specified on a global and local level. Though the specification of the format establishes that it supports inverse reading order (e.g., for Arabic languages) or vertical reading order (e.g., for Asian lan- guages), WebAIM (WebAIM 2006) states that it causes problems with them. The structure in a PDF document is transmitted mainly through the tags incorporated in the format since version 1.4. The standard set of tags is fairly limited—more so than that of HTML 4 (see Adobe 2006, Sections 10.7 and 10.8, for a complete list). The set of tags can be extended but always with an equiva- lent with the standard set, which is the only one sup- ported by Adobe. The structure is transmitted through the Microsoft Active Accessibility (MSAA) technol- ogy to any assistive technology, so other applications can also read it. MSAA provides agents and synthe- sizers in several languages that do not tend to be installed by default but can be downloaded free of charge from the Microsoft website (www.microsoft .com/msagent/downloads/user.asp). There are cur- rently few programs that can process this structure, of which Reader is the most widespread. Among the screen readers, Jaws by FreedomScientific and Windows Eyes by GW Micro can also process PDF files on a structural level. According to Joe Clark (2005), IBM Home Page Reader and Hal Screen Reader by Dolphin can also do so. Most of these programs incorporate support for PDF in their latest versions, which are not always the most widely used. n Are PDF documents accessible according to the ISO standard? On this topic, few studies have been made and much work remains to be done. Accessibility should be evalu- ated for the different types of disabilities that affect elec- tronic reading and for different scenarios and contexts of use. Though the tests of users with disabilities are begin- ning to be generalized, and there are even guidelines on how to do them (Coyne and Nielsen 2001a), after an extensive bibliographic search I was unable to locate any studies evaluating this aspect in PDF. n The opinion of the experts Joe Clark, the author of one of the most important books on website accessibility, Building Accessible Websites (2003), currently forms part of Adobe’s work group on usability and accessibility. He is one of the great- est proponents of PDF, and claims that it offers some advantages of accessibility/usability compared with HTML (its greatest competitor for digital documents), such as its ability to use footnotes, notes, and comments. He gives little importance to the fact that it is a propri- etary format and argues that the important point is that the specification is public. In his article on the acces- sibility of PDFs (Clark 2005), he defends the use of the format compared with others for certain objectives and needs: for interactive forms, for documents in revision, for design fonts not available in HTML, as a format of preservation, and for documents that require the man- agement of digital rights. With regard to software, he makes a great defense of Reader for taking advantage of documents and resolving problems of accessibility, but recognizes that in the field of authoring tools more programs are necessary. WebAIM, an initiative for accessibility at Utah State University, gives its opinion on some points that facilitate or hinder the accessibility of PDFs: n The screen-reading function only exists in the complete version of Reader, and by no means offers the same functionality as Jaws or Windows Eyes. Furthermore, to use it one must memo- rize new access keys that are not common to other programs. n The reflowing function is very useful for persons who require magnification or who work with small screens because it eliminates the horizontal scroll. n WebAIM recommends the use of HTML for tables, particularly if they are complex, because current screen readers can process them far better than if they are in PDF. n WebAIM criticizes the fact that the options for con- figuring accessibility in Reader are highly oriented toward screen readers and magnifiers, and that they are only partial and confusing. For example, it mentions the fact that it is possible to configure some multimedia options but not from the acces- sibility setup wizard. (WebAIM 2006) Another aspect that receives constant criticism is the cost of creating accessible PDF documents. Though throughout this article the accessibility of PDF has been studied from the viewpoint of the user, the lack of tools for creating documents must also be stressed. Though the specification is in theory public and there exist software tools (e.g., the PDFLib software library and the XSLT transformations in Open Office) to generate tags in a PDF that are the result of a document conversion, users normally depend on the tools of Microsoft Office and the Acrobat Professional program to create accessible PDF documents. Even with these tools, editing is not as intui- tive as editing a plain text tagged with HTML, and it is aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 31 far less maintainable because in PDF tags and content are encoded separately. n Conclusions As has been seen in the analysis, PDF can be considered fairly accessible from many points of view, and its degree of compliance with the most widely recognized guide- lines is fairly high. However, a surprising lack of atten- tion is paid to complex data tables, which form a very important part of scientific articles, one of the main types of document encoded in this format. The accessibility of PDF has greatly reduced its multiplatform nature and it is to be expected that Adobe will gradually resolve this point, particularly in the Linux environment that has been adopted by many public administrations. An accessible and multiplatform PDF is a requisite for a really public electronic government. The format still faces three major challenges with regard to accessibility: n The creation of powerful authoring tools that allow documents to be edited easily, to be modified, and to be partially changed without having to restart the cycle of creation. There is a lack of authoring tools for creating accessible PDF documents easily and a lack of integration of the creation process in the most widely used word processors. It is to be expected that with the merger of Macromedia and Adobe these tools will shortly appear on the market. n A greater opening of the format by Adobe in order to foster its extensibility (Adobe recently applied for PDF to become an ISA standard), which in digi- tal articles is beginning to be a requirement for the annotation of mathematical or chemical formulas. (See, for example, the specific development of the Infty project for reading mathematical formulas in PDF [Kanahori 2006].) n A greater wealth of tags and attributes in order to express variants, and additional or comple- mentary information, such as definitions and pronunciation. Despite its shortcomings, the possibilities of the PDF with regard to accessibility have increased greatly in the latest versions, and it is now almost on a level with HTML. The existence of Reader with several options for facilitating accessible reading increases its attractiveness even further. The experts recognize the giant steps made by the format, though they are aware of its limitations; for the moment their advice is to use the right format for the right task. Finally, further research is required in order to gather the opinion of users on its accessibility. Works Cited Adobe. 2005. Creating accessible PDF documents with Adobe Acrobat 7.0: A guide for publishing PDF doc- uments for use by people with disabilities. www .adobe.com/enterprise/accessibility/pdfs/acro7_pg_ ue.pdf (accessed Apr. 17, 2007). ———. 2006. PDF reference: Adobe portable document for- mat version 1.7, 6th ed. www.adobe.com/devnet/acrobat/ pdfs/pdf_reference.pdf (accessed Apr. 17, 2007). ———. 2007. Adobe to release PDF for industry standardiza- tion. Press release, Jan. 29. www.adobe.com/aboutadobe/ pressroom/pressreleases/200701/012907OpenPDFAIIM .html (accessed Apr. 17, 2007). Bagley, S. R., D. F. Brailsford, and M. R. B. Hardy. 2003. Document formatting: Creating reusable well-structured PDF as a sequence of component object graphic (COG) elements. Presented at Document Engineering, Grenoble, France. Bates, M. J. 2002. The cascade of interactions in the digital library interface. Information Processing and Management 38, no. 3: 381–400. Clark, J. 2003. Building Accessible Websites. Indianapolis: New Riders. ———. 2005. Facts and opinions about PDF accessibility. A List Apart, Aug. 22, 2005. www.alistapart.com/articles/ pdf_accessibility (accessed Apr. 17, 2007). Coonin, B. 2002. Establishing accessibility for e-journals: A suggested approach. Library Hi Tech 20, no. 2: 207–20. Coyne, K. P., and J. Nielsen. 2001a. How to Conduct Usability Evaluations for Accessibility: Methodology Guidelines for Testing Websites and Intranets with Users Who Use Assistive Technology. Fremont, Calif.: Norman Nielsen Group. ———. 2001b. Beyond ALT text: Making the Web easy to use for users with disabilities. Fremont, Calif.: Norman Nielsen Group. ———. 2001c. Web Usability for Senior Citizens: Design Guidelines Based On Usability Studies with People Age 65 and Older. Fremont, Calif.: Norman Nielsen Group. Dechilly, T. 2004. Diffusion de contenus et de documents sur internet. In J. Le Moal, B. Hidoine, and L. Calderna (Eds.), Publier sur Internet : Séminaire INRIA 27 septembre–1er octobre 2004 Aix-les-Bains: 65–100. (s.l.): Association des profession- als de l’information et de la documentation (ADBS)/Institut National de recherche en informatique et en automatique. Dillon, A. 2004. Designing Usable Electronic Text. 2nd ed. Boca Raton, Fla.: CRC Press. Engelen, J. 2001. Guidelines for web accessibil- ity. Inclusive Design Guidelines for HCI. Ed. Nichole Collette and Julio Abascal. London: Taylor and Francis. Henry, S. L. 2007. Just ask: integrating accessibility throughout design. www.uiaccess.com/accessucd (accessed Apr. 17, 2007). Hornbaek, K. 2006. Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64, no. 2: 79–102. 32 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 ISO 16071:2003. Ergonomics of human-system interaction— guidance on accessibility from human-computer interfaces. Iternational Organization for Standardization, 2003. ISO 9241-11:1998. Ergonomic requirements for office work with visual display terminals (VDTs)—Part 11: Guidance on usability. International Organization for Standardization, 1998. Johnson, D. 2006. What are PDF tags and why should I care? www.acrobatusers.com/articles/2006/02/pdf_tags/pdf_ tags.php (accessed Apr. 17, 2007). ———. 2007a. PDF goes to ISO: the road ahead. www.planet pdf.com/enterprise/article.asp?ContentID=PDF_goes_to_ ISO_-_The_road_ahead&page=0 (accessed Apr. 17, 2007). ———. 2007b. PDF in government. www.acrobatusers.com/ articles/2007/02/pdf_in_government/index.php (accessed Apr. 17, 2007). Kanahori, T. 2006. Scientific PDF document reader with sim- ple interface for visually impaired people. Conference on Computers Helping People with Special Needs (ICCHP 10th), University of Linz, Austria. Library of Congress. 2005. PDF/X, PDF for prepress graphics file exchange. Sustainability of Digital Formats Planning for Library of Congress Collections. http://digitalpreservation .gov/formats/fdd/fdd000124.shtml (accessed Apr. 17, 2007). Markham, R. 2005. The market for accessible technology— The wide range of abilities and its impact on computer use. www.microsoft.com/enable/research/phase1.aspx (accessed Apr. 17, 2007). Nielsen, Jakob. 1996. Accessible design for users with dis- abilities. Alertbox: Current Issues in Web Usability, Oct. www .useit.com/alertbox/9610.html (accessed Apr. 17, 2007). ———. 1999. Disabled accessibility: The pragmatic approach. Alertbox: Current Issues in Web Usability, June 13. www.useit .com/alertbox/990613.html (accessed Apr. 17, 2007). Newell, Alan F. 1995. Extra-ordinary human computer opera- tion. Extra-Ordinary Human-Computer Interactions: Interfaces for Users with Disabilities. Ed. A. D. N. Edwards, 3–18. Cambridge: Cambridge Univ. Pr. O’Hara, K. 1996. Toward a Typology of Reading Goals. No. XRCE Technical Report No. EPC-1996-107. Xerox Research Centre Europe. www.xrce.xerox.com/Publications/ Attachments/1996-107/EPC-1996-107.pdf (accessed Apr. 17, 2007). Paciello, Michael G. 2000. Web Accessibility for People with Disabilities. Lawrence, Kans.: CMP Books. Petrie, Helen and G. Weber. 2002. Reading multimedia docu- ments. Computers Helping People with Special Needs 8th International Conference, ICCHP 2002. Proceedings, 15–20 July. Raman, T. V. 1994. Audio system for technical reading. PhD Thesis, Cornell University. Schade, A. and J. Nielsen. 2002. Accessibility and usability of Flash for users with disabilities based on best practices for design of Flash-based user interfaces, based on usability studies with people who use assistive technology. Fremont, Calif.: Norman Nielsen Group. Stewart, R., V. Narendra, and A. Schmetzke. 2005. Accessibility and usability of online library databases. Library Hi Tech 23, no. 2: 265–286. Tenopir, C. 2003. Use and Users of Electronic Library Resources: An Overview and Analysis of Recent Research Studies. Washington, D.C.: Council on Library and Information Resources. http://clir.org/pubs/reports/pub120/pub120. pdf (accessed Apr. 17, 2007). U.S. Access Board. 1998. Section 508 of the Rehabilitation Act. U.S. Code 29, §794d. WebAIM. 2006. Accessibility features in Acrobat Reader 7. www .webaim.org/resources/reader/index.php (accessed Apr. 17, 2007). W3C. 1999. Web Content Accessibility Guidelines 1.0. www .w3.org/TR/WAI-WEBCONTENT (accessed Apr. 17, 2007). W3C. 2006. Web Content Accessibility Guidelines 2.0. W3C working draft, Apr. 27, 2006. www.w3.org/TR/WCAG20 (accessed Apr. 17, 2007). aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 33 APPENDIx A. WCAG 1.0 checklist of checkpoints Priority 1 checkpoints In General (Priority 1) Yes No N/A 1.1 Provide a text equivalent for every non-text element (e.g., via “alt,” “longdesc,” or in element content). This includes: images, graphical representations of text (including symbols), image map regions, animations (e.g., animated GIFs), applets and programmatic objects, ASCII art, frames, scripts, images used as list bullets, spacers, graphical buttons, sounds (played with or without user interaction), stand-alone audio files, audio tracks of video, and video. x 2.1 Ensure that all information conveyed with color is also available without color, for example from context or markup. x 4.1 Clearly identify changes in the natural language of a document’s text and any text equivalents (e.g., captions). x1 6.1 Organize documents so they may be read without style sheets. For example, when an HTML document is rendered without associated style sheets, it must still be possible to read the document. x 6.2 Ensure that equivalents for dynamic content are updated when the dynamic content changes. x 7.1 Until user agents allow users to control flickering, avoid causing the screen to flicker. x 14.1 Use the clearest and simplest language appropriate for a site’s content. x And if you use images and image maps (Priority 1) Yes No N/A 1.2 Provide redundant text links for each active region of a server-side image map. x 9.1 Provide client-side image maps instead of server-side image maps except where the regions cannot be defined with an available geometric shape. x And if you use tables (Priority 1) Yes No N/A 5.1 For data tables, identify row and column headers. x 5.2 For data tables that have two or more logical levels of row or column headers, use markup to associate data cells and header cells. x And if you use frames (Priority 1) Yes No N/A 12.1 Title each frame to facilitate frame identification and navigation. x And if you use applets and scripts (Priority 1) Yes No N/A 6.3 Ensure that pages are usable when scripts, applets, or other programmatic objects are turned off or not supported. If this is not possible, provide equivalent information on an alternative accessible page. x And if you use multimedia (Priority 1) Yes No N/A 1.3 Until user agents can automatically read aloud the text equivalent of a visual track, provide an auditory description of the important information of the visual track of a multimedia presentation. x 1.4 For any time-based multimedia presentation (e.g., a movie or animation), synchronize equivalent alternatives (e.g., captions or auditory descriptions of the visual track) with the presentation. x And if all else fails (Priority 1) Yes No N/A 11.4 If, after best efforts, you cannot create an accessible page, provide a link to an alternative page that uses W3C technologies, is accessible, has equivalent information (or functionality), and is updated as often as the inaccessible (original) page. x 34 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Priority 2 checkpoints In General (Priority 2) Yes No N/A 2.2 Ensure that foreground and background color combinations provide sufficient contrast when viewed by someone having color deficits or when viewed on a black and white screen. [Priority 2 for images, Priority 3 for text.] x2 3.1 When an appropriate markup language exists, use markup rather than images to convey information. x3 3.2 Create documents that validate to published formal grammars. x 3.3 Use style sheets to control layout and presentation. x4 3.4 Use relative rather than absolute units in markup language attribute values and style sheet property values. x 3.5 Use header elements to convey document structure and use them according to specification. x 3.6 Mark up lists and list items properly. x 3.7 Mark up quotations. Do not use quotation markup for formatting effects such as indentation. x 6.5 Ensure that dynamic content is accessible or provide an alternative presentation or page. x 7.2 Until user agents allow users to control blinking, avoid causing content to blink (i.e., change presentation at a regular rate, such as turning on and off). x 7.4 Until user agents provide the ability to stop the refresh, do not create periodically auto- refreshing pages. x 7.5 Until user agents provide the ability to stop auto-redirect, do not use markup to redirect pages automatically. Instead, configure the server to perform redirects. x 10.1 Until user agents allow users to turn off spawned windows, do not cause pop-ups or other windows to appear and do not change the current window without informing the user. x 11.1 Use W3C technologies when they are available and appropriate for a task and use the latest versions when supported. x 11.2 Avoid deprecated features of W3C technologies. x 12.3 Divide large blocks of information into more manageable groups where natural and appropriate. x 13.1 Clearly identify the target of each link. x 13.2 Provide metadata to add semantic information to pages and sites. x 13.3 Provide information about the general layout of a site (e.g., a site map or table of contents). x 13.4 Use navigation mechanisms in a consistent manner. x aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 35 In General (Priority 2), cont. Yes No N/A And if you use tables (Priority 2) Yes No N/A 5.3 Do not use tables for layout unless the table makes sense when linearized. Otherwise, if the table does not make sense, provide an alternative equivalent (which may be a linearized version). x 5.4 If a table is used for layout, do not use any structural markup for the purpose of visual formatting. x And if you use frames (Priority 2) Yes No N/A 12.2 Describe the purpose of frames and how frames relate to each other if it is not obvious by frame titles alone. x And if you use forms (Priority 2) Yes No N/A 10.2 Until user agents support explicit associations between labels and form controls, for all form controls with implicitly associated labels, ensure that the label is properly positioned. x 12.4 Associate labels explicitly with their controls. x And if you use applets and scripts (Priority 2) Yes No N/A 6.4 For scripts and applets, ensure that event handlers are input-device independent. x 7.3 Until user agents allow users to freeze moving content, avoid movement in pages. x 8.1 Make programmatic elements such as scripts and applets directly accessible or compatible with assistive technologies. [Priority 1 if functionality is important and not presented elsewhere, otherwise Priority 2.] x 9.2 Ensure that any element that has its own interface can be operated in a device-independent manner. x 9.3 For scripts, specify logical event handlers rather than device-dependent event handlers. x Priority 3 checkpoints In General (Priority 3) Yes No N/A 4.2 Specify the expansion of each abbreviation or acronym in a document where it first occurs. x5 4.3 Identify the primary natural language of a document. x 9.4 Create a logical tab order through links, form controls, and objects. x 9.5 Provide keyboard shortcuts to important links (including those in client-side image maps), form controls, and groups of form controls. x6 10.5 Until user agents (including assistive technologies) render adjacent links distinctly, include non- link, printable characters (surrounded by spaces) between adjacent links. x 11.3 Provide information so that users may receive documents according to their preferences (e.g., language, content type, etc.). x 36 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 In General (Priority 3), cont. Yes No N/A 13.5 Provide navigation bars to highlight and give access to the navigation mechanism. x 13.6 Group related links, identify the group (for user agents), and, until user agents do so, provide a way to bypass the group. x 13.7 If search functions are provided, enable different types of searches for different skill levels and preferences. x7 13.8 Place distinguishing information at the beginning of headings, paragraphs, lists, etc. x 13.9 Provide information about document collections (i.e., documents comprising multiple pages). x 13.10 Provide a means to skip over multi-line ASCII art. x 14.2 Supplement text with graphic or auditory presentations where they will facilitate comprehension of the page. x 14.3 Create a style of presentation that is consistent across pages. x And if you use images and image maps (Priority 3) Yes No N/A 1.5 Until user agents render text equivalents for client-side image map links, provide redundant text links for each active region of a client-side image map. x And if you use tables (Priority 3) Yes No N/A 5.5 Provide summaries for tables. x8 5.6 Provide abbreviations for header labels. x 10.3 Until user agents (including assistive technologies) render side-by-side text correctly, provide a linear text alternative (on the current page or some other) for all tables that lay out text in parallel, word-wrapped columns. x And if you use forms (Priority 3) Yes No N/A 10.4 Until user agents handle empty controls correctly, include default, place-holding characters in edit boxes and text areas. x Notes 1. This information is not correctly transferred from some word processors to the PDF format. 2. Adobe Reader allows color combinations in text to be changed. 3. The PDF tag set is very limited and does not include mat- hematical formulas or chemical symbols. 4. The latest versions of PDF use CSS. 5. Creators can use the E element to specify an abbreviation for a word. 6. Adobe Reader and Adobe Acrobat use generic access keys, but they cannot be specified in a document. 7. Included in Adobe Reader and Adobe Acrobat. 8. Only as of version 1.7. aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 37 APPENDIx B. WCAG 2.0 checklist of checkpoints (draft April 2006) Principle 1: Content must be perceivable Guideline 1.1: Provide text alternatives for all non-text content Yes No N/A Level 1 Success Criteria for Guideline 1.1 For all non-text content, one of the following is true: If non-text content presents information or responds to user input, text alternatives serve the same purpose and present the same information as the non-text content. If text alternatives cannot serve the same purpose, then text alternatives at least identify the purpose of the non-text content. If non-text content is multimedia; live audio-only or live video-only content; a test or exercise that must use a particular sense; or primarily intended to create a specific sensory experience; then text alternatives at least identify the non-text content with a descriptive text label. If the purpose of non-text content is to confirm that content is being operated by a person rather than a computer, different forms are provided to accommodate multiple disabilities. If non-text content is pure decoration, or used only for visual formatting, or if it is not presented to users, it is implemented such that it can be ignored by assistive technology. x1 Guideline 1.2 Provide synchronized alternatives for multimedia (not analyzed in this article) Yes No N/A Level 1 Success Criteria for Guideline 1.2 1.2.1 Captions are provided for prerecorded multimedia. 1.2.2 Audio descriptions of video, or a full multimedia text alternative including any interaction, are provided for prerecorded multimedia. Level 2 Success Criteria for Guideline 1.2 1.2.3 Audio descriptions of video are provided for prerecorded multimedia. 1.2.4 Captions are provided for live multimedia. Level 3 Success Criteria for Guideline 1.2 1.2.5 Sign-language interpretation is provided for multimedia. 1.2.6 Extended audio descriptions of video are provided for prerecorded multimedia. 1.2.7 For prerecorded multimedia, a full multimedia text alternative including any interaction is provided. Guideline 1.3: Ensure that information and structure can be separated from presentation Yes No N/A Level 1 Success Criteria for Guideline 1.3 1.3.1 Information and relationships conveyed through presentation can be programmatically determined, and notification of changes to these is available to user agents, including assistive technologies. x 1.3.2 Any information that is conveyed by color is also visually evident without color. x 38 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Guideline 1.3: Ensure that information and structure can be separated from presentation, cont. Yes No N/A 1.3.3 When the sequence of the content affects its meaning, that sequence can be programmatically determined. x Level 2 Success Criteria for Guideline 1.3 1.3.4 Information that is conveyed by variations in presentation of text is also conveyed in text, or the variations in presentation of text can be programmatically determined. x 1.3.5 Information required to understand and operate content does not rely on shape, size, visual location, or orientation of components. x Guideline 1.4: Make it easy to distinguish foreground information from its background Yes No N/A Level 2 Success Criteria for Guideline 1.4 1.4.1 Text or diagrams, and their background, have a luminosity contrast ratio of at least 5:1. x 1.4.2 A mechanism is available to turn off background audio that plays automatically, without requiring the user to turn off all audio. x Level 3 Success Criteria for Guideline 1.4 1.4.3 Text or diagrams, and their background, have a luminosity contrast ratio of at least 10:1. x 1.4.4 Audio content does not contain background sounds, background sounds can be turned off, or background sounds are at least 20 decibels lower than the foreground audio content, with the exception of occasional sound effects. x Principle 2: Interface components in the content must be operable Guideline 2.1: Make all functionality operable via a keyboard interface Yes No N/A Level 1 Success Criteria for Guideline 2.1 2.1.1 All functionality of the content is operable in a non-time-dependent manner through a keyboard interface, except where the task requires analog, time-dependent input. Note: This does not preclude and should not discourage the support of other input methods (such as a mouse) in addition to keyboard operation. x Level 3 Success Criteria for Guideline 2.1 2.1.2 All functionality of the content is operable in a non-time-dependent manner through a keyboard interface. x aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 39 Guideline 2.2: Allow users to control time limits on their reading or interaction Yes No N/A Level 1 Success Criteria for Guideline 2.2 2.2.1 For each time-out that is a function of the content, at least one of the following is true: • the user is allowed to deactivate the time-out; or • the user is allowed to adjust the time-out over a wide range that is at least ten times the length of the default setting; or • the user is warned before time expires and given at least 20 seconds to extend the time-out with a simple action (for example, “hit any key”), and the user is allowed to extend the timeout at least ten times; or • the time-out is an important part of a real-time event (for example, an auction), and no alternative to the time-out is possible; or • the time-out is part of an activity where timing is essential (for example, competitive gaming or time-based testing) and time limits can not be extended further without invalidating the activity. x Level 2 Success Criteria for Guideline 2.2 2.2.2 Content does not blink for more than three seconds, or a method is available to stop all blinking content in the Web unit or authored component. x 2.2.3 Content can be paused by the user unless the timing or movement is part of an activity where timing or movement is essential. x Level 3 Success Criteria for Guideline 2.2 2.2.4 Except for real-time events, timing is not an essential part of the event or activity presented by the content. x 2.2.5 Interruptions, such as updated content, can be postponed or suppressed by the user, except interruptions involving an emergency. x 2.2.6 When an authenticated session expires, the user can continue the activity without loss of data after re-authenticating. x Guideline 2.3: Allow users to avoid content that could cause seizures due to photosensitivity Yes No N/A Level 1 Success Criteria for Guideline 2.3 2.3.1 Content does not violate the general flash threshold or the red flash threshold. x Level 3 Success Criteria for Guideline 2.3 2.3.2 Web units do not contain any components that flash more than three times in any one- second period. x 40 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Guideline 2.4: Provide mechanisms to help users find content, orient themselves within it, and navigate through it Yes No N/A Level 1 Success Criteria for Guideline 2.4 2.4.1 A mechanism is available to bypass blocks of content that are repeated on multiple Web units. x Level 2 Success Criteria for Guideline 2.4 2.4.2 More than one way is available to locate content within a set of Web units where content is not the result of, or a step in, a process or task. x 2.4.3 Web units have titles. x 2.4.4 Each link is programmatically associated with text from which its purpose can be determined. x Level 3 Success Criteria for Guideline 2.4 2.4.5 Titles, headings, and labels are descriptive. x 2.4.6 When a Web unit or authored component is navigated sequentially, components receive focus in an order that follows relationships and sequences in the content. x 2.4.7 Information about the user’s location within a set of Web units is available. x 2.4.8 The purpose of each link can be programmatically determined from the link. x Guideline 2.5: Help users avoid mistakes and make it easy to correct mistakes that do occur Yes No N/A Level 1 Success Criteria for Guideline 2.5 2.5.1 If an input error is detected, the error is identified and described to the user in text. x Level 2 Success Criteria for Guideline 2.5 2.5.2 If an input error is detected and suggestions for correction are known and can be provided without jeopardizing the security or purpose of the content, the suggestions are provided to the user. x 2.5.3 For forms that cause legal or financial transactions to occur, that modify or delete data in data storage systems, or that submit test responses, at least one of the following is true: Actions are reversible. Actions are checked for input errors before going on to the next step in the process. The user is able to review and confirm or correct information before submitting it. x Level 3 Success Criteria for Guideline 2.5 2.5.4 Context-sensitive help is available for text input. x Principle 3: Content and controls must be understandable Guideline 3.1: Make text content readable and understandable Yes No N/A Level 1 Success Criteria for Guideline 3.1 3.1.1 The primary natural language or languages of the Web unit can be programmatically determined. x2 Level 2 Success Criteria for Guideline 3.1 aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 41 Guideline 3.1: Make text content readable and understandable, cont. Yes No N/A 3.1.2 The natural language of each passage or phrase in the Web unit can be programmatically determined Note: This requirement does not apply to individual words or phrases that have become part of the primary language of the content. x2 Level 3 Success Criteria for Guideline 3.1 3.1.3 A mechanism is available for identifying specific definitions of words or phrases used in an unusual or restricted way, including idioms and jargon. x3 3.1.4 A mechanism for finding the expanded form of abbreviations is available. x4 3.1.5 When text requires reading ability more advanced than the lower secondary education level, supplemental content is available that does not require reading ability more advanced than the lower secondary education level. x 3.1.6 A mechanism is available for identifying specific pronunciation of words where meaning cannot be determined without pronunciation. x3 Guideline 3.2: Make the placement and functionality of content predictable Yes No N/A Level 1 Success Criteria for Guideline 3.2 3.2.1 When any component receives focus, it does not cause a change of context. x 3.2.2 Changing the setting of any form control or field does not automatically cause a change of context (beyond moving to the next field in tab order), unless the authored unit contains instructions before the control that describe the behavior. x Level 2 Success Criteria for Guideline 3.2 3.2.3 Navigational mechanisms that are repeated on multiple Web units within a set of Web units or other primary resources occur in the same relative order each time they are repeated, unless a change is initiated by the user. x 3.2.4 Components that have the same functionality within a set of Web units are identified consistently. x Level 3 Success Criteria for Guideline 3.2 3.2.5 Changes of context are initiated only by user request. x Principle 4: Content should be robust enough to work with current and future user agents (including assistive technologies) Guideline 4.1: Support compatibility with current and future user agents (including assistive technologies) Yes No N/A Level 1 Success Criteria for Guideline 4.1 4.1.1 Web units or authored components can be parsed unambiguously, and the relationships in the resulting data structure are also unambiguous. x 4.1.2 For all user interface components, the name and role can be programmatically determined, values that can be set by the user can be programmatically set, and notification of changes to these items is available to user agents, including assistive technologies. x 42 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Guideline 4.2: Ensure that content is accessible or provide an accessible alternative Yes No N/A Level 1 Success Criteria for Guideline 4.2 4.2.1 At least one version of the content meets all Level 1 success criteria, but alternate version(s) that do not meet all Level 1 success criteria may be available from the same URI. x 4.2.2 Content meets the following criteria even if the content uses a technology that is not in the chosen baseline: If content can be entered using the keyboard, then the content can be exited using the keyboard. Content conforms to success criterion 2.3.1 (general and red flash). x Level 2 Success Criteria for Guideline 4.2 4.2.3 At least one version of the content meets all Level 2 success criteria, but alternate version(s) that do not meet all Level 2 success criteria may be available from the same URI. x Level 3 Success Criteria for Guideline 4.2 4.2.4 Content implemented using technologies outside of the chosen baseline satisfies all Level 1 and Level 2 requirements supported by the technologies. x Notes 1. Nonfunctional content can be specified using watermarks, or can simply be deleted from the reading order or the tag tree. 2. This information is not correctly transferred from some word processors to the PDF format. 3. Creators can use the title attribute to specify an alternate title for a tag. 4. Creators can use the E element to specify the abbreviation for a word. aRE pDF DocumEnts accEssiBLE? | RiBERa tuRRó 43 APPENDIx C. Section 508 checklist Section 508 PASS FAIL N/A (a) A text equivalent for every non-text element shall be provided (e.g., via “alt,” “longdesc,” or in element content). x (b) Equivalent alternatives for any multimedia presentation shall be synchronized with the presentation. x (c) Webpages shall be designed so that all information conveyed with color is also available without color, for example from context or markup. x (d) Documents shall be organized so they are readable without requiring an associated style sheet. x (e) Redundant text links shall be provided for each active region of a server-side image map. x (f) Client-side image maps shall be provided instead of server-side image maps except where the regions cannot be defined with an available geometric shape. x (g) Row and column headers shall be identified for data tables. x (h) Markup shall be used to associate data cells and header cells for data tables that have two or more logical levels of row or column headers. x (i) Frames shall be titled with text that facilitates frame identification and navigation. x (j) Pages shall be designed to avoid causing the screen to flicker with a frequency greater than 2 Hz and lower than 55 Hz. x (k) A text-only page, with equivalent information or functionality, shall be provided to make a website comply with the provisions of this part, when compliance cannot be accomplished in any other way. The content of the text-only page shall be updated whenever the primary page changes. x (l) When pages utilize scripting languages to display content, or to create interface elements, the information provided by the script shall be identified with functional text that can be read by assistive technology. [Not evaluated in this article] x (m) When a webpage requires that an applet, plug-in, or other application be present on the client system to interpret page content, the page must provide a link to a plug-in or applet that complies with §1194.21(a) through (l). x (n) When electronic forms are designed to be completed online, the form shall allow people using assistive technology to access the information, field elements, and functionality required for completion and submission of the form, including all directions and cues. x (o) A method shall be provided that permits users to skip repetitive navigation links. x (p) When a timed response is required, the user shall be alerted and given sufficient time to indicate more time is required. x 3247 ---- 44 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2007 Author ID box for 3 column layout Column Title 44 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Communications James Feher and Tyler Sondag Administering an Open-Source Wireless Network This tutorial presents enhancements to an open-source wireless network dis- cussed in the June 2007 issue of ITAL that should reduce its administrative burden. In addition, it will demonstrate an open- source monitoring script written for the wireless network. As it has become increasingly impor- tant to provide wireless Internet access for their patrons, libraries and colleges are almost expected to offer this service. Inexpensive methods of providing wireless access—such as adding a commodity wireless access point to an existing network—can suffer from security issues, access by external entities, and bandwidth abuses. Designs that address these issues often involve more costly pro- prietary hardware as well as expertise and effort that are often not readily available. A wireless network built with open-source software and com- modity hardware that addressed the cost, security, and equal access issues mentioned above was presented in the June 2007 issue of ITAL.1 This tutorial highlights enhancements to the pre- vious design that help to explain the technical hurdles in implementation, and includes a program that monitors the status of the various software and hardware components, helping to reduce the time required to administer the network. The wireless network presented requires several different pieces of soft- ware that must work together. Because each of the required software programs are frequently updated, slight changes to the implementation may also be needed. A few issues that have arisen since the previous paper was written are addressed. A note is provided explaining the significance of setting the correct Media Access Control (MAC) address for the radius server and for Wireless Distribution System (WDS) when configuring the system. In addition, in order to provide secure exchange of authentication credentials (username and password), the Secure Socket Layer was used. A brief expla- nation of how to install a registered certificate on the gateway server is provided. Lastly, a program that moni- tors the status of the network, provides a Web page displaying the status of the various hardware and software components, and e-mails administra- tors with any changes to the network status—along with information on how this program is to be deployed within the network—is presented. Configuration changes for previous design As new exploits are discovered and patched on a continual basis, any system should be regularly updated to insure that the most recent software is being used. The network design provided in the previous article used many different software components including, but not limited to: Access Point Software OpenWRT—Whiterussian rc3 DNS Cache Dnsmasq v2.32 Gateway Chillispot v1.0 Operating System Fedora Core 4 RADIUS Server Free Radius v1.0.4 Web Caching Server Squid v2.5 Web Server Apache 2.2.3 Many of these components can be kept up-to-date by using the Yellow dog Updater, Modified (yum). 2 For example, to update a given package, with root access, at the command line enter: yum update packageName The yum command may also be used to update each package that has an available update by simply removing the package name from the yum update command and entering the following: yum update Yum may also be used to upgrade the entire operating system.3 Keep in mind that with any change in software, the configuration of any particular package may change as well. For example, the newest version of Squid is currently 2.6. Appendix D in the previous paper explained how to allow transparent relay of Web requests so that client browsers did not have to be reconfigured. So, while version 2.5 required four changes to allow the transparent relay, the current version—found in appendix A—requires only one. In addition to changes in software, occasionally even entire websites move, as happened with Chillispot.4 Another change involved the con- figuration of the Linksys WRT54GS access points. The newer versions of this access point/router sold by Linksys have half the flash memory and half of the RAM of the older ver- sions.5 While the newer versions of the Linksys WRT54GS can be flashed with custom firmware,6 the firmware that will fit on the newer unit lacks all the capability of the standard firmware. Given this, those wishing to imple- ment such a wireless network should investigate the capability of models to be deployed, as well as the version numbers for the access points chosen. The current version of the Linksys WRT54GL and WRTSL54GS units retain enough flash memory and RAM to be updated with the standard firm- ware mentioned in the previous article.7 james Feher (jdfeher@mckendree.edu) is Associate Professor of Computer Science and Computer Information Systems at McKendree University, Lebanon, Illinois. tyler sondag (sondag@cs.iastate.edu) is a PhD candidate in Computer Science at Iowa State University, Ames. intRoDucinG zoomiFY imaGE | smitH 45aDministERinG an opEn-souRcE wiRELEss nEtwoRK | FEHER anD sonDaG 45 In addition, the procedure for upgrad- ing the firmware for the WRTSL54GS is simpler than the procedure outlined in appendix I of the previous paper. The factory-installed firmware on version 1.1 can be flashed directly using the Web interface provided by Linksys. So, while this tutorial and the previous paper outline the design of a network, the administrator will need to be vigilant in updating the packages used and keep in mind that the configuration specifications may also change with those updates. The administrator for the network must also investigate the capability of the standard hardware used to insure that it retains the functionality required for the system. Choosing the correct MAC address for the access point The access points used will have more than one interface and as such more than one MAC address. When enter- ing the MAC address of a given access point into either the users file for the Radius server or the access points that use the WDS, use the MAC address associated with the wireless interface.8 Using the incorrect MAC address will result in problems when communicat- ing with the various access points. For the Radius server, the access point will not get the correct IP address, which will prohibit the possibility of remotely administering the unit. Incorrect MAC addresses that are used for the WDS settings will cause even worse problems, as the unit will not be able to relay data from users who connect to this access point. Installation of a registered SSL certificate As users are required to enter their authentication credentials to gain access to the Internet, the exchange of this data is encrypted using the Secure Socket Layer.9 While administrators can self-sign the certificates used for their Web servers, it is recommended that a registered certificate be obtained and installed for the system. This can help prevent common attacks and has the added benefit of eliminating warnings for the client browsers when they detect unregistered certificates being used by the SSL. A search of “SSL certificate” will yield any number of commercial vendors from which a certificate can be obtained. Generally the installation of a certificate is fairly straightforward. The openssl com- mand line utility can be used to gener- ate a SSL key and Certificate Signing Request (CSR).10 Once the CSR is generated, pick a vendor/Certificate Authority who can sign your key. It should be noted that the design presented required the authentica- tion gateway to be behind the main router. This required a certificate to be signed for a server within an intranet that does not have a fully qualified domain name. So, when generating the SSL key and CSR, make sure to use GATEWAYHOSTNAME.localnet as the common name of your server. Of course, GATEWAYHOSTNAME is whatever you choose as the name of your gateway host. The term localnet is used to refer to the server existing within an intranet. Then make sure to place an entry for GATEWAYHOSTNAME.localnet into the hosts file of the server that is pro- viding Domain Name Service for your network. An example entry for the hosts file which is in the /etc directory of a standard Fedora Core installation is found in appendix B. Monitoring script for wireless network As the wireless network has many separate hardware and software components, many possible points of failure exist for the system. The script from appendix C, which was written in Perl,11 uses Ping to test if each access point is still connected to the network and nmap to test whether the port associated with a given net- work service is still available.12 This program can be run manually or, even better, run automatically through the Unix cron utility to update a webpage that displays the current state of all the network components. The web- page generated by this script for the McKendree College wireless network may be found at http://lance.mcken- dree.edu/cgi-bin/wireless/status.cgi. (Additionally, a sample of this page is available as a figure in appendix D.) This script actually contains a script within a script. The main script must be run on the gateway machine, Chilli on the diagram in appendix E, as only this machine has access to ping the access points. When the script determines that an access point or daemon is down, it will e-mail the system administrator. When an access point is down, in addition to sending the system administrator an e-mail, it can also send notification to an e-mail address associated with that device. This allows for someone other than the system administrator—who may have closer physical access to the unit—to check the access point on behalf of the administrators for simple issues, such as an access point losing power. This script then generates another CGI script that can be transmitted to an external server that can be reached from anywhere on the Internet. In this case, this generated script can be run as a Web-based application or by the system itself using the cron utility. If run as by the cron daemon, it will also e-mail the administrators if the script has not been updated recently. The script requires the use of several Perl modules that will need to be installed. n Expect n Mail::Mailer n Net::Ping The script has been released using the GNU General Public License, 46 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200846 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 Version 2 (GPL).13 The first portion of the script contains a reference to the GPL, followed by a brief explanation of the script as well as a set of parameters that should be changed to fit the specifications of the network designed. Conclusion Administrators should be vigilant in updating the entire system to assure security, keeping in mind that new versions of software or hardware may necessitate changes in the overall con- figuration of the system. In addition, while the monitoring script provides a useful aid in monitoring the net- work, it could be further expanded to include a more comprehensive review of level of use for various access points by the different users. It is felt that this would be best done through a database, which would require a higher level of administrative effort. A brief frequently asked questions list along with the script and link to the code for the script can be found at http://lance.mckendree.edu/csi/ wirelessfaq.html. References 1. Sondag, Tyler and James Feher, “Open Source Wifi Hotspot Implementa- tion,” Information Technology and Libraries 26, no. 2: 35–43, http://ala.org/ala/lita/ litapublications/ital/262007/2602jun/ toc.cfm (accessed July 24, 2008). 2. Linux@DUKE, “Yum: Yellow dog Updater, Modified,” http://linux.duke .edu/projects/yum (accessed July 24, 2008) 3. Upgrading Fedora Using Yum Fre- quently Asked Questions, http://fedora p r o j e c t . o r g / w i k i / Yu m U p g r a d e F a q (accessed Mar. 16, 2007). 4. ChilliSpot—Open Source Wire- less LAN Access Point Controller, “Spice up your HotSpot with Chilli,” www .chillispot.info/ (accessed May 22, 2008). 5. OpenWrtDocs/Hardware/Linksys /WRT54GS—OpenWrt, http://wiki.open wrt.org/OpenWrtDocs/Hardware/Link sys/WRT54GS (accessed July 24, 2008). 6. Bitsum Technologies Wiki— WRT54G5 CFE, http://bitsum.com/ openwiking/owbase/ow.asp?WRT54G5_ CFE (accessed July 24, 2008). 7. OpenWrtDocs/Hardware/Link- sys/WRTSL54GS—OpenWrt, http:// wiki.openwrt.org/OpenWrtDocs/Hard ware/Linksys/WRTSL54GS (accessed July 24, 2008). 8. O p e n Wr t D o c s / W h i t e R u s s i a n / Configuration, Wireless Distribution Sys- tem (WDS)/Repeater/Bridge. http:// wiki.openwrt.org/OpenWrtDocs/White Russian/Configuration (accessed July 24, 2008). 9. Viega, John, Matt Messier, and Pravir Chandra, Network Security with OpenSSL Cryptography for Secure Commu- nications. (Sebastopol, Calif.: O’Reilly and Associates, 2002). 10. Generating a Key Pair and CSR for an Apache Server with modssl. www .verisign.com/support/tlc/csr/modssl/ v00.html (accessed Feb. 20, 2007). 11. Wall, Larry, Tom Christiansen, and Randal Schwartz, Programming Perl, Third Edition (Sebastopol, Calif.: O’Reilly and Associates). 12. Nmap—Free Security Scanner For Network Exploration and Security Audits. http://insecure.org/nmap/ (accessed Feb. 20, 2007). 13. GNU General Public License Ver- sion 2, June 2007. www.gnu.org/licenses/ gpl.txt. APPENDIx A. Squid configuration changes # changes made to squid.conf # Lines needed for Squid 2.5 #httpd_accel_port 80 #httpd_accel_host virtual #httpd_accel_with_proxy on #httpd_accel_uses_host_header on # # One line needed in version 2.6 http_port 3128 transparent APPENDIx B. /etc/hosts entry on marla for localnet entry 127.0.0.1 marla localhost.localdomain localhost 66.128.109.60 bob 66.99.172.252 lance.mckendree.edu lance # next line is for the ssl certificate to work properly 192.168.176.1 chilli.localnet chilli intRoDucinG zoomiFY imaGE | smitH 47aDministERinG an opEn-souRcE wiRELEss nEtwoRK | FEHER anD sonDaG 47 APPENDIx C. Monitoring script #!/usr/bin/perl ######################################################### # Code released 03/22/07 under: # # the GNU GENERAL PUBLIC LICENSE, Version 2 # # http://www.gnu.org/licenses/gpl.txt # # # # It is recommended that this script is run as a cron # # job frequently to find changes in the network. This # # script will check the status of the wireless access # # points/routers as well as the daemons necessary to # # run the network. It will then output the results to # # another perl file that is copied to a remote # # webserver. When the script observes a change in the # # availability of any access point or daemon, email # # will be sent to the specified administrator # # address(es). The option exists to send an email to # # to an additional person for each access point. # # # # Additionally, the output file on the remote webserver # # will check when it was last updated, if that script # # is run from the command line or via cron. If it has # # not been updated for a specified number of minutes, # # it will send an email to the administrator. It is # # also recommended that this output script be run as a # # cron jobr. This output script can also be executed # # as a cgi program to generate a display of network # # status. # ######################################################### use strict; use Expect(); # needed to scp to webserver use Mail::Mailer; # needed to send emails if outages use Net::Ping; # needed to check the status of aps #variables for webserver to host status page’s my $webServUname = “username”; my $webServPass = “password”; my $webServUrl = “lance.mckendree.edu”; my $webServTarg = “/var/www/cgi-bin/wireless/”; my $webOutputUrl = “http://lance.mckendree.edu/cgi-bin/wireless/status.cgi”; my $instName = “McKendree College”; #default background color of the status page my $defBGColor = “#660066”; # If the page on the webserver has not been updated # in $updateMin minutes send an email that the service # is down (set to =~ 3*crontime) my $updateMin = 10; #email address errors will be sent to my $fromEmail = ‘admin1@email.com’; my $toEmail = 48 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200848 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 ‘admin1@email.com, admin2@email.com’; #file where errors will be stored on remote host my $logFileName = “/tmp/wirelesLog.txt”; #hash for routers/ap’s #location is displayed on the webpage and in status emails #owner - changes in status regarding this AP are sent to # this address as well (optional) my %ipToLoc = ( “192.168.182.10” => { “LOCATION” => “Clark 205”, “OWNER” => ‘’}, “192.168.182.11” => { “LOCATION” => “Clark 202a”, “OWNER” => ‘apUser1@email.com’}, “192.168.182.12” => { “LOCATION” => “PAC Lounge”, “OWNER” => ‘apUser2@email.com’}, “192.168.182.20” => { “LOCATION” => “Library Main”, “OWNER” => ‘apUser3@email.com’}, “192.168.182.21” => { “LOCATION” => “Library Upper”, “OWNER” => ‘’}, “192.168.182.22” => { “LOCATION” => “Library Lower”, “OWNER” => ‘’}, “192.168.182.30” => { “LOCATION” => “Carnegie”, “OWNER” => ‘apUser4@email.com’}); #hash for daemons my %daemons = ( “Dnsmasq - DNS Server” => { “IP_ADDR” =>”10.4.1.90”, “PORT” =>”53”, “PROTO” =>”TCP”}, “Radius - Authenticate” => { “IP_ADDR” =>”10.4.1.90”, “PORT” =>”1812”, “PROTO” =>”UDP”}, “Chilli - Capt. Portal” => { “IP_ADDR” =>”10.5.3.30”, “PORT” =>”0”, “PROTO” =>”LOCAL”}, “Squid - Web Cache” => { “IP_ADDR” =>”10.4.1.90”, “PORT” =>”3128”, “PROTO” =>”TCP”}, “Apache - Web Server” => { “IP_ADDR” =>”10.5.3.30”, “PORT” =>”80”, “PROTO” =>”TCP”}); intRoDucinG zoomiFY imaGE | smitH 49aDministERinG an opEn-souRcE wiRELEss nEtwoRK | FEHER anD sonDaG 49 ######################################################## # # # NO CHANGES NEED TO BE MADE TO THE FOLLOWING CODE # # # ######################################################## # get the current time my $currentTime = scalar localtime(); my $startTime = time(); # open old output status script to get previous status’ open(OLD, “status.cgi”); my @tmpOldStatFile = ; my $oldStatFile = join(“”, @tmpOldStatFile); # check routers/ap’s using ping my $diff = ‘’; my $allRouterStat; foreach my $host (sort keys %ipToLoc){ my $p = Net::Ping->new(); my $pingResult = $p->ping($host); if(!$pingResult){ sleep 10; $pingResult = $p->ping($host); } my $thisLastStat = ( $oldStatFile =~ m/$ipToLoc{$host}{LOCATION}<\/TD>close(); } #check the status of each daemon my $allDaemonStat =’’; foreach my $i (sort keys %daemons){ my $thisLastStat = ( $oldStatFile =~ m/$i<\/TD> (\$lastTime + (60 * $updateMin))){ \$systemStatus = “#FF0000”; \$message = “

Status Update Failed

”; } # if this is cron running the script if (\$currentUser =~ “$webServUname”){ # send email if status is down & logFile doesn’t exist &sendEmail() if ( (\$systemStatus =~ “#FF0000”) && !(-e “$logFileName”) ); # delete log file if everything is up unlink(“$logFileName”) if ( (!(\$systemStatus =~ “#FF0000”)) && (-e “$logFileName”) ); } #else apache is accessing the page (its a web request) else{ #print the page print header(); ############################ # start of html output # ############################ print < $instName Wireless Status intRoDucinG zoomiFY imaGE | smitH 51aDministERinG an opEn-souRcE wiRELEss nEtwoRK | FEHER anD sonDaG 51

$instName Wireless Status

\$message $allRouterStat

$allDaemonStat

Last updated $currentTime
WEB_OUTPUT ########################## # end of html output # ########################## }#end else sub sendEmail { my \$mailer = Mail::Mailer->new(“sendmail”); \$mailer->open({From => ‘$fromEmail’, 52 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200852 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 To => [\$toEmail], Subject => “Wireless Problem”}); my \$message = “The wireless system has failed to “ .”it’s status.\n\n$webOutputUrl\n”; print \$mailer \$message; \$mailer->close(); open(FILE, “>>$logFileName”); print FILE “Failed to Update system.”; close(FILE); } OUTPUT_FILE_FOR_REMOTE_HOST ######################################################## # end of script output block # ######################################################## #write output code to the file my $perlOutputFile = “status.cgi”; open (OUT, “>$perlOutputFile”); print OUT $perlOutput; close (OUT); chmod 0755, $perlOutputFile; #send email is necessary &sendEmail($diff, $webOutputUrl, $fromEmail, $toEmail) if ($diff); #send perl file to webserver &scpFile($perlOutputFile, $webServUname, $webServPass, $webServUrl, $webServTarg); ################################################ # # # END MAIN CODE BLOCK, START FUNCTIONS # # # ################################################ # given the name and status of something (ap or # daemon), this returns a string for the table # row for displaying the status of the ap/daemon sub printStatus { my ($service, $status, $oldStatus, $owner, $toEmail,$oldStatusFile, $currentTime ) = @_; my $msg = “”; my $statusLine = “\n $serviceUP”; intRoDucinG zoomiFY imaGE | smitH 53aDministERinG an opEn-souRcE wiRELEss nEtwoRK | FEHER anD sonDaG 53 # if last two status’ were down if ($oldStatusFile =~ m/$$service$-0--->/){ $msg = “$service back up at $currentTime\n”; # if service has owner & not already in mail list, # add owner to mail list $toEmail .= “, \’$owner\’” if ($owner && (!($toEmail =~ $owner))); } } #else current status is down else{ $statusLine .= “down\”>DOWN”; # if last status was down & before that status was up if ($oldStatusFile =~ m/$$service$-0-1-->/){ $msg = “$service down at $currentTime\n”; # if service has owner & not already in mail list, # add owner to mail list $toEmail .= “, \’$owner\’” if ($owner && (!($toEmail =~ $owner))); } } $statusLine .= “”; return ($statusLine, $toEmail, $msg); }#end printStatus function # checks the status for the given daemon # takes in IP, port to check, daemon name, and protocol # (tcp/udp). if given port=0 it checks for local daemon sub checkDaemon { my ($ip, $port, $daemon, $proto) = @_; my $dStat = 0; if ($proto !~ /LOCAL/){ #sU checks for udp ports my $com = ($proto =~ “TCP”) ? (“nmap -p $port $ip | grep $port”) : (“nmap -sU -p $port $ip | grep $port”); open(TMP, “$com|”); my $comOut = ; close(TMP); if ($comOut =~ /open/){ $dStat = 1; #if port is open, status is up } } else{ $daemon =~ s/ +.*//g; #\l lowercases the first letter of $daemon my $com = “which \l$daemon”; open(TMP, “$com|”); my $comOut = ; close(TMP); $com = “ps aux | awk ‘{print \$11}’ | grep $comOut”; open(TMP, “$com|”); $comOut = ; close(TMP); $dStat = 1 if ($comOut); 54 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200854 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 } return $dStat; } # end checkDaemon function # send the output perl status file to the webserver sub scpFile { my ($filePath, $webServUname, $webServPass, $webServUrl, $webServTarg ) = @_; my $command = “scp $filePath $webServUname” .”\@$webServUrl:$webServTarg”; my $exp1 = Expect->spawn ($command); # the first argument “30” may need to be adjusted # if your system has very high latency my $ret = $exp1->expect(30, “word:”); print $exp1 “$webServPass\r”; my $ret = $exp1->expect(undef); $exp1->close(); } # end scpFile function # send an email to the admin & append error to log file sub sendEmail { my ($errorList, $webOutputUrl, $fromEmail, $toAddresses ) = @_; my $mailer = Mail::Mailer->new(“sendmail”); $mailer->open({From => “$fromEmail”, To => [$toAddresses], Subject => “Wireless Problem”}); $errorList .= “\n\n$webOutputUrl”; print $mailer $errorList; $mailer->close(); } # end sendEmail function APPENDIx D. Script output page APPENDIx E. Diagram of network LITA cover 2, cover 3, cover 4 Index to Advertisers 3248 ---- intRoDucinG zoomiFY imaGE | smitH 55 Column Title Editor Author ID box for 3 column layout REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 55 Communications Robert N. Bland and Mark A. Stoffan Returning Classification to the Catalog The concept of a classified catalog, or using classification as a form of subject access, has been almost forgotten by contemporary librarians. Recent developments indicate that this is changing as libraries seek to enhance the capabilities of their online catalogs. The Western North Carolina Library Network (WNCLN) has developed a “classified browse” feature for its shared online catalog that makes use of Library of Congress classification. While this feature is not expected to replace keyword search- ing, it offers both novice and experienced library users another way of identifying relevant materials. Classification to modern librari-ans is almost exclusively a tool for organizing and arranging books (or other physical media) on shelves. The role of classification as a form of subject access to collec- tions through the public catalog—the concept of the classified catalog—has been almost forgotten. From a review of the literature, it does not appear that any major U.S. library has sup- ported a classified catalog since Boston University Libraries closed its classified catalog in 1973.1 To be sure, nearly all online catalogs nowadays have some form of what is called a “call number search” or a “shelf list browsing capability” that is based on classification, but this is a humble and little-used feature because it requires that a call number (or at least a call number stem) be known and entered by the user, when no verbal index to the classification is available online. This search methodology provides nothing in the way of a systematic and hierarchical arrangement and display of subject classes, complete with accompanying verbal descrip- tions, that the classified catalog seeks to accomplish. But as Karen Markey put it in her recent review of classifi- cation and the online catalog, “To this day, the only way in which most end users experience classification online is through their online catalog’s shelf list browsing capability.”2 There are signs that this situ- ation is changing. The recently released Endeca-based catalog at North Carolina State University Libraries uses Library of Congress Classification (LCC) in a prominent way to provide for browsing of the collection without need of the user entering any search terms at all.3 The LCC outline is presented on the main search entry screen with verbal cap- tions describing the classes, allowing users to navigate through several layers of the outline to retrieve with a click of the mouse bibliographic records for materials assigned to those classes. In a converse way, the new online catalog being developed by the Florida Center for Library Automation uses LC classification as a kind of back end to keyword search- ing. Following a keyword search, a user can limit the results set by con- fining it to a designated LCC range chosen again from an online display of the LCC outline.4 Both of these catalogs use three levels of the LCC outlines from the most general single letter level classes (Q for sciences, for example) through the two-letter classes for more specific subjects (QC for physics, QD for chemistry) to an even finer granularity with des- ignated numeric ranges within the two-letter classes identifying specific subdisciplines, (QD241–QD441 for organic chemistry). The Western North Carolina Library Network (WNCLN) has been experimenting with classification as a retrieval tool in the public cata- log for some time,5 and it has just implemented the first version of what we call a Classified Catalog Browse in our Innovative Millennium sys- tem.6 Like the two catalogs just men- tioned, the Classified Catalog Browse is based on software that is external to the ILS software and integrated with that software through linking and webpage designs. Also, like the previously discussed catalogs, it is Robert n. Bland (bland@unca.edu) is Associate University Librarian for Technical Services, University of North Carolina at Asheville. mark stoffan (mstoffan@ fsu.edu) is Associate Director for Library Technology at Florida State University, Tallahassee. Figure 1. Level 1 of LC Classification in WNCLN WebPac 56 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200856 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 based on scanning and incorporating into the catalog the LCC outlines as published by the Library of Congress. The WNCLN catalog goes a step fur- ther, however, in bringing the entire LC classification online down to the individual class number level—at least that portion of the classification that is actually used in our catalog. This is done through extracting class numbers and associated subject head- ings from bibliographic and authority records in our catalog and building an online classification display with descriptive captions (a verbal index) from these bibliographic and author- ity records. The result is a hierarchical display (to continue the example from above) not only of QC241–QD441 for organic chemistry but within this, QD271 for chromatographic analy- sis, QD273 for organic electrochem- istry, and so on. The design of our interface presents this as a fourth level to which the user can “drill” down beginning with Q for sciences, QD for chemistry, QD241–QD441 for organic chemistry, and finally QD271 for chromatographic analysis (fig- ures 1–4.) From this fourth level,the user can click an associated link to execute a search of the catalog by the class number in question using the call number search function of the ILS (figure 5); a second link for that class number will present the same list of titles but sorted by “Most Popular” (i.e., the items that have been checked out most frequently) from a separate but linked external database (figure 6); a third link will search the catalog by the associated subject heading for the class (figure 7); and finally a fourth link will show other subject headings that have been used in the catalog with this specific class number (figure 8). What does having the LC clas- sification online in our catalog accomplish for our users? Part of the point of our project is to answer this very question. Chan and oth- ers7 have theorized that incorpora- tion of the classification system into the catalog as a retrieval tool can Figure 2. Level 2 of LC Classification in WNCLN WebPac Figure 3. Level 3 of LC Classification in WNCLN WebPac provide enhanced subject access that is not possible through standard alphabetical subject headings and keyword searching alone. Early stud- ies by Markey and others at OCLC seem to have confirmed this with an online version of the Dewey Decimal Classification.8 Since (as far as we know) the Library of Congress clas- sification has not really been tested as an online retrieval tool in a live catalog up to now, our implementa- tion will serve as a kind of test bed for this hypothesis. How actual users in fact exploit this feature is of course only something that experience will intRoDucinG zoomiFY imaGE | smitH 57REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 57 tell. A cursory look, however, would seem to indicate definite advantages to this approach. First of all, many studies indicate that two of the major sources of fail- ure with subject retrieval in online systems are misspellings and poor choice of search terms by users. No Figure 4. Level 4 of LC Classification in WNCLN Web:Pac Figure 5. Call number search display in WNCLN matter how far we may try to go with keyword searching and relevance ranking, no online library retrieval system is likely to do much with “Napolyan’s fites” when what the user is looking for are books on the military campaigns of the Emperor Napoleon. With the classification sys- tem and verbal index online most of these problems are eliminated, since users can navigate to a subject of choice without ever entering a search term. Moreover, given the design of the verbal index based on Library of Congress subject headings, the user is led to actual subject headings used in the catalog, which should provide for precise retrieval beyond what is ordinarily possible with key- words even when entered correctly, and (importantly) a retrieval set that is always greater than zero. The infa- mous and frustrating problem of “no hits” is eliminated. Secondly, the great attraction of the classified catalog approach is that it arranges subjects in a hierarchical fashion based on integral connec- tions among the topics in a way that cannot be accommodated in an alphabetic subject approach because of the vagaries of spelling. The top- ics “Violence,” “Social Conflict,” and “Conflict Management,” for example, obviously spread out in an alpha- betical subject list, are collocated in the classified catalog under the class “HM1106–HM1171 Interpersonal Relations” (figure 9), allowing the user to find references to materials all in one place in the catalog just as the classification system arranges the books on these subjects all in one place on the library shelves. Alphabetical subject indexes, of course, attempt to ameliorate this problem by means of cross references, but there is clearly a limit to how far one can go with this approach. Finally, the classified catalog provides an efficient way for col- lection development staff to review specific subject areas and to make better informed purchasing deci- sions regarding the collections. In the WNCLN design, the classes at the bottom level of the hierarchy are linked to the catalog by call number and subject headings, and each class carries an indication of the number of items assigned that class number. The classes are also linked to an external database that shows the frequency 58 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200858 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 of circulation of items in the class as well as title and date of publication. A quick review of this list can inform a bibliographer of circulation rates as well as the currency of materials in the class. As mentioned, the captions that are displayed with the LCC hier- archy in the WNCLN catalog are extracted from subject headings and authority records present in our catalog. Readers familiar with LC MARC record services may won- der why we took this approach to building the verbal index rather than using the information available in the LC MARC classification records. Machine-readable records for LC clas- sification are now available in MARC format. These files include records for each individual class number with a corresponding verbal caption. While we did experiment with using these files, cost and complexity determined that we go another direction. The LC classification files are huge, contain- ing hundreds of thousands of classi- fication numbers that we do not now and probably never would use in our WNCLN catalog simply because we (unlike LC) have no materials on these subjects. While these records could be filtered out by matching against LC class numbers that are found in our catalog and discard- ing non-matches, this would add yet another level of processing to an already complex process, as would handling the LC table subdivisions that are used in the LC schedules and that are separate from the stan- dard class numbers. Secondly, the LC MARC classification files require a subscription costing several thou- sands dollars per year, as well as a substantial payment for the retro- spective file needed to begin building the database of class numbers. On the other hand, extracting the verbal index from subject headings and authority records in our own catalog adds no cost to our process- ing. These headings and authority records are created and maintained, of course, as a standard part of the Figure 6. Most used titles display Figure 7. Subject search display in WNCLN cataloging process, and accordingly only headings and authority records that match materials owned by our libraries are included. The descrip- tion or caption that is finally assigned to a class number is determined by a computer program that analyzes both authority records and biblio- graphic records found in our catalog that are assigned the class number in question, with the subject head- ing that is used most frequently as a primary subject generally being the one normally selected as the caption for the class. These class numbers with associated subject headings are processed then by another program, which eventually builds HTML files intRoDucinG zoomiFY imaGE | smitH 59REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 59 representing the classification with links to the catalog and the external “Most Used” database as alluded to above. These standard HTML files, along with the files representing the first three levels of the LCC out- line, are then loaded onto our Web server to display the classification system online. Figure 9. Collocation of terms in the classified catalog Figure 8. Related subjects display in WNCLN A second advantage of this approach is that using the actual subject heading as the caption or description for the class makes it pos- sible to use that caption as a direct link to a subject search in the catalog, as shown in the illustration in figure 4. A disadvantage is that the captions from the LCC files are designed to retain the hierarchy that is repre- sented in the printed schedules in a visual way by formatting and indent- ing. Captions derived from subject headings do not retain this feature. We have tried to accommodate this in our display of the schedules by replicating the class number ranges from the outline in the appropriate place in the full display of the sched- ules, thereby building a hierarchy from these ranges as genus and the individual class numbers as species. This does not manage to retain the full hierarchy of the LC schedules as shown in the printed schedules or as represented in LC’s online Classification Web product, but it is, we hope, an adequate surrogate for the purpose intended. In fact, in most cases, the captions derived from the extracted subject and authority head- ings match quite nicely the captions included in the actual LCC schedules, as shown in a comparison from the psychology classification of the hier- archy as it appears in our Classified Catalog Browse and as it appears online in LC’s Classification Web product (figures 10 and 11). What is missing in our representation of the classification is not so much the subject content of the classes but the notes and information about literary form that are included in the actual LCC schedules. Thus, our LCC online is not a strict image of the LCC as it would appear in printed or electronic form based on the hierarchies and cap- tions devised by the LC. Nor for that matter—despite our terminology— is it a true classified catalog, since only one classification (that used in the call number) is assigned to each item, whereas in a true classified catalog multiple classifications may be assigned to an item. It is never- theless an online presentation of the LCC with links to our catalog that seeks to enhance subject access by exploiting the power of the classifica- tion system to organize materials by integral subject classes and to show relationships among subjects by a 60 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200860 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 hierarchical arrangement of classes as genus, species, and subspecies. And, perhaps just as importantly, it is an implementation that requires no additional cataloging effort on the part of our staff, nor any additional costs for data or processing other than the investment we have made in development of the software and the small amount of time required weekly to update the files. We do not expect that the Classified Catalog Browse will replace keyword or subject searching as the primary means of subject access to our collections. We do believe that it promises to be a powerful and effec- tive complement to our standard ILS searches that may improve subject searching for both the novice and the experienced user. References 1. Margaret Hindle Hazen, “The Clos- ing of the Classified Catalog at Boston University,” Library Resources and Technical Services 18 (1974): 221–26. 2. Karen Markey, Joan S. Mitchell, and Diane Vizine-Goetz, “Forty Years of Clas- sification Online: Final Chapter or Future Unlimited?” Cataloging and Classification Quarterly 42 (2006): 1–63. 3. North Carolina State University Libraries, “NCSU Libraries Online Cata- log,” North Carolina State University, www.lib.ncsu.edu/catalog (accessed Mar. 23, 2007). 4. Florida Center for Library Automa- tion, “State University Libraries of Flor- ida–Endeca,” Board of Governors, State of Florida, http://catalog.fcla.edu (accessed Mar. 23, 2007). 5. The Western North Carolina Library Network is a consortium consisting of the libraries of Appalachian State Univer- sity, the University of North Carolina at Asheville, and Western Carolina University. 6. Western North Carolina Library Net- work, “Library Catalog,” Western North Carolina Library Network, http://wncln .wncln.org (accessed Mar. 23, 2007). Figure 10. Class captions in the WNCLN WebPac Figure 11. Class captions in LC’s Classification Web 7. Lois Mai Chan, “Library of Con- gress Classification as an Online Retrieval Tool: Potentials and Limitations,” Infor- mation Technology and Libraries 5 (1986): 181–92. 8. Karen Markey and Anh Demeyer, Dewey Decimal Classification Online Project: Evaluation of a Library Schedule and Index Integrated into the Subject Searching Capa- bilities of an Online Catalog: Final Report to the Council on Library Resources (Dublin, Ohio: OCLC, 1986), Report no. OCLC/ OPR/RR-86/1. 3250 ---- 2 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 mark Beatty (mbeatty@wils.wisc.edu) is LITA President 2007/2008 and Trainer, Wisconsin Library Services, Madison. Mark BeattyPresident’s Message I’ve recently read three quite different articles that surprisingly all had something similar to say with a different twist on the theme uppermost in my brain the last year or two. Here’s the briefest of quotes from the three. I would suggest your full reading of all three if you haven’t already. n Lankes, Silverstein and Nicholson, “Participatory Networks: The Library as Conversation,” in the December 2007 Information Technology and Libraries: “With their principles, dedication to service, and unique knowledge of infrastructure, libraries are poised not simply to respond to new technologies, but to drive them. By tying technological implementation, development and improvement to the mission of facilitating conver- sations across fields, libraries can gain invaluable visibility and resources.” n Bill Crowley, “Lifecycle Librarianship,” in the April 1, 2008 Library Journal: “Public, academic, and school librarians should adopt the service philosophy of lifecycle librarianship and jointly plan at town, city, or county levels to identify and meet human learning needs from “lapsit to nursing home.”” n Joe Kissell, “Instant Messaging for Introverts,” in the April 4, 2008 TidBITS (http://db.tidbits.com/ article/9544): “Several people I discussed this issue with (using IM and Twitter) expressed dismay at having had relationships deteriorate due to an unwillingness on another person’s part to adapt to changing technology. For example, people who don’t use e-mail don’t get Evites, and so they end up being excluded from parties.” What all three express to me is a concern that librar- ies, and just plain humans, need to be part of the conver- sation, part of the social structure, and full participants in life. We are now, through surveys and meetings and focus groups, starting to know that new librarians and new LITA members are most interested in networking with their colleagues using multiple methods to fulfill the whole range of their professional and social needs. Lankes wants to make sure we participate with all our constituencies, Crowley wants us to spend a lifetime with those constituencies, and Kissell wants to make sure we get invited to the party. That sounds a bit face- tious but I believe the point is that our association, our libraries, our social structures are now required to be active participants, physically and virtually, in the life of their communities. We have to recognize our communi- ties and then act to participate and provide space and support to those communities. This takes work and the will to always be part of our communities. All of which leads to my President’s Program, featur- ing keynote speaker Joe Janes and the blogging folks at “It’s All Good” at the ALA Annual Conference 2008 in Anaheim, California. It will be part of Sunday Afternoon with LITA, taking place on Sunday, June 29, 2008. The program line up will include: n Top Technology Trends 1:30–3:00 p.m. n LITA Awards and Scholarships Reception 3–4 p.m. n LITA President’s Program 4–5 p.m. n “Isn’t it great to be in the library . . . wherever that is?” It’s often said that today we have to run three libraries at once: the library of yesterday, today, and tomorrow. We run both the physical, visible library, and the one that exists beyond the walls. This raises many questions of what a library is and encompasses, what it isn’t, where the boundaries lie, the impact on what we do and how we do it, what our clients want, how we serve them, and what kinds of librarians serve them. This program will attempt to examine the full social and cultural constructs of libraries that move beyond basic Web 2.0 and integrate patrons, librarians, and resources in what should be a ubiquitous manner. Join Joe Janes, associate professor in the Information School of the University of Washington in Seattle and columnist for American Libraries, keynote speaker, along with members of the “It’s All Good” blog- ging group (http://scanblog.blogspot.com) as the reactor panel for a lively exploration of possible futures. I hope you’ll be able to attend but be assured that members of the LITA community will blog and report and even record the sessions in various ways that will be made freely available to our community. 3251 ---- EDitoRiaL | tRuitt 3 Marc TruittEditorial marc truitt (marc.truitt@ualberta.ca) is Associate Director, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. The catalog. Love it? Hate it? Depending upon who is speaking, it may be cast as the ultimate portal that enables user access to all local and networked resources, or it may be a tool of Byzantine complexity, comprehensible at best to but a small fraction of librarians able to navigate its bibliographic metadata encoded in an arcane 1960s-era format. It is a rich trove of structured and controlled information assembled over decades by the work of countless dedicated catalogers and others. Or, it is the now-obsolete product of a labor-intensive process of description and subject analysis that has no relevance in a Web-centric world where “everything” is findable via the Google search-box. Its attempt to organize knowledge provides catalogers with a raison d’etre, but sends their colleagues and many users fleeing for simpler and more all-encompassing tools. It is our alpha and omega, our yin and our yang. Few topics in librarianship—perhaps with the con- spicuous exception of that perennial library school favor- ite, our profession’s status as a profession—seem to provoke the range and depth of sentiment engendered by discussions of the place of the catalog. Especially in recent years, criticism of the catalog has grown ever more strident, to the point where it has become commonplace in our profession’s literature to say that this most basic of library services “sucks.” As a consequence, librarians have increasingly fallen into one of two camps, with those critical of the catalog often simplistically characterized as favoring, and those defending it as opposing “change.” A number of initiatives have emerged in response to this ferment. Some of these have focused on our bib- liographic metadata, and particularly on its ability to express the relationships and interconnectedness of the bibliographic universe. As we have traditionally cataloged whatever we had “in-hand,” our cataloging codes and encoding standards have done a very good job of manag- ing the description of bibliographic items; what they have not generally expressed well are the relationships among items. FRBR and FRAD—the Functional Requirements for Bibliographic Records and the Functional Requirements for Authority Data—seem promising beginnings for addressing the relationship issues, although there are as-yet very few practical implementations. Resource Description and Access (RDA), the forthcoming successor to AACR2, is designed around FRBR concepts; it will be interesting to see how this plays out in the “real world;” equally interesting will be to what degree the present (or a modified) MARC21 is able to express RDA’s FRBR-based relationship model. Other approaches have focused on developing systems that are able to exploit our existing investment in biblio- graphic metadata in new and useful ways. The pioneering and best-known example of this, of course, is the discov- ery tool developed by a partnership of North Carolina State University Libraries and Endeca, which premiered in early 2006. This initiative included several innova- tive features not previously found in library catalogs, such as search result relevance ranking and the ability to perform faceted searching against a variety of controlled- vocabulary indices (subject/topical, form/genre, date, etc.) NCSU’s Endeca discovery tool spawned an entirely new product segment for the catalog: major ILS vendors have scrambled to develop their own next-gen products, combining relevancy and facets with additional function- ality such as Web 2.0 social and collaborative tools and enhanced federated searching capabilities. The result of all this activity has been the first cross-platform growth opportunity for ILS vendors since the development of resource-linking tools and the ERM. We at ITAL have watched these trends with keen interest and have published works describing many of the major developments vis-a-vis the catalog in recent years. Indeed, since late 2004, ITAL has published at least eleven major papers on various topics related to improv- ing the catalog. With our publication of Jennifer Bowen’s report on the first phase outcomes of the University of Rochester’s eXtensible catalog (XC) project in this issue of ITAL, we continue our commitment to publish important research in this area. The Rochester project is noteworthy, both for its modular and metadata-focused approach and for its high visibility as an open source effort that has received significant support from the Andrew W. Mellon Foundation. I predict that this paper will quickly take its place among the other ground-breaking works on the catalog that ITAL has published, and I’ll eagerly be await- ing the next progress report on the XC. n “Must-reads” Dept. Okay, so I may not be the first out of the gate with this one, but for those of you who haven’t looked at it yet, trust me, you’ll want to. Jonathan Zittrain’s The Future of the Internet and How to Stop It (Yale University Press, 2008), which divides the Internet into “generative” tech- nologies such as the PC, and proprietary appliances such as the iPhone, may or may not resonate with you, but I think it could well become the next big debate about where the Net is and where it should be going. Grab a copy and read it today. 3252 ---- Editor’s note: We have an excellent editorial board for this journal and with this issue we’ve decided to begin a new column. In each issue of ITAL, one of our board members will reflect on some question related to technol- ogy and libraries. We hope you find this new feature thought-provoking. Enjoy! Any librarian who has been following the profes-sional literature at all in the past ten years knows that there has been an increasing emphasis on user-centeredness in the design and creation of library services. Librarians are trying to understand and even anticipate the needs of users to a degree that’s perhaps unprecedented in the history of our profession. It’s no mystery as to why. We now live in a world where global computer networks link users directly with information in such a way that often, no middleman is required. Users are exploring information on their own terms, at their own convenience, sometimes even using technolo- gies and systems that they themselves have designed or contributed to. At the same time, most libraries are feeling a finan- cial pinch. Resources are tight, and local governments, institutions of higher education, and corporations are all scrutinizing their library operations more closely, ask- ing “what have you done for me lately?” The unspoken coda is “It better be something good, or I’m cutting your funding.” The increasing need to justify our existence, together with our desire to build more relevant services, is driving an increased interest in assessment. How do we know when we’ve built a successful service? How do we define “success?” And, perhaps most importantly, in a world filled with technologies that are “here today, gone tomorrow,” how do we decide which ones are appropri- ate to build into enduring and useful services? As a library technologist, it’s this last question that concerns me the most. I’m painfully aware of how quickly new technologies develop, mature, and fade silently into that good night with nary a trace. It’s like watching pro- tozoa under a microscope. Which of these can serve as the foundation for real, useful services? It’s obvious to me that if I’m going to choose well, it’s vital that I place these services in context—and not my context, the user context. In order to do that, I need to understand the users. How do they do their work? What are they most concerned with? How do they think about the library in relation to the research process? How do they use technology as part of that process? How does that process fit into the larger context of the assignment? To answer questions like these, librarians often turn to basic marketing techniques such as the survey or the focus group. Whether we are aware of it or not, the emphasis on user-centered design is making librarians into mar- keters. This is a new role for us, and one that most of us have not had the training to cope with. Since most of us haven’t been exposed to marketing as a discipline of study, we don’t think of what we do as marketing, even when we use marketing techniques. But that’s what it is. So whether we know it or not, marketing, particularly market research, is important to us. Marketing as a discipline is in the process of under- going some major changes right now. Recent research in sociology, psychology, and neuroscience has uncovered some new and often startling insights into how human beings think and make decisions. Marketers are strug- gling to incorporate these new models into their research methods, and to change their own thinking about how they discover what people want. I recently collided with this change when my own library decided to do a focus group to help us redesign our website. Since we have a school of business, I asked one of our marketing profes- sors for help. Her advice? Don’t do it. As she put it: “You and the users would just be trading ignorances.” She then gave me a reading list, which included How Customers Think by Gerald Zaltman, which I now refer to as “the book that made marketing sexy.”1 Zaltman’s book pulls together a lot of the recent research on how people think, make choices, and remem- ber. Some of it is pretty mind-blowing: n 95% of human reasoning is unconscious. It hap- pens at a level we are barely aware of. n We think in images much more than we do in lan- guage n Social context, emotion, and reason are all involved in the decision-making process. Without emotion, we literally are unable to make choices. n All human beings use metaphors to explain and understand the world around them. Metaphor is the bridge between the rational and emotional parts of the decision-making process. n Memory is not a collection of immutable snapshots we carry around in our heads. It’s much more like a narrative or story—one that we change just by remembering it. Our experience of the past and present are inextricably linked—one is constantly influencing the other. Heady stuff. If you follow many of these ideas to their logical conclusions, you end up questioning the value of many traditional marketing techniques, such as surveys and focus groups. For example, if the social context in 4 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 Kyle Felker (felkerk@wlu.edu) is an ITAL Editorial Board member, 2007–09, and Technology Coordinator at Washington and Lee University Library in Lexington, Virginia. Editorial Board Thoughts Kyle Felker itaL BoaRD mEmBER’s coLumn | FELKER 5 which a decision is made is important, then surveys are often going to yield false data, since the context in which the person is deciding to tick off this or that box is very different from the context in which they actually decide to use or not use your service or product. Asking users “what services would be useful” in a focus group won’t be effec- tive because you are only interviewing the users’ rational thought process—it’s at least as important to find out how they feel about the service, your library, the task itself, and how they perceive other people’s feelings on the subject. Zaltman proposes a number of very different market- ing techniques to get a more complete picture of user decision making: n Use lengthy, one-on-one interviews. Interviewing the unconscious is tricky and takes trust, it’s some- thing you can’t do in a traditional focus group set- ting. n Use images. We think in images, and images are a richer field for bringing unconscious attitudes to the surface. n Use metaphor. Invite interviewees to describe their feelings and experiences in metaphor. Explore the metaphors they come up with to more fully under- stand all the context. If this sounds more like therapy than marketing to you, then your initial reaction is pretty similar to mine. But the techniques follow logically from the research Zaltman presents. How many of us have done user assessment and launched a new service, only to find a less than warm reception for it? How many of us have had users tell us they want something, only to see it go unused when it’s implemented? Zaltman’s model offers potential explanations for why this happens, and meth- ods for avoiding it. Lest you think this has nothing to do with technol- ogy, let me offer an example: library Facebook/Myspace profile pages. There’s been a lot of debate on how effec- tive and appropriate these are. It seems to me that we can’t gauge how receptive users are to this unless we understand how they feel about and think about those social spaces. This is exactly the sort of insight that new market- ing techniques purport to offer us. In fact, if the research is right, and there is a social and emotional component to every choice a person makes, then that applies to every choice a user makes with regard to the library, whether it’s the choice to ask a question at the reference desk, the choice to use the library website, or the choice to vote on a library bond issue. Librarians are doing a lot of things we never imagined we’d ever need or want to do. Web design. Archival digi- tization. Tagging. Perhaps it’s also time to acknowledge that what we do has an important marketing component, and to think of ourselves as marketers (at least part time). I’m sold enough on Zaltman’s ideas that I’m willing to try them out at my own institution, and I encourage you to do the same. Reference 1. Zaltman, Gerald. How customers think: Essential insights into the mind of the market (Boston, Mass.: Harvard Business School Press, 2003.) 3253 ---- 6 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 Metadata to Support Next-Generation Library Resource Discovery: Lessons from the eXtensible Catalog, Phase 1 Jennifer Bowen The eXtensible Catalog (XC) Project at the University of Rochester will design and develop a set of open-source applications to provide libraries with an alternative way to reveal their collections to library users. The goals and functional requirements developed for XC reveal gener- alizable needs for metadata to support a next-generation discovery system. The strategies that the XC Project Team and XC Partner Institutions will use to address these issues can contribute to an agenda for attention and action within the library community to ensure that library metadata will continue to support online resource discovery in the future. Library metadata, whether in the form of MARC 21 catalog records or in a variety of newer metadata schemas, has served its purpose for library users by facilitating their discovery of library resources within online library catalogs (OPACS), digital libraries, and insti- tutional repositories. However, libraries now face the chal- lenge of making this wealth of legacy catalog data function adequately within next-generation Web discovery environ- ments. Approaching this challenge will require: n an understanding of the metadata itself and a com- mitment to deriving as much value from it as pos- sible; n a vision for the capabilities of future technology; n an understanding of the needs of current (and, where possible, future) library users; and n a commitment to ensuring that lessons learned in this area inform the development of both future library systems and future metadata standards. The University of Rochester ’s eXtensible Catalog (XC) Project will bring these various perspectives together to design and develop a set of open-source, collaboratively built next-generation discovery tools for libraries. The XC Project Team seeks to make the best possible use of legacy library metadata, while also informing the future development of discovery metadata for librar- ies. During Phase 1 of the XC Project (2006–2007), the XC Project Team created a plan for developing XC and defined the goals and initial functional requirements for the system. This paper outlines the major metadata- related issues that the XC Project Team and XC Partner Institutions will need to address to build the XC system during Phase 2. It also describes how the XC Team and XC Partners will address these issues, and concludes by presenting a number of issues for the broader library community to consider. While this paper focuses on the work of a single library project, the goals and functional requirements developed for the XC Project reveal many generaliz- able needs for metadata to support a next-generation discovery system.1 The metadata-related goals of the XC Project—to facilitate the use of MARC metadata outside an Integrated Library System (ILS), to combine MARC metadata with metadata from other sources in a single discovery environment, and to facilitate new functional- ity (e.g., faceted browsing, user tagging)—are very simi- lar to the goals of other library projects and commercial vendor discovery software. The issues described in this paper thus transcend their connection to the XC Project and can be considered general needs for library discov- ery metadata in the near future. In addition to informing the library community about the XC Project and encouraging comment on that work, the author hopes that identifying and describing meta- data issues that are important for XC—and that are likely to be important for other projects as well—will encourage the library community to set these issues as high priorities for attention and action within the next few years. n The eXtensible Catalog Project The University of Rochester’s vision for the eXtensible Catalog (XC) is to design and develop a set of open-source applications that provide libraries with an alternative way to reveal their collections to library users. XC will provide easy access to all resources (both digital and physical col- lections) and will enable library content to be revealed through other Web applications that libraries may already be using. XC will be released as open-source software, so it will be available for free download, and libraries will be able to adopt, customize, and extend the software to meet their local needs. The XC Project is a collaborative effort between partner institutions that will serve a variety of roles in its development. Phase 1 of the XC Project, funded by the Andrew W. Mellon Foundation and carried out by the University of Rochester River Campus Libraries between April 2006 and June 2007, resulted in the creation of a project plan for the development of XC. During XC Phase 1, the XC Project Team recruited a number of other institutions that will serve as XC Partners and who have agreed to contribute resources toward building and implementing XC during Phase 2. XC Phase 2 (October 2007 through jennifer Bowen (jbowen@library.rochester.edu) is Director of Metadata Management at the University of Rochester River Campus Libraries, New York, and is Co-Principal Investigator for the eXtensible Catalog Project. mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 7 June 2009) is supported through additional funding from the Andrew W. Mellon Foundation, the University of Rochester, and XC Partners. During Phase 2, the XC Project Team, assisted by XC Partners, will deploy the XC software and make it available as open-source software.2 Through its various components, the XC system will provide a platform for local development and experimen- tation that will ultimately allow libraries to manage and reveal their metadata through a variety of Web applica- tions such as Web sites, institutional repositories, and con- tent management systems. A library may choose to create its own customized local interface to XC, or use XC’s native user interface “as is.” The native XC interface will include Web 2.0 functionality, such as tagging and faceted browsing of search results that will be informed by FRBR (Functional Requirements for Bibliographic Records)3 and FRAD (Functional Requirements for Authority Data)4 conceptual models. The XC software will handle multiple metadata schemas, such as MARC 215 and Dublin Core,6 and will be able to serve as a repository for both existing and future library metadata. In addition, XC will facilitate the creation and incorporation of user-created metadata, enabling such metadata to be enhanced, augmented, and redistributed in a variety of ways. The XC Project Team has designed a modular archi- tecture for XC, as shown in the simplified schematic in figure 1. XC will bring together metadata from a variety of sources (integrated library systems, digital reposito- ries, etc.), apply services to that metadata, and display it in a usable way in the Web environments where users expect to find it.7 XC’s architecture will allow institutions that implement the software to take advantage of innova- tive models for shared metadata services, which will be described in this paper. n XC Phase 1 activities During the now-completed XC Phase 1, the XC Project Team focused on six areas of activity: 1. Survey and understand existing research on user practices. 2. Gauge library demand for the XC system. 3. Anticipate and prepare for the metadata requirements of the new system. 4. Learn about and build on related projects. 5. Experiment with and incorporate useful, freely available code. 6. Build a community of interest. The XC Project Team carried out a variety of research activities to inform the overall goals and high-level functional requirements for XC. This research included a literature search and ongoing monitoring of discussion lists and blogs, to allow the team to keep up with the most current discussions taking place about next-generation library discovery systems and related technologies and projects.8 The XC team also consulted regularly with prospective partners and other knowledgeable colleagues who are engaged in defining the concept of a next-gener- ation library discovery system. In order to gauge library demand for the XC system, the team also conducted a survey of interested institutions.9 This paper reports the results of the third area of activity during XC Phase 1—anticipating and preparing for the metadata requirements of the new system—and looks ahead to plans to develop the XC software during Phase 2. n XC goals and metadata functional requirements The goals of the XC Project have significant implications for the metadata functionality of the system, with each goal suggesting specific high-level functional require- ments for how the system can achieve that particular goal. The five goals are: n Goal 1: Provide access to all library resources, digital and non-digital. n Goal 2: Bring metadata about library resources into a more open Web environment. n Goal 3: Provide an interface with new Web func- tionality such as Web 2.0 features and faceted browsing. n Goal 4: Conduct user research to inform system development. n Goal 5: Publish the XC code as open-source software. Figure 1. XC System Diagram 8 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 An overview of each XC goal and its related high-level metadata requirements appears below. Each requirement is then discussed in more detail, with a plan for how the XC Project Team will address that requirement when developing the XC software. n Goal 1: Provide access to all library resources, digital and non-digital Working alongside a library’s current Integrated Library System (ILS) and its other Web applications, XC will strive to bring together access to all library resources, thus eliminating the data silos that are now likely to exist between a library’s OPAC and its various digital reposi- tories and commercial databases. This goal suggests two fairly obvious metadata requirements (Requirements 1 and 2). Requirement 1—the system must be capable of acquiring and managing metadata from multiple sources: iLss, digital repositories, licensed databases, etc. A typical library currently has metadata pertaining to its collections residing in a variety of separate online systems: MARC data in an ILS, metadata in various sche- mas in digital collections and repositories, citation data in commercial databases, and other content on library Web sites. A library that implements XC may want to populate the system with metadata from several online environments to simplify access to all types of resources. To achieve Goal 1, XC must be capable of acquiring and managing metadata from all of these sources. Each online environment and type of metadata present their own challenges. Repurposing maRc data Repurposing MARC metadata from an existing ILS will be one of the biggest metadata tasks for a next-generation discovery system such as XC. In planning XC, we have assumed that most libraries will keep their current ILS for the next few years or perhaps migrate to a newer commer- cial or open-source ILS. In either case, most libraries will likely continue to rely on an ILS’s staff functionality to handle materials acquisition, cataloging, circulation, etc. for the short term. Relying upon an ILS as a processing environment does not, however, mean that a library must use the OPAC portion of that ILS as its means of resource discovery for users. XC will provide other options for resource retrieval by using Web services to interact with the ILS in the background.10 To repurpose ILS metadata and enable it to be used in various Web discovery envi- ronments, XC will harvest a copy of MARC metadata records from an institution’s ILS using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).11 Using Web services and standard protocols such as OAI- PMH offers not only a short-term solution for reusing metadata from an ILS, but can also be used in both the short- and long-term to harvest metadata from any sys- tem that is OAI-PMH harvestable, as will be discussed further below. While harvesting metadata from existing systems into XC creates duplication of metadata between an ILS and XC, this actually has significant benefits. XC will handle metadata updates through automated harvesting services that minimize additional work for library staff, other than for setting up and managing the automated services themselves. The internal XC metadata cache can be easily regenerated from the original repositories and services when necessary, such as to enable future changes to the internal XC metadata schema. The XC system architecture also makes use of internal metadata duplication among XC’s components, which allows these components to communicate with each other using OAI- PMH. This built-in metadata redundancy will also enable XC to communicate with external services using this standard protocol. It is important to distinguish the deliberate metadata redundancies built into the XC architecture from the type of metadata redundancies that have been singled out for elimination in the Library of Congress Working Group on the Future of Bibliographic Control draft report (Recommendation 1.1)12 and previously in the University of California (UC) Libraries Bibliographic Services Task Force’s final report.13 These other “negative” redundan- cies result from difficulties in sharing metadata among different environments and cause significant additional staff expense for libraries to enrich or recreate metadata locally. XC’s architecture actually solves many of these problems by facilitating the sharing of enriched metadata among XC users. XC can also adapt as the library com- munity begins to address the types of costly metadata redundancies mentioned in the above reports, such as between the OCLC WorldCat database14 and copies of that MARC data contained within a library’s ILS, because XC will be capable of harvesting metadata from any source that uses a standard API.15 metadata from digital repositories and other free sources XC will harvest metadata from various digital collections and repositories, using OAI-PMH, and will maintain a copy of the harvested metadata within the XC metadata cache, as shown in figure 1. The metadata services hub architecture provides flexibility and possible economy for XC users by offering the option for multiple XC insti- tutions to share a single metadata hub, thus allowing participating institutions to take full advantage of the hub’s capabilities to aggregate and augment metadata from multiple sources. While the procedure for harvest- mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 9 ing metadata from an external repository is not techno- logically difficult in itself, managing the flow of metadata coming from multiple sources and aggregating that meta- data for use in XC will require the development of sophis- ticated software. To address this, the XC Project Team is partnering with established experts in bibliographic metadata aggregation to develop the metadata services portion of the XC architecture. The team from Cornell University that has developed the software behind the National Science Digital Library’s Metadata Management System (NSDL/MMS)16 is advising the XC team in the development of the XC metadata services hub, which will be built on top of the basic NSDL/MMS software. The XC metadata services hub will coordinate meta- data services into a reusable task grouping that can be started on demand or scheduled to run regularly. This XC component will harvest XML metadata and com- bine metadata records that refer to equivalent resources (based on Uniform Resource Identifier [URI], if available, or other unique identifier) into what the Cornell team describes as a “mudball.” Each mudball will contain the original metadata, the sources for the metadata, and the references to any services used to combine metadata into the mudball. The mudball may also contain metadata that is the result of further automated processing or services to improve quality or to explicitly identify relationships between resources. Hub services could potentially record the source of each individual metadata statement within each mudball, which would then allow a metadata record to be redelivered in its original or in an enriched form when requested.17 By allowing for the capture of prov- enance data for each data element, the hub could poten- tially provide much more granular information about the origin of metadata—and much more flexibility for recombining metadata—than is possible in most MARC- based environments. After using the redeployed NSDL/MMS software as the foundation for the XC metadata hub, the XC Project Team will develop additional hub services to support XC’s functional requirements. XC-specific hub services will accommodate incoming MARC data (including MARC holdings data for non-digital resources); basic authority control; mappings from MARC 21, MARCXML,18 and Dublin Core to an internal XC schema defined within the XC Application Profile (described below); and other ser- vices to facilitate the functionality of the XC user environ- ments (see discussion of Requirement 5, below). Finally, the XC hub services will make the metadata available for harvesting from the hub by the XC client integration applications. metadata for licensed content For a next-generation discovery system such as XC to provide access to all library resources, it will need to pro- vide access to licensed content, such as citation data and full-text databases. Metasearch technology provides one option for incorporating access to licensed content into XC. Unfortunately, various difficulties with metasearch technology19 and usability issues with some metasearch products20 make metasearch technology a less-than-ideal solution. An alternative approach would bring meta- data from licensed content directly into a system such as XC. The metadata services hub architecture for XC is capable of handling the ingest and processing of meta- data supplied by commercial content providers by add- ing additional services to handle the necessary schema transformations and to control access to the licensed content. The more difficult issue with licensed content may be to obtain the cooperation of commercial vendors to ingest their metadata into XC. Pursuing individual agreements with vendors to negotiate rights to ingest their metadata is beyond the original scope of XC’s Phase 2 Project. However, the XC team will continue to monitor ongoing developments in this area, especially the work of the EthicShare Project, which uses a system architecture very similar to that of XC.21 It remains our goal to build a system that will facilitate the inclusion of licensed content within XC in situations where commercial providers have made it available to XC users. Requirement 1 summary When considering needed functionality for a next-gen- eration discovery system, the ability to ingest and man- age metadata from a variety of sources is of paramount importance. Unlike a current ILS, where we often think of metadata as mostly static unless it is supplemented by new, updated, and deleted records, we should instead envision the metadata in a next-generation system as being in constant motion, moving from one environment to another and being harvested and transformed on a scheduled basis. The metadata services hub architecture of the XC system will accommodate and facilitate such constant movement of metadata. Requirement 2—the system must handle multiple metadata schemas. An extension of Requirement 1 will be the necessity for a next-generation system such as XC to handle metadata from multiple schemas, as the system harvests those sche- mas from various sources. Library metadata priorities As a part of the XC survey of libraries described earlier in this paper, the XT Team queried respondents about what metadata schemas they currently use or plan to use in the near future. Many responding libraries indicated that they expect to increase their use of non–MARC 21 metadata within the next three years, although no library indicated the intention to completely move away from 10 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 MARC 21 within that time period. Nevertheless, the idea of a “MARC exit strategy” has been discussed in various circles.22 The architecture of XC will enable libraries to move beyond the constraints of a MARC-based system without abandoning their ILS, and will provide an oppor- tunity for libraries to stage their “MARC exit strategy” in a way that suits their purposes. Libraries also indicated that they plan to move away from homegrown schemas toward accepted standards such as METS,23 MODS,24 MADS,25 PREMIS,26 EAD,27 VRA Core,28 and Dublin Core.29 Several responding libraries plan to move toward a wider variety of metadata schemas in the near future, and will focus on using XML- based schemas to facilitate interoperability and metadata harvesting. To address the needs of these libraries in the future, XC’s metadata services will contain a variety of transformation services to handle a variety of schemas. Taking into account the metadata schemas mentioned the most often among survey respondents, the software developed during Phase 2 of the XC Project will sup- port harvested metadata in MARC 21, MARCXML, and Dublin Core (including Qualified Dublin Core).30 metadata crosswalks and mapping One respondent to the XC Survey offered the prediction that “reuse of existing metadata and transformation of metadata from one format to another will become commonplace and routine.”31 XC’s internal metadata transformations must be designed with this in mind, to facilitate making these activities “commonplace and routine.” Fortunately, many maps and crosswalks already exist that potentially can be incorporated into a next-generation system such as XC.32 The metadata ser- vices hub architecture for XC can function as a standard framework for applying a variety of existing crosswalks within a single, shared environment. Following “best practices” for crosswalking metadata, such as those developed by the Digital Library Federation (DLF),33 will be extremely important in this environment. As the DLF guidelines describe, metadata schema transforma- tion is not as straightforward as it might first appear to be. While the DLF guidelines advise always cross- walking from a more robust schema to a simpler one, sometimes in a series of steps, such mapping will often result in “dumbing down” of metadata, or loss of granu- larity. This is a particularly important concern for the XC Project because a large percentage of the metadata handled by XC will be rich legacy MARC 21 metadata, and we hope to maintain as much of that richness as possible within the XC system. In addition to simply mapping one data element in a schema to its closest equivalent in another, it is essential to ensure that the underlying metadata models of the two schemas being crosswalked are compatible. The authors of the Framework for a Bibliographic Future draft document define multiple layers of such models that need to be considered,34 and offer a general high- level comparison between the FRBR data model35 and the DCMI (Dublin Core Metadata Initiative) Abstract Model (DCAM).36 More detailed comparisons of models are also taking place as a part of the development of the new metadata content standard, Resource Description and Access (RDA).37 The developers of RDA have issued documents offering a detailed mapping of RDA elements to RDA’s underlying model (FRBR)38 and analyzing the relationship between RDA elements, the DCMI Abstract Model, and the Metadata Framework.39 As a result of a meeting held April 30–May 1, 2007, a joint DCMI/RDA Task Group is now undertaking the collaborative work necessary to carry out the following tasks: n Develop an RDA Element Vocabulary. n Develop an RDA/Dublin Core Application Profile based on FRBR and FRAD. n Disclose RDA Value Vocabularies using RDF/ RDFS/SKOS.40 These efforts hold much potential to provide a more rigorous way to communicate about metadata across multiple communities and to increase the compatibility of different metadata schemas and their underlying models. Such compatibility will be essential to enabling the func- tionality of future discovery systems such as XC. an xc metadata application profile The XC Project Team will define a metadata application profile for XC as a way to document decisions made about data elements, content standards, and crosswalking used within the system. The use of an application profile can facilitate metadata migration, harvesting, and other auto- mated processes, and presents an approach to metadata that is more flexible and responsive to local needs than simply adopting someone else’s metadata guidelines.41 Application profiles facilitate the use of multiple schemas because elements can be selected for inclusion from more than one existing schema, or additional elements can be created and defined locally.42 Because the XC system will incorporate harvested metadata from a variety of sources, the use of an application profile will be essential to sup- port XC’s complex system requirements. The DCMI Community has published guidelines for creating a Dublin Core Application Profile (DCAP), which is defined more specifically as: [a] form for documenting which terms a given applica- tion uses in its metadata, with what extensions or adap- tations, and specifying how those terms relate both to formal standards such as Dublin Core as well as to less formally defined element sets and vocabularies.43 mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 11 The announcement of plans to develop an RDA/ Dublin Core Application Profile illustrates the impor- tant role that application profiles are beginning to take to facilitate the interoperability of metadata schemas. The planned RDA/DC Application Profile will “trans- late” RDA into a standard structure that will allow it to be related more easily to other metadata element sets. Unfortunately, the RDA/DC Application Profile will likely not be completed in time for it to be incorporated into the first release of the XC software in mid-2009. Nevertheless, we intend to use the existing definitions of RDA elements to inform the development of the XC Application Profile.44 This will allow us to anticipate any future incompatibilities between the RDA/DC and the XC application profiles, and ensure that XC will be well- positioned to take advantage of RDA-based metadata when RDA is implemented. This process may have the reciprocal benefit of also informing the developers of RDA of any RDA elements that may be difficult to imple- ment within a next-generation system such as XC. The potential value of RDA to the XC project—in terms of providing a consistent approach to bibliographic and authority metadata and facilitating FRBR-related user functionality—is very significant. It is hoped that at some point XC can become an early adopter of RDA and provide a mechanism through which libraries can move their legacy MARC 21 metadata into a system that is compatible with an emerging international metadata standard. n Goal 2: Bring metadata about library resources into a more open Web environment XC will reveal library metadata not only through its own separate interface (either the out-of-the-box XC interface or an interface designed by the local library), but will also allow library metadata to be revealed through other Web applications. The latter approach will bring library resources directly to Web locations that library users are already visiting, rather than attempting to entice users to visit an additional library-specific Web location. Making library metadata work effectively in the broader Web environment (outside the well-defined boundar- ies of an ILS or repository) will require the following Requirements 3 and 4: Requirement 3—metadata must conform to the standards of the new web environments as well as to that of the system from which it originated. Achieving Requirement 3 will require library metadata in future systems to perform a dual function: to conform to both existing library standards as well as to Web standards and conventions. One way to achieve this is to ensure that the two types of standards themselves are compatible. Coyle and Hillmann have argued persua- sively for changes in the direction of RDA development to allow metadata created using RDA to function in the broader Web environment. These changes include the need to follow a clearly refined, high-level metadata model, to create data elements that can be manipulated by machines, and to move toward the use of URIs instead of textual identifiers.45 After the announcement of the outcomes of the RDA/DC Data Modeling meet- ing, the two authors are considerably more optimistic about RDA functioning as a standard within the broader Web environment.46 This discourse concerning RDA shows but a piece of the process through which long-es- tablished library metadata standards need to be reexam- ined to make library metadata understandable to both humans and machines on the Web. Moving away from AACR2 toward RDA, and ultimately toward incorporat- ing standard Web conventions into library metadata, can be a difficult process for those involved in creating and maintaining library standards. Nevertheless, transform- ing library metadata standards in this way is essential to fulfill the requirements necessary for next-generation library discovery systems. Requirement 4—metadata must function effectively within the new web environments as well as within the system from which it originated. Not only must metadata for a next-generation system follow the conventions and standards used in the broader Web, but the data also needs to be able to func- tion effectively in a broader Web environment. This is a slightly different proposition from Requirement 3, and will necessitate testing the metadata standards them- selves to ensure that they enable library metadata to function effectively. The XC Project will provide direct experience with using library metadata in two types of Web environ- ments: content management systems and learning man- agement systems. Library metadata in a content management system As shown in the XC architecture diagram in figure 1, the XC Project Team will build one of the primary user environments for XC on top of the open-source content management system, Drupal.47 The XC Drupal module will allow us to respond to many of the needs expressed by libraries in their responses to the XC survey48 by supplying: n a Web application server with a back-end database; 12 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 n a user interface with Web 2.0 features; n library-controlled Web pages that will treat library metadata as a native data type; n a metadata interface for enhancing or correcting metadata in the system; and n an administrative interface. The XC Team will bring library metadata into the Drupal content management system (CMS) as a native content type within that environment, creating a Drupal “node” for each metadata record. This will allow XC to take advantage of many native features of the Drupal CMS, such as a taxonomy system.49 Building XC inter- faces on top of the Drupal CMS will also give us an opportunity to collaborate with partner libraries that are already active participants in the Drupal user com- munity. XC’s architecture will allow the possibility of develop- ing additional user environments on top of other content management systems. Bringing library metadata into these new environments will provide many new oppor- tunities for libraries to manipulate their metadata and present it to users without being constrained by the limi- tations of the current generation of library systems. Such opportunities will then inform the future requirements for library metadata in such environments. Library metadata in a learning management system Figure 1 illustrates two examples of XC user envi- ronments through learning management systems: XC interfaces to both the Blackboard Learning System50 and Sakai.51 Much exciting work is being done at other institutions to bring library content into these Web applications.52 XC will build on projects such as these to reveal library metadata for non-licensed library resources from an ILS through learning management systems. Specifically, we plan to develop the capabil- ity for libraries to make the display of library metadata context-sensitive within the learning management sys- tem. For example, searching or browsing on a page for a particular academic course could be configured to reflect the subject area of the course (e.g., chemistry) and automatically present library resources related to that subject.53 This capability will build upon the experiences gained by the University of Rochester through its work to develop its “CoURse Resources” system.54 Such XC functionality will be integrated directly into the learn- ing management system, rather than simply providing a link out to a separate library system. Again, we hope that our efforts to bring library metadata into these new environments will encourage libraries to engage in further work to integrate library resources into broader Web environments and inform future requirements for library metadata in these envi- ronments. n Goal 3: Provide an interface with new Web functionality such as Web 2.0 features and faceted browsing New functionality for users will require that metadata fulfill more sophisticated functions in a next-generation system than it may have done in an ILS or repository, in order to provide more intuitive searching and navigation. The system will also need to capture and incorporate metadata generated through tagging, user-contributed reviews, etc. Such new functionality creates the need for Requirements 5 and 6. Requirement 5—metadata must support functionality to facilitate intuitive searching and navigation, such as faceted browsing and FRBR- informed results groupings. Enabling faceting and clustering Much research has already been done regarding the design of faceted search interfaces in general.55 When con- sidered along with user research conducted at other insti- tutions56 and to be conducted during the development of XC, this data provides a strong foundation for the design of a faceted browse environment. The XC Project Team has already gained firsthand experience with developing faceted browsing through the development of the “C4” prototype interface during Phase 1 of the XC Project.57 To enable faceting within XC, we will also pay particular attention to what others have discovered through design- ing faceted interfaces on top of legacy MARC 21 meta- data. Specific lessons learned from those involved with North Carolina State University’s Endeca-based catalog,58 Vanderbilt University’s Primo implementation,59 and Plymouth State University’s Scriblio system60 provide valuable guidance for the XC Project Team as we design facets for the XC system. Ideally, a mechanism should be developed to enable these discoveries to feed back into the development of metadata and encoding standards, so that changes to existing standards can be considered to facilitate faceting in the future. Several new system implementations have used Library of Congress Subject Headings (LCSH) and LC subdivisions from MARC 21 records as the basis for deriving facets. The XC “C4” prototype interface provides facets for topic, genre, and region that are based simply upon one or more MARC 21 6XX tags.61 North Carolina State University’s Endeca-based system has enabled facets for topic, genre, region, and era using LCSH subdivisions as well, but this has necessitated a “massive cleanup” of subdivisions, as described by Charley Pennell.62 OCLC’s FAST (Faceted Application of Subject Terminology) project may provide another option for enabling such facets.63 A library could populate its MARC 21 data with FAST headings, based mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 13 upon the existing LCSH in the records, and then use the FAST headings as the basis for generating facets. It remains to be seen whether FAST will offer significant benefit over LCSH itself when it comes to faceting, however, since FAST headings are generated directly from LCSH. While MARC 21 metadata has some known difficul- ties where faceting and clustering are concerned (such as those involving LCSH), the XC system will encounter additional difficulties when implementing these tech- nologies with less robust metadata schemas such as simple Dublin Core, and especially across metadata from a variety of schemas. The development of Web services to augment batches of metadata records in an automated manner holds some promise for improving the creation of facets from other metadata schemas. Within the XC system, such services could be added to the metadata services hub and run against ingested metadata. While designing extensive services of this type is beyond the scope of the next phase of XC software development, we will encourage others to develop such services for XC. Another (but much less desirable) approach to aug- menting metadata is for a metadata specialist to manually edit one record or group of records. The XC cataloging interface, built within the Drupal CMS, will allow record- by-record editing of metadata when necessary. While we see this editing interface as essential functionality for XC, we anticipate that libraries will want to use this feature sparingly. In many cases it will be preferable to correct or augment metadata within its original repository (e.g., the institution’s ILS) and then re-harvest the corrected meta- data, rather than correcting it manually within XC itself. Because of the expense of manual metadata augmentation and correction, libraries will be well-advised to rely upon insights gained through user research to assess the value of this type of work. For example, a library might decide to edit individual metadata records only when the correction or augmentation will support specific system functionality that is of high priority for the institution’s users. implementing FRBR results groupings To incorporate logical groupings of search results based upon the FRBR64 and FRAD65 data models over sets of diverse metadata within XC, we will encounter similar difficulties that we face with faceting and clustering. Various analyses of the MARC 21 formats have dealt extensively with the relationship between FRBR and MARC 21,66 and others have written specifically about methodology for FRBRizing a MARC-based catalog.67 In addition, various tools and Web services are available that can potentially facilitate this process.68 Even with this extensive body of work to draw upon, however, the suc- cess of our implementation of FRBR-based functionality will depend upon both the quality and completeness of the system’s metadata. Metadata in XC that originated as Dublin Core records may need significant augmenta- tion to be incorporated effectively into FRBRized results displays. To maximize the ability of the system to support FRBR/FRAD results groupings, we may need to supple- ment automated grouping of resources with a combina- tion of additional services for the metadata services hub, and with cataloger-generated metadata correction and augmentation, as described above.69 The XC team will use the results of user research carried out during the next phase of the XC Project to inform our decision-making regarding what FRBR-informed results grouping users find helpful, and then assess what specific metadata aug- mentation services are needed for XC. Providing FRBR-informed groupings of related records in search results will be easier when the underly- ing metadata incorporates principles of authority control. Of course, the vast majority of the non-MARC metadata that will be ingested into XC will not be under author- ity control. Again, this situation suggests the need for additional services or functionality to improve existing metadata within the XC metadata hub, the XC cataloging interface, or both. As an experiment in developing ser- vices to facilitate authority control, the XC Project Team carried out a pilot project in partnership with a group of software engineering students from the Rochester Institute of Technology (RIT) during Phase 1 of XC. The RIT students designed a basic name access control tool that can be used across disparate metadata schemas in an environment such as XC. The tool can ingest MARC 21 authority and bibliographic records as well as Dublin Core records, provide automated matching, and facili- tate a cataloger’s handling of problem reports.70 The XC Project Team will implement the automated portion of the tool as a Web service within the XC hub, and the “cataloger facilitation” portion of the tool within the XC cataloging user interface. Institutions that use XC can then incorporate additional tools to facilitate authority control into XC as they are needed and developed. In addition to providing a test case for developing XC metadata services, the RIT pilot project proved valuable by providing an opportunity for student software devel- opers and catalogers to discuss the functional require- ments of a cataloging tool. Not only did the experience enable the developers to understand the needs of the system’s intended users, but it also presented an opportu- nity for the engineering students to demonstrate techno- logical possibilities that the catalogers—who work almost exclusively with legacy ILS technology—may not have envisioned before participating in the project. Requirement 6—the system must manage user- generated metadata resulting from user tagging, submission of reviews, etc. Because users now expect Web-based tools to offer Web 2.0 functionalities, the XC Project has as one of its basic 14 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 goals to incorporate these functionalities into XC’s user environments. The results of the XC Survey rank tools to support the finding, gathering, use, and reuse of scholarly content (e.g., RSS feeds, blogs, tagging, user reviews) eighth out of a list of twenty new desirable OPAC fea- tures.71 We expect to learn much more about the useful- ness of Web 2.0 technology within a next-generation system through the user research that we will carry out during Phase 2 of the XC Project. The XC system will capture metadata generated by users from any one of the system’s user environ- ments (e.g., Drupal-based interface, learning manage- ment system integration) and harvest it back into the system’s metadata services hub for processing.72 The XC Application Profile will incorporate user-generated metadata, mapped into its own carefully defined meta- data elements. This will allow us to capture and manage this metadata as discrete content, without inadvertently mixing it with other metadata created by library staff or ingested from other sources. n Goal 4: Conduct user research to inform system development User research will be essential to informing the design and functionality of the XC software. To align XC’s functional requirements as closely as possible with user needs, the XC Project Team will practice a user-centered design methodology that takes an iterative approach to defining the system’s functional requirements. Since we will engage concurrently in the processes of user research and software design, we will not fully determine the system requirements for XC until a significant amount of user research has been done. A complete picture of the demands upon metadata within XC will thus emerge as we gain information from our user research. n Goal 5: Publish the XC code as open-source software Central to the vision of the XC Project is sharing the XC software freely throughout the library community and beyond. Our hope is that others will use all or part of the XC software, modify it, and improve it to meet their own needs. New requirements for the metadata within XC are likely to arise as this process takes place. Other future changes to the XC software will also be needed to ensure the software’s continued compatibility with vari- ous metadata standards and schemas. These changes will all affect the system requirements for XC over time. addressing Goals 4 and 5 While Goals 1 through 3 for the XC Project result in specific high-level functional requirements for the sys- tem’s discovery metadata that can be addressed and dis- cussed as XC is being developed, Goals 4 and 5 present general challenges that must be addressed in the future. Goal 4 is likely to fuel the need to update the XC software over time as the needs of users change. Goal 5 provides a challenge to managing that updating process in a col- laborative environment. These two goals suggest an additional general requirement for the system’s metadata Requirement 7: Requirement 7—the system’s metadata must be extensible to facilitate future enhancements and updates. Enabling future user needs Developing XC using a user-centered design process in which user research and software design occur simulta- neously will enable us to design and build a system that is as responsive as possible to the needs of users that are seeking library resources. However, user needs will change during the life of the XC software. These needs must be assessed and addressed, and then weighed against the desires of individual institutions that use XC and who request specific system enhancements. To carry forward the XC Project’s commitment to serving users, we will develop a governance model for the XC community that brings the needs of future users into the decision-making process by providing a method for continuing to determine and capture user needs. In addition, we will consciously cultivate a commitment to user research among members of the XC community. Because the XC software will be released as open source, we can also encourage XC partners to develop whatever additional functionality they need for their own insti- tutions and make these enhancements available to the entire community of XC users. This approach is very different from the enhancement process in place for most commercial systems, and XC partner institutions may need to adjust to this approach. Enabling future metadata standards As current metadata standards are revised and new standards and schemas are created, XC must be able to accommodate these changes. New crosswalks will allow new metadata schemas to be mapped to the XC internal schema in the future. The XC Application Profile can be updated with the addition of new data elements as needed. The Drupal-based XC user environment will also allow institutions that use XC to create new internal data types to incorporate additional types of metadata. As the development of the Semantic Web moves forward73 and enables smart linking between existing authority files and vocabularies,74 XC’s architecture can make use of the resulting Web services, either by incorporating them mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 15 through the XC metadata services hub or through the native XC user interface as part of a user search query. n Further considerations The above discussion of the goals and requirements for XC has revealed a number of issues related to the devel- opment of next-generation discovery systems that are unfortunately beyond the scope of the next phase of the XC Project. We therefore offer them as a possible agenda for future work by the broader library community: 1. Explore the wider usefulness of Web-based meta- data services and the need for an automated metadata services coordinator to control these functions. Libraries are already comfortable with basic “services” that are performed on metadata by an outside agency: For example, a library may send copies of its MARC records to a vendor for authority processing or enrichment with tables of contents or other data elements. The library com- munity should encourage vendors and others to develop these and other metadata enrichment options as automated Web services. 2. Study the advantages of using statement-level metadata provenance, as used in the NSDL Metadata Management System and considered for use within the XC metadata services hub, and explore whether there are ways that MARC 21 could move toward allowing more granularity in recording and sharing metadata provenance. 3. To facilitate access to licensed library resources, encourage the development of more robust metase- arch technology and standards so that technologi- cal limitations do not hinder system performance and search result usability. If this is not successful, libraries and content providers must work together to enable metadata for licensed resources to be revealed within open discovery environments such as XC and EthicShare.75 This second scenario will enable libraries to directly address usability issues with the display of licensed content, which may make it a more desirable longer-term solution than attempting to improve metasearch technology. 4. The administrative bodies of the two groups rep- resented on the DCMI/RDA Task Group (i.e., the Dublin Core Metadata Initiative and the RDA Committee of Principals) have a responsibility to take the lead in funding this group’s work to develop and maintain the RDA/DC Application Profile and its related registries and vocabularies. Beyond this, however, the broader library com- munity must recognize that this work is essential to ensure that future library metadata standards will function in the broader Web environment, and offer additional administrative and financial sup- port for it in the coming years. 5. To ensure that library standards work effectively outside of traditional library systems, catalog- ers and metadata experts must develop ongoing, collaborative working relationships with system developers. Such collaboration will necessitate educating each group of experts about the domain of the other. 6. Libraries should experiment with using metadata in new environments and use the lessons learned from this activity to inform the metadata standards development process. While current library auto- mation environments by and large do not provide opportunities for this, the eXtensible Catalog will provide a flexible platform where experimenta- tion can take place.76 XC will make experimenta- tion as risk-free as possible by ensuring that the original metadata brought into the system can be reharvested in its original form, thus minimizing concerns about possible data corruption. XC will also minimize the investment needed for a library to engage in this experimentation because it will be released as open-source software. 7. To facilitate new functionality for next-generation library discovery environments, libraries must share their new expertise in this area with each other. For example, library professional organiza- tions (such as ALA and its associations) should form discussion groups and committees devoted to sharing lessons learned from the implementa- tion of faceted interfaces and Web 2.0 technologies, such as tagging and folksonomies. Such groups should develop a “best practices” document out- lining a preferred way to define facets from MARC 21 data that can be used by any library implement- ing faceting on top of its legacy metadata. 8. The library community should discuss and encour- age mechanisms for pooling and sharing user- generated metadata among libraries and other interested institutions. n Conclusions To present library resources via the Web in a manner that users now expect, library metadata must function in ways that have never been required of it before. Making library metadata function effectively within the broader Web environment will require that libraries take advantage of the combined knowledge of experts in the areas of cata- loging/metadata and system development who share a 16 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 common vision for serving library users. The challenges to making legacy library metadata and newer metadata for digital resources interact effectively in the broader Web environment are significant, and work must begin now to ensure that we can preserve the investment that libraries have made in their legacy metadata. While the recommendations within this report are the result of planning to develop one particular library discovery system—the eXtensible Catalog (XC)—these lessons can inform the development of other systems as well. The actual development of XC will continue to add to our knowledge in this area. While it may be tempting to wait and see what commercial vendors offer as their next generation of commercial discovery products, such a passive approach may jeopardize the future viability of library metadata. Projects such as the eXtensible Catalog can serve as a vehicle for moving forward by providing an opportunity for libraries to experiment and to then take informed action to move the library community toward a next generation of resource discovery systems. Acknowledgments Phase 1 of the eXtensible Catalog Project was funded through a grant from the Andrew W. Mellon Foundation. This paper is in partial fulfillment of that grant, originally funded on April 1, 2006, and concluding on June 30, 2007. The author acknowledges the contributions of the entire University of Rochester eXtensible Catalog Project Team to the content of this paper, and especially thanks David Lindahl, Barbara Tillett, and Konstantin Gurevich for reading and offering suggestions on drafts of this paper. References and notes 1. Despite the use of the word “catalog” within the name of the eXtensible Catalog Project, this paper will avoid using the word “catalog” in the phrase “next-generation catalog” because this may misleadingly convey the idea of a catalog as solely a single, separate Web destination for library users. Instead, terms such as “discovery environment” and “discovery system” will be preferred. 2. The XC blog provides a list of XC Partners, describes their roles in XC Phase 2, and provides links to reports that represent the outcomes of XC Phase 1. “XC (eXtensible Catalog): An Open- Source Online System That Will Unify Access to Traditional and Digital Library Resources,” www.extensiblecatalog.info (accessed October 4, 2007). 3. IFLA Study Group on the Functional Requirements for Bibliographic Records, Functional Requirements for Bibliographic Records (Munich: K. G. Saur, 1998), www.ifla.org/VII/s13/frbr/ frbr.pdf (accessed July 23, 2007). 4. IFLA Working Group on Functional Requirements and Numbering of Authority Records (FRANAR), “Functional Requirements for Authority Data: A Conceptual Model,” April 1, 2007, www.ifla.org/VII/d4/FRANAR-ConceptualModel- 2ndReview.pdf (accessed July 23, 2007). 5. Library of Congress, Network Development and MARC Standards Office, “MARC 21 Formats,” April 18, 2005, www.loc .gov/marc/marcdocz.html (accessed September 3, 2007). 6. “Dublin Core Metadata Element Set, Version 1.1,” Decem- ber 20, 2004, http://dublincore.org/documents/dces (accessed September 3, 2007). 7. University of Rochester River Campus Libraries, “eXten- sible Catalog Phase 2,” (grant proposal submitted to the Andrew W. Mellon Foundation, July 11, 2007). 8. “Literature List,” eXtensible Catalog blog, www. extensiblecatalog.info/?page_id=17 (accessed August 27, 2007). 9. A summary of the results of this survey is available on the XC blog. Nancy Fried Foster et al., “eXtensible Catalog Survey Report,” July 20, 2007, www.extensiblecatalog.info/wp-content/ uploads/2007/07/XC%20survey%20report.pdf (accessed July 23, 2007). 10. Lorcan Dempsey has written of the need for a service layer for libraries that would facilitate the “de-coupling” of resource retrieval from back-end processing. Lorcan Dempsey, “A Palindromic ILS Service Layer,” Lorcan Dempsey’s Weblog, January 20, 2006, http://orweblog.oclc.org/archives/000927. html (accessed August 24, 2007). 11. “Open Archives Initiative Protocol for Metadata Harvest- ing v. 2.0,” www.openarchives.org/OAI/openarchivesprotocol. html (accessed August 27, 2007). 12. Library of Congress, Working Group on the Future of Bibliographic Control, “Report on the Future of Bibliographic Control: Draft for Public Comment,” November 30, 2007, www .loc.gov/bibliographic-future/news/lcwg-report-draft-11-30- 07-final.pdf (accessed December 30, 2007). 13. University of California Libraries Bibliographic Services Task Force, “Rethinking How We Provide Bibliographic Services for the University of California,” Final report, 34, http://libraries. universityofcalifornia.edu/sopag/BSTF/Final.pdf (accessed August 24, 2007). 14. “[WorldCat.org] Search for an Item in Libraries Near You,” www.worldcat.org (accessed August 24, 2007). 15. OCLC’s plan to create additional APIs to WorldCat as part of its WorldCat Grid project is a welcome development that may enable OCLC members to harvest metadata directly from WorldCat into a system such as XC in the future. See the following blog posting for an early description of OCLC’s plans, which have not been formally unveiled by OCLC as of this writing: Bess Sadler, “The Librarians and the Choco- late Factory: OCLC Developer Network Day,” Solvitur ambu- lando, October 3, 2007, www.ibiblio.org/bess/?p=88 (accessed December 30, 2007). 16. “Metadata Management System,” NSDL Registry, Sep- tember 20, 2006, http://metadataregistry.org/wiki/index.php/ Metadata_Management_System (accessed July 23, 2007). 17. Diane Hillmann, Stuart Sutton, and Jon Phipps, “NSDL Metadata Improvement and Augmentation Services,”(grant proposal submitted to the National Science Foundation, 2007). 18. Library of Congress, Network Development and MARC Standards Office, “MARCXML: MARC 21 XML Schema,” July 26, 2006, www.loc.gov/standards/marcxml (accessed September 3, 2007). mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 17 19. Andrew K. Pace, “Category: Metasearch,” Hectic Pace, http://blogs.ala.org/pace.php?cat=150 (accessed August 27, 2007). See in particular the following blog entries: “MetaMeta,” July 25, 2006; “More Meta,” September 29, 2006; “Preaching to the Publishers,” Oct 31, 2006; “Even More Meta,” July 11, 2007; and “Still Here,” August 21, 2007. 20. David Lindahl, “Metasearch in the Users’ Context,” The Serials Librarian 51, no. 3/4 (2007): 220–222. 21. EthicShare, a collaborative project of the University of Minnesota, Georgetown University, Indiana University–Bloom- ington, Indiana University–Purdue University Indianapolis, and the University of Virginia, is addressing this challenge as part of its plan to develop a sustainable online environment for the practical ethics community. The architecture of the proposed EthicShare system has many similarities to that of XC, but the project focuses specifically upon ingesting citation metadata from a variety of sources, including commercial providers. See Cecily Marcus, “EthicShare Planning Phase Final Report,” July 2007, www.lib.umn.edu/about/ethicshare/University%20 of%20Minnesota_EthicShare_Final_Report.pdf (accessed August 27, 2007). 22. Roy Tennant used this phrase in “MARC Exit Strategies,” Library Journal 127, no. 19 (November 15, 2002), www.libraryjour- nal.com/article/CA256611.html?q=tennant+exit (accessed July 23, 2007); Karen Coyle presented her vision for moving beyond MARC to a more flexible, identifier-based record structure that will facilitate a range of library functions in “Future Consider- ations: The Functional Library Systems Record,” Library Hi Tech 22, no. 2 (2004). 23. Library of Congress, Network Development and MARC Standards Office, “METS: Metadata Encoding and Transmission Standard Official Web Site,” August 23, 2007, www.loc.gov/ standards/mets (accessed September 3, 2007). 24. Library of Congress, Network Development and MARC Standards Office, “MODS: Metadata Object Description Schema,” August 22, 2007, www.loc.gov/standards/mods (accessed Sep- tember 3, 2007). 25. Library of Congress, Network Development and MARC Standards Office, “MADS: Metadata Authority Description Schema,” February 2, 2007, www.loc.gov/standards/mads (accessed September 3, 2007). 26. “PREMIS: Preservation Metadata Maintenance Activity,” July 31, 2007, www.loc.gov/standards/premis (accessed Sep- tember 3, 2007). 27. Library of Congress, Network Development and MARC Standards Office, “EAD: Encoded Archival Description Version 2002 Official Site,” August 17, 2007, www.loc.gov/ead (accessed September 3, 2007). 28. Visual Resources Association, “VRA Core: Welcome to the VRA Core 4.0,” www.vraweb.org/projects/vracore4 (accessed September 3, 2007). 29. “Dublin Core Metadata Element Set, Version 1.1.” 30. Other XML-compatible schemas, such as MODS and MADS, will also be supported initially in XC if they are first con- verted into MARC XML or Qualified Dublin Core. In the future, we plan to allow these other schemas to be harvested directly into XC. 31. Foster et al., “eXtensible Catalog Survey Report,” July 20, 2007, 15. The original comment was submitted by Meg Bellinger in Yale University’s response to the XC Survey. 32. Patricia Harpring et al., “Metadata Standards Cross- walks,” in Introduction to Metadata: Pathways to Digital Informa- tion (Getty Research Institute, n.d.), www.getty.edu/research/ conducting_research/standards/intrometadata/crosswalks. html (accessed August 29, 2007); see also Carol Jean Godby, Jef- frey A. Young, and Eric Childress, “A Repository of Metadata Crosswalks,” D-Lib Magazine 10, no. 12 (December 2004), www .dlib.org/dlib/december04/godby/12godby.html (accessed July 23, 2007). 33. Digital Library Federation, “CrosswalkingLogic,” June 22, 2007, http://webservices.itcs.umich.edu/mediawiki/oaibp/ index.php/CrosswalkingLogic (accessed August 28, 2007). 34. Karen Coyle et al., “Framework for a Bibliographic Future,” May 2007, http://futurelib.pbwiki.com/Framework (accessed July 23, 2007). 35. IFLA Study Group on the Functional Requirements for Bibliographic Records, Functional Requirements for Bibliographic Records. 36. Andy Powell et al., “DCMI Abstract Model,” Dublin Core Metadata Initiative, June 4, 2007, http://dublincore.org/ documents/abstract-model (accessed August 29, 2007). 37. Joint Steering Committee for Development of RDA, “RDA: Resource Description and Access: Background,” July 16, 2007, www.collectionscanada.ca/jsc/rda.html (accessed August 29, 2007). 38. Joint Steering Committee for Development of RDA, “RDA-FRBR Mapping,” June 14, 2007, www.collectionscanada .ca/jsc/docs/5rda-frbrmapping.pdf (accessed August 29, 2007). 39. Joint Steering Committee for Development of RDA, “RDA Element Analysis,” June 14, 2007, www.collectionscanada.ca/ jsc/docs/5rda-elementanalysis.pdf (accessed August 28, 2007). A revised version of the document was issued on December 16, 2007, at www.collectionscanada.gc.ca/jsc/docs/5rda-element analysisrev.pdf (accessed December 30, 2007). 40. “Data Model Meeting: British Library, London 30 April–1 May 2007,” www.bl.uk/services/bibliographic/meeting.html (accessed July 23, 2007). The Task Group has outlined its work plan, including deliverables, on its wiki at http://dublincore .org/dcmirdataskgroup (accessed October 4, 2007). 41. Emily A Hicks, Jody Perkins, and Margaret Beecher Mau- rer, “Application Profile Development for Consortial Digital Libraries,” Library Resources and Technical Services 51, no. 2 (April 2007). 42. Makx Dekkers, “Application Profiles, or How to Mix and Match Metadata Schemas,” Cultivate Interactive, January 2001, www.cultivate-int.org/issue3/schemas (accessed August 29, 2007). 43. Thomas Baker et al., “Dublin Core Application Profile Guidelines,” September 3, 2005, http://dublincore.org/usage/ documents/profile-guidelines (accessed October 8, 2007). 44. Joint Steering Committee for Development of RDA, “RDA Element Analysis.” 45. Karen Coyle and Diane Hillmann, “Resource Descrip- tion and Access (RDA): Cataloging Rules for the 20th Century,” D-Lib Magazine 13, no. 1/2 (Jan./Feb. 2007), www.dlib.org/dlib/ january07/coyle/01coyle.html (accessed August 24, 2007). 46. Karen Coyle, “Astonishing Announcement: RDA Goes 2.0,” Coyle’s InFormation, May 3, 2007, http://kcoyle.blogspot .com/2007/05/astonishing-announcement-rda-goes-20.html (accessed August 29, 2007). 18 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 47. “Drupal.org,” http://drupal.org (accessed August 30, 2007). 48. Foster et al., “eXtensible Catalog Survey Report,” 14. 49. “Taxonomy: A Way to Organize Your Content,” Drupal.org, http://drupal.org/handbook/modules/taxonomy (accessed September 12, 2007). 50. “Blackboard Learning System,” www.blackboard.com/ products/academic_suite/learning_system/index.Bb (accessed August 31, 2007). 51. “Sakai: Collaboration and Learning Environment for Edu- cation,” http://sakaiproject.org (accessed August 31, 2007). 52. For example, the Library into Blackboard Project at California State Fullerton has developed a toolkit for faculty that brings OpenURL resolver functionality into Blackboard to create linked citations to resources. See “Putting the Library into Black- board: A Toolkit for Cal State Fullerton Faculty,” 2005, www .library.fullerton.edu/librarytoolkit/default.shtml (accessed August 31, 2007); and Susan Tschabrun, “Putting the Library into Blackboard: Using the SFX OpenURL Generator to Create a Toolkit for Faculty.” The Sakaibrary Project at Indiana Uni- versity and the University of Michigan are working to integrate licensed library content into Sakai using metasearch technology. See “Sakaibrary: Integrating Licensed Library Resources with Sakai,” June 28, 2007, www.dlib.indiana.edu/projects/sakai (accessed August 31, 2007). 53. University of Rochester River Campus Libraries, “eXten- sible Catalog Phase 2.” 54. Susan Gibbons, “Library Course Management Systems: An Overview,” Library Technology Reports 41, no. 3 (May/June 2005): 34–37. 55. Marti A. Hearst, “Design Recommendations for Hier- archical Faceted Search Interfaces,” August 2006, http:// flamenco.berkeley.edu/papers/faceted-workshop06.pdf (accessed August 31, 2007). 56. Kristin Antelman, Emily Lynema, and Andrew K. Pace, “Toward a Twenty-First Century Library Catalog,” Information Technology and Libraries 25, no. 3 (September 2006): 128–138. 57. “C4,” https://www.library.rochester.edu/c4 (accessed September 28, 2007). As of the time of this writing, the C4 prototype is available to the public. However, the prototype is no longer being developed, and this prototype may cease to be available at some point in the future. 58. Charley Pennell, “Forward to the Past: Resurrecting Faceted Search @ NCSU Libraries,” (PowerPoint presenta- tion at the American Library Association Annual Conference, Washington, D.C., June 24, 2007), www.lib.ncsu.edu/endeca/ presentations/200706-facetedcatalogs-pennell.ppt (accessed August 31, 2007). 59. Mary Charles Lasater, “Authority Control Meets Faceted Browse: Vanderbilt and Primo,” (PowerPoint presentation at the American Library Association Annual Conference, Washington, D.C., June 24, 2007), www.ala.org/ala/lita/litamembership/ litaigs/authorityalcts/2007annualfiles/MaryCharlesLasater.ppt (accessed August 31, 2007). 60. Casey Bisson, “Faceting and Clustering: An Implementa- tion Report Based on Scriblio,” (PowerPoint presentation at the American Library Association Annual Conference, Washing- ton, D.C., June 24, 2007), http://oz.plymouth.edu/~cbisson/ presentations/ALAannual_2-2007June24.pdf (accessed August 31, 2007). 61. “Subject Access Fields (6XX),” in MARC 21 Concise Format for Bibliographic Data (2006), www.loc.gov/marc/bibliographic/ ecbdsubj.html (accessed September 28, 2007). 62. Pennell, “Forward to the Past: Resurrecting Faceted Search@ NCSU Libraries.” 63. “FAST: Faceted Application of Subject Terminology,” www.oclc.org/research/projects/fast (accessed August 31, 2007). 64. IFLA Study Group on the Functional Requirements for Bibliographic Records, Functional Requirements for Bibliographic Records. 65. IFLA Working Group on Functional Requirements and Numbering of Authority Records (FRANAR), “Functional Requirements for Authority Data.” 66. Library of Congress, Network Development and MARC Standards Office, “Functional Analysis of the MARC 21 Bib- liographic and Holding Formats,” April 6, 2006, www.loc. gov/marc/marc-functional-analysis/functional-analysis.html (accessed August 31, 2007); Martha M. Yee, “FRBRization: A Method for Turning Online Public Finding Lists into Online Public Catalogs,” Information Technology and Libraries 24, no. 2 (June 2005): 77–95; Pat Riva, “Mapping MARC 21 Linking Entry Fields to FRBR and Tillett’s Taxonomy of Bibliographic Relation- ships,” Library Resources and Technical Services 48, no. 2 (April 2004): 130–143. 67. Trond Aalberg, “A Process and Tool for the Conversion of MARC Records to a Normalized FRBR Implementation,” in Digital Libraries: Achievements, Challenges and Opportunities (Berlin/Heidelberg: Springer, 2006), 283–292; Christian Monch and Trond Aalberg, “Automatic Conversion from MARC to FRBR,” in Research and Advanced Technology for Digital Libraries (Berlin/Heidelberg: Springer, 2003): 405–411; David Mimno and Gregory Crane, “Hierarchical Catalog Records: Implementing a FRBR Catalog,” D-Lib Magazine 11, no. 10 (October 2005), www .dlib.org/dlib/october05/crane/10crane.html (accessed August 24, 2007). 68. Trond Aalberg, Frank Berg Haugen, and Ole Husby, “A Tool for Converting from MARC to FRBR,” in Research and Advanced Technology for Digital Libraries (Berlin/Heidelberg: Springer, 2006), 453–456; “FRBR Work-Set Algorithm,” www .oclc.org/research/software/frbr/default.htm (accessed August 31, 2007); “xISBN (Web Service),” www.worldcat .org/affiliate/webservices/xisbn/app.jsp (accessed August 31, 2007). 69. For example, MARC 21 data may need to be augmented to extract data attributes related to FRBR works and expressions that are not explicitly coded within a MARC 21 bibliographic record (such as a date associated with a work coded within a general note field); or to “sort out” the fields in a MARC 21 bibliographic record for a single resource that contains various works and/or expressions (e.g. ,a sound recording with multiple tracks), to associate the various fields (performer access points, analytical entries, subject headings, etc.) with the appropriate work or expression. 70. While the RIT-developed tool is not publicly available at the time of this writing, it is our intent to post it to Sourceforge (www.sourceforge.net) in the near future. The final report of the RIT project is available at http://docushare.lib.rochester.edu/ docushare/dsweb/Get/Document-27362 (accessed January 2, 2008). mEtaData to suppoRt nExt-GEnERation LiBRaRY REsouRcE DiscovERY | BowEn 19 71. Foster et al., “eXtensible Catalog Survey Report.” 72. Note the arrow pointing to the left in Figure 1 between the user environments and the metadata services hub. 73. Jane Greenberg and Eva Mendez, Knitting the Semantic Web (Binghamton, NY: Haworth Information Press, 2007). This volume, co-published simultaneously as Cataloging and Clas- sification Quarterly 43, no. 3/4, contains a wealth of articles that explore the role that libraries can, and should, play in the devel- opment of the Semantic Web. 74. Corey A. Harper and Barbara B. Tillett explore various methods for making these controlled vocabularies available in “Library of Congress Controlled Vocabularies and Their Appli- cation to the Semantic Web,” Cataloging and Classification Quar- terly 43, no. 3/4 (2007): 63. The development of SKOS (Simple Knowledge Organization System), a Semantic Web language for representing controlled structured vocabularies, will also be valuable for XC. See Alistair Miles and Jose R. Perez-Aguiera, “SKOS: Simple Knowledge Organisation for the Web,” Catalog- ingand Classification Quarterly 43, no. 3/4 (2007). 75. Marcus, “EthicShare Planning Phase Final Report.” 76. The Talis Platform provides another promising environ- ment for experimentation and development. See “Talis Platform: Semantic Web Application Platform,” Talis, www.talis.com/ platform (accessed September 2, 2007). 3254 ---- 20 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 An Assessment of Student Satisfaction with a Circulating Laptop Service Louise Feldmann, Lindsey Wess, and Tom Moothart Since May of 2000, Colorado State University’s (CSU) Morgan Library has provided a laptop computer lending service. In five years the service had expanded from 20 to 172 laptops. Although the service was deemed a success, users complained about slow laptop startups, lost data, and lost wireless connections. In the fall of 2005, the pro- gram was formally assessed using a customer satisfaction survey. This paper discusses the results of the survey and changes made to the service based on user feedback. Colorado State University (CSU) is a land-grant insti-tution located in Fort Collins, Colorado. The CSU Libraries consist of the Morgan Library, the main library on the central campus; the Veterinary Teaching Branch Hospital Library at the Veterinary Hospital cam- pus; and the Atmospheric Branch Library at the Foothills Campus. In 1997, Morgan Library completed a major reno- vation and expansion which provided a designated space for public desktop computers in an information commons environment. The library called this space the Electronic Information Center (EIC). Due to the popularity of the EIC ,and with the intent of expanding computer access without expanding the computer lab, library staff began to explore the implementation of a laptop checkout service in 2000. Library staff used Heather Lyle’s (1999) arti- cle “Circulating Laptop Computers at West Virginia University” as a guide in planning the service. Development funds were used to purchase twenty laptop computers, and the 3COM Corporation donated fifteen wireless network access points. The laptops were to be used in Morgan Library on a wireless network main- tained by the Library Technology Services department. These computers were to be circulated from the loan desk, the same desk used to check out books. Although the building is open to the public, use of the laptops was limited to university students and staff and for library in-house use only. All the public desktop computers and laptops use Microsoft Windows and Microsoft Office. Maintaining the security of the libraries’ network and students’ personal data in a wireless environment was paramount. To maintain a secure computing environment and present a standardized computing experience in the library, an application of Windows XP group policies was used. Currently, the laptop software is updated at least every semester using Symantec Ghost. Ghost cop- ies a standardized image to every laptop even when the library owns a variety of computer models from the same manufacturer. Additionally, due to concerns over wireless computer security, Morgan Library implemented Cisco’s Virtual Private Network (VPN) in 2004. The laptop service was launched in May 2000. More than 22,000 laptop transactions occurred in the initial year. Since its inception, the use of the Morgan Library laptop service and the number of laptops available for checkout has steadily grown. Using student technology funds, the service had grown to 172 laptops and ten presenta- tion kits consisting of a laptop, projector, and a portable screen. Circulation during the fall 2005 semester totaled 30,626 laptops and 102 presentation kits. In fiscal year 2005, 66,552 laptops and presentation kits were checked out. Based on the high circulation statistics and anecdotal evidence, the service appeared to be successful. Although Morgan Library replaced laptops every three years and upgraded the wireless network, laptop support staff noted that users complained of slow laptop startups, lost data, and lost wireless connections. The researchers also noted that large numbers of users queued at the circulation desk at 5:00 p.m. even though large numbers of desktop computers were available in the EIC. A customer service satisfaction survey was developed to assess the service and test library staff’s assumptions about the service. CSU had a student population of 25,616 students at the time of the survey. n Literature review Much of the published literature discussing laptop ser- vices focuses on the implementation of laptop lending pro- grams and was published from 2001 to 2003, when many libraries were beginning this service (Allmang 2003; Block 2001; Dugan 2001; Myers 2001; Oddy 2002; Vaughan and Burnes 2002; Williams 2003). These articles deal primarily with topics such as how to deal with start-up technologi- cal, staffing, and maintenance issues. They have minimal discussion of the service post-implementation. Researchers who have surveyed users of university laptop lending services include DiRenzo (2002), Lyle (1999), Jordy (1998), Block (2001), Oddy (2002), and Monash University’s Caulfield Library (2004). DiRenzo from the University of Akron only briefly discusses a survey they conducted with some information about additional software added as a result of their user com- ments. Lyle from West Virginia University discusses the percentage of respondents to particular questions such Louise Feldmann (louise.feldmann@colostate.edu) is the Business and Economics Librarian at Colorado State University Libraries. She serves as the College Liaison Librarian to the College of Business. Lindsey wess (lindsey.wess@colostate. edu) coordinates assistive technology services and manages the Information Desk and the Electronic Information Center at Colorado State University Libraries. tom moothart (tmoothar@ library.colostate.edu) is the Coordinator of On-site Services at Colorado State University Libraries. stuDEnt satisFaction witH ciRcuLatinG Laptop sERvicE | FELDmann, wEss, anD mootHaRt 21 as what applications were used, problems encountered, and overall satisfaction with the service. Jordy’s report provides in-depth analysis of the survey results from the University of North Carolina at Chapel Hill, but the focus of his survey is on the laptop service’s impact on library employee work flow. Monash University’s Caulfield Library survey focuses on wireless access and awareness of the program by patrons. Other survey results found on university library Web sites include Southern New Hampshire University Library (West 2005) and Murray State University Library (2002). Additionally, the Monmouth University Library Web site (2003) provides discussion and written analysis of a survey they conducted prior to implementation of their service, a survey which was used to gather informa- tion and assess patron needs in order to aid in the con- struction and planning of their service. From the survey results discussed in the literature and posted on Web sites, overall comments from users are very consistent with one another. Most users indicate that they use a loaned laptop computer rather than desktop computer for privacy and portability (Lyle 1999; Oddy 2002; West 2005). In addition, the responses from patrons are overwhelmingly positive and users appreciated hav- ing the service made available (Lyle 1999; Jordy1998; West 2005). Both West Virginia University and the University of North Carolina at Chapel Hill surveys found that 98 percent of respondents would check out a laptop again (Lyle 1999; Jordy 1998). Southern University of New Hampshire’s survey indicated that 88 percent of those responding would check one out again (West 2005). Many respondents stated that a primary drawback of using the laptops was the slowness of connectivity (Lyle 1999; Monash 2004; Murray State 2002). The primary use of the laptops, reported in the surveys, was Microsoft Word (Lyle 1999; Jordy 1998; Oddy 2002). There is a lack of published literature regarding laptop lending customer satisfaction surveys and analy- sis. This could be due to the relative newness of many programs, the lack of university libraries that provide laptops, or the reliance on circulation statistics solely to assess the program. Articles that discuss circulation and usage statistics as an assessment indicator to judge the popularity of their programs include DiRenzo (2002), Dugan (2001), and Vaughan and Burnes (2002). Based on high circulation statistics and positive anecdotal evi- dence, it may appear that library users are pleased with laptop programs, and perhaps there has been a hesita- tion to survey users on a program that is perceived by those in the library as successful. n Results With the strong emphasis on assessment at Colorado State University, it was decided to formally survey laptop users on their satisfaction with the program. The survey was distributed by the access services staff when the laptops were checked out from October 28, 2005, to November 28, 2005. This was a voluntary survey and the respondents were asked to complete one survey. Users returned 173 completed surveys. Undergraduates are the predominant audience for the laptop service; of the 173 returned sur- veys, 160 identified themselves as undergraduates. As shown in table 1, the responses indicated that the library has a core of regular laptop users, with 33 percent using the laptops at least daily and 82 percent using the laptops at least weekly. Only 3 percent indicated that they were using a laptop for the first time. Many laptop users also utilized the EIC with 67 percent responding that they use the information commons at least weekly (see table 2). The laptops were initially purchased with the intent that they would be used to support student team projects. Presentation kits with a laptop, projector, and portable screen were an extension of this idea and were also made available for checkout. Surprisingly, only 15 percent of Table 1. How often do you use a library laptop? Frequency Percentage More than once a day 3% Daily 30% Weekly 49% Monthly 15% My first time 3% N=172 Table 2. How often do you use a library PC? Frequency Percentage More than once a day 3% Daily 20% Weekly 44% Monthly 20% Never 13% N=169 22 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 the respondents noted that they were using the laptop with a group. During evenings, it was observed by staff that stu- dents were regularly queuing and waiting for a laptop even though PCs were available in the library computer lab. Figure 1 shows an hourly use statistics for the desk- top and laptop public computers. The usage of the desk- top computer drops in the late afternoon, just as the use of the laptop computer increases. Students were asked why they chose a laptop rather than a library PC and were allowed to choose from multiple answers. As can be seen in table 3, most students noted the advantages of portability and privacy. Five respon- dents wrote in the “other” category that they were able to work better in quieter areas, and ten mention that the computer lab workspace is limited. The dense use of space in the library computer lab has been noted by Morgan Library staff and students. The desktop surround- ing each library PC only provides about three feet of workspace. One respondent explained the choice of laptop over PC was because “I can take it to a table and spread out my notes vs. on a library PC.” For many users, the desktops are too crowded to spread research mate- rial, and the EIC is too noisy for contemplative thought. As can be noted from the use statistics, the public laptop program has been a very popu- lar library service. Prior to the survey, the per- ception of the Morgan Library staff was that students were waiting in the evening for extended periods of time for a lap- top. When the library expanded the laptop pool from 20 in 2000 to 172 in 2005, it had seemingly no effect on reducing the number of students wait- ing to use them. As can be seen in table 4, when asked how long they had waited for a laptop, 74 percent of the students said they had access to a laptop immediately, and 15 percent waited less than a minute. The survey was administered during the second busiest time of the year for the library, the month before Thanksgiving break. In the open comments, one respondent stated that it was possible to wait forty- five minutes to an hour for a laptop and another noted that “during finals weeks it is almost impossible to get one.” Even with the limited waiting time recorded by the Page 1 of 1 feldmann FIGURES.doc 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 7:30 AM 8:30 AM 9:30 AM 10:30 AM 11:30 AM 12:30 PM 1:30 PM 2:30 PM 3:30 PM 4:30 PM 5:30 PM 6:30 PM 7:30 PM 8:30 PM 9:30 PM 10:30 PM 11:30 PM Time of Day P er ce nt ag e of U se r Desktop Computers Checkout Laptops Figure 1. Computer use statistics for May 1, 2006. Figure 1. Computer use statistics for May 1, 2006. Table 3. Why did you choose to use a laptop rather than a library PC? Response Number Portability 41 Privacy 12 Easier to work with a group 7 Portability and privacy 54 Portability and easier to work with a group 10 Portability, privacy, and easier to work with a group 12 stuDEnt satisFaction witH ciRcuLatinG Laptop sERvicE | FELDmann, wEss, anD mootHaRt 23 respondents, when asked how the library could improve the laptop service many respondents requested that more laptops be purchased to decrease the wait. The library is struggling to determine the appropriate number of lap- tops to have available during peak use periods to reduce or eliminate wait times. The library laptops are more problematic than the library desktop computers to support. The laptops are more fragile than the desktop computers and have the added complication of connecting to the wireless net- work. Every morning the Morgan Library’s technology staff retrieves non-functioning laptops; library technicians regularly retrieve lost data due to malfunctioning laptops and unsophisticated computer users. The addition of the Virtual Private Network (VPN) connection to the laptop startup script files has slowed the boot-up to the wireless network. An effort has been made to ameliorate wireless “dead zones,” but users still complain of being dropped from the wireless network. With these problems in mind, users were asked about the technical complications they have experienced with the library laptops. The survey responses in tables 5 and 6 indicate a much lower percentage of users reporting technical problems than was anticipated. The technical staff’s large volume of technical calls may reflect the volume of users rather than systematic problems with the laptop service. Surprisingly, 79 percent of the users reported rarely or never returning a non-functioning laptop. In addition, the library technicians have reported that no problems have been found on some of the laptops returned for repair. Some of the returned computers may be due to frustra- tion with the slow connection to the wireless network. Forty-five percent of respondents reported at least occasionally having problems connecting to the wireless network. From the inception of the laptop program, the library has experienced problems with the wireless tech- nology. From its original fifteen wireless access points to its current twenty-nine, the library has struggled to meet the demand of additional library laptops and users’ personal laptops. Many written comments on the surveys complained about the slow connection speed of the wire- less network such as, “Find a way to make the boot-up process faster. I need to wait about five minutes for it to be totally booted and ready to use.” Even with the slow connection to the wireless net- work, 41 percent of students responding to the survey rated their satisfaction with the library’s laptop service as excellent and 49 percent rated their satisfaction as good (see table 7). n Discussion Even with 90 percent of our users rating the laptop service as good or excellent, the survey noted some problems that needed attention. The Morgan Library laptops seamlessly connect to a wireless network through a login script when the computer is turned on. A new script was written to Table 4. How long did you wait before you were able to check out your laptop? Response Percentage I did not wait 74% Less than one minute 15% One to four minutes 11% Five to ten minutes 2% More than ten minutes 0% N=171 Table 5. How often have you experience problems saving files, connecting to the wireless network, or had a laptop that locked up or crashed? Frequency Saving files Wireless connection Locked up or crashed Often <1% 5% <1% Occasionally 8% 40% 17% Rarely 33% 32% 35% Never 58% 24% 49% N= 165 165 163 Table 6. How often have you returned a library laptop that was not working properly? Frequency Percentage Often 4% Occasionally 18% Rarely 30% Never 49% N=165 24 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 allow the connection and authentication to the Cisco Virtual Private Network (VPN) client. During testing it was found that some laptops took as long as ten min- utes to connect to the wireless network, which resulted in numerous survey respondents commenting on our slow wireless network. To help correct this problem, the library’s network staff changed each laptop’s user profile from a mandatory roaming profile to a local profile and simplified the login script. The laptops connected faster to the wireless network with the new script, but they still did not meet the students’ expectations. In the fall of 2006, the library network staff moved the laptops from VPN to Wi-Fi Protected Access (WPA) wireless security, and laptop login time to the wireless network dropped to under two minutes. The number of customer complaints dropped dramatically after implementing WPA. Additional access points were purchased to improve connectivity in Morgan Library’s wireless “dead zones.” In January 2006, the University’s Central Computing Services audited the wireless network after continued wireless connectivity complaints. The audit recom- mended reconfiguring the access points channel assign- ments. In many cases it was found that the same channel had been assigned to access points adjacent to each other, ultimately compromising laptop connectivity. The audit also discovered noise interference on the wireless net- work from a 2.4-GHz cordless phone used by the loan desk staff. The phone was replaced with a 5.8-GHz one, which has resulted in fewer dropped connections near the loan desk. Supporting almost 200 laptops has introduced several problems in the library. The Morgan Library building was not designed to support the use of large numbers of laptops. Because it is impractical for the loan desk to charge nearly 200 laptop batteries throughout the day, laptops available for checkout must be connected to electrical outlets. These are seldom near study tables, and students are forced to crawl underneath tables to locate power or stretch adapter cords across aisles. A space plan for the Morgan Library is being developed that will increase the number of outlets near study tables. In the meantime, 100 power strips were added to tables used heavily by laptop users. The loan desk staff is very efficient at circulating, but has less success at troubleshooting technical problems. When the laptop service was first implemented, large numbers of laptops were not available due to servicing reasons. The public laptop downtime was lowered by hiring additional library technology students. A one-day onsite repair service agreement was purchased from the manufacturer which resulted in many equipment repairs being completed within 48 hours. In order to reduce the downtime further, a plan to replace some loan desk student workers with library technology students is being evaluated. The technology students will be able to troubleshoot connectivity and hardware problems with the users when they return the defective computers to the loan desk. If a computer needs additional service, it can be handled immediately, which will allow more laptops for checkout since fewer will be removed for repair. When the laptop service was first envisioned, it was seen as a great service for those working in groups. As can be seen in table 3, very few students are using the laptops in a group setting. In survey written comments, students emphasize that they enjoy the portability and privacy enabled by using a laptop. The Morgan Library EIC is cramped and noisy, with the configuration allow- ing very little room for students to spread out research materials and notes for writing. The Morgan Library space plan takes these issues into consideration and rec- ommends reconfiguring the EIC to lessen the noise and provide writing space near computers. This is intended to improve the student library experience and encourage students to use the desktop computers during the eve- nings when lines form for the laptops. In order to decrease the current laptop queue at the loan desk, more laptops will be added. As a result of survey comments requesting Apple computers, five Mac PowerBooks were added to the library’s laptop fleet. In addition, as Morgan Library adds more checkout laptops and the number of students arriving on campus with wireless laptops increases, the wireless infrastructure will need to be upgraded. Upgrading the wireless access points to standard 802.11g has been implemented. Updating each laptop with a new hardrive image has become problematic as the number of laptops has increased. The wireless network capacity is not large enough for the Ghost software to transmit the image to multiple laptops, and so each laptop must be physically attached to the library network. Initially, when library technology services attempted imaging many laptops at once, it took six to eight hours and required up to eight staff members. This method of large-scale laptop imaging was so network intensive that it had to be per- formed when the library was closed to avoid disrupting Table 7. Please rate your satisfaction with the laptop service. Response Percentage Excellent 41% Good 49% Neutral 7% Poor Very Poor 2% <1% N=166 stuDEnt satisFaction witH ciRcuLatinG Laptop sERvicE | FELDmann, wEss, anD mootHaRt 25 public Internet use. Now imaging the laptop fleet is done piecemeal, twenty to thirty laptops at a time, in order to minimize complications with the Ghost process and mul- ticasting through the network switches. Due to the staff time required, laptop software is not updated as often as the users would like. Technological solutions continue to be investigated that will decrease the labor and network intensity of imaging. n Conclusion The Morgan Library Laptop service was established in 2000 and has been a very popular addition to the library’s services. As an example of its popularity, in fiscal year 2005 the laptops circulated 66,552 times. Student govern- ment continues to support the use of student technology fees to support and expand the fleet of laptops. This survey was an attempt to assess users’ perceptions of the service and identify areas that need improvement. The survey found that students rarely wait more than a few minutes for a laptop, and in open-ended survey ques- tions, students noted that they waited for computers only during peak use periods. While relatively few survey respondents experienced technical difficulties with the laptops and wireless network, slow wireless connection time was a concern that students noted in the open com- ments section of the survey. Overall, the students gave the laptop service a very high rating. When asked to suggest improvements to the service, many respondents recommended purchasing more laptops. The libraries made several changes to improve the laptop service based on survey responses. Changes have been made to the login script files, wireless network, and security protocol to speed and stabilize the wireless con- nection process. Additional wireless access points will be added to the building and all access points will be upgraded to the 802.11g standard. In addition, five Mac PowerBooks have been added to the fleet of Windows- based laptops. The library continues to investigate new service models to circulate and maintain the laptops. Works Cited Allmang, Nancy. 2003. Our plan for a wireless loan service. Com- puter in Libraries 23, no. 3: 20–25. Block, Karla J. 2001. Laptops for loan: The experience of a multi- library project. Journal of Interlibrary Loan, Document Delivery, and Information 12, no. 1: 1–12. DiRenzo, Susan. 2002. A wireless laptop-lending program: The University of Akron experience. Technical Services Quarterly 20, no. 2: 1–12. Dugan, Robert E. 2001. Managing laptops and the wireless net- work at the Mildred F. Sawyer Library. Journal of Academic Librarianship 27, no. 4: 295–298. Jordy, Matthew L. 1998. The impact of user support needs on a large academic workflow as a result of a laptop check-out program. Master’s thesis, University of North Carolina. Lyle, Heather. 1999. Circulating laptop computers at West Vir- ginia University. Information Outlook 3, no. 11: 30–32. Myers, Penelope. 2001. Laptop rental program, Temple Univer- sity Libraries. Journal of Interlibrary Loan, Document Delivery, and Information Supply 12, no. 1: 35–40. Monash University Caulfield Library. 2004. Laptop users and wireless network survey. www.its.monash.edu.au/staff/net- works/wireless/review/caul-lapandnetsurvey.pdf (accessed June 8, 2005). Monmouth University. 2003. Testing the wireless waters: A sur- vey of potential users before the implementation of a wireless notebook computer lending program in an academic library. http://bluehawk.monmouth.edu/~hholden/WWL/wire- less_survey_results.html (accessed June 8, 2005). Murray State University. 2002. Library laptop computer usage survey results. www.murraystate.edu/msml/laptopsurv. htm (accessed June 8, 2005). Oddy, Elizabeth Carley. 2002. Laptops for loan. Library and Infor- mation Update 1, no. 4: 54–55. Vaughn, James B., and Brett Burnes. 2002. Bringing them in and checking them out: Laptop use in the modern academic library. Information Technology and Libraries 21, no. 2: 52–62. West, Carol. 2005. Librarians pleased with results of student survey. Southern New Hampshire University. www.snhu. edu/3174/asp (accessed June 8, 2005). Williams, Joe. 2003. Taming the wireless frontier: PDAs, tablets, and laptops at home on the range. Computers in Libraries 23, no. 3: 10–12, 62–64. 3255 ---- 26 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 Preparing Locally Encoded Electronic Finding Aid Inventories for Union Environments: A Publishing Model for Encoded Archival Description Author ID (to come) Plato L. Smith II This paper will briefly discuss encoded archival descrip- tion (EAD) finding aids, the workflow and process involved in encoding finding aids using EAD metadata standard, our institution’s current publishing model for EAD finding aids, current EAD metadata enhancement, and new developments in our publishing model for EAD finding aids at Florida State University Libraries. For brevity and within the scope of this paper, FSU Libraries will be referred to as FSU, electronic EAD finding and/ or archival finding aid will be referred as EAD or EADs, and locally encoded electronic EAD finding aids invento- ries will be referred to as EADs @ FSU. n What is an EAD finding aid? Many scholars, researchers, and learning and schol- arly communities are unaware of the existence of rare, historic, and scholarly primary source materials such as inventories, registers, indexes, archival documents, papers, and manuscripts located within institutions’ col- lections/holdings, particularly special collections and archives. A finding aid—a document providing informa- tion on the scope, contents, and locations of collections/ holdings—serves as both an information provider and guide for scholars, researchers, and learning and schol- arly communities, directing them to the exact locations of rare, historic, and scholarly primary source materi- als within institutions’ collections/holdings, particularly noncirculating and rare materials. The development of the finding aid led to the institution of an encoding and markup language that was software/hardware indepen- dent, flexible, extensible, and allowed online presentation on the World Wide Web. In order to provide logical structure, content pre- sentation, and hierarchical navigation, as well as to facilitate Internet access of finding aids, the University of California–Berkeley Library in 1993 initiated a coop- erative project that would later give rise to development of the nonproprietary SGML-based, XML-compliant, machine-readable markup language encoding finding aid standard, encoded archival description (EAD) docu- ment type definition (DTD) (LOC, 2006a). Thus, an EAD finding aid is a finding aid that has been encoded using Encoded Archival Description and which should be validated against an EAD DTD. The EAD XML that pro- duces the EAD finding aid via an extensible style sheet language (XSL) should be checked for well-formed-ness via an XML validator (i.e. XML Spy, Oxygen, etc.) to ensure proper nesting of EAD metadata elements “The EAD Document Type Definition (DTD) is a stan- dard for encoding archival finding aids using Extensible Markup Language (XML)” (LOC, 2006c). An EAD finding aid includes descriptive and generic elements along with attribute tags to provide descriptive information about the finding aid itself, such as title, compiler, compilation date, and the archival material such as collection, record group, series, or container list. Florida State University Libraries has been creating locally encoded electronic encoded archival descrip- tion (EAD) finding aids using a Note Tab Light text editor template and locally developed XSL style sheets to generate multiple EAD manifestations in HTML, PDF, and XML formats online for over two years. The formal EAD encoding descriptions and guidelines are developed with strict adherence to the Best Practice Guidelines for the Implementation of EAD Version 2002 in Florida Institutions (FCLA, 2006), Manuscript Processing Reference Manual (Altman & Nemmers, 2006), and EAD Version 2002. An EAD Note Tab Light template is used to encode findings down to the collection level and cre- ate EAD XML files. The EAD XML files are tranformed through XSL stylesheets to create EAD finding aids for select special collections. n EAD workflow, processes, and publishing model The certified archivist and staff in Special Collections and a graduate assistant in the Digital Library Center encode finding aids in EAD metadata standard using an EAD clip and EAD template library in Note Tab Light text editor via data entry input for the various descriptive, administrative, generic elements, and attribute metadata element tags to generate EAD XML files. The EAD XML files are then checked for validity and well-formed-ness using XML Spy 2006. Currently, EAD finding aids are encoded down to the folder level, but recent Florida Heritage Project 2005–2006 grant funding has allowed selected special collections finding aids to be encoded down to the item level. Currently, we use two XSL style sheets, ead2html.xsl and ead2pdf.xsl, to generate HTML and PDF formats, and simply display the raw XML as part of rendering EAD finding aids as HTML, PDF, and XML and present- ing these manifestations to researchers and end users. The ead2html.xsl style sheet used to generate the HTML versions was developed with specifications such as use of FSU seal, color, and display with input from the Special Collections department head. The ead2pdf.xsl style sheet used to generate PDF versions uses XSL-FO (formatting plato L. smith ii (psmithii@fsu.edu) is Digital Initiatives Librarian at Florida State University Libraries, Tallahassee. pREpaRinG LocaLLY EncoDED ELEctRonic FinDinG aiD invEntoRiEs FoR union EnviRonmEnts | smitH 27 object), and was also developed with specifications for layout and design input from the Special Collections department head. The HTML versions are generated using XML Spy Home Edition with built-in XSLT, and the PDF versions are generated using Apache Formatting Object Processor (FOP) software from the command line. EAD finding aids, EADs @ FSU, are available in HTML, PDF, and XML formats (see figure 1). The style sheets used, EAD authoring software, and EADs @ FSU origi- nal site are available via www.lib.fsu.edu/dlmc/dlc/ findingaids. n Enriching EAD metadata As EAD standards and developments in the archival community advance, we had to begin a way of enrich- ing our EAD metadata to prepare our locally encoded EAD finding aids for future union catalog searching and OPAC access. The first step toward enriching the metadata of our EAD finding aids was to use RLG EAD Report Card (OCLC, 2008) on one of our EAD finding aids. The test resulted in the display of missing Required (Req), Mandatory (M), Mandatory if applicable (MA), Recommended (Rec), Optional (Opt), and Encoding Analogs (relatedencoding and encodinganalog attri- butes) metadata elements (see figure 2). The second test involved reference Online Archive of California Best Practices Guidelines (OAC BPG), specifically Appendix B (CDL, 2005, ¶ 2), to create a Formal Public Identifier (FPI) for our EAD finding aids and make the EAD FPIs Describing Archives Content Standards (DACS)–compliant. This second test resulted in the creation of our very first DACS– compliant EAD Formal Public Identifier. Example: FTaSU2003004. xml The RLG EAD Report Card and Appendix B of OAC BPG together helped us modify our EAD finding aid encoding template and work- flow to enrich the EAD document identifier metadata tag element, include miss- ing mandatory EAD metadata elements, and develop FPIs for all of our EAD finding aids. Prior to recent new developments in the publishing model of EAD finding aids at FSU Libraries, the EAD finding aids in our EADs @ FSU inventories could not be easily found using traditional Web search engines, were part of the so-called “deep Web,” (Prom & Habing, 2002) and were “unidimensional in that they [were] based upon the assumption that there [was] an object in a library and there [was] a descriptive surrogate for that object, the cataloging record” (Hensen, 1999). EAD finding aids in our EADs @ FSU inventories did not have a descriptive surrogate catalog record and lacked the relevant related encoding and analog metadata ele- ments within the EAD metadata with which to facilitate “metadata crosswalks”—mapping one metadata stan- dard with another metadata standard to facilitate cross- searching. “To make the metadata in EAD instance as robust as possible, and to allow for crosswalks to other encoding schemes, we mandate the inclusion of the relat- edencoding and encodinganalog attributes in both the and segments” (Meissner, et al., 2002). Incorporating an EAD quality checking tool such as RLG BPG and EAD compliance such as DACS when Figure 1. EAD finding aids in hTML, PDF, and XML format Figure 2. RLG EAD Report Card of XML EAD file 28 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 authoring EADs, will assist in improving EAD encoding and EAD finding aids publishing model. n Some key issues with creating and managing EAD finding aids One of the major issues with creating and managing EAD finding aids is the set of rules used for describing papers, manuscripts, and archival documents. The former set of rules used for providing consistent descriptions and Anglo-American Cataloging Rules (AACR) bibliographic catalog compliance for papers, manuscripts, and archi- val documents down to collection level was Archives, Personal Papers, and Manuscripts (APPM), which was complied by Steven L. Hensen and published by the Library of Congress in 1983. However, the need for more description granularity down to the item level, enhanced bibliographic catalog specificity, MARC and EAD meta- data standards implementations and metadata standards crosswalks, and inclusion of descriptors of archival material types beyond personal papers and manuscripts prompted the development of Describing Archives: A Content Standard (DACS), published in 2004 with the second edition published in 2007. “DACS [U.S. imple- mentation of international standard for the description of archival materials and their creators] is an output-neutral set of rules for describing archives, personal papers, and manuscripts collections, and can be applied to all mate- rial types ”(Pearce-Moses, 2005). Some international standards for describing archival materials are General International Standard Archival Description ISAD(G) and International Standard Archival Authority Record for Corporate Bodies, Persons, and Families [ISAAR(CPF)]. Other issues with creating and managing EAD find- ing aids include (list not exhaustive): 1. Online presentation of finding aids 2. Exposing finding aids electronically for searching 3. Provision of a search interface to search finding aids 4. Online public access catalog record (MARC) and link to finding aids 5. Finding aids linked to digitized content of collections EADs @ FSU exist in HTML for online presenta- tion, PDF for printing, and XML for exporting, which allow researchers greater flexibility and options in the information-gathering and research processes and have improved the way archivists communicated guides to archival collections with researchers as opposed to paper finding aids physically housed within institutions. EADs @ FSU have existed online in HTML, PDF, and XML formats for two years in a static HTML document and then moved to Drupal (MySQL database with PHP) for about one year, which improved online maintenance but not researcher functionality. However, the purchase and upgrade of a digital content management system marked a huge advancement in the development of our EAD finding aids implementation and thus resolutions to issues numbers 1–3. Researchers now have a single-point search interface to search EADs @ FSU across all our digital collections/ institutional repository (see figure 3); the ability to search within the finding aids via full-text indexing of PDFs; the option of brief (thumb- nails with EAD, HTM, PDF, and XML manifes- tation icons), table (title, creator, and identifier), and full (complete EAD finding aid DC record with manifestations) views of search results, which provides different levels of exposures of EAD finding aids; and the ability to save/e-mail search results. Future initiatives are underway to enhance EADs @ FSU implementation via the creation of EAD MARC records through Dublin Core to MARC metadata crosswalk, to deep link to EAD finding aids via 856 field in MARC records, and to begin digitizing and linking to EAD finding aids archival content via digital archival object EAD element. is “linking element that uses the attributes ENTITYREF or HREF to connect the finding aid information to elec- tronic representations of the described materi- als. The and elements allow the content of an archival collection or record Figure 3. Online search GUI for EAD finding aids and digital collections within IR pREpaRinG LocaLLY EncoDED ELEctRonic FinDinG aiD invEntoRiEs FoR union EnviRonmEnts | smitH 29 group to be incorpo- rated in the finding aid” (LOC, 2006b). We have opted to create basic Dublin Core records of EAD finding aids based on the information in the EAD finding aids descriptive summary (front matter) first and then crosswalk to MARC, but are cogni- zant that this current workflow is subject to change in the pur- suit of advancement. However, we are seeking ways to improve the EAD workflow and EAD MARC record creation through more communication and future collaboration with the FSU Libraries cataloging department. n Number of finding aids and percent of EADs @ FSU As of February 16, 2006, we had 700 collections with finding aids in which 220 finding aids are electronic and encoded in HTML (31 percent of total finding aids). From the 220 electronic finding aids, 60 are available as HTML, PDF, and XML finding aids (20 percent of electronic find- ing aids are EADs @ FSU). However, we currently have 63 EAD finding aids available online in HTML, PDF, and XML formats. n New developments in publishing EADs @ FSU Current EADs @ FSU include the recommendations from test 1 and test 2 (RLG BPG and DACS compliance) which were discussed earlier and the digital content manage- ment system (i.e. DigiTool) creates a descriptive digital surrogate of the EAD objects in the form of brief and basic Dublin Core metadata records for each EAD finding aid along with multiple EAD manifestations (see figure 4). We have successfully built and launched our first new digital collection, FSU Special Collections EAD Inventories, in DigiTool 3.0 as part of FSU Libraries DLC Digital Repository (http://digitool3.lib.fsu.edu/R/), a relational database digital content management system (DCMS). DigiTool has an Oracle 9i relational database management system backend, searchable Web-based GUI, a default EAD style sheet that allows full-text searching of EADs, supports MARC, DC, METS metadata standards, JPEG2000 (built in tools for images and thumbnails) as well as Z39.50 and OAI protocols which will enable resource discovery and exposing of EADs @ FSU. You can visit FSU Special Collections EAD Finding Aids Inventories at http://digitool3.lib.fsu.edu/R/? func=collections-result&collection_id=1076. n National, international, and regional aggregation of finding aids initiatives RLG’s ArchiveGrid (http://archivegrid.org/web/index. jsp) is an international, cross-institutional search consti- tuting the aggregation of primary source archival materi- als of more than 2,500 research libraries, museums, and archives with a single-point interface to search archival collections from across research institutions. Other inter- national, cross-institutional searches of aggregated archi- val collections are: n Intute: arts& humanities in the United Kingdom www.intute.ac.uk/artsandhumanities/ cgi-bin/browse.pl?id=200025 (international guide to subcategories of archival materials) n Archives Made Easy www.archivesmade easy.org (guide to archives by country) There are also some regional initiatives, which pro- vide cross-institutional search of aggregations of finding aids: n Publication of Archival Library and Museum Materials (PALMM) http://palmm.fcla.edu (cross- Figure 4. EAD finding aids in EAD (default), hTML, PDF, and XML manifestations 30 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 institutional searches in FL - FSU participates, FL) n Virginia Heritage: Guides to Manuscript and Archival Collections in Virginia http://ead.lib .virginia.edu/vivaead/ (cross-institutional searches in Virginia) n Texas Archival Resources Online www.lib.utexas. edu/taro/ (cross-institutional searches in Texas) n Online Archive of New Mexico http://elibrary .unm.edu/oanm/ (cross-institutional searches in New Mexico) Awareness of regional, national, and international aggregation of finding aids initiatives and engagement in regional aggregation of finding aids will enable a consis- tent advancement in the development and implementa- tion of EADs @ FSU. Acknowledgments FSU Libraries Digital Library Center and Special Collections Department, Florida Heritage Project funding (FCLA), Chuck F. Thomas (FCLA), and Robert McDonald (SDSC) assisted in the development, implementation, and success of EADs at FSU. References Altman, B. & Nemmers, J. (2006). Manuscripts processing ref- erence manual. Florida State University Special Collections. California Digital Library (CDL). (2005). OAC best practice guidelines for encoded archival description, appendix b. for- mal public identifiers for finding aids. Retrieved October 6, 2006 from www.cdlib.org/inside/diglib/guidelines/bpgead/ bpgead_app.html#d0e2995. Digital Library Center, Florida State University Libraries. (2006). FSU special collections EAD finding aids inventories. Retrieved January 5, 2007 from http://digitool3.lib.fsu.edu/ R/?func=collections-result&collection_id=1076. Florida Center of Library Automation (FCLA). (2004). PALMM: publication of archival library and museum materials, archival collections. Retrieved January 7, 2007 from http://palmm.fcla .edu. Florida Center for Library Automation (FCLA). (2006). Best practice guidelines for the implementaton of EAD version 2002 in Florida institutions. (John Nemmers, Ed.). Accessed April 21, 2008, at www.fcla.edu/dlini/OpeningArchives/new/ FloridaEADguidelines.pdf Fox, M. (2003). The EAD cookbook — 2002 edition.Chicago: The Society of American Archivists. Retrieved October 6, 2006 from www.archivists.org/saagroups/ead/ead2002cookbook .html. Hensen, S. L. (1999). NISTF II and EAD: The evolution of archi- val description. Encoded Archival Description: Context, Theory, and Case Studies (pp. 23–34). Chicago: The Society of American Archivsits Library of Congress (LOC). (2006a). Development of the encoded archival description DTD. Retrieved October 6, 2006 from www.loc.gov/ead/eaddev.html. Library of Congress (LOC). (2006b). Digital archival object— encoded archival description tag library—version 2002. Retrieved January 8, 2007 from www.loc.gov/ead/tglib. Library of Congress (LOC). (2006c). Encoded archival descrip- tion —version 2002 official site. ETD dtd version 2002. Retrieved April 19, 2008 from www.loc.gov/ead/ead2002a.html. Meissner, D., Kinney, G., Lacy, M., Nelson, N., Proffitt, M., Rinehart, R., Ruddy, D., Stockling, B., Webb, M., & Young, T. (2002). RLG best practices guidelines for encoded archival description (pp. 1-24). Mountain View: RLG. Retrieved January 5, 2007 from www.rlg.org/en/pdfs/bpg.pdf. National Library of Australia. (1999). Use of encoded archi- val description (EAD) for manuscript collection Retrieved January 4, 2007 from www.nla.gov.au/initiatives/ead/eadintro .html. OCLC. (2007). ArchiveGrid—open the door to history. Retrieved January 4, 2007 from http://archivegrid.org/web. OCLC. (2008). EAD report card. Retrieved April 11, 2008 www.oclc.org/programs/ourwork/past/ead/reportcard .htm. Pearce-Moses, R. (2005). A glossary of archival and records terminology. Chicago: Society of American Archivists. Retrieved January 8, 2007 from www.archivists.org/glossary/index.asp. Prom, C. J. & Habing, T. G. (2002). Using the open archives ini- tiative protocols with EAD . Paper preserted at the International Conference on Digital Libraries Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries. Portland, Oregan, USA, July 14-18, 2002. Retrieved October 6, 2006 from http://portal.acm .org/citation.cfm?doid=544220.544255. Reese, T. (2005). Building lite-weight EAD Repositories,. Paper presented in the International Conference on Digital Libraries Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries. New York: ACM. Retrieved January 5, 2007 from http://doi.acm.org/10.1145/1065385.1065498. Special Collections Department, University of Virginia. (2004). Virginia heritage guides to manuscripts and archival collections in virginia. Retrieved January 7, 2007 from http://ead.lib.virginia .edu/vivaead/. Thomas, C., et al. (2006). Best practices guidelines for the implementation of EAD version 2002 in Florida institutions. Florida State University Special Collections. University of Texas Libraries, University of Texas at Austin. (Unknown). Texas Archival Resources Online (TARO). Retrieved January 4, 2007 from www.lib.utexas.edu/taro. 3256 ---- 32 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2007 Author ID box for 3 column layout Column Title 32 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 Communications Michaela Brenner and Peter Klein Discovering the Library with Google Earth Libraries need to provide attractive and exciting discovery tools to draw patrons to the valuable resources in their catalogs. The authors con- ducted a pilot project to explore the free version of Google Earth as such a discover tool for Portland State Library’s digital collection of urban planning documents. They created eye-catching placemarks with links to parts of this collection, as well as to other pertinent materials like books, images, and historical background information. The detailed how-to-do part of this article is preceded by a discussion about discovery of library materials and followed by possible applications of this Google Earth project. In Calhoun’s report to the Library of Congress, it becomes clear that staff time and resources will need to move from cataloging traditional formats, like books, to cataloging unique primary sources, and then providing access to these sources from many different angles. “Organize, digitize, expose unique special collections” (Calhoun 2006). In 2005, Portland State University Library received a grant “to develop a digital library under the sponsor- ship of the Portland State University Library to serve as a central repository of the collection, accession, and dis- semination of [urban] key planning documents . . . that have high value for Oregon citizens and for scholars around the world” (Abbott 2005). This collection is called the Oregon Sustainable Community Digital Library (OSCDL) and is an ongoing project that includes literature, plan- ning reports, maps, images, RLIS (Regional Land Information System) geographical data, and more. Much of the older material is unpublished, and making it available online pres- ents a valuable resource. Most of the digitized—and, more recently, born- digital—documents are accessible through the library’s catalog, where patrons can find them together with other library materials about the City of Portland. The bibliographic records are arranged in the catalog in an electronic resource management (ERM) system (Brenner, Larsen, and Weston 2006). Additionally, these bibliographic data are regularly exported from the library catalog to the OSCDL Web site (http://oscdl. research.pdx.edu) and there inte- grated with GIS (Global Information System) features, thus optimizing cataloging costs by reusing data in a different electronic environment. Committed to not falling into the trap that Clifford Lynch had in mind when he wrote, “I think there is a mental picture that many of us have that digitization is something you do and you finish . . . a finite, one-time process“ (Lynch 2002), and agreeing with Gatenby that “it doesn’t matter at all if a user finds our OPAC through the ‘back door ’“ (Gatenby 2007), the authors looked into further using these existing data from the library catalog by making them accessible from a popular and appealing place on the Internet, a place that users are more likely to visit than the library catalog. The free version of Google Earth, a virtual-globe program that can be installed on PCs, lent itself to experimenting. “Google Earth com- bines the power of Google Search with satellite imagery, maps, terrain and 3-D buildings to put the world’s geographic information at your fin- gertips” (http://earth.google.com). From there, the authors provide links to the digitized documents in the library catalog. Easy distribution, as well as the more playful nature of this pilot project and the inclusion of pictures, make the available data even more attractive to users. “Google now reigns” “Google now reigns,” claims Karen Markey (Markey 2007), and many others agree that using Google is easier and more appealing to most than using library catalogs. Google’s popularity has been growing spec- tacularly. In August 2007, Google accounted for 64 percent of all U.S. searches (Avtec Media Group 2007). In contrast, the OCLC report on how users perceive the library shows that only one percent of the respondents begin their information search on a library Web site, while 84 percent use search engines (De Rosa, et al. 2005). “If we [libraries] want to survive,” says Stephen Abram, “we must place our messages where the users are seeking answers and will trip over them. Today that usually means at Yahoo, MSN, and Google” (Abram 2005). According to Lorcan Dempsey, in the longer run, traffic to the library catalog will come by linking from larger consolidated resources, like Open WorldCat and Google Scholar (Dempsey 2005). Dempsey also stressed that it becomes more and more significant to differentiate between discov- ery and location (Dempsey 2006a). Initially, users want to discover; they want to find what interests them independent from where this information is actually located and available. While there may be lots of valuable, detailed, and exceptionally well-organized bibliographic infor- mation in the library catalog, not michaela Brenner (brennerm@pdx.edu) is Assistant Professor and Database Maintenance and Catalog Librarian at Portland State University Library, Oregon. peter Klein (Peter.Klein@colorado.edu) is Aerospace engineering BS/MS at the University of Colorado at Boulder. intRoDucinG ZoomiFY imaGE | smitH 33DiscovERinG tHE LiBRaRY witH GooGLE EaRtH | BREnnER anD KLEin 33 many users (one percent) are willing to discover this information through the catalog. They may not discover what a library has to offer if “the library does not find a way to go to the user, rather than waiting for the user to come to the library” (Coyle 2007). Unless the intent is to keep our treasures buried, the library com- munity needs to work with popular outside discovery environments— like search engines—to bring infor- mation available in libraries to users from the outside. Libraries are, although sometimes reluctantly, responding. Google, Google Scholar, and Google Books are Open WorldCat partner sites that are now or soon will be providing access to WorldCat records. Google Book Search includes “Find this book in the library,” and the advanced Book Search also has the option to limit a search to library catalogs with access to the WorldCat Web record for each item. “Deep linking” enables Web users to link from search results in Yahoo, Google, or other partner sites to the “Find in a Library” interface in Open WorldCat, and then directly to the item’s record in their library’s online public access catalog (OPAC). Simply put, “Find it on Google, get it from your library” (Calhoun 2006). The “leveraged discovery envi- ronment” is an expression coined by Dempsey that means it becomes increasingly important to leverage a “discovery environment which is outside your control to bring peo- ple back into our catalog environ- ment (like Amazon, Google Scholar)” (Dempsey 2006b). Issues in Calhoun’s report to the Library of Congress include the ques- tion of how to get a Google user from Google to library collections. She quotes an interviewee saying that “data about a library’s collec- tion needs to be on Google and other popular sites as well as the library interface” (Calhoun 2006). With evidence pointing to the heavy use of Google for discovery and with Google Earth technology providing such a powerful visualiza- tion tool, the authors felt tempted to experiment with existing data from Portland State Library’s digital OSCDL collection and make these data accessible through a virtual globe. The King’s College cultural heritage project Martyn Jessop from King’s College in London, United Kingdom, published an article about a relatively small pilot project on providing access to a digital cultural heritage collection through a geographical informa- tion system (Jessop 2005). Jessop’s approach to explore different tech- nologies and techniques to apply to existing data about unique primary sources was exactly what the authors had in mind with this project, and provided encouragement to move forward with the idea of provid- ing additional access to the Oregon Sustainable Community Digital Library (OSCDL) collections through Google Earth. Similar to Jessop, the authors regard it an unaffordable lux- ury to put a great deal of effort into collecting, digitizing, and catalog- ing materials without making them available to a much broader audience through multiple access points. Comparable to Jessop, the goal of this project was to find a relatively simple, low-cost technological solu- tion that could also be applied to a much wider range of data without much more investment in staff time and money. Once the authors mastered the ini- tial hurdle of understanding Google Earth’s programming language, they could easily identify with Jessop’s notion of “project creep” as more and more possibilities arose to make the project more appealing. This, as with the King’s College project, was a valuable part of the develop- ment process, the details of which are described below. The Portland State Library OSCDL-on- Google-Earth project The authors chose ten Portland- based OSCDL sub-collections as the basis of this pilot project: Harbor Drive, Front Street, Portland Public Market, Urban Studies Collection, Downtown, Park Blocks, South Park Blocks, Pioneer Courthouse Square, Portland City Archives, and JPACT (Joint Policy Advisory Committee on Transportation). The programming language for Google Earth is KML (keyhole markup language), a file format used to display geographic data. KML is based on the XML standard and can be created with the Google Earth user interface or from scratch with a simple text editor. Having no pre- vious KML experience, the authors decided to use both. Figure 1. Basic placemark in Google Earth Figure 2. KML script for basic placemark 34 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200834 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 A basic placemark provided by Google Earth (figure 1), copied and pasted in Notepad (figure 2), was the starting point. At Portland State Library, Information Technology routinely batch export cataloged OSCDL data from the library catalog (ILS) to the OSCDL Web site to reuse them. For the Google Earth project, the authors had two options, to either export data relevant to our collections from the ILS to a spreadsheet or to use an existing Excel spreadsheet contain- ing most of the same data, including place coordinates. This spreadsheet was one of many others that had been created to keep track for the digitization process as well as for creating bibliographic records for the library catalog later. Using the avail- able spreadsheet again, the following data were retained: n the title of the collection n longitude and latitude of the place the collection refers to n a brief description of the collec- tion The following were added manu- ally to the remaining spreadsheet: n all the texts and URLs for the collection-specific links n URLs for the collection-specific images The authors extracted the place- mark-specific script from figure 2 to create a template in Notepad. A general description and all links that were the same for the ten collec- tions were added to this template, and placeholders were inserted for collection-specific data (figure 3). Using Microsoft Office Word’s mail merge, the authors populated the template with the data from the spreadsheet in one quick step. The result was a KML script that included all the placemark data for the ten col- lections (figure 4). The script was saved as plain text (.txt) first, and then renamed with the extension .kml, which represents the final file (figure 5). Clicking the OSCDL.kml icon on a desktop or inside a Web application opens Google Earth. The user “flies” to Portland, where ten stars represent the ten collections (figure 6). Zooming in, the placemarks show the locations to which the collections refer. Considering the many layers and icons available in Google Earth, the authors decided to use yellow stars to make them more visible. In order to avoid clutter and overlap- ping labels, titles only appear on mouse-over (figures 7 and 8). Figure 9 shows the open place- mark for Portland Public Market. “Portland State University” with the university’s logo is a link that takes the user to the university’s homepage. The next line is the title of the collection, followed by a brief description. The paragraph after that is the same for all collections and includes links to the Portland State University Library and the OSCDL Web site. The collection-specific links that follow next go to the library catalog where the user has access to the digitized manuscripts of this collection (figure 10). Other pertinent links—in this case to a book available in the library, a public Web site on the history of the Market, and a historic image of the Market—were added as well. To make the placemarks visu- ally more attractive, all links are pre- sented in the school’s “PSU green,” and an image representative of the collection was added. The pictures can be enlarged in a new window by clicking on them. To avoid copyright issues, the authors photographed their own images. The last link opens an e-mail window for questions and comments (figure 11). This link is intended to bring some feedback and suggestions on how to improve the project and on its value for researchers and other users. The authors have been toying with the idea of including in the future more elaborate features such as video clips and music. One more recent feature is that KML files, created in Google Earth, can now also be viewed on the Web by simply entering the URL of the KML file into the search box of Google Maps (figure 12), thus cre- ating Google Earth placemarks in Figure 3. Detail of template with variables between « double brackets » Figure 4. Detail: “Downtown” placemark of finished KML script Figure 5. Simplified process Figure 6. Ten stars representing the ten collections intRoDucinG ZoomiFY imaGE | smitH 35DiscovERinG tHE LiBRaRY witH GooGLE EaRtH | BREnnER anD KLEin 35 Google Maps with different view options (figures 13 and 14). Not all formatting is correctly transferred, and at this point, there is no way to correct this in Google Maps. For example, the yellow stars were white, the mouse-over didn’t work and the size of the placemarks was impre- cise. However, the content of the placemarks—except for the images which didn’t show on some comput- ers—was fully retained and all links worked (figure 15). Although the use of the KML file in Google Maps is not as elegant as in Google Earth, it has the advantage that there is no need to install software as with Google Earth. This adds value to KML files and makes projects like this more versatile. The authors have identified sev- eral uses for the KML file: n A workstation in the library can be dedicated to resources about the City of Portland. An icon on the desktop of this workstation will open Google Earth and “fly” directly to Portland where the yellow stars are displayed. n Professors can easily add the .kml file to WebCT (now Blackboard) or other course management sys- tems. n The file can be e-mailed as an Figure 7. Zoomed in with mouse-over placemark Figure 8. Location of the Pioneer Courthouse Square placemark Figure 9. Portland Public Market Figure 10. Access to the collection in library catalog Figure 11. Ready-to-go e-mail window Figure 12. URL of KML file in Google Maps search box Figure 13. “Map” view in Google Maps Figure 14. “Satellite” view in Google Maps Figure 15. Portland Public Market place- mark in Google Maps 36 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200836 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 2008 attachment to those interested in the development of the City of Portland. n A link from the Wikipedia page related to the OSCDL project leads to the Google Earth pilot project. n The project was added to the Google Earth Gallery where many remarkable projects, cre- ated by individuals and groups can be found. n It can also be accessed through the OSCDL Web site, and rele- vant links from the records in the library catalog to Google Maps can be included. It may be use- ful to alert patrons, who actually did come to the catalog by them- selves, to this visual tool. Conclusion “The question now is not how we improve the catalog as such,” says Dempsey. “It is how we provide effec- tive discovery and delivery of library materials in a network environment where attention is scarce and infor- mation resources are abundant and where discovery opportunities are being centralized into major search engines and distributed to other envi- ronments” (Dempsey 2006a). With this in mind, the authors took on the challenge to create another discovery tool for one of the Library’s primary unique digital collections. Google Earth is not the Web, and it needs to be installed on a worksta- tion in order to use a KML file. On the other hand, the file created in Google Earth can also be used on the Web more readily but less elegantly in Google Maps, thus possibly reach- ing a larger audience. Similar to the King’s College project and following Abram’s sug- gestion that “we should experiment more with pilots in specific areas” (Abram 2005), this pilot project is of an exploratory, experimental nature. And as with many experiments, the authors were testing an idea, trying something different and new to find out how useful this idea might be, and useful applications for this proj- ect were identified. Google Earth is a sophisticated, attractive, and exciting program—and fun to play with. In a time “where attention is scarce and information resources are abundant,” as Dempsey (2006a) says, we need to provide these kinds of discovery tools to attract patrons and to lure them to these valuable resources in our library’s catalog that we created with so much diligence and cost of staff time and resources. Works Cited Abbott, Carl. 2005. Planning a sustain- able Portland: A digital library for local, regional, and state planning and policy documents. Framing paper. http://oscdl.research.pdx.edu/docu- ments/library_grant.pdf. Abram, Stephen. 2005. The Google oppor- tunity. Library Journal 130, no. 2: 34. Avtec Media Group. 2007. Search engine statistics. http://avtecmedia.com/ internet-marketing/internet-market- ing-trends.htm. Brenner, Michaela, Tom Larsen, and Clau- dia Weston. 2006. Digital collection management through the library cata- log. Information Technology and Libraries 25, no. 2: 65–77. Calhoun, Karen. 2006. The changing nature of the catalog and its integra- tion with other discovery tools; final report, prepared for the Library of Congress. www.loc.gov.proxy.lib.pdx. edu/catdir/calhoun-report-final.pdf. Coyle, Karen. 2007. The library catalog in a 2.0 world. The Journal of Academic Librarianship 33, no. 2: 289–291. De Rosa, Cathy et al. 2005. Perceptions of libraries and information resources. A report to the OCLC membership. www .oclc.org.proxy.lib.pdx.edu/reports/ pdfs/Percept_all.pdf. Dempsey, Lorcan. 2006a. The library catalogue in the new discovery envi- ronment: Some thoughts. Ariadne 48. www.ariadne.ac.uk/issue48/demp- sey. Dempsey, Lorcan. 2006b. Lifting out the catalog discovery expe- rience. Lorcan Dempsey’s Weblog on Libraries, Services, and Networks, May 14, 2006. http://orweblog .oclc.org/archives/001021.html Dempsey, Lorcan. 2005. Making data work—Web 2.0 and catalogs. Lor- can Dempsey’s Weblog on Libra- ries, Services, and Networks, October 4, 2005. http://orweblog.oclc .org/archives/000815.html Gatenby, Janifer. 2007. Accessing library materials via Google and other Web sites. Paper presented to ELAG (Euro- pean Library Automation Group), May 9, 2007. http://elag2007.upf. edu/papers/gatenby_2.pdf. Jessop, Martyn. 2005. The application of a geographical information system to the creation of a cultural heritage digital resource. Literary and Linguistic Computing: Journal of the Association for Literary and Linguistic Computing 20, no. 1: 71–90. Lynch, Clifford. 2002. Digital collections, digital libraries, and the digitization of cultural heritage information. First Monday 7, no. 5. www.firstmonday. org/issues/issue7_5/lynch. Markey, Karen. 2007. The online library catalog. D-Lib Magazine 13, no. 1/2. www .dlib.org/dlib/january07/markey/01 markey.html. LITA cover 2, cover 3, cover 4 Index to Advertisers 3257 ---- 2 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Currently we librarians seem to be hitching our wagon to the idea of library as community because in part it’s what we ourselves want. We’ve seen that our LITA members want more community from our association, so it makes sense to us that our patrons also want community. It’s what Pew, OCLC, and other stud- ies seem to be telling us. The business-wired side of the world is breaking their backs to create every form of vir- tual community they can think of as quickly as possible. Apply the appropriate amounts of marketing and then our patrons want those things and expect them from all of their historically important community resources, the library being a prime player in that group. So we strive and strive and strive to not only provide the standard issue face-to-face community we’ve always created, but to also create that new highly desired virtual community. Either we create a library-specific version, or we at the very least create a way for our patrons to access those communities. Hopefully, when our patrons step into those virtual communities, we work to make it possible for them to find libraries there, too. All well and good, but do we have a plan? What’s the goal? What’s the end achievement? If, as studies say, patrons with a research need turn to libraries first only one percent of the time, and instead first hit up friends and family fifty or more percent of the time, then where is our significance and place in either the physical or virtual spaces? We know we serve significant numbers in many ways. We have gate counts, circulation records, holds placed, warm bodies in the building—all manners of indi- cators that show a well-managed and -marketed library is in demand and appreciated. As we run into the terrible head-on crash of commu- nity and technology, willy-nilly doing absolutely every- thing we can to accommodate everyone and everything, because we’re librarians and library technologists and that’s what we do, do we really have a clue why we’re doing it? All fodder for deep thought and many lattes or beers and late night discussions. On the LITA side, though, we’re embarking on doing something about this knot when it comes to serving our members. Under the guidance of Past-President Bonnie Postlethwaite we’ve established an Assessment and Research Committee co-chaired by Bonnie and Diane Bisom. To kick off the committee activities and to help them establish an agenda and direction, LITA hired the research firm The Wedewer Group to work with the LITA board and the new committee. Stay tuned for reports and announcements from this committee as it works to find answers to some of those questions. And have that latte with a LITA colleague as you seek to find some answers yourself. It’s all part of building community. mark Beatty (mbeatty@wils.wisc.edu) is LITA President 2007/2008 and Trainer, Wisconsin Library Services, Madison. President’s Message: Doing Something about Life’s Persistent Problems? Mark Beatty 3258 ---- aRtiCLE titLE | autHoR 3EDitoRiaL | tRuitt 3 Editorial: Beginnings Marc Truitt As I write these lines in late February, the first hints of spring on the Alberta prairie are manifest. Alternatively, perhaps it’s just that the longer and warmer days are causing me to “think spring.” There are no signs yet of early bulbs—at least, none that I can detect with around a foot of snow in most places—but the sun is now rising at 7:30 a.m. and not setting until 6 p.m., a dramatic change from the barely seven hours of daylight typical of December and January. And while none but the hardiest souls are yet outside in shorts and shirt-sleeves, somehow, daytime highs that hover around freezing seem downright pleasant in comparison with the minus thirties (not counting the wind chill) we were experiencing even a couple of weeks ago. Yes, spring is in the air, even if the calendar says it is still nearly a month away. . . . So what, you may fairly ask, does the weather in Edmonton have to do with ITAL? This is my first issue of ITAL as editor, and it may not surprise you to hear that I’ve been thinking quite a bit about what might be the right theme and tone for my first column. While I’ve been associated with the journal for quite awhile—first as a board member, and more recently as managing editor—my role has always been comfortably limited to background tasks such as refereeing papers and produc- tion issues. Now, that is about to change; I am stepping a bit out of my comfort zone. It’s about beginnings. I follow with some awe in the footsteps of a long line of editors of ITAL (and JOLA, its predecessor). I’ve been honored to serve—and to learn a great deal—from the last two, Dan Marmion and John Webb. You, the readers of ITAL, and I are fortunate to have as returning managing editor Judith Carter, who preceded me and taught me the skills required for that post; I hasten to emphasize that she is definitely not responsible for the things I did not do right in the job! Regular readers of ITAL will recall that John Webb often referred humorously and admiringly to the members of the ITAL editorial board as his “junkyard dogs;” he claimed that they kept him honest. With the addition of a couple of fine new members, I’m confident that they will continue to do so in my case! Okay, with that as preface, enough about me . . . let’s talk about ITAL. ■ What’s inside this issue ITAL content has traditionally represented an eclectic blend of the best mainstream and leading/bleeding edge of library technology. We strive to be reflective of the broad, major issues of concern to all librarians, as well as alert to interesting applications that may be little more than a blip at the edge of our collective professional radar screen. Our audience is not limited to those actively work- ing in library technology, although they certainly form ITAL’s core readership; rather, we seek to identify and publish content that will be relevant to all with an interest in or need to know about how technology is affecting our profession. Thus, some articles will resonate with staff seeking new ways to use Web 2.0 technologies to engage our readers, while other articles will be of interest to those interested in better exploiting the four decades’ worth of bibliographic metadata that forms the backbone of our integrated library systems. The current issue of ITAL is no exception in this regard. We lead off with two papers that reflect the renewed inter- est of the past several years in the role and improvement of the library online catalog. Jia Mi and Cathy Weng review OPAC interfaces, searching functionality, and results dis- plays to address the question of why the current OPAC is ineffective and what we can do to revitalize it. Timothy Dickey, in a contribution that received the 2007 LITA/ ExLibris Student Writing Award,1 summarizes the challenges and benefits of a FRBR approach to current and “next-gen” library catalogs. Interestingly, as will become clear at the end of this column, Dickey’s is not the first prize-winning FRBR study to appear in the pages of ITAL. Online learning has long been a subject of interest both to librarians and to the education sector as a whole. Whereas the focus of many previous studies has been on the techniques and efficacy of online learning systems, though, Connie Haley’s paper takes a rather different approach, describing and exploring factors that character- ize the preference of learners for online training, as com- pared with more traditional in-person techniques. In Gary Wan’s and Zao Liu’s investigation of content- based information retrieval (CBIR) in digital libraries, the authors describe and argue for systems that will enable identification of images and audio clips by automated comparison against digital libraries of image and audio files. Finally, Wooseob Jeong prototypes an innovative application for enhancing Web access by the visually impaired. Jeong’s application makes use of force feed- back, an inexpensive, proven technology drawn from the world of video gaming. ■ Some ideas about where we are going A change of editorship is always one of those good oppor- tunities for thinking about how we might improve, or of marc truitt (marc.truitt@ualberta.ca) is Associate Director, Bibliographic and Information Technology Services, University of Alberta Libraries, Edmonton, Alberta, Canada, and Editor of ITAL. 4 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 different directions we might explore. With that in mind, here are a couple of things we’re either going to try, or that we’re considering: Different voices. ITAL’s format has long included provision for two “opinion” columns, one by the editor, and another by the president of LITA. From time to time, past editors have given over their columns for guest edi- torials. However, there are many other voices that could enrich ITAL’s pages, and the existing structure doesn’t really have a “place” for the regular airing of these voices. Beginning with the June 2008 issue, ITAL will include a regular column contributed by members of the board, on a rotating basis. The column will be about any topic related to technology and libraries that is on the author’s mind. I’m thinking about how we might expand this to include a similar column contributed by ITAL readers. While such reader contributions may lack the currency of a weblog, I think that they would make for thought- provoking commentary. Oh, and there’s that “currency thing.” In recent years, those of us who bring you ITAL have—as have those responsible for other ALA publications—discussed at length the whole question of when and how to move to a sustainable model of electronic publishing that will address the needs of readers. This issue is of course espe- cially important in the case of a technology-focused jour- nal, where content tends to age rapidly. Unfortunately, for various reasons, we’re not yet at the stage where we can go completely and solely electronic. A recent conversation with one board member, though, surfaced an idea that I think in the meantime has merit: essentially, we might create a preprint site for papers that have been accepted and edited for future publication in ITAL. We might call it something such as ITAL Express, and its mission would be to get content awaiting publication out and accessible. Is this a “done-deal”? No, at this stage, it’s just an intriguing idea, and I’d be interested in hearing your views about it . . . or anything else related to ITAL, for that matter. You can e-mail me at marc.truitt@ualberta.ca. ■ And finally, Congratulations Dept. Last week, Martha Yee, of the Film and Television Archive at the University of California, Los Angeles received the ALCTS Cataloging and Classification Section’s Margaret Mann Citation for 2008. Martha was “recognized for her outstanding contributions to the practice of cataloging and her interest in cataloging education . . . [and her] professional contributions[, which] have included active participation in ALA and ALCTS and numerous publica- tions.” Of particular note, the citation specifically singled out her work in the areas of “FRBR, OPAC displays, shared cataloging and other important issues, [in which] Yee is making a significant contribution to the discussions that are leading the development of our field.” Surely among the most important of these is her paper “FRBRization: A Method for Turning Online Public Finding Lists into Online Public Catalogs,” which appeared in the June 2005 issue of ITAL (p. 77–95). Archived at the ITAL site, D-list, the CDL e-Scholarship Repository, and elsewhere, this seminal contribution has become one of the most accessed and cited works on FRBR. We at ITAL are proud to have provided the original venue for this paper and congratulate Martha on being named recipient of the Margaret Mann award. 3259 ---- aRtiCLE titLE | autHoR 5REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 5 The behavior of academic library users has drastically changed in recent years. Internet search engines have become the preferred tool over the library online public access catalog (OPAC) for finding information. Libraries are losing ground to online search engines. In this paper, two aspects of OPAC use are studied: (1) the current OPAC interface and searching capabilities, and (2) the OPAC bibliographic display. The purpose of the study is to find answers to the following questions: Why is the current OPAC ineffective? What can libraries and librarians do to deliver an OPAC that is as good as search engines to better serve our users? Revitalizing the library OPAC is one of the pressing issues that has to be accomplished. T he information-seeking behavior of today’s aca- demic library users has drastically changed in recent years. According to a survey conducted and published by OCLC in 2005, approximately 89 per- cent of college students across all the regions that were included in the study (including areas outside the United States) begin their electronic information searches with Internet search engines.1 More than half of U.S. residents used Google for their searches. Internet search engines dominate the information-seeking landscape. Academic libraries are the ones affected most, because many col- lege students are satisfied with the answers they find on the Internet for their assignments, and they end up not taking advantage of the many quality resources in their libraries. For many years, before the Internet search engine emerged, library catalogs were the sole information-seek- ing gateway. Just as the one-time industry giant Kodak has lost ground to digital photography, academic library OPACs are losing ground to online search engines. All along we academic librarians have devotedly and assidu- ously produced good cataloging records for the public to use. We have diligently and faithfully educated and helped our faculty and students find the proper library resources to fulfill their research needs and assignment requirements. We feel good about what we have achieved. Why have our users switched to online search engines? ■ The evolution of user behavior It is technology and rising user expectations that have contributed to the changes in user behavior. As Coyle and Hillmann pointed out: “Today’s library users have a different set of information skills from those of just a few decades ago. They live in a highly interactive, networked world and routinely turn to Web search engines for their information needs.”2 A recent study conducted by the University of Georgia on undergraduate research behav- ior in using the university’s electronic library concluded that Internet sites and online instruction modules are the primary sources for their research.3 The students’ year of study did not make much of a difference in their choices. Tenopir also concluded from her study of approximately 200 scholarly works published between 1995 and 2003 that no matter what type of resources were used, “con- venience remains the single most important factor for information use.”4 Recently, OCLC identified three major trends in the needs of today’s information consumers—self-service (moving to self-sufficiency), satisfaction, and seamless- ness.5 Services provided by Google, Amazon, and similar companies are the major cause of these emerging trends. Customers have wholeheartedly embraced these prod- ucts because of their ease of use and quick delivery of “good enough” results. Researchers do not need to take information literacy classes to learn how to use an online search engine. They do not need to worry about forget- ting important but infrequently used search rules or com- mands. In addition, the search results delivered by online search engines are sorted using relevance ranking sys- tems that are more user-friendly than the ones currently employed by academic library OPACs. These are just some of the features that current academic library OPACs fail to deliver. In 2004, Campbell and Fast presented their analysis of an exploratory study of university students’ perceptions of searching OPACs and Web search engines.6 They found that “[s]tudents express a distinct preference for search engines over library catalogues, finding the catalogue baffling and difficult to use effectively.” As a result, library OPACs, because they do not fulfill user needs, have been bombarded with criticism.7 We often hear librarians complain about how library users forget what they have learned in user education classes. Librarians sometimes even laugh at users’ igno- rance and ineffectiveness in searching library OPACs. This legacy mentality has actually prevented librarians from recognizing the changes in user behavior and expectations that have occurred in the past decade. Rarely have librar- ians considered ineffective OPAC design to be at the root of unsuccessful OPAC use. Roy Tennant has mentioned frequently in his presentations that “only librarians like to search; users prefer to find”; that “users aren’t lazy, they are Jia Mi and Cathy Weng Revitalizing the Library OPAC: Interface, Searching, and Display Challenges Jia mi (jmi@tcnj.edu) is Electronic Resources/Serials Librarian and Cathy Weng (weng@tcnj.edu) is Head of Cataloging, the College of New Jersey Library, Ewing. 6 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 20086 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 human.”8 It is only natural that library users turn to Internet search engines first for their information needs. ■ The OPAC reexamined Cutter, in his 1876 book, introduced the objectives of the library catalog as follows: 1. To enable a person to find a book of which either a. the author b. the title is known c. the subject 2. To show what the library has a. by a given author b. on a given subject c. in a given kind of literature 3. To assist in the choice of a book a. as to its edition (bibliographically) b. as to its character (literary or topical)9 The majority of today’s OPACs have successfully ful- filled Cutter’s model in finding known items. Following the card-catalog convention, bibliographic elements such as title, author, and subject have been the leading search options in OPAC search menus for many years. It was assumed that users always came to the library with spe- cific author, title, or subject information in mind before searching the catalog. The OPAC bibliographic display is in essence an electronic version of the card catalog. To accommodate the bibliographic data from card catalogs, many display labels were created, but often without regard to whether or not they were suitable in an online environment. This data-centered, card-catalog type of design was easily understood and fluently used by librar- ians, but not by most end users. Campbell and Fast found in their study that “while the participants were generally happy with their understanding of search engines, they frequently expressed a low opinion of their ability to search the catalogue.” They also found that students felt that “[t]he Web is cluttered; the catalogue is organized. However, this organization was not always helpful; it was admired, but not understood.”10 The traditional catalog retrieval mechanism is sig- nificantly different from the Web search engine. As Yu and Young noted in 2004, “Web search engines and online bookstores have a number of features that are not typically incorporated into OPACs. These functions include: natural-language entry, automated mapping to controlled vocabulary, spell-checking, similar pages, rel- evance-ranked output, popularity tracking, and brows- ing.”11 These features have unquestionably affected user expectations in searching library OPACs. Teaching users to search for structured bibliographic data is completely opposed to the ever-popular free and open Internet search mechanism drawn from the Google-like search experience, which does not require any special training. Since academic libraries aim to provide more dynamic and versatile services, revitalizing library OPACs should be considered a top priority. Furthermore, librarians’ expectations of user behavior should adjust to today’s needs. Educating users to become fluent in using OPAC search commands and rules has become less relevant as users now seldom read and follow instructions. Investing effort and energy in designing a truly user-friendly OPAC that functions intuitively to achieve productive retrieval could not be more imperative. Academic librarians have started pondering what changes should be made to library OPACs so that a truly user-friendly, twenty-first-century catalog that offers a “Google-like” experience can be delivered. Two impor- tant aspects that affect the usability of library OPACs are addressed in this article: (1) the current interface and searching capabilities and (2) the bibliographic display. The OPAC’s public interface and searching capabilities together function as a finding aid. It determines how successful a user is in retrieving information and is the gateway to library resources. The effectiveness of an OPAC’s bibliographic display affects the user’s under- standing of the bibliographic description. Users use bibliographic information to identify, select, and obtain library resources. ■ The study of the public interface of library OPACs To find out how academic libraries designed and admin- istered their OPACs, the authors examined the interfaces of 123 Association of Research Libraries (ARL) libraries’ OPACs powered by five major integrated library systems (ILS): Aleph, Horizon, Millennium, Unicorn, and Voyager. The study focused on searching ability, relevance rank- ing, layout, and linking functionalities. During the study, we expected each ILS system to have its own OPAC design. We also anticipated that search mechanisms would be managed differently at each location. However, we were surprised by the great disparities that we discovered in OPAC quality, a clear indication of the time and effort (or lack thereof) devoted to their maintenance and improvement. The findings are summarized below. Google-driven changes—keyword search as the default search key In his article “Mental Models for Search Are Getting Firmer,” usability expert Jakob Nielsen argued that cur- > aRtiCLE titLE | autHoR 7REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 7 rent users have already developed a firm mental model of searching: Search is such a prominent part of the Web user experi- ence that users have developed a firm mental model for how it’s supposed to work. Users expect search to have three components: ■ A box where they can type words ■ A button labeled “search” that they click to run the search ■ A list of top results that’s linear, prioritized, and appears on a new page—the search engine results page (SERP) In our experience, when users see a fat “Search” button, they’re likely to frantically look for “the box where I type my words.” The mental model is so strong that the label “Search” equals keyword searching, not other types of search.12 Studies have also shown that the default search option to which an OPAC is set affects users’ success in retriev- ing information. Two studies on university OPAC search transactions confirmed that novice users preferred search- ing by keyword. At Nanyang Technological University, Singapore, a recent search transaction log study was conducted to “identify query and search failure patterns with the goal of identifying areas of improvement for the system.” Results indicated that “the most commonly used search option for the NTU OPAC is the keyword search. The use of keyword searches contributed to 68.9 percent of all queries while other options such as title, author, and subject accounted for 16.5 percent, 8.2 percent, and 6.4 percent of all searches respectively.”13 At California State University–Los Angeles, a four- quarter (2002–2003) search transaction log analysis also revealed similar results. After the library implemented an “advanced keyword search” feature that provided more user-centered, behind-the-scenes search algorithms and that set keyword search as the default, the keyword search queries rose dramatically.14 Many university library OPACs have already begun to adopt features employed by Internet search engines. Among the 123 ARL library OPACs studied, 81 have “keyword(s) anywhere” as the default search key (see appendix A). This is a positive sign that libraries are pay- ing attention to user search behavior. Thirty-six libraries’ default search keys are still set to “title,” and six libraries, instead of providing a default search option, list field choices from which users must choose before entering their search terms. The title search used as the default option holds some potential problems. In order to retrieve good results from a title search, users are expected to type in a title in the right order, spelled correctly, and omitting the initial article (a, an, the), if any. While librarians are fluent with these seemingly simple rules, students and faculty con- stantly have trouble remembering them. Providing online search tips and offering information literacy classes only help a little. Since presenting keyword search as the default has proved effective, libraries using title search as their OPAC default search option might want to recon- sider switching their default setting to keyword. search ability—true keyword search The basis of current OPAC search systems is Boolean logic. The ease of using Google-like search engines comes from its implicit “AND” feature, which eliminates the need to enter Boolean connectors (AND, OR, NOT) between search terms. This is logical because users usually look for records that contain all the terms that they enter. Sixty-six percent of the ARL libraries studied have OPACs with keyword set as the default search option. These libraries handle Boolean logic in keyword search- ing very differently. All five ILS vendors offer “automatic AND” functionality, but not all of these libraries have adopted it: in some cases, users are required to enter Boolean operators during a search. Emory University Library’s OPAC automatically executes “same” for mul- tiple search words if no Boolean operators are entered which means that it will find records with the search terms in the same bibliographic fields. Syracuse University’s OPAC automatically uses the Boolean operator “OR” for all keyword queries. This practice can generate too many irrelevant results. Libraries that automatically supply the Boolean operator “AND” for multiple terms entered in the search box consequently produce more relevant results. In addition, none of the ARL OPACs studied han- dle auto-correction for typos, spell-check, auto-plurals, auto-word-truncation, punctuations, or special charac- ters. This makes searching unnecessarily inconvenient. For many years now, teaching students how to prop- erly use Boolean operators has been one of the essen- tial topics in information literacy classes. After taking these classes, do students use Boolean operators when searching? A study of 2,374 transaction logs collected by 836 French universities revealed that French uni- versity students use Boolean operators infrequently. Fifty-six percent of the queries used only a single term. Approximately 28 percent of the queries contained one Boolean operator. To further investigate the impact of information search expertise on the use of Boolean opera- tors, the study showed that approximately one-third (32 percent) of the students (considered the “novice” group in the study) still did not use Boolean operators even when they were explicitly invited to do so, compared to 83 percent of librarians (considered the “expert” group in the study), who used at least one Boolean operator for their queries.15 Therefore, complicated search strategies and syntax are mostly used by expert users. Novice users 8 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 20088 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 prefer to use natural-language queries. Libraries also handle phrase searching in different ways. Phrase searching usually is embedded within keyword search either explicitly or implicitly depending upon the ILS system. Aleph (Ex Libris) libraries use a radio button for “word or phrase” or “words adjacent” or “exact phrase” options for the computer to execute the command. Unicorn (Sirsi/Dynix) libraries provide three options: “keyword,” “begins with,” and “exact.” Some libraries have the “exact” command executed to search every field in a bibliographic record; other librar- ies search the title, subject, and author fields only. The Millennium system’s (Innovative) keyword search fea- ture can do automatic phrase and “AND” search. Some Millennium libraries (e.g., Michigan State University) take advantage of this feature to search words entered as phrases first and, if unsuccessful, the system then repeats the search for the same words using the Boolean opera- tor “AND.” This feature produces more relevant search results. However, several Millennium libraries have not implemented this feature. They still use “Boolean key- word” search as the default and instruct users to add quotation marks to define phrases. The Voyager (Ex Libris, formerly Endeavor) system offers two types of keyword searches: “keyword Relevance” and “keyword Boolean.” Both options can handle phrase searching. But users are required to enter quotation marks for specific terms used as phrases. Some libraries intentionally made only one keyword search option available. Other librar- ies provided both options and used different languages as an OPAC search key (see appendix B). These search keys are not self-explanatory, and users will often find them puzzling. The default help screen provided by the ILS vendor and adopted by many Voyager libraries does not help much either (see appendix C). Thirty-one of the 35 Voyager libraries provide a Boolean keyword search option. Only five libraries utilize the automatic “AND” feature. One library uses Boolean keyword search as the only keyword option, but did not activate the automatic “AND” functionality. Relevance ranking in search results When users search by keyword, the best way to sort the results is by relevance. Presenting the most relevant results at the top of the results page is crucial because it enhances library resource discovery and access. Other sorting options, such as title or publication date, are not very useful since users usually do not have titles or pub- lication dates in mind when browsing search results from a keyword search. Three ILS systems (Millennium, Unicorn, and Voyager) have a relevance-ranking feature, yet this functionality was very much underutilized by the libraries studied. Of the eighteen Unicorn libraries, only five offered relevance ranking. None made it the default sorting option. Thirty- six of the 38 Millennium libraries provided relevance ranking as a sorting option. Only twelve of those librar- ies made relevance ranking the default sorting system. Twenty-seven out of the thirty-five Voyager libraries offered the keyword (relevance) search option, under which the search results were automatically ranked by relevance. Out of the twenty-nine Voyager libraries that offered the keyword (Boolean) search option, only four libraries used relevance as the default sorting system. The rest of the libraries used a “system sort” mechanism that sorted search results by bibliographic control num- ber. Figure 1 summarizes the sorting options used by the ARL libraries studied and also shows the default sorting options for keyword search. Unlike online search engines, which pull data directly from full-text documents, library OPACs search for words from the structured metadata entered by catalog- ers. Different fields are set to carry different weights for relevance considerations. The behind-the-scenes algo- rithm (the criteria used to decide the level of relevance) should be carefully established to warrant a good ranking scheme. For example, the new OPAC of North Carolina State University Library, powered by Endeca, adopted an algorithm based on field weighting, phrase matching, facet LCSH, term frequency (TF), and inverse document frequency (IDF). Their search results are indeed more logically ranked by relevance. Recently there have been suggestions to incorporate circulation statistics, book review data, and a Library of Congress call number table into the algorithm. The checkout data would provide a rough substitute for Google’s PageRank (a count of links to a site, which is an indication of the site’s popularity), and book reviews would provide more text to be consid- ered in the relevancy tests. Using Library of Congress call numbers would either require having the call number table loaded and then running the search terms against it or including call numbers in the algorithm, giving more weight to titles having the same call number. For example, seven out of twenty-three results generated for a search for “New York history” on an OPAC have the call number “F128.” The call number “F128” is linked to the call number table with the subject New York and history. It can be confirmed that seven items with call number “F128” should be considered more relevant and ranked first on the results list. More research needs to be done in this area. the search results display The search results display is critical. The information, options, and bibliographic data presented on the browse page help users decide what actions to take next. In the OPACs examined, the authors found the following problems: aRtiCLE titLE | autHoR 9REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 9 1. Search terms and search boxes were not retained on the results page After a search is performed, many OPACs do not effectively carry the original search information onto the results screen. This information includes the search key and the words typed in the search box. Users need to con- sult this information to identify and select records relevant to their needs from the search results page. Based on the retained information, users also decide what to do next. For example, they might change their search strategy or modify their previous search. Many of the OPACs studied neglected to display the original search information. Even better than just displaying the text of the user’s search terms would be to maintain them in search boxes at the top and bottom of the results display page. This way, users would only have to modify their search terms rather than type new search terms each time they wished to modify their original search. Only one of the twenty- one Aleph libraries studied kept the previous search terms in the search box on the results page. Fourteen of the Aleph libraries retained neither the previous search strategy nor the search terms. Six libraries placed the search box at the bottom of the search results page, which could be easily missed. 2. Post-search limit functions were not always readily available Sometimes keyword searches produce an overwhelm- OPAC Sorting Options for Keyword Search Relevance Year (Publication Date) Author Title Call # (Subject) Format Default ALEPH 21 0 21 21 21 8 4 Year/Author: 17 Title/Year Ascending: 1 Title: 1 System sort: 2 HORIZON 7 0 7 7 7 0 0 Publication Date: 2 Title: 2 Author(ascending): 1 System sort: 2 Millennium 38 36 38 0 38 0 0 Date: 20 Title: 5 Relevance: 12 System sort: 1 Unicorn 18 5w Descending 18 Ascending 18 18 18 18 0 New to Old: 5 Relevance: 1 (NCSU) System sort: 12 Voyager 35 Kw (R) 27 Kw (B) 4 Descending 34 Ascending 34 35 35 0 0 Relevance: 5 KW with Relevance: 27 System sort: 8 Figure 1. ARL libraries sorting options for keyword search (as of March 2007) 10 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200810 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 ing number of search results. Since the relevance ranking functionality currently provided by ILS vendors does not work very well, the best way to refine searches is to make effective search limit options available. Limiting options such as format, language, date, availability, and location should be readily available on the results page. Some ILSs in our study hid this feature, either under a modified search link or an advanced search link. This made refin- ing a search unnecessarily cumbersome. 3. Item statuses were not available on the search results page In addition to bibliographic information, users also need to know whether an item they want is available. Having the item status on the browse page is very helpful because users can skip the records that have been checked out. Some libraries studied did not have this informa- tion on the results browse page. Users needed to go to the individual bibliographic records to find out whether an item was available or not. A few libraries provided an added-value option to limit the results by “available items”—a very useful feature. 4. A lack of value-added information A book cover image conveys an impression of a book that words cannot. It can also help a user recognize a book he or she has seen previously. In addition to cover images, libraries can provide value-added and contextual information by linking those images to tables of contents, summaries, sample passages of text, and reviews. One way libraries provide value-added and contextual infor- mation is to link cover images to the Library of Congress’s table of contents page. Another way is to link OPACs to information obtained from Syndetics.com, a company that provides cover images, tables of contents, summa- ries, author biographical information, and reviews. The Ohio State University Library not only adds the table of contents into the MARC record, but also links the names of the authors of a particular resource to other works by the same authors. This is a great discovery tool for find- ing related resources, and it is especially helpful, since in the future OPACs will be able to search not only books but also articles and other resources. 5. Title links were misleading We found that several libraries’ OPACs title links on the results page did not take users to the detailed biblio- graphic record, but instead directed users to an alphabeti- cal title-browsing page. To get to the actual bibliographic record, users had to click a “display full record” link (which is sometimes difficult to locate) to view the indi- vidual bibliographic record. This misleading feature makes the retrieval process inefficient. 6. Switching between individual records and the results list was cumbersome After viewing an individual bibliographic record, users will want to return to the results browse page, either by hitting the “back” button or by clicking on a “return to results” link. Many library OPACs in our study returned the user to the top of the results page rather than to the location to which the user had previously scrolled. This forced the user to scroll back down through the records that had already been examined. This feature ought to be improved. 7. The color of entry links that had already been read were not differentiated For over a decade now, Web browsers have changed the color of links that have already been clicked on. However, this has not been the case with OPACs. To solve this problem, visited bibliographic entry links on search results pages should likewise be given a different color from entries that have not yet been visited. This feature facilitates the browsing of the search results. If what has been viewed is clearly marked, users only need to focus on entries that have not yet been visited. Some libraries in our study did not have this feature. 8. Searched keywords were not highlighted When a keyword search is performed, highlighting the entered keywords in each bibliographic record that has been retrieved is helpful. Based on the bibliographic elements in which the highlighted keywords appear, users can then decide how relevant the retrieved publica- tion is to their research. All five ILS vendors provide this feature, and many libraries did a good job of implement- ing it. However, some libraries neglected to make this feature available. 9. Many libraries lack a meaningful call-number browse feature Library OPACs should take better advantage of call number links by allowing users to browse them much as if they were browsing shelves in the stacks. To that end, OPACs should link call numbers directly to a page with more useful identifying information, such as the authors and titles. No Aleph library OPACs that we studied cur- rently have this feature. Instead, clicking on the hyper- linked call number field only leads users to a list of more call numbers, which is not helpful at all. 10. Title link, subject link, and author link should be relabeled to be meaningful to end users (other value- added features) Millennium’s “Similar records” and Voyager’s “More like this” are added to pull similar titles under the same aRtiCLE titLE | autHoR 11REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 11 subjects. Unicorn and Horizon offer a panel on the left side of the detailed book record, which can add meaning- ful information to these links. But how the panel is used depends on the individual libraries. Some libraries use the panel with only library holding information, but other libraries, such as University of Virginia, make an infor- mative presentation of those links to students. Virginia has added three browse features to make the index links much more meaningful: “Find more by this author” (author link), “Find more on these topics” (subject link), and “Nearby items on shelf” (call number link). (See figure 2.) This value-added feature can indeed facilitate retrieval process. By analyzing five major integrated library systems’ OPACs among ARL libraries, the authors have come to believe that librarians can make a big difference in improving OPACs. No matter how good the library system is, librarians still need to invest effort, time, and technical knowledge to configure and take full advantage of the many capabilities that ILSs offer. Public services, technical services, and system librarians should all work together to continuously study the usability of OPACs and to make them more effective. It is true that all current OPACs lack spell-check and automatic stemming func- tionality. Aleph and Horizon need to add relevance rank- ing, and Millennium, Unicorn, and Voyager should make our data work harder and relevance ranking algorithms more effective. Besides those systems in need of improve- ments, the study shows that all library OPACs could do a much better job if they focus on the user’s needs. ■ The OPAC bibliographic display study When the Web OPAC was introduced, libraries around the world quickly abandoned the traditional card cata- log display and adopted the line-by-line display with display labels on one side and bibliographic information on the other. Because the line-by-line display format can be locally customized, each library’s OPAC bibliographic display looks very different. For decades, most academic libraries in the United States have used AACR and MARC as their content and metadata standards for resource description and access. MARC and AACR were originally created for card catalogs in which descriptive elements and access elements were separately defined and presented. The line between the two types of elements has become less distinct in today’s Web environment. Many elements in bibliographic records can serve as both description and tracing elements on OPACs.16 Hyperlink functionality has also streamlined the retrieval process. To see how academic libraries in the United States format their OPAC bibliographic displays, the authors examined the OPACs of fifteen academic libraries.17 The purpose was to study the effectiveness of the display of records in different formats. In the mid-1990s, Wool stud- ied the bibliographic display practices for monographs of thirty-six online catalogs in the United States. In his study, five criteria were used to analyze each bibliographic record structure.18 The authors of this paper adopted for analysis three of the five OPAC bibliographic display criteria used by Wool, only this time with an emphasis on the user’s perspective and needs. Eight different titles were reviewed and compared: three monographs, two serials, one video recording, and two sound recordings.19 The analysis given below is based on the following three criteria: ■ the accuracy and clarity of display labels; ■ the order of bibliographic elements display; and ■ the utilization of bibliographic data. accuracy and clarity of display labels For this discussion, the authors divided the bibliographic elements into three areas: ■ the first tier: information about author/contributor, title, imprint, and subjects; ■ the second tier: other descriptive information, including the physical description, notes, related con- tributors, related titles, etc.; and ■ the third tier: the linking fields (MARC 76X–78X fields) and the electronic location and access field (i.e., 856 field). the first-tier elements The information displayed in the first tier can be consid-Figure 2. University of Virginia Libraries Catalog 12 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200812 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Most libraries in our study used the label “author” for the principal author. The principal author could be a personal author, a corporate author, or a conference name. If it is a personal author, it could be a writer, an artist, or a composer. Some OPACs used “author” to represent all types of responsible bod- ies, be it a personal author, a corporate author, a meeting name, an artist, a music com- poser, etc. This use of a single label to cover a diverse set of situations is confusing. Some libraries used separate labels (“author,” “corporate author,” “meeting name,” “author/ artist,” “author/composer,” or “author, etc.”) for different types of responsible bodies (see appendix D). “Uniform title” was defined in AACR to collocate resources derived from the same original intellectual or artistic creation. For example, when cataloging a transla- tion, in addition to its official translated title, an estab- lished uniform title is entered to indicate the original work. When browsing by uniform title on a properly set OPAC, all entries related to the original intellectual creation should be retrieved. This uniform title brows- ing feature helps users locate related publications in the catalog. The problem is that the term “uniform title” is only understood by catalogers, not by others. There is no label for such an entry that can be easily understood by the average user. However, suppressing the uniform title entry to avoid confusing users will cause the OPAC to lose its helpful collocation functionality. Some libraries studied use the term “uniform title” as a display label. Some libraries use “other title” as a display label. Some libraries display this entry under the label “title” along with the title proper (title in the 245 field). None of the above-mentioned arrangements are ideal. The display labels for subject headings provided by each library were very similar. Most academic libraries in the United States use the Library of Congress subject headings and the medical subject headings as the thesauri for subject entries. Specifying the thesauri for headings on OPACs with acronyms like “LCSH” and “MESH” is of no help to users, because these thesauri do not clarify anything that would assist users in their research. Figure 3 lists the display labels used by libraries in the study. ered the key elements for identification. OPAC users first examine them and decide if the manifestation described is relevant to their query. Most OPACs studied used “title” as the display label for the title statement. This element actually consists of the title and statement of responsibil- ity (author, etc. statement). Using the label “title” alone is not inclusive enough. One library (University of Arizona Library) displayed only the title portion under the label “title” and provided a separate label, “author/con- tributor info,” for the statement of responsibility portion, which, while helpful in a limited way, could also create more confusion. Let us consider, for example, the Project directory (Répertoire des projets) of TDC (in French, CDT). The title statement for this data would be “Project direc- tory / TDC = Répertoire des projets / CDT.” Here, the English title and statement of responsibility is equiva- lently presented with its French title and statement of responsibility. The OPAC display using the University of Arizona Library’s model is as follows: Title: Project directory Author/contributor info: TDC = Répertoire des projets / CDT. This arrangement will not work for items with titles and statements of responsibility in multiple languages presented on a single manifestation. The French title appears under the label “Author/contributor info,” which makes no sense. MARC Fields Library of Congress Subject (MARC 650 field 2nd indicator 0) Medical Subject (MARC 650 field 2nd indicator 2) D is p la y L a b e ls Subject (LCSH) Subject (MESH) Subject-Lib. Cong. Subject-Medical Subject LC Subject Medical Library of Congress subject headings Medical subject headings Subject(s) Subject(s) Subject, general Subject, geographic Subject, medical Subject Med. Subject Figure 3. Display labels for subjects aRtiCLE titLE | autHoR 13REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 13 the second-tier elements The elements in the second tier include the physical description, notes, related authors, and related titles. This is an area where mapping bibliographic elements onto proper display labels is difficult. This area was also not managed well by the libraries studied. Unlike first-tier elements in which one element usu- ally corresponds to a unique display label, second-tier elements exhibit two patterns in the OPACs exam- ined: many-to-one and one-to-many. That is, multiple categories of data (of different MARC fields) can be represented by one display label, e.g., incorporating physical description, numbering notes, and publication numbering into “description” (many-to-one). On the other hand, one display label can represent one single, repeatable bibliographic element (the same MARC field repeated many times), e.g., multiple general notes (one- to-many). Both arrangements (one-to-many and many- to-one) can result in a simpler, cleaner public display, since some descriptive elements are self-explanatory and users can get by without specific display labels supplied. The disadvantage of these arrangements is that the level of specificity of public displays is com- promised. Some important descriptions can be easily missed if they are clustered in a group of elements. For bibliographic elements that are not self-explanatory, this type of arrangement can fail to convey useful informa- tion, or even worse, deliver inaccurate or vague infor- mation. For example: Description: v. : ill. ; 28 cm (physical description, MARC 300 field) Report year ends Mar. 31. (numbering note, MARC 515 field) ’77– (publication span, MARC 362 field) Published: Philadelphia : Robert Morris Associates, 1977– (imprint, MARC 260 field) ’77– (publication span, MARC 362 field) Annual (frequency, MARC 310 field) The numbering field (field 362) is defined to describe a serial publication’s chronological or numerical publi- cation extent. Carelessly placing data like “’77-” under labels such as “description” or “published” is very unclear. In fact, it is inaccurate because “’77-” is the pub- lication span, not the publication date. Without a proper label, it is difficult to convey this information to users. Some libraries we studied used such labels as “publica- tion history,” “publishing history,” “publication dates,” or “volume/date range” to describe the publication span. This practice is misleading (see appendix E). Names like coauthors, editors, cast members, perform- ers, related corporate names, or meeting names of people who contributed to or were involved in the creation of the work are considered secondary contributors. Using one label to cover the various roles (author, editor, composer, etc.) is the practice most libraries have adopted. Like the primary author field, this element represents a variety of roles depending upon the type of manifestation. Some OPACs used one display label to cover all related per- sonal names, corporate names, and meeting names (see appendix F). Most libraries failed to supply a proper label for a secondary name when it was entered with a related title. This so-called “name-title added entry” is provided to collocate materials under the same author and title in the catalog. Ideally, the name-title combined element, pro- vided with redirect functionality via hyperlink, should perform an author-title combination search for exact retrieval. Most OPAC systems could only perform either an author or a title search. The search results were unsur- prisingly irrelevant, because they did not utilize both elements of the name-title added entry to produce results that were sufficiently specific: users would get only a list of authors or a list of titles instead of an author-title com- bination entry list. Some libraries presented this type of element only as an unhyperlinkable note, which defeats the purpose of having such data available. Handling series for OPAC displays is also chal- lenging. The majority of OPACs studied did a poor job in this area. In general, a series title transcribed from the resource also functions as an access element if the transcribed title is the same as the established one in the authority file. When the transcribed series title is different from the established series title, ideally the transcribed series title should only be accessible via the library system’s cross-references feature, which then directs users to bibliographic records that contain the established entry. This type of descriptive element is not meant to be displayed on the OPAC. The OPACs examined used the labels listed in figure 4 to handle transcribed and established series entries. Labels listed in the same row were taken from the same OPAC. As can be seen, users are not expected to know the difference between a “series statement” and a “series.” In many cases, these two elements are identical due to the vendor authority control process.20 This could confuse the user, especially when both elements are displayed right next to each other. 14 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200814 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 the third-tier elements The third-tier elements consist of linking fields (MARC 7XX fields) and electronic location and access fields (MARC 856 field). The linking fields are used mostly in serial bibliographic records. Their purpose is to link the title being described to its related publications, e.g., supplements, translations, preceding titles, or succeeding titles. Elements in this category should be displayed and linked directly to the related record via control numbers provided in the bibliographic record. If the catalog does not have the related record, a clear message should indi- cate this to the user. Unfortunately, many libraries do not display all the linking entries. None of the OPACs stud- ied offered direct link functionality. Instead, what was usually offered was a redirect feature via hyperlink that prompted the system to issue a new author or title search. The direct link functionality via record control numbers was never made available. If the library did not have the related entry, the OPAC system simply took the user back to the original entry—a very confusing design flaw. To ease the user’s access to Internet resources, the electronic location and access element (MARC 856 field) was defined for catalogers to record the Internet location of the resource being described and its related informa- tion. By clicking the hyperlinked element on an OPAC, users seamlessly get to the desired electronic document site. The URL specified in the field might link to full-text documents, the table of contents, the document abstract, the publisher’s description, or the author’s biographi- cal information. A label that fits all types of materials is crucial. The bibliographic elements displayed under the label should also be carefully managed. Under the label, some libraries displayed the type of resource (e.g., table of contents). Other libraries displayed the HTTP URL only. Some libraries displayed both the type of resource and the HTTP URL (see figure 5). As for the location of the label in the OPAC record, we found that the loca- tion of the URL link depended on the OPAC in which it appeared: In some OPACs, links were located at the top of records; in others, they appeared in the middle or at the bottom. We found that the location of the link was not ter- ribly critical, provided that the label was prominent and the display text understandable. the order of the bibliographic elements display The way bibliographic data is organized in each OPAC record, together with display labels, helps users to quickly identify library resources. Although each library can locally choose the arrangement of bibliographic data displayed on its OPAC, most libraries prefer to place the citation information (author, title, publication) ahead of other elements. The sequence of the other elements exhibited enormous variation in the OPACs studied. Some libraries placed the electronic access element above all other data (SUNY Buffalo); some libraries placed local holdings information, call number, and item availability in the middle of the bibliographic record. Arrangements were clearer and more understandable when provided with clear labels and a distinct layout between the local holdings information and bibliographic data. Problems arose when second-tier elements were mingled with first- tier elements and when they shared the same display label. See example in figure 6. In this example, two titles are displayed under the “title” label. The first title, “RMA annual statement stud- ies,” is the full title (MARC field 245) of the publication. The second title, “RMA annual statement studies: Industry default probabilities and cash flow measures,” is the title of the resource’s related publication (MARC field 730), which normally is considered a second-tier element and should be placed farther from the title proper with a clear label. Since the display order of bibliographic elements is completely customizable, we found in our study that few libraries put enough effort into providing clear bibliographic displays. More importantly, records in different formats (e.g., mono- graphs, serials, music materials, video recordings) were not given equal attention. Some labels and data sequences might work for one format, but not another. utilization of bibliographic data Another factor that has an effect on the usability of an OPAC is the utilization of bibliographic data. Two issues are addressed in terms of utilization of bib- liographic data: (1) the completeness and suitability of the metadata displayed on an OPAC, and (2) the extent of repurposing the bibliographic data and creating added value to an OPAC.21 A typical bibliographic record contains descriptive data, access data, and admin- Label for transcribed element Label for established element Series Statement Series Series Statement Series indexed as Other Series Series Series note Series Description Series Figure 4. Display labels for series aRtiCLE titLE | autHoR 15REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 15 istrative data. Descriptive data is provided to describe the manifestation cataloged and is considered of inter- est to the public. Access data is entered and indexed for retrieval. Administrative data is used for setting up search limits (e.g., limit by language, format) and pull- ing statistics (e.g., how many titles in Spanish). It is most useful for internal, administrative use. Librarians must be careful when deciding whether such data elements will be displayed. In terms of the completeness and the suitability of metadata in the OPAC display, the authors discovered the following in the OPACs studied: 1. Many libraries’ OPACs displayed control numbers, such as the OCLC control number (the 035 field), the LC control number (010 field), and other local system control numbers. This type of information is usually of no interest to the public. See example in figure 7. In this example, the numbers listed under the label “Wln #” represent different types of system control numbers, which are of no concern to users and therefore should not be displayed. 2. Some OPACs displayed bibliographic data from the leader fields of the cataloging record. MARC leader fields are a group of fixed-length codes that represent the type of resource (monograph, serial, or musical score) and material format (print, elec- tronic, or sound recording). The information could be helpful for patrons if they are displayed with the proper label on the OPAC. Libraries that chose to display the leader data on their OPACs did not do a good job of making the information clear to users. For example, one library listed “journals and newspapers,” “computer file,” “serial,” “book,” “e-resource,” and “gov publication” under the label “record type” (see figure 8). Seeing so many record types under one label can easily confuse library users. 3. Some libraries omitted certain crucial variable fields, e.g., the linking entry complexity note (field 580, containing information about title history), related title access entries (fields 730 and 740, containing related titles), and linking entries (link- ing the record to other bibliographically related records, e.g., 76X, 77X, and 78X fields). These fields are defined with a clear purpose and should be carefully considered for public display with clear labels. Some libraries in our study displayed them but left other irrelevant information on the OPAC, which clutters the display with information that does not help users. See example in figure 9. In this example, under the label “related publica- tion,” the French version and the Spanish version of JAMA are displayed. In addition to the French title and the Spanish title, the MARC 21 language code and its corresponding ISSN are also displayed. The language code and the eight-digit ISSN number— since no separate label is provided for them—are confusing. 4. The linking elements not only should be displayed on the OPAC, but should also be hyperlinkable. They ought to be used to link to related biblio- graphic records. In an online environment, this sort of field can also function as a descriptive element. Some OPACs displayed linking entries but did not enable hyperlink functionality. Some libraries dis- played two instances of them, one as a descriptive element and the other as a linking element with hyperlink capability. Another important aspect of making use of bib- liographic data is repurposing the bibliographic data to provide added value to OPACs. Lorcan Demsey mentions frequently in his blog that in order to sustain library value, libraries should “make data work harder.” He points out that “libraries have invested a great deal in bibliographic data—yet it has remained somewhat inert in our catalogs, failing to release the value of the investment.”22 These rich data can be better utilized for different purposes, including designing an enhanced OPAC. Lavoie, et al. described further in their recent article about data mining: As more activities move into networked spaces, more areas of our lives are shedding data. This data is increasingly being mined for intelligence that drives services. . . . [C]ompanies like Amazon repurpose data to create added value. This is a lesson librarians must learn if they want to improve their own visibility and value in increasingly crowded digital information spaces where users, as always, want good results with- out too much time or effort. . . . The good news is that libraries don’t come to this task empty-handed but with Figure 5. Online OPAC record from SUNY Buffalo Figure 6. Online OPAC record from the College of New Jersey 16 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200816 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 rich, structured information about the materials in our collections.23 Tim O’Reilly highlighted in his article the successful example of how Amazon reutilizes data: Amazon relentlessly enhanced the data, adding pub- lisher-supplied data such as cover images, table of contents, index, and sample materials. Even more importantly, they harnessed their users to annotate the data, such that after ten years, Amazon, not Bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. . . . Effectively, Amazon “embraced and extended” their data suppliers.24 All OPACs reviewed in the study operate within the traditional vendor-supplied module. This long-estab- lished approach gives libraries limited flexibility to cus- tomize the search key options, search results displays, restricted sorting options, and pre- and post-search limit options of their OPACs. Unfortunately, libraries can do very limited data mining inside the vendor’s hard-coded framework. Many valuable metadata are buried in the bibliographic database. System vendors have failed to make the most of technology to better utilize data. Very few libraries have thought outside the box and taken advantage of the existing rich bibliographic data. The emergence of North Carolina State University’s Endeca- powered OPAC was a good example of repurposing data and creating value-added information. The data facets used on NCSU’s single search-and- browse combined OPAC interface are pulled and repur- posed from their Sirsi/Dynix database. As one might have expected, eight of the eleven facets are extracted from the library’s MARC bibliographic records (“availability” and “browse: new” are from item records). Out of the eight fac- ets, four are extracted from subject headings; two are from the fixed fields; one is from the call number field and one from the variable fields of the bibliographic record.25 ■ Discussion and recommendation Based on the authors’ findings above, the following are the primary factors that have contributed to the ineffectiveness of the OPACs offered by today’s academic libraries. 1. System limitations The inadequacy of today’s ILS has been a known problem. Inflexible search options make library cata- logs difficult to use. Despite the fact that some vendors diligently enhance their systems’ functionalities, overall performance is still disappointing. Karen Markey pointed out in a recent article that one of the reasons why the solu- tions recommended by researchers in the 1990s were not applied to online library catalogs was “the failure of ILS vendors to monitor shifts in information-retrieval tech- nology and respond accordingly with system improve- ments.”26 Antelman et al. observed similarly that all major ILS vendors are still marketing catalogs that represent second-generation functionality. Despite between-record linking made possible by migrating catalogs to Web interfaces, the underlying indexes and exact-match Boolean search remain unchanged. It can no longer be said that more sophisticated approaches to searching are too expensive computationally; they may, however, to be too expensive to introduce into legacy systems from a business perspective.27 Since ILS vendors first introduced their products back in the 1980s, user behavior and expectations have changed immensely. While libraries have started to Figure 7. Online OPAC record from the University of Washington. Figure 9. Online OPAC record from the University of Michigan.Figure 8. Online record from SUNY Buffalo aRtiCLE titLE | autHoR 17REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 17 recognize the changes and are working hard toward meeting the needs of multiple generations of users, little can be done if ILS products still operate within the same old-fashioned information-retrieval structure. Because ILS vendors have failed to revamp their OPAC modules to meet user needs, libraries have been forced to seek other options. North Carolina State University is one of the first libraries to exercise its options. Its new OPAC system, powered by Endeca (operated on the Sirsi/Dynix platform), has shown remarkable improvements in ease of use, which usability tests have verified. Recently, two ILS vendors (Innovative and Ex Libris) have been in the process of developing new OPAC modules using new technology and a new approach in data mining. 2. Libraries are not fully exploiting the functionality already made available by ILSs Unsurprisingly, the OPACs examined by the authors, if powered by the same vendor, showed similarities in general layout and interface features. During the study, it soon turned out to be easy for the authors to recognize the ILS system of each OPAC. As mentioned previously, we expected OPACs to vary somewhat. What was unexpected was the huge differences in, among other things, interface layout, search options and search languages, behind-the- scenes search algorithms, search results displays, display labels and the corresponding bibliographic data, and what data was chosen for display. The disparities that we found in these features suggested that there had been great differ- ences in the amount of attention, energy, and time devoted by each library to designing its OPAC. Some libraries took advantage of available features and made better use of them than others. (See appendix G for examples of best practices of library OPACs.) Many libraries did only the very minimum. While we recognize that academic library OPACs are difficult to use, we also need to recognize that some libraries do not fully exploit existing resources, thereby exacerbating the difficulty of using their OPACs. 3. The unsuitability of MARC standards to online bib- liographic display As previously mentioned, AACR and MARC were initially designed for card catalogs without display labels in mind. Many MARC fields can be used for multiple purposes. Providing labels that properly fit all the cata- loging data needed to cover all types of resources is nearly impossible. From the OPACs studied, some librar- ies used vague labels in an effort to encompass as many circumstances as possible. Some libraries used labels suit- able only for certain formats, but not all formats. Neither approach is satisfactory. The solution has to come from cataloging and metadata standards. Wool identified this issue back in the 1990s: The interchangeability of descriptive data elements and access points (since each can be made to serve both functions online) makes the separate creation of description and headings seem pointless and bur- densome. Labeling of data elements (made possible through the mapping of terms to MARC fields) creates a need for simpler, less ambiguous bibliographic data definitions than are appropriate for the dense and context-rich narrative-style records catalogers continue to create . . . Cataloging standards will need to be rewritten in order to provide the kind of data flexibility expected in online systems . . . records flexible enough to be added to, subtracted from, and rearranged without loss and gar- bling of meaning. What is needed is a modular record structure, in which every segment of data can stand on its own with appropriate labeling and which can sup- port all possible display lengths and combinations of data elements.28 A decade later, not much progress has been made in improving cataloging and metadata standards for online display. While enhancing cataloging and metadata stan- dards for better retrieval is desirable, making the stan- dards more complicated and difficult to adopt in order to accommodate OPAC displays is not. As librarians are working to simplify cataloging, our essential rich metadata should not be sacrificed. One possible solution is to have the system recognize the existence of certain subfields and produce specific display labels accordingly. This certainly will not solve all the issues with regard to display labels. Regardless, there is much room for improvement, and librarians’ attention is this area is critically needed. ■ Conclusion The information-seeking world has entered an era of self- service. Roy Tennant described well the self-service trend: “I wish I had known that the solution for needing to teach our users how to search our catalog was to create a system that didn’t need to be taught.”29 Tim O’Reilly also indicated in his article “What is Web 2.0” that “the Web 2.0 lesson [is to] leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.” He also argued that “[t]rusting users as co-devel- opers” is one of the core competencies of Web 2.0 compa- nies.30 Academic libraries should aim toward designing a user-centered, self-sufficient, twenty-first-century online catalog that fits the Web 2.0 model. The ultimate goal is that users will be comfortable and confident using library OPACs for their information needs wherever a computer 18 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200818 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 is available and without special training. As Campbell and Fast have trenchantly asked, “Are we witnessing a major disruption, a large-scale redefi- nition of information design and delivery so radically different from the traditional library environment that it renders irrelevant all our experience in bibliographic con- trol?”31 This remains an open question. Regardless, a new generation of OPACs will need to be in place soon. Much needs to be done to make academic library OPACs mat- ter. Academic librarians cannot afford to be considered irrelevant in the information-seeking world. The future of academic libraries relies on effective OPACs. This is one of the most pressing tasks that must be accomplished. References and notes 1. Cathy De Rosa et al., Perceptions of Libraries and Information Resources: A Report to the OCLC Membership (Dublin, Ohio: OCLC, 2005), 1–17. http://www.oclc.org/reports/2005perceptions.htm (accessed Jan. 20, 2007). 2. Karen Coyle and Diane Hillmann, “Resource Description and Access (RDA): Cataloging Rules of the 20th Century,” D-Lib Magazine 13, no. 1/2 (2007). http://www.dlib.org/dlib/janu- ary07/coyle/01coyle.html (accessed Feb. 3, 2007). 3. Anna M.Van Scoyoc and Caroline Cason, “The Electronic Academic Library: Undergraduate Research Behavior in a Library Without Books,” Portal: Libraries and the Academy 6, no. 1 (2006): 47–58. 4. Carol Tenopir, “User and Users of Electronic Library Resources: An Overview and Analysis of Recent Research Stud- ies,” Council on Libraries and Information Resources, 2003. http://www.clir.og/pubs/reports/pub120/pub120 (accessed Jan. 20, 2007). 5. Cathy De Rosa et al., The 2003 OCLC Environmental Scan (Dublin, Ohio: OCLC, 2003), http://www.oclc.org/reports/ escan/introduction/default.htm (accessed Jan. 20, 2007). 6. D. Grant Campbell and Karl V. Fast, “Panizzi, Lubetzky, and Google: How the Modern Web Environment Is Reinventing the Theory of Cataloguing,” The Canadian Journal of Information and Library Science 28, no. 3 (2004): 25–38. 7. Roy Tennant, “Breaking Library Services Out of the Box,” Presentation (2005), http://www.cdlib.org/inside/news/ presentations/rtennant/2005netspeed/ (accessed Feb. 11, 2007); Andrew Pace, “My Kingdom for an OPAC,” American Libraries Online (Feb. 2005), http://www.ala.org/ala/alonline/ techspeaking/2005colunms/techFeb2005.cfm (accessed Feb. 11, 2007); Karen G. Schneider, “How OPACs Suck, Part 1: Relevance Rank (Or the Lack of It),” ALA TechSource Blog (Mar. 13, 2006), http://www.techsource.ala.org/blog/2006/03/how-opacs- suck-part-1-relevance-rank-or-the-lack-of-it.html (accessed Feb. 11, 2007); Karen G. Schneider, “How OPACs Suck, Part 2: The Checklist of Shame,” ALA TechSource Blog (Apr. 3, 2006), http:// www.techsource.ala.org/blog/2006/04/how-opacs-suck-part- 2-the-checklist-of-shame.html (accessed Feb. 11, 2007); “How OPACs Suck, Part 3: The Big Picture,” ALA TechSource Blog (May 20, 2006), http://www.techsource.ala.org/blog/2006/05/ how-opacs-suck-part-3-the-big-picture.html (accessed Feb. 11, 2007); Lorcan Dempsey, Lorcan Dempsey’s Weblog (Oct. 4, 2005), http://orweblog.oclc.org/archives/000815.html (accessed Feb. 11, 2007); Kristin Antelman, Emily Lynema, and Andrew K. Pace, “Toward a Twenty-First Century Library Catalog,” Infor- mation Technology and Libraries 25, no. 3 (2006): 128–139. 8. Roy Tennant, “Libraries Through the Looking-Glass,” 2004 ALA Midwinter Endeavor presentation. http://www.cdlib. org/inside/news/presentations/rtennant/2004ala/ (accessed March 16, 2007). 9. Charles Ammi Cutter, Rules for a Printed Dictionary Cata- logue (Washington, D.C.: Government Printing Office, 1876). 10. D. Grant Campbell and Karl V. Fast, “Panizzi, Lubetzky, and Google: How the Modern Web Environment Is Reinventing the Theory of Cataloguing,” 31. 11. Holly Yu and Margo Young, “The Impact of Web Search Engines on Subject Searching in OPAC,” Information Technology and Libraries 23, no.4 (2004): 194. 12. Jakob Nielsen, “Mental Models for Search Are Getting Firmer,” in Jakob Nielsen’s Alertbox, http://www.useit.com/ alertbox/20050509.html (accessed Feb 20, 2007). 13. Eng Pwey Lau and Dion Hoe-Lian Goh, “In Search of Query Patterns: A Case Study of a University OPAC,” Informa- tion Processing and Management 42, no. 1 (2006): 1316–1329. 14. Holly Yu and Margo Young, “The Impact of Web Search Engines on Subject Searching in OPAC,” 173. 15. Dinet Jérome, Favart Monik and Passerault Jean-Michel, “Searching for Information in an Online Public Access Cata- logue OPAC: The Impacts of Information Search Expertise on the Use of Boolean Operators,” Journal of Computer Assisted Learning 20, no. 5 (2004): 338–346. 16. Gregory Wool, “The Many Faces of a Catalog Record: A Snapshot of Bibliographic Display Practices for Monographs in Online Catalogs,” Information Technology and Libraries 15, no. 3 (1996): 184. 17. The fifteen libraries are located at The College of New Jersey, Library of Congress, Northwestern University, Princeton University, State University of New York at Buffalo, Temple Uni- versity, University of Arizona, University of Florida, University of Illinois–Urbana-Champaign, University of Michigan, Univer- sity of Minnesota, University of Rochester, University of Texas– Austin, University of Washington, and Vanderbilt University. 18. Gregory Wool, “The Many Faces of a Catalog Record: A Snapshot of Bibliographic Display Practices for Monographs in Online Catalogs,” 173–195. 19. Eight titles representing monograph, serial, video record- ing, and sound recording were used to study the effectiveness of the bibliographic display. The eight titles are: (1) To love the wind and the rain: African Americans and environ- mental history, edited by Dianne D. Glave and Mark Stoll. University of Pittsburgh Press, 2006. (Monograph) (2) To kill a mocking bird, by Harper Lee (Mongraph) (3) RMA annual statement studies, Robert Morris Associates, 1977- (Serial) (4) Sideways (20th Century Fox, 2004) (Video recording) (5) Chamber music (Newport Classic, 2000) (Sound recording) (6) End of summer book of hours ; Bright music, Naxos, 2003 / by Ned Rorem (Sound recording) (7) JAMA : the journal of the American Medical Association, 1960- (Serial) aRtiCLE titLE | autHoR 19REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 19 (8) The 21st century at work, by Lynn A. Karoly (Rand, 2004) (Mongraph) 20. Many vendors retag the 440 field to 490 in bibliographic record and create an 830 field based on the contents of the 440 field. The series title in the 830 field receives authority control. Many libraries prefer not to restore the 830 field back to the 440 fields causing the duplicate series statements on OPAC if both fields are displayed. 21. Lorcan Demsey, “Making Data Work—Web 2.0 and Cata- logs.” 22. Ibid. 23. Brian Lavoie, Lorcan Dempsey, and Lynn Silipigni Con- naway, “Making Data Work Harder,” Library Journal.com (Jan. 15, 2006), http://www.libraryjournal.com/article/CA6298444. html (accessed Jan. 28, 2006). 24. Tim O’Reilly, “What Is Web 2.0: Design Patterns and Business Models for the Next Generation of Software,” (Sept. 30, 2005), http://www.oreillynet.com/pub/a/oreilly/tim/ news/2005/09/30/what-is-web-20.html (accessed Jan. 28, 2007). 25. Tito Sierra, “A Faceted Interface to the Library Cata- log,” ALA 2007 Midwinter Meeting, http://www.lib.ncsu.edu/ endeca/presentations.html (accessed Feb. 11, 2007). 26. Karen Markey, “The Online Library Catalog: Paradise Lost and Paradise Regained?” D-Lib Magazine 13, no.1/2 (2007). http://www.dlib.org/dlib/january07/markey/01markey.html (accessed Feb. 11, 2007). 27. Kristin Antelman, Emily Lynema, and Andrew K Pace, “Toward a Twenty-First Century Library Catalog,” 129. 28. Gregory Wool, “The Many Faces of a Catalog Record: A Snapshot of Bibliographic Display Practices for Monographs in Online Catalogs,” 184–185. 29. Roy Tennant, “Lipstick on a Pig,” Library Journal.com (Apr. 15, 2005), http://libraryjournal.com/article/CA516027. html (accessed Feb. 11, 2007). 30. Tim O’Reilly, “What Is Web 2.0: Design Patterns and Busi- ness Models for the Next Generation of Software.” 31. D. Grant Campbell and Karl V. Fast, “Panizzi, Lubetzky, and Google: How the Modern Web Environment Is Reinventing the Theory of Cataloguing,” 26. Appendix A. Default search keys used by ARL libraries (as of March 2007) Appendix B. Keyword search keys used by Voyager libraries Keyword (Relevance) Keyword (Boolean) Keyword with relevance ranking Keyword (enclose phrases “in quotes”) Keyword Anywhere (user “” for phrase) Keyword Combined (use and/or/not “ “ for phrase) Keyword Anywhere (Relevance ranked) Keyword (and or not) Keyword Anywhere Advanced Boolean Words Anywhere Keyword Boolean Basic Keyword Keyword(s) (user AND, OR, Not, or “a phrase”) Any word anywhere Boolean search (use and or not) Relevance Keyword (User + for key terms) Command Keyword Keyword Phrase Keyword (use “And” “Or” “Not”) Keyword And Or Not( Keyword Boolean) Keyword (Results sorted by relevance) Expert Keyword Keyword Keyword Expert (user an or not “phrase”) Keyword Command Ranked Keyword Keyword 20 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200820 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Keyword (Ranked by relevance) Keyword Keyword Command Search Find all words Search for a phrase Keyword (Quick search) Boolean Search Appendix C. Default keyword search help page provided by Voyager system Keyword Search ■ Enter words and/or phrases ■ Use quotes to search phrases: "world wide web" ■ Use + to mark essential terms: +explorer ■ Use * to mark important terms: *internet ■ Use ? to truncate (cut off) words: theat? finds theaters, theatre, theatrical, etc. ■ Do not use Boolean operators (AND, OR, NOT) to combine search terms Boolean ■ Use the Boolean terms (and, or, not) to combine search terms. ■ Use quotation marks to search for a phrase, e.g., "United States" ■ Use ? to truncate a word, e.g., browser? ■ Use parentheses to group search terms, e.g., (automobile or car) and repair Appendix D. Display labels for entries of principal responsibility MARC Fields Libraries 100 (personal name) 110 (Corporate name) 111 (Meeting name) U. of Arizona Author Author Author U. of Ill. Author Author Conference LC Personal Name Corporate Name Meeting Name U. of Minnesota Author Author Author U. of Michigan Author Author Author Northwestern U. Author, etc. Author, etc. Author, etc. Princeton U. Author/Artist Author/Artist Author/Artist U. of Washington Author Author Author SUNY Buffalo Author Author Author Temple Author Corp Author Conference U. of Florida Author, etc. Author, etc. Author, etc. U. of Rochester Main Author Main Author Conference UT Austin Author Corporate author Conference TCNJ Principal author Principal author Conference name Vanderbilt U. Author Corporate author Meeting/Event name aRtiCLE titLE | autHoR 21REvitaLizinG tHE LiBRaRY opaC | mi anD WEnG 21 Appendix E. Display labels for publication extent Libraries MARC 362 field U. of Arizona Issued U. of Ill. Publication history LC Description U. of Minnesota Published U. of Michigan Pub History Northwestern U. Extent of publication Princeton U. Description U. of Washington (Suppressed from OPAC) SUNY Buffalo Publication dates Temple Publication Started U. of Florida Publishing history U. of Rochester (Suppressed from OPAC) UT Austin Publication coverage date TCNJ Description Vanderbilt U. Volume/date range Appendix F. Display labels for entries of secondary responsibility MARC Fields Libraries 700 (Personal name) 710 (Corporate name) 711 (Meeting name) U. of Arizona Other Auth Other Auth Other Auth U. of Ill Champaign Other Name Other Name Other Name LC Related Names Related Names Related Names U. of Minnesota Contributor Contributor Contributor U. of Michigan Contributors - People Contributors - Other Contributors - Other Northwestern U. Other authors, title, etc. Other authors, title, etc. Other authors, title, etc. Princeton U. Related name(s) Related name(s) Related name(s) U. of Washington Alt author Alt author Alt author SUNY Buffalo Contributors Contributors Contributors Temple Other author(s) Other author(s) Other name U. of Florida Other author(s), etc. Other author(s), etc. Other author(s), etc. U. of Rochester Other Author(s) Other Author(s) Other Author(s) UT Austin Added author (Not Display) (Not Display) TCNJ Other Contributor(s) Other Contributor(s) Conference name Vanderbilt U. Author, editor, etc. Corporate author Meeting/Event 22 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200822 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Appendix G. Examples of best practices of OPACs (accessed July 16, 2007) Search interface, including retaining search keys and searched terms University of Notre Dame http://alephprod.library.nd.edu:8991/F/?func= find-b-0&local_base=ndu01pub Keyword searching ability Michigan State University http://magic.msu.edu/search~/X Facets browsing (Endeca) North Carolina State University http://www.lib.ncsu.edu/catalog McMaster University http://libcat.mcmaster.ca Make author, subject and call number links more accessible University of Virginia https://virgo.lib.virginia.edu/uhtbin/cgisirsi/0/ UVA-LIB/0/60/1180/X Links to Amazon ratings Ohio State University http://library.ohio-state.edu/search Direct export to RefWorks Johns Hopkins University https://catalog.library.jhu.edu/ipac20/ipac. jsp?profile=default#focus University of Chicago http://libcat.uchicago.edu/ipac20/ipac. jsp?profile=ucpublic Cover art/TOC/ Summary/Review Indiana University http://www.iucat.iu.edu/authenticate.cgi?status=start Guesstimate/del.icio.us persistent link enabled Virginia Tech http://addison.vt.edu 3260 ---- aRtiCLE titLE | autHoR 23FRBRization oF a LiBRaRY CataLoG | DiCkEY 23 The Functional Requirements for Bibliographic Records (FRBR)’s hierarchical system defines families of biblio- graphic relationship between records and collocates them better than most extant bibliographic systems. Certain library materials (especially audio-visual formats) pose notable challenges to search and retrieval; the first benefits of a FRBRized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. This report will summarize the benefits of FRBR to next- generation library catalogs and OPACs, and will review the handful of ILS and catalog systems currently operat- ing with its theoretical structure. Editor’s note: This article is the winner of the LITA/ Ex Libris Writing Award, 2007. T he following review addresses the challenges and benefits of a next-generation online public access catalog (OPAC) according to the Functional Requirements for Bibliographic Records (FRBR).1 After a brief recapitulation of the challenges posed by certain library materials—specifically, but not limited to, audio- visual materials—this report will present FRBR’s benefits as a means of organizing the database and public search results from an OPAC.2 FRBR’s hierarchical system of records defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems; it thus affords both library users and staff a more streamlined navigation between related items in different materials formats and among editions and adaptations of a work. In the eight years since the FRBR report’s publication, a handful of working systems have been developed. The first benefits of such a system to an average academic library system would be felt in a branch music library, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. ■ Current search and retrieval challenges The difficulties faced first, but not exclusively, by music users of most integrated library systems fall into two related categories: issues of materials formats, and issues of cataloging, indexing, and MARC record structure. Music libraries must collect, catalog, and support materi- als in more formats than anyone else; this makes their experience of the most common ILS modules—circu- lation, reserves, and acquisitions—by definition more complicated. The study of music continues to rely on the interrelated use of three distinct information formats—scores (the notated manifestation of a composer’s or improviser’s thought), recordings (realizations in sound, and some- times video, of such compositions and improvisations), and books and journals (intellectual thought regard- ing such compositions and improvisations)—music libraries continue to require . . . collections that inte- grate [emphasis mine] these three information formats appropriately.3 Put a different way, “relatedness is a pervasive char- acteristic of music materials.”4 This is why FRBR’s model of bibliographic relationships offers benefits that will first impact the music collection.5 At present, however, musical formats pose search and retrieval challenges for most ILS users, and the problem is certainly replicated with microforms and video recordings. The MARC codes distinguish between material formats, but they support only one category for sound recordings, lumping together CD, DVD audio, cassette tape, reel-to- reel tape, and all other types.6 This single “sound record- ing” definition is easily reflected in OPACs (such as those powered by Innovative Interfaces’ Millennium and Ex Libris’ Aleph 500) and union catalogs (such as WorldCat. org).7 However, the distinction between sound recording formats is embedded in subfields of the 007 field, which presently cannot be indexed by many library automation systems because the subfields are not adjacent. An even more central challenge derives from the fact that music sound recordings—such as journals and essay collections—contain within each item more than one work. Thus, for one of the central material formats collected by a music library (as well as by a public library or other aca- demic branches), users routinely find themselves searching for a distinct subset of the item record. Perversely, though music catalogers do tend to include analytic added-entries for the subparts of a CD recording or printed score, and major ILS vendors are learning to index them, AACR2 guidelines set arbitrary cutoff points of about fifteen tracks on a sound recording, and three performable units within a score.8 Subsets of essay collections and journal runs are routinely exposed to users’ searches by indexing and abstracting services and major databases, but subsets of libraries’ music collections depend upon catalogers to exploit the MARC records for user access.9 timothy J. Dickey (dickeyt@oclc.org) is a Post-Doctoral Researcher, OCLC Office of Programs and Research, Dublin, Ohio. FRBRization of a Library Catalog: Better Collocation of Records, Leading to Enhanced Search, Retrieval, and Display Timothy J. Dickey 24 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200824 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 In light of these pervasive bibliographic relation- ships, catalogers of music (again, with parallels in other subjects) have developed a distinctive approach to the MARC metadata schema. In particular, they—with their colleagues in literature, fine arts, and theology—rely upon the 700t field for uniform work titles, and upon careful authority control.10 However, once again, many major ILS portals have spotty records in affording access to library collections via these data. Innovative Interfaces’ Millennium, though it clearly leads other major library products in this market, frequently frustrates music librarians (it is, of course, not alone in doing so).11 Its automatic authority control feature works poorly with (necessary) music authority records.12 And even though Innovative has been one of the first vendors to add a database index to the 700t field, partly in response to con- cerns expressed to the company by the Music Librarians’ User Group, Millennium apparently does not allow for an appropriate level of follow-through on searching.13 An initial search by name of a major composer, for instance, yields a huge and cluttered result set contain- ing all indexed 700t fields.14 The results do helpfully include the appropriate see also references, but those references disappear in a subsidiary (limited) search. In addition, the subsidiary display inexplicably changes to an unhelpful arrangement of generic 245 fields (“Mozart, Symphonies”; “Mozart, Operas, Excerpts”). Similar chal- lenges will be faced by other parts of an academic or large public library collection, including the literature collec- tions (for works such as Shakespeare’s plays), fine arts (for images and artists’ works), and theology (for works whose uniform title is in Latin). The OPAC interfaces of other major ILS vendors fare little better. The same search (for “Mozart”) on the Emory University Library catalog (with an ILS by SirsiDynix), similarly yields a rich results set of more than one thousand records, and poses similar prob- lems in refining the search.15 In the case of this OPAC, an index of 700t fields also exists, but it only may be searched from the inside of a single record; as with Millennium, SirsiDynix’s interface will then group the next set of results confusingly by 245 fields. The Library Corporation’s Carl-X apparently does not contain a 700t index; the simple “Mozart” search returns a much- simplified set of only 97 results organized by 245a fields, and thus offers a more concise set of results but avoids the most incisive index for audio-visual materials.16 Ex Libris offers a somewhat more helpful display of its more restricted results; unfortunately for the present comparison, though the detailed results set does list the “format” of all Mozart-authored items, the same term— “Music”—is used for sound recordings, musical scores, and score excerpts, with no attempt logically to group the results around individual works.17 No 700t index appears present. ■ THE FRBR paradigm: review of literature and theory From the earliest library catalogs in the modern age, the tools of bibliographic organization have sought to afford users both access to the collection and collocation of related materials. Anglo-American cataloging practice has traditionally served the first function by main entries and alternate access points and the second function by classification systems. However, as knowledge increases in scope and complexity, the systems of bibliographic control have needed to evolve. As early as the 1950s, theo- ries were developing that sought to distinguish between the intellectual content of a work, and its often manifold physical embodiments.18 The 1961 Paris International Conference on Cataloging Principles first reified within the cataloging community a work-item distinction, though even the 1988 publication of the Anglo-American Cataloging Rules, 2nd ed., “continued to demonstrate con- fusion about the nature . . . of works.”19 Meanwhile, extensive research into the nature of bibliographic relationships groped toward a consen- sus definition of the entity-types that could encompass such relationships.20 Ed O’Neill and Diane Vizine-Goetz examined some one hundred editions of Smollett’s The Expedition of Humphrey Clinker over a two-hundred-year span of publication history to propose a hierarchical set of definitions to define entity levels.21 The theoretical entities include the intellectual content of a work—which in the case of audio-visual works, may not even exist in any printed formats—the various versions, editions, and printings in which that intellectual content manifests itself, and the specific copies of each manifestation which a library may hold.22 Research has discovered such clus- ters of bibliographically related entities for as much as 50 percent or more of all the intellectual works in any given library catalog, and as many as 85 percent of the works in a music catalog.23 This work laid the foundation for FRBR (and, once again, incidentally underscored the breadth of its applicability to, and beyond, music catalogs). The theoretical framework of FRBR is most concisely set forth in the Final Report of the IFLA study group. The long-awaited publication traces its genesis to the 1990 Stockholm Seminar, and the resultant 1992 founding of the ILFA Study Group on Functional Requirements for Bibliographic Records. The study group set out to develop: a framework that identifies and clearly defines the entities of interest to users of bibliographic records, the attributes of each entity, and the types of relationships that operate between entities . . . a conceptual model that would serve as the basis for relating specific attri- butes and relationships . . . to the various tasks that users perform when consulting bibliographic records. aRtiCLE titLE | autHoR 25FRBRization oF a LiBRaRY CataLoG | DiCkEY 25 The study makes no a priori assumptions about the bibliographic record itself, either in terms of content or structure.24 In other words, the intention of the group’s delibera- tions and the Final Report is to present a model for under- standing bibliographic entities and the relationships between them to support information organization tools. It specifically adopts an approach that defines classes of entities based upon how users, rather than catalogers, approach bibliographic records—or, by natural extension, any system of metadata. The FRBR hierarchical entities comprise a fourfold set of definitions: ■ Work: “a distinct intellectual or artistic creation”; ■ Expression: “the intellectual or artistic realization of a work” in any combination of forms (including edi- tions, arrangements, adaptations, translations, per- formances, etc.); ■ Manifestation: “the physical embodiment of an expression of a work”; and ■ Item: “a single exemplar of a manifestation.”25 Examples of these hierarchical levels abound in the bibliographic universe, but frequently music offers the quickest examples: ■ Work: Mozart’s Die Zauberflöte (The Magic Flute) ■ Work: Puccini’s La Bohéme ■ Expression: The composer’s complete musical score (1896) ■ Manifestation: Edition of the score printed by Ricordi in 1897 ■ Expression: An English language edition for piano and voices ■ Expression: A performance by Mirella Freni, Luciano Pavarotti, and the Berlin Philharmonic Orchestra (October 1972) ■ Manifestation: A recording of this perfor mance released on 33¹/³ RPM sound discs in 1972 by London Records ■ Manifestation: A re-release of the same per formance on compact disc in 1987 by London Records ■ Item: The copy of the compact disc held by the Columbus Metropolitan Library ■ Item: The copy of the compact disc held by the University of Cincinnati In fact, LIS research has tended to demonstrate what music librarians have always understood—that related- ness among items and complexity of families is most prevalent in audio-visual collections. Even before the IFLA Report had been penned, Sherry Vellucci had set out the task: “To create new catalog structures that better serve the needs of the music user community, it is important first to understand the exact nature and complexity of the materials to be described in the catalog.”26 Even limiting herself to musical scores alone (that is, no recordings or monographs), Vellucci found that more than 94.8 percent of her sample exhibited at least one bibliographic relationship with another entity in the collection; she further related this finding to the very “inherent nature of music, which requires performance for its aural realization,” as opposed to, for example, monographic book printing.27 Vellucci and others have frequently commented on how the relatedness of manifestations—in different formats, arrangements, and abridgements—of musical works continues to be a problem for information retrieval in the world of music bibliography.28 Musical works have been variously and industriously described by musicologists and music bibliographers. Yet, in the information retrieval domain [and, I might add, under both AACR and AACR2] . . . systems for bib- liographic information retrieval . . . have been designed with the document as the key entity, and works have been dismissed as too abstract . . .29 The work is the access point many users will bring—in their minds, and thus in their queries—to a system. They intend, however, to discover, identify, and obtain specific manifestations of that work. Very recently, research has begun to demonstrate that the FRBR model can offer spe- cific advantages to music retrieval in cases such as these: “the description of bibliographic data in a FRBR-based database leads to less redundancy and a clearer presenta- tion of the relationships which are implicit in the tradi- tional databases found in libraries today.”30 Explorations of the theory in view of the benefits to other disciplines, such as audio-visual and other graphic materials, maps, oral literature, and rare books, have appeared in the literature as well.31 The admitted weakness of the FRBR theory, of course, is that it remains a theory at its incep- tion, with still preciously few working applications. ■ FRBR applications Working implementations of FRBR to catalogs, OPACs, and ILSs are still relatively few but promise much for the future. The FRBR theoretical framework has remained an area of intense research at OCLC, which has even led to some prototype applications and, very recently, deploy- ment in the WorldCat Local interface.32 A scattered few other researchers have crafted FRBR catalogs and catalog displays for their own ends; the Library of Congress has a prototype as well. Innovative, the leading academic ILS vendor, announced a FRBR feature for 2005 release, 26 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200826 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 yet shelved the project for lack of a beta-testing partner library.33 Ex Libris’ Primo discovery tool, one other com- plete ILS (by Visionary Technologies for Library Systems, or VTLS), and the National Library of Australia, have each deployed operational FRBR applications.34 The number of projects testifies to the high level of interest among the cataloging and information science commu- nities, while the relatively small number of successful applications testifies to the difficulties faced. OCLC has engaged in a number of research projects and prototypes in order to explore ways that FRBRization of bibliographic records could enhance information access. OCLC Research frequently notes the potential streamlining of library cataloging by FRBRization; in addition they have experienced “superior presentation” and “more intuitive clustering” of search results when the model is incorporated into systems.35 Work-level defini- tions stand behind such OCLC Research prototypes as Audience Level, Dewey Browser, FictionFinder, xISBN, and Live Search. In every case, researchers determined that, though it was very difficult to automate any identifi- cation of expressions, application of work-level categories both simplifies and improves search result sets.36 An algorithm common to several of these applications is freely available as an open source application, and now as a public interface option in OCLC’s WorldCat Local.37 The algorithm creates an author/title key to cluster work- sets (often at a higher level than the FRBR work, as in the case of the two distinct works that are the book and screenplay for Gone with the Wind). In the public search interface, the results sets may be grouped at the work level; users may then execute a more granular search for “all editions,” an option that then displays the group of expressions linked to the work record. Unfortunately, as the software does not use 700t fields (its intention is to travel up the entity hierarchy, and it uses the 1xx, 24x, and 130 fields), its usefulness in solving the above challenges may not be immediate. A somewhat similar application (though Merrilee Proffitt declares it not to be a FRBR product) was RedLightGreen, a user interface for the ex- RLG union catalog based upon quasi-FRBR clustering.38 The reports from designers of other automated sys- tems offer interesting commentaries on the process. The team building an automatically FRBRized database and user interface for AustLit—a new union collection of Australian literature among eight academic libraries and the National Library of Australia—acknowledged some difficulty with non-monographic works such as poems, though the majority of their database consisted of simpler work-manifestation pairs.39 Based on strongly positive user feedback (“The presentation of information about related works [is] both useful and comprehensible”), a similar application was attempted on the Australian national music gateway MusicAustralia; it is unclear whether the project was shelved due to difficulties in automating the FRBRization process.40 One recent application created for the Perseus Digital Library adopts a somewhat different approach.41 Rather than altering previously created MARC records to allow hierarchical relationships to surface, this team created new records using crosswalks between MARC and, for instance, MODS, for work-level records. They claim some moderate level of success; though once again, their discussion of the process is more illuminating than their product. Mimno and Crane successfully allowed a single manifestation-level record to link upwards to many expressions, a necessary analytic feature especially for dealing with sound recordings. They did practically dem- onstrate the difficulty of searching elements from differ- ent levels of the hierarchy at the same time (such as work title and translator), a complication predicted by Yee.42 Three ILS vendors have released products that use the FRBR model: Portia (VisualCat), Ex Libris (Primo), and VTLS (Virtua).43 The first product, a cataloging util- ity from a smaller player in the vendor market, claims to incorporate FRBR into its metadata capture, yet the information available does not explain how, nor do they offer an OPAC to exploit it. The 2007 release of Ex Libris’ Primo offers what the company calls “FRBR groupings” of results.44 This discovery tool is not itself an ILS, but promises to interoperate with major existing ILS products to consolidate search results. It remains unclear at this time how Ex Libris’ “standard FRBR algorithms” actu- ally group records; the single deployment in the Danish Royal Library allows searching for more records with the same title, for instance, but does not distinguish between translations of the same work.45 VTLS, on the other hand, has since 2004 offered a complete product that has the potential to modify existing MARC records—via local linking tags in the 001 and 004 fields—to create FRBR relationships.46 Their own studies agreed with OCLC that a subset, roughly 18 percent, of existing catalog records (most heavily concentrated in music collections) would benefit from the process, and they thus allow for “mixed” catalogs, with only subsets (or even individually selected records) to be FRBRized. The company’s own information suggests relatively sim- ple implementation by library catalogers, coupled with robust functionality for users, and may be the leading edge of the next generation of catalog products. ■ FRBR solutions The ILFA Study Group, following its user-centered approach, set out a list of specific tasks that users of a computer-aided catalog should be able to accomplish: aRtiCLE titLE | autHoR 27FRBRization oF a LiBRaRY CataLoG | DiCkEY 27 ■ to find all manifestations embodying certain criteria, or to find a specific manifestation given identifying information about it; ■ to identify a work, and to identify expressions and manifestations of that work; ■ to select among works, among expressions, and among manifestations; and ■ to obtain a particular manifestation once selected. It seems clear that the FRBR model offers a framework of relationships that can aid each task. Unfortunately, none of the currently available commercial solutions may be in themselves completely applicable for a single library. The OCLC Work-set Algorithm is open source, as well as easily available through WorldCat Local, but it only works to create super-work records; it also ignores the 700t field so crucial to many of the issues noted above. None of the other home-grown applications may have code available to an institution. The Virtua module from VTLS offers a very tempting solution, but may require a change of vendor.47 Either adapting one of these solutions or designing a local application, then, raises the question: What would the ideal system entail? Catalog FRBRization will tran- spire in two segments: enhancing the existing catalog to add bibliographic relationships to surface in the retrieval phase, and designing or adaptating a new interface and display to reflect the relationships.48 The first task may prove the more formidable, due to the size of even a mod- est catalog database and the difficulties often observed in automating such a task; while the librarians constructing the AustLit system found a relatively high percentage of records could be transferred en masse, the OCLC Research team had difficulty automatically pinpointing expressions from current MARC records.49 Despite current technology trends toward users’ application of tags, reviews, and other metadata, a task as specialized as adding bibliographic relationships to the catalog demands specialized cataloging profession- als.50 The best approach within a current library structure may be to create a single new position to head the project and to act as liaison with cataloging staff in the vari- ous branches and with vendor staff, if applicable. Each library branch may judge on its own the proportions of records to FRBRize, beginning with high-traffic works and authors, those for whom search results tend to be the most overwhelming and confusing to users. Each branch can be responsible for allocation of cataloging staff effort to the process, and will thus have specialist oversight of subsets of the database. Three technical solutions to actually changing the database structure have been attempted in the literature to date: incrementally improving the existing MARC records to better reflect bibliographic relationships, add- ing local linking tags, and simply creating new metadata schemas. The VTLS solution of adding local linking tags seems most appropriate; relationships between records are created and maintained via unique identifiers and linking statements in the 001 and 004 fields.51 OCLC’s open source software could expedite the creation of work-level records, and the creation of expression-level records will be made easier by the large amount of bib- liographic information already present in the current catalog. Wherever possible, cataloging staff also should take the opportunity to verify or create links to authority files so as to enhance retrieval.52 Creating a new catalog display option could be accom- plished via additions to current OPAC coding, either by adopting WorldCat Local or by designing parts of a new local interface. It need not even require a complete revi- sion; the single site (UCL) currently deploying VTLS’ FRBRized interface maintains a mixed catalog and offers, once again, a highly intuitive model.53 When a searcher comes across a bibliographic record for which FRBR linking is available, they may click a link to open a new display screen. We should strive, however, to use simple interface statements such as “View all different kinds of holdings,” “This work has x editions, in y languages” or “This version of the work has been published z times” (both the OCLC prototype and the AustLit Gateway offer such helpful and user-friendly statements). Though the foundational work of both Tillett and Smiraglia focused upon taxonomies of relationships, the hierarchical struc- ture of the IFLA proposal should remain at the forefront of the display, with a secondary organization by type of relationship or type of entity. Rather than adopting a design which automatically refreshes at each click, a tree organization of the display should be more user-friendly, allowing users to maintain a visual sense of the organiza- tion that they are encountering (see Appendix for screen- shots of this type of tree display).54 Format information should be included in the display, as an indication of a users’ primary category, as well as a distinction among expressions of a work. With these changes, the library catalog will begin to afford its users better access to many of its core collec- tions. FRBRization of even part of the catalog—concen- trating on high-incidence authors, as identified by subject specialists—will allow it better to reflect, and collocate, items within the families of bibliographic relationships that have been acknowledged a part of library collec- tions for decades. This increased collocation will begin to counteract the pitfalls of mere keyword searching on the part of users, especially in conjunction with renewed authority work. Finally, FRBR offers a display option in a revamped OPAC that is at the same time simpler than current result lists, and more elegant in its reflection of relatedness among items. Each feature should better 28 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200828 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 enable the users of our catalog to find, select, and obtain appropriate resources, and will bring our libraries into the next generation of cataloging practice. References and notes 1. IFLA Committee on the Functional Requirements for Bib- liographic Records, Final Report (Munich: K. G. Saur, 1998); see also http://www.ifla.org/VII/s13/wgfrbr/bibliography.htm (accessed Mar. 10, 2007). 2. This paper began as a graduate research assignment for LIS 60640 (Library Automation), in the Kent State University MLIS program, March 19, 2007. My thanks to Jennifer Ham- brick, Nancy Lensenmayer, and Joan Lippincott, for their helpful comments on earlier drafts. The curricular assignment asked for a library automation proposal in a specific library setting; the original review contained a set of recommendations concerning FRBR through the lens of a (fictional) medium-sized academic library system, that of St. Hildegard of Bingen Catholic Univer- sity. As will be noted below, the branch Music Library typically serves a small population of music majors (graduate and under- graduate) within such an institution, but also a large portion of the student body that use the library’s collection to support their music coursework and arts distribution requirements. Any music library’s proportion of the overall system’s holdings may be relatively small, but will include materials in a diverse set of formats: monographs, serials, musical scores, sound record- ings in several formats (cassette tapes, LPs, CDs, and stream- ing audio files), and a growing collection of video recordings, likewise in several formats (VHS, laser discs, and DVD). It thus offers an early test case for difficulties with an automated library system. 3. Dan Zager, “Collection Development and Management,” Notes—Quarterly Journal of the Music Library Association 56, no. 3 (March 2000): 569. 4. Sherry L. Velluci, “Music Metadata and Authority Con- trol in an International Context,” Notes—Quarterly Journal of the Music Library Association 57, no. 3 (Mar. 2001): 541. 5. The OPAC for the University of Huddersfield Library system famously first deployed a search option for related items (“Did you mean . . . ?”); http://www.hud.ac.uk/cls (accessed July 10, 2007). FRBR not only offers the related item search, but also logically groups related works throughout the library cata- log. 6. Allyson Carlyle demonstrated empirically that users value an object’s format as one of the first distinguishing fea- tures: “User Categorization of Works: Toward Improved Organi- zation of Online Catalog Displays,” Journal of Documentation 55, no. 2 (Mar. 1999): 184–208 at 197. 7. Millennium will feature heavily in the following discus- sion, both because of its position leading the academic library automation market (being adopted wholesale by, for instance, the Ohio statewide academic library consortium), and because it was the subject of the original paper. 8. See Alastair Boyd, “The Worst of Both Worlds: How Old Rules and New Interfaces Hinder Access to Music,” CAML Review 33, no. 3 (Nov. 2005), http://www.yorku.ca/caml/ review/33-3/both_worlds.htm (accessed Mar. 12, 2007); Michael Gorman and Paul W. Winkler, eds., Anglo-American Cataloging Rules, 2nd ed. (Chicago: ALA, 1988). 9. In the past few years, a small subset of the search litera- ture has described technical efforts to develop search engines that can query by musical example; see J. Stephen Downie, “The Scientific Evaluation of Music Information Retrieval Systems: Foundations and Future,” Computer Music Journal 28, no. 2 (Sum- mer 2004): 12–23. A company called Melodis Corporation has recently announced a successful launch of a query-by-humming search engine, though a verdict from the music community remains out; http://www.midomi.com (accessed Jan. 31, 2007). 10. See Velluci, “Music Metadata and Authority Control in an International Context”; Richard P. Smiraglia, “Uniform Titles for Music: An Exercise in Collocating Works,” Cataloging and Classification Quarterly 9, no. 3 (1989): 97–114; Steven H. Wright, “Music Librarianship at the Turn of the Century: Technology,” Notes—Quarterly Journal of the Music Library Association 56, no. 3 (Mar. 2000): 591–97. Each author builds upon the foundational work of Barbara Tillett, “Bibliographic Relationships: Toward a Conceptual Structure of Bibliographic Information Used in Cataloging” (Ph.D. diss., University of California at Los Ange- les, 1987). 11. “At conferences, [my colleagues] are always groaning if they are a Voyager client,” interview with an academic music librarian by the author, Feb. 9, 2007. 12. Several prominent music librarians only discovered that Innovative’s system had such a feature when instances of the automatic system’s changing carefully crafted music authority records were discovered; Mark Sharff (Washington University in St. Louis) and Deborah Pierce (University of Washington), post- ings to Innovative Music Users’ Group electronic discussion list, Oct. 6, 2006, archive accessed Feb. 1, 2007. 13. Music librarians are the only subset of the Millennium users to have formed their own Innovate Users’ Group. Sirsi- Dynix has a separate Users’ Group for STM librarians, and Ex Libris hosts a Law Librarians’ Users’ Group, two other groups whose interaction with the ILS poses discipline-specific chal- lenges. 14. Searches were tested on the The Ohio State University Libraries’ OPAC , http://library.osu.edu (accessed Mar. 10, 2007). 15. http://www.emory.edu/libraries.cfm (accessed June 27, 2007). 16. Searches performed on the library of Oklahoma State University, http://www.library.okstate.edu (accessed June 27, 2007); TLC has considered making FRBRization a possible fea- ture of their product. They offer some concatenation of “intel- lectually similar bibliographic records,” and “TLC continues to monitor emerging FRBR standards”; Don Kaiser, personal communication to the author, July 8, 2007. I was unable to reach representatives of SirsiDynix on this issue. 17. Searches performed on the MIT Library catalog, pow- ered by ALEPH 500 http://libraries.mit.edu (accessed June 27, 2007). 18. Eva Verona, “Literary Unit versus Bibliographic Unit [1959],” in Foundations of Descriptive Cataloging, ed. Michael Car- penter and Elaine Svenonius, 155–75 (Littleton, Colo.: Libraries Unlimited, 1985), and Seymour Lubetzky, Principles of Catalog- ing, Final Report Phase I: Descriptive Cataloging (Los Angeles: Institute for Library Research, 1969), are usually credited with aRtiCLE titLE | autHoR 29FRBRization oF a LiBRaRY CataLoG | DiCkEY 29 the foundational work on such theories; see Richard P. Smira- glia, The Nature of “A Work”: Implications for the Organization of Knowledge (Lanham, Md.: Scarecrow, 2001), 15–33, to whom the following overview is indebted. 19. Anglo-American Cataloging Rules, cited in Smiraglia, The Nature of “A Work,” 33. 20. Among the many library and information science thinkers contributing to this body of research, the most prominent have been Patrick Wilson, “The Second Objective” in The Conceptual Foundations of Descriptive Cataloging, ed. Elaine Svenonius, 5–16 (San Diego: Academic Publ., 1989); Edward T. O’Neill and Diane Vizine-Goetz, “Bibliographic Relationships: Implications for the Function of the Catalog,” in The Conceptual Foundations of Descriptive Cataloging, ed. Elaine Svenonius, 167–79 (San Diego: Academic Publ., 1989); Barbara Ann Tillett, “Bibliographic Relationships: Toward a Conceptual Structure of Bibliographic Information Used in Cataloging” (Ph.D. diss, University of California, Los Angeles, 1987); eadem, “Bibliographic Relation- ships,” in Relationships in the Organization of Knowledge, Carol A. Bean and Rebecca Green, eds. , 19–35 (Dordrecht: Kluwer, 2001) (summary of her dissertation findings on 19–20); Martha M. Yee, “Manifestations and Near-Equivalents: Theory with Special Attention to Moving-Image Materials,” Library Resources and Technical Services 38, no. 3 (1994): 227–55. 21. O’Neill and Vizine-Goetz, “Bibliographic Relationships”; see also Edward T. O’Neill, “FRBR: Application of the Entity- relationship Model to Humphrey Clinker,” Library Resources and Technical Services 46, no. 4 (Oct. 2002): 150–59. 22. Theorists in music semiotics who have more or less pro- foundly influenced music librarians’ view of their materials include Jean-Jacques Nattiez, Music and Discourse: Toward a Semi- ology of Music, trans. by Carolyn Abbate (Princeton, N.J.: Princ- eton Univ. Pr., 1990), and Lydia Goehr, The Imaginary Museum of Musical Works (New York: Oxford Univ. Pr., 1992). See also Smiraglia, The Nature of “A Work,” 64. For a concise overview of how semiotic theory has influenced thinking about literary texts, see W. C. Greetham, Theories of the Text (Oxford: Oxford Univ. Pr., 1999), 276–325. 23. Studies have found families of derivative bibliographic relationships in 30.2 percent of all WorldCat records, 49.9 per- cent of records in the catalog of Georgetown University Library, 52.9 percent in the Burke Theological Library (Union Theologi- cal Seminary), 57.9 percent of theological works in the New York University Library, and 85.4 percent in the Sibley Music Library at the Eastman School of Music (University of Rochester). See Smiraglia, The Nature of “A Work,” 87, who cites Richard P. Smiraglia and Gregory H. Leazer, “Derivative Bibliographic Relationships: The Work Relationship in a Global Bibliographic Database,” Journal of the American Society for Information Science 50 (1999): 493–504; Richard P. Smiraglia, “Authority Control and the Extent of Derivative Bibliographic Relationships” (Ph.D. diss., University of Chicago, 1992); Richard P. Smiraglia, “Deriv- ative Bibliographic Relationships among Theological Works,” Proceedings of the 62nd Annual Meeting of the American Society for Information Science (Medford, N.J.: Information Today, 1999): 497–506; and Sherry L. Vellucci, “Bibliographic Relationships among Musical Bibliographic Entities: A Conceptual Analysis of Music Represented in a Library Catalog with a Taxonomy of the Relationships” (D.L.S. diss., Columbia University, 1994). 24. IFLA, Final Report, 2–3. 25. Ibid, 16–23. 26. Sherry L. Vellucci, Bibliographic Relationships in Music Cata- logs (Lanham, Md.: Scarecrow, 1997), 1. 27. Ibid, 238; 251. 28. Vellucci, “Music Metadata”; Richard P. Smiraglia, “Musi- cal Works and Information Retrieval,” Notes: Quarterly Journal of the Music Library Association 58, no. 4 (June 2002). Patrick Le Boeuf notes that users of music collections often use the single word “score” to indicate any one of the four FRBR entities; “Musical Works in the FRBR Model or ‘Quasi la Stessa Cosa’: Variations on a Theme by Umberto Eco,” in Functional Require- ments for Bibliographic Records (FRBR): Hype or Cure-All? ed. Patrick Le Boeuf, 103–23 at 105–06 (New York: Haworth, 2005). 29. Smiraglia, “Musical Works and Information Retrieval,” 2. 30. Marte Brenne, “Storage and Retrieval of Musical Docu- ments in a FRBR-based Library Catalogue” (Masters’ Thesis, Oslo University College, 2004), 79. See also John Anderies, “Enhancing Library Catalogs for Music,” paper presented at the Conference on Music and Technology in the Liberal Arts Environment, Hamilton College, June 22, 2004; PowerPoint presentation accessed Mar. 12, 2007, from http://academics. hamilton.edu/conferences/musicandtech/Presentations/Cata- log-Enhancements.ppt; Boyd, “The Worst of Both Worlds.” 31. See the extensive bibliography compiled by IFLA, Cata- loging Division: “FRBR Bibliography,” http://www.ifla.org/ VII/s13/wgfrbr.bibliography.htm (accessed Mar. 10, 2007). 32. The first ILS deployment of the WorldCat Local applica- tion using FRBR is with the University of Washington Libraries: http://www.lib.washington.edu (accessed June 27, 2007). 33. Innovative Interfaces, Inc., “Millennium 2005 Preview: FRBR Support,” INN-Touch (June 2004), 9. Interestingly, the one- page advertisement for the new service chose a musical work, Puccini’s opera La Bohème, to illustrate how the sorting would work. Innovative Interfaces booth staff at the ALA National Conference, Washington, D.C., June 24, 2007, told the author the company has moved in a different development direction now (investing more heavily in faceted browsing). 34. Denmark’s Det Kongelige Bibliotek has been the first Ex Libris partner library to deploy Primo, http://www.kb.dk/en (accessed July 10, 2007). The VTLS system has been operating since 2004 at the Université catholique de Louvain, http:// www.bib.ucl.ac.be (accessed Mar. 15, 2007). For Austlit, see http://www.austlit.edu.au (accessed Mar. 14, 2007). 35. Rick Bennett, Brian F. Lavoie, and Edward T. O’Neill, “The Concept of a Work in WorldCat: An Application of FRBR,” Library Collections, Acquisitions, and Technical Services 27, no. 1 (Spring 2003): 45–60. Work-level records allow manifestation and item records to inherit labor-intensive subject classification metadata; Eric Childress, “FRBR and OCLC Research,” paper presented at the University of North Carolina-Chapel Hill, Apr. 10, 2006, http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed Mar. 12, 2007). 36. Thomas B. Hickey, Edward T. O’Neill, and Jenny Toves, “Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR),” D-Lib 8, no. 9 (Sept. 2002), http://www.dlib.org/dlib/september02/hickey/09hickey.html (accessed Mar. 12, 2007). 37. Thomas B. Hickey and Jenny Toves, “FRBR Work-Set Algorithm,” Apr. 2005 report, http://www.oclc.org/research/ projects/frbr/default.htm (accessed Mar. 12, 2007); algorithm 30 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200830 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 available at http://www.oclc.org/research/projects/frbr/algo- rithm.htm. On WorldCat Local, see above, note 32. 38. Merrilee Proffitt, “RedLightGreen: FRBR between a Rock and a Hard Place,” http://www.ala.org/ala/alcts/alctsconted/ presentations/Proffitt.pdf (accessed Mar. 12, 2007). RedLight Green has been discontinued, and some of its technology incor- porated into WorldCat Local. 39. http://www.austlit.edu.au (accessed Mar. 14, 2007), but unfortunately a subscription database at this time, and thus unavailable for operational comparison. See Marie-Louise Ayres, “Case Studies in Implementing Functional Requirements for Bibliographic Records: AustLit and MusicAustralia,” ALJ: The Australian Library Journal 54, no. 1 (Feb. 2005): 43–54, http:// www.nla.gov.au/nla/staffpaper/2005/ayres1.html (accessed Mar. 12, 2007). 40. Ibid. 41. See David Mimno and Gregory Crane, “Hierarchical Catalog Records: Implementing a FRBR Catalog,” D-Lib 11, no. 10 (Oct. 2005); http://www.dlib.org/dlib/october05/ crane/10crane.html (accessed Mar. 12, 2007). 42. Ibid. See also Martha M. Yee, “FRBRization: A Method for Turning Online Public Finding Lists into Online Public Cata- logs,” Information Technology and Libraries 24, no. 3 (2005): 77–95, http://repositories.cdlib.org/postprints/715 (accessed Mar. 12, 2007). 43. Portia, “VisualCat Overview,” http://www.portia.dk/ pubs/visualcat/present/visualcatoverview20050607.pdf (accessed Mar. 14, 2007); VTLS, Inc., “Virtua,” http://www.vtls. com/brochures/virtua.pdf (accessed Mar. 14, 2007). 44. http://www.exlibrisgroup.com/primo_orig.htm (accessed July 10, 2007). 45. Syed Ahmed, personal communication to the author, July 10, 2007; searches run July 10, 2007, on http://www.kb.dk/en. The library’s holdings of manifestations of Mozart’s Singspiel opera, The Magic Flute, run to four different groupings on this catalog: one under the title “Die Zauberflöte,” one under the title “La flute enchantée: Opéra fantastique en 4 actes,” and two separate groups under the title “Tryllefløtjen.” 46. “VTLS Announces First Production Use of FRBR,” http:// www.vtls.com/Corporate/Releases/2004/6.shtml (accessed Mar. 14, 2007). Unfortunately, though this press release indi- cates commitments on the part of the Université catholique de Louvain and Vaughan Public Libraries (Ontario, Canada) to use fully FRBRized catalogs, only the first is operating in this mode as of July 2007, and with only a subset of its catalog adapted. 47. Virtua is not interoperable, for instance, with any of Innovative’s other ILS modules, which continue to dominate a number of larger academic consortia; John Espley, VTLS Inc. Director of Design, personal communication to the author, Mar. 15, 2007. 48. See Allyson Carlyle, “Fulfilling the Second Objective in the Online Catalog: Schemes for Organizing Author and Work Records into Usable Displays,” Library Resources and Technical Services 41, no. 2 (1997): 79–100. 49. Even at the work-level, Yee distinguished fully eight dif- ferent places in a MARC record in which the identity of a work may be located, “FRBRization,” 79–80. 50. Gregory Leazer and Richard P. Smiraglia imply that cataloger-based “maps” of bibliographic relationships are inad- equate; “Bibliographic Families in the Library Catalog: A Quali- tative Analysis and Grounded Theory,” Library Resources and Technical Services 43, no. 4 (1999): 191–212. The cataloging fail- ures they describe, however, are more a result of inadequacies in the current rules and practice, and do not really prove that catalogers have failed in the task of creating useful systems. 51. Vinood Chacra and John Espley, “Differentiating Librar- ies though Enriched User Searching: FRBR As the Next Dimensions in Meaningful Information Retrieval,” PowerPoint presentation, http://www.vtls.com/Corporate/FRBR.shtml (accessed Mar. 10, 2007). 52. See Yee, “FRBRization.” 53. http://www.bib.ucl.ac.be (accessed Mar. 15, 2007). 54. Not only does the Ex Libris Primo application need click- throughs, it creates a new window for an extra step before pre- senting a new group of records. Bibliography Anderies, John. “Enhancing Library Catalogs for Music.” Paper presented at the Conference on Music and Technology in the Liberal Arts Environment, Hamilton College, June 22, 2004; http://academics.hamilton.edu/conferences/musican- dtech/Presentations/Catalog-Enhancements.ppt (accessed Mar. 12, 2007). Ayres, Marie-Louise. “Case Studies in Implementing Functional Requirements for Bibliographic Records: AustLit and Musi- cAustralia.” ALJ: The Australian Library Journal 54, no. 1 (Feb. 2005): 43–54; http://www.nla.gov.au/nla/staffpaper/2005/ ayres1.html (accessed Mar. 12, 2007). Bennett, Rick, Brian F. Lavoie, and Edward T. O’Neill. “The Concept of a Work in WorldCat: An Application of FRBR.” Library Collections, Acquisitions, and Technical Services 27, no. 1 (Spring 2003): 45–60. Boyd, Alistair. “The Worst of Both Worlds: How Old Rules and New Interfaces Hinder Access to Music.” CAML Review 33, no. 3 (Nov. 2005); http://www.yorku.ca/caml/review/33-3/ both_worlds.htm (accessed Mar. 12, 2007). Brenne, Marte. “Storage and Retrieval of Musical Documents in a FRBR-based Library Catalogue.” Masters’ Thesis, Oslo University College, 2004. Carlyle, Allyson. “Fulfilling the Second Objective in the Online Catalog: Schemes for Organizing Author and Work Records into Usable Displays,” Library Resources and Technical Services 41, no. 2 (1997): 79–100. ______. “User Categorization of Works: Toward Improved Orga- nization of Online Catalog Displays.” Journal of Documenta- tion 55, no. 2 (Mar. 1999): 184–208 Chacra, Vinood, and John Espley. “Differentiating Libraries though Enriched User Searching: FRBR As the Next Dimen- sions in Meaningful Information Retrieval.” PowerPoint presentation, http://www.vtls.com/Corporate/FRBR.shtml (accessed Mar. 10, 2007). Childress, Eric. “FRBR and OCLC Research.” Paper pre- sented at the University of North Carolina-Chapel Hill, Apr. 10, 2006; http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed Mar. 12, 2007). Hickey, Thomas B., and Edward O’Neill. “FRBRizing OCLC’s WorldCat.” In Functional Requirements for Bibliographic Records aRtiCLE titLE | autHoR 31FRBRization oF a LiBRaRY CataLoG | DiCkEY 31 (FRBR): Hype or Cure-All? ed. Patrick Le Boeuf, 239-251. New York: Haworth, 2005. Hickey, Thomas B., and Jenny Toves. “FRBR Work-Set Algo- rithm.” Apr. 2005 report; http://www.oclc.org/research/ frbr (accessed Mar. 12, 2007). Hickey, Thomas B., Edward T. O’Neill, and Jenny Toves, “Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR),” D-Lib 8, no. 9 (Sept. 2002); http://www.dlib.org/dlib/september02/hickey/09hickey. html (accessed Mar. 12, 2007). IFLA Study Group on the Functional Requirements for Bib- liographic Records. Functional Requirements for Bibliographic Records: Final Report. Munich: K. G. Saur, 1998. Layne, Sara Shatford. “Subject Access to Art Images.” In Intro- duction to Art Image Access: Issues, Tools, Standards, Strategies, Murtha Baca, ed., 1–18. Los Angeles: Getty Research Insti- tute, 2002. Leazer, Gregory, and Richard P. Smiraglia. “Bibliographic Fami- lies in the Library Catalog: A Qualitative Analysis and Grounded Theory.” Library Resources and Technical Services 43, no. 4 (1999): 191–212. Le Boeuf, Patrick. “Musical Works in the FRBR Model or ‘Quasi la Stessa Cosa’: Variations on a Theme by Umberto Eco.” In Functional Requirements for Bibliographic Records (FRBR): Hype or Cure-All? Patrick Le Boeuf, ed., 103–23 New York: Haworth, 2005. Markey, Karen. Subject Access to Visual Resources Collections: A Model for Computer Construction of Thematic Catalogs. New York: Greenwood, 1986. Mimno, David, and Gregory Crane. “Hierarchical Catalog Records: Implementing a FRBR Catalog.” D-Lib 11, no. 10 (Oct. 2005); http://www.dlib.org/dlib/october05/crane/10crane. html (accessed Mar. 12, 2007). O’Neill, Edward T. “FRBR: Application of the Entity-relationship Model to Humphrey Clinker.” Library Resources and Technical Services 46, no. 4 (Oct. 2002): 150–59. O’Neill, Edward T., and Diane Vizine-Goetz. “Bibliographic Relationships: Implications for the Function of the Catalog.” in The Conceptual Foundations of Descriptive Cataloging. Elaine Svenonius, ed., 167–79. San Diego: Academic Publ., 1989. Proffitt, Merrilee. “RedLightGreen: FRBR Between a Rock and a Hard Place.” Paper presented at the 2004 ALA Annual Con- ference, Orlando, Fla.; http://www.ala.org/ala/alcts/alcts- conted/presentations/Proffitt.pdf (accessed Mar. 12, 2007). Smiraglia, Richard P. Bibliographic Control of Music, 1897–2000. Lanham, Md.: Scarecrow and Music Library Association, 2006. ______. “Content Metadata: An Analysis of Etruscan Artifacts in a Museum of Archaeology.” Cataloging and Classification Quarterly, 40, no. 3/4 (2005): 135–51. ______. “Musical Works and Information Retrieval,” Notes: Quarterly Journal of the Music Library Association 58, no. 4 (June 2002): 747–64. ______. The Nature of “A Work”: Implications for the Organization of Knowledge. Lanham, Md.: Scarecrow, 2001. ______. “Uniform Titles for Music: An Exercise in Collocating Works.” Cataloging and Classification Quarterly 9, no. 3 (1989): 97–114. Tillett, Barbara Ann. “Bibliographic Relationships.” in Relation- ships in the Organization of Knowledge. Carol A. Bean and Rebecca Green, eds., 19–35. Dordrecht: Kluwer, 2001. Vellucci, Sherry L. Bibliographic Relationships in Music Catalogs. Lanham, Md.: Scarecrow, 1997. ______. “Music Metadata and Authority Control in an Interna- tional Context.” Notes—Quarterly Journal of the Music Library Association 57, no. 3 (Mar. 2001): 541–54. Wilson, Patrick. “The Second Objective.” In The Conceptual Foun- dations of Descriptive Cataloging. Elaine Svenonius, ed., 5–16. San Diego: Academic Publ., 1989. Wright, H. S. “Music Librarianship at the Turn of the Century: Technology.” Notes: Quarterly Journal of the Music Library Association 56, no. 3 (Mar. 2000): 591–97. Yee, Martha M. “FRBRization: A Method for Turning Online Public Finding Lists into Online Public Catalogs.” Information Technology and Libraries 24, no. 3 (2005): 77–95; http://reposi- tories.cdlib.org/postprints/713 (accessed Mar. 12, 2007). ______. “Manifestations and Near-Equivalents: Theory with Spe- cial Attention to Moving-Image Materials.” Library Resources and Technical Services 38, no. 3 (1994): 227–55. Zager, Daniel. “Collection Development and Management.” Notes: Quarterly Journal of the Music Library Association 56, no. 3 (2000): 567–73. 32 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200832 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 A search on Also sprach Zarathustra on the online public access catalog for the Universite catholique de Louvain, with results FRBRized. (A VTLS OPAC). Selecting the first work yields the following screen: . . . which, when FRBRized, yields a list of expressions. Any part of the tree may be expanded, to display manifes- tations, and item-level records follow. Appendix: Examples of a FRBRized Tree Display 3261 ---- aRtiCLE titLE | autHoR 33onLinE WoRkpLaCE tRaininG in LiBRaRiEs | HaLEY 33 This study was designed to explore and describe the relationships between preference for online training and traditional face-to-face training. Included were variables of race, gender, age, education, experience of library employees, training providers, training locations, and institutional professional development policies, etc. in the library context. The author used a bivariate test, Kruskal- Wallis test and Mann-Whitney U test to examine the relationship between preference for online training and related variables. I n the era of information explosion, the nature of library and information services makes library staff update their work knowledge and skills regularly. Workplace training has played an important role in the acquisition of knowledge and skills required to keep up with this information explosion. As Richard A. Swanson states, human resource development (HRD) is personnel training and development and organization develop- ment to improve processes and enhance the learning and performance of individuals, organizations, com- munities, and society (Swanson 2001). Training is the largest component of HRD. It helps library employees acquire more skills through continuous learning. Online workplace training is a relatively new medium of deliv- ery. This new form of training has been explored in the literature of human resources development in corpora- tion settings (Macpherson, Elliot, Harris, and Homan 2004), but it has not been adequately explored in univer- sity and library settings. Universities are unique settings in which to study HRD, and libraries are unique settings in which to examine HRD theory and practice. In human resource development literature there are studies on participation (Wang and Wang 2004) from the perspec- tive of individual motivation, attitudes, etc.; however, more research needs to be conducted to explore library employees’ demographics related to online training in the unique library contexts, such as various staff training and development, as well as training policies. HRD literature includes studies of online learning in formal educational settings (Hiltz and Goldman 2004; Shank and Sitze 2001; Waterhouse 2005), and there are studies on relationships between national culture and the utility of online training (Downey, Wentling, Wentling, and Wadsworth 2005). But there has been very little research conducted in terms of online workplace training for library staff. It is not clear what relation- ships exist among preferences for online training and demographic variables such as ethnicity, gender, age, educational level, and years of library experience. Due to lack of research in these areas, workplace training in libraries will be less effective if certain ethnic groups, or certain age groups, prefer traditional face-to-face train- ing as libraries move toward online training. The author believes that research should govern library practice. Therefore, it is necessary to research this topic and dis- seminate the findings. Because of the growth in online training, there is a need to gain a better understanding of these relationships. ■ Purpose of the study The study aims to reveal the relationships between preferences for online or traditional face-to-face train- ing and variables such as ethnicity, gender, age, educa- tional level, and years of experience. It also studies the relationships among preference for online training and other variables of training locations, training providers, training budgets, and professional development policies. The constructs are: the preference for online training was related to demographics, library’s training budget, pro- fessional development policies, training providers, and the training locations. These factors were included in the research questionnaire. We begin with the research ques- tions, review the current literature, and then discuss the method, results, and need for further research. Correlational research questions 1. What is the relationship between ethnicity and online workplace training preferences? 2. What is the relationship of employees’ educational levels, age, and years of library experience to online workplace training preferences? 3. How does preference for online workplace training in libraries relate to employee gender? 4. How does preference for online workplace training in libraries relate to training locations, training pro- viders, training budgets, and professional develop- ment policies? 5. Do library staff prefer traditional face-to-face train- ing over online training? ■ Review of the literature As stated above, training is the largest component of HRD. The discipline of HRD relies on three core theo- ries: psychological theory, economic theory, and system theory. Swanson (2001) stated: Online Workplace Training in Libraries Connie K. Haley Connie k. Haley (chaley@csu.edu) is Systems Librarian, Chicago State University Library, Illinois 34 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200834 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Economic theory is recognized as a primary driver and survival metric of organizations; system theory recognizes purpose, pieces, and relationships that can maximize or strangle systems and subsystems; and psychological theory acknowledges human beings as brokers of productivity and renewal along with the cul- tural and behavioral nuances. Each of these three theo- ries is unique, complementary, and robust. Together they make up the core theory underlying the discipline of HRD (p. 92–93). Three specific economic theory perspectives are believed to be most appropriate to the discipline of HRD: (1) scarce resource theory, (2) sustainable resource theory, and (3) human capital theory (Swanson 2001). Training is an investment to human capital with valuable returns, but no costs. Wenger and Snyder’s study (as cited in Mahmood, Ahmad, Samah, and Idris 2004) states that today’s economy runs on knowledge and skills. Thurow’s study (as cited in Swanson 2001) states that new industries of the future depend on brain power. Man-made com- petitive advantages replace the comparative advantage of natural-resources endowments or capital endowments. In a rapidly changing society, maintaining organiza- tional and individual competence has become a greater challenge than ever before (Hake 1999). Competences include knowledge, skills, and attitudes. Much of the literature focuses on job-related functional competences (Deist and Winterton 2005). Library workplace training is one of the primary methods of investing in human capital and increasing competence for library employ- ees. Training is the process through which skills are developed, information is provided, and attributes are nurtured (Davis and Davis 1998). To increase training participation and efficacy, libraries need to determine employees’ preferences for online training or traditional face-to-face training; a resulting high training participa- tion rate would increase the competence of all employees. Library trainers and administrators can encourage non- participants to attend training by offering different train- ing sessions (online or face-to-face), and/or by changing training policies and budget allocations. Unlike person- ality and intelligence, skill competence may be learned; hence it may be improved through training and devel- opment (McClelland 1998). Nadler and Tushman (1999) emphasized core competence as a key organizational resource that could be exploited to gain competitive advantage. Core competence was defined as collective learning in the organization, especially how to coordinate diverse production skills and integrate multiple streams of technologies (Prahalad and Hamel 1990). Mezirow (2000) asserted that there are asymmetrical power rela- tionships that influence the learning process (as cited in Baumgartner 2000). Learning more about the relation- ships may benefit training and learning. In other words, training may be more effective if it is provided in the form preferred by the majority of staff. As stated above, there is very little research about online workplace training for library staff. Past stud- ies have focused on how to conduct online training for working catalogers (Ferris 2002) or on online teaching for students (Crichton and LaBonte 2003; Hitch and Hirsch 2001). From the design and implementation perspectives, Kovacs (2000) discussed Web-based training in libraries, and Unruh (2000) emphasized problems in delivery of Web-based training. Markless (2002) addressed learning theory and other relevant theories that could be used to teach in libraries. Yet there is a lack of research on the demographics of library staff participation in workplace training and a lack of research on the training preferences of library staff. ■ Methodology The study took place in an online environment. The research activities covered a twenty-day period from April 10 to April 30, 2006. Survey questionnaires and consent forms were posted on the Web. select participants The survey URL (http://freeonlinesurveys.com/render- survey.asp?id=106221) was sent to library staff via library discussion lists along with a consent form including con- tact information and a brief explanation of the survey’s purpose. The surveys were anonymous and confidential. Names, e-mail addresses, and personally identifiable information were not tracked. All participants filled out the survey online. The sample was limited to employ- ees who were at least nineteen years old. Directors and department heads were also welcome to participate. instrument Data collected for this study included categorical data (i.e., gender and ethnicity) and numeric data (age, years of education, and years of experience). This was an atti- tudinal survey; hence, the Rensis Likert scale was used for data feedback. Most of the data was quantitative Likert scale, such as the preference for online training, the professional development policy, and the budget allocation for training. Data collection “entailed measur- ing the attitudes of employees, providing feedback to participants, and stimulating joint planning for improve- ment” (Swanson 2001). Likert-type scales provide more variation of responses and lend themselves to stronger statistical analysis (Creswell 2005). It is important to select a well-tested instrument that aRtiCLE titLE | autHoR 35onLinE WoRkpLaCE tRaininG in LiBRaRiEs | HaLEY 35 reports reliable and valid data. However, measuring atti- tudes has been one of the most challenging forms of psy- chometric measurement (Thorkildsen 2005). Due to a lack of similar studies of libraries’ online training, no instru- ments could be found for this study except the Education Participation Scale (EPS), the Deterrents to Participation Scale (DPS), and the Style Analysis Survey (SAS) instru- ments. Boshier’s forty-item EPS (1974) is reliable in differ- entiating among diverse groups with varying reasons for participating in continuing education (as cited in Merriam and Caffarella 1999). The EPS is used to find the motiva- tions as to why people participate in continuing education; consequently, the EPS cannot answer all questions of this study. Similarly, the DPS reveals factors of nonparticipa- tion; hence, the DPS cannot be used in this study. And while the SAS is designed to identify how individuals pre- fer to learn, concentrate, and perform in both educational and work environments (Sloan, Daane, and Giesen 2002), after careful examination, it was found that the SAS was not well-suited to this study. Because surveys are used to collect data and to assess opinions and attitudes (Creswell 2005), the researcher chose to develop a survey that con- tained about 20 items to assess library staff’s opinions and attitudes toward online training. The survey consisted of three parts: demographic variables, Likert-scale assessment of online workplace training preference, and open-ended questions that were worded to reflect reasons for training preference (see Appendix). To capture demographic data, participants were asked to indicate their age, years of library experience, years of education (high school/GED = 12; two years college = 14; bachelor’s degree = 16; one master’s degree = 18; two master’s degrees = 20; Ph.D/Ed.D = 22+), gender (1 = male or 2 = female), ethnicity (1 = Asian/Pacific Islander, 2 = American Indian, 3 = African American, 4 = Hispanic, 5 = White, non-Hispanic, and 6 = other). The Likert scale items are designed using a forced-choice Likert scale (Smith 2006), that is, an even number of response options (1 = Strongly agree; 2 = Agree; 3 = Mildly agree; 4 = Mildly disagree; 5 = Disagree; 6 = Strongly disagree), rather than an odd number (Strongly agree; Agree; Neither agree nor disagree; Disagree; Strongly disagree). A scoring decision is consistently applied in order to have a meaningful inter- pretation of the scores. Thus, for the Likert scale items, the scaling method is to use high scores to represent stronger resistance to a measured attitude of online training. To insure reliability and validity of scores, the questionnaire was reviewed by an expert in the library field to validate if questions were representative of the library field. Data collection The way a researcher plans to draw a sample is related to the best way to collect data (Fowler 2002). The above sampling approach made it easier for data collection. The author collected data via the Web survey company by paying for survey services on a monthly basis. The data was collected by the end of April 2006. The total number of participants was 292 (n=292), of which 260 were valid. Thirty-two participants did not com- plete the survey; those surveys with missing data were excluded from analysis. Survey results were saved in a text file and then downloaded into SPSS for analysis. ■ Results and analysis Beside general frequency analysis, the Kruskal-Wallis Test was used for six ethnic groups. Since some ethnic groups had small sample sizes, all minorities (48) were merged in one ethnic group. Thus, the Mann-Whitney Test was used for the two ethnic groups—minority and majority. The author also assessed bivariate relationships with preference of online training and other variables. Frequencies analysis Frequencies analysis includes demographics, preference of online training versus face-to-face training, budget, and professional development policies. Demographics. Eighty-five percent of participants were female, 81 percent were white, 49 percent had one master ’s degree, and 23 percent had two or more mas- ter ’s degrees. Nearly 70 percent were forty years old or older; 45 percent were fifty years old or older. Thirty-six percent had less than 10 years of library experience (see table 1). Preference of online training versus face-to-face training. Most participants (87.3 percent) reported that online training was less effective than traditional face-to-face training. Generally speaking, fewer participants (33.9 percent) preferred online training: Strongly agree (3.1 percent), Agree (13.5 percent), and Mildly agree (17.3 percent). More participants (66.1 percent) did not prefer online training: Mildly disagree (28.8 percent), Disagree (28.1 percent), and Strongly disagree (9.2 percent). Budget. Fifty-five percent of participants somewhat agree their library allocates sufficient budget for training: Strongly agree (8.8 percent), Agree (25.8 percent), and Mildly agree (20 percent). Professional development policies. Sixty-eight percent of participants somewhat agree their libraries had good professional development policies: Strongly agree (13.5 percent), Agree (30 percent), Mildly agree (24.6 percent). Table 2 shows the frequencies of preference of online training, budget, and policy. 36 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200836 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 kruskal-Wallis test of Ethnicity (α = .05) In the Kruskal-Wallis test for Ethnicity, to match the total number of 48 minorities, 48 white people were randomly selected from 212. The test was not significantly different. In the Kruskal-Wallis test, Chi-Square is 2.222 (df = 4) and asymptotic significance was 0.715, which was greater than the criterion α = .05. There was no difference in pref- erence for online training between ethnic groups. mann-Whitney u test statistics of Ethnicity (α = .05) The Mann-Whitney test of ethnicity was not significant. Asymptotic significance is 0.81 (z = -.241), which was greater than the criterion α = .05. There was no difference in preference for online training between the minorities group and the group of white/not Hispanic. mann-Whitney u test statistics of Gender (α = .05) The Mann-Whitney test of gender was not significant. Asymptotic significance was 0.675 (z = -.419), which was greater than the chosen α value (α = .05). There was no significant difference in preference for online training between males and females. Bivariate analysis (α = .05) Bivariate correlations were computed (see table 3). Preference for online training was not associated with age, years of education, years of library experience, sufficient training budget, or professional development policy. It makes sense to believe that traditional face-to-face training has better quality than online training. Before the survey analysis, the author expected that younger employees would prefer online training and older ones would prefer traditional face-to-face training due to the older employ- ees’ reluctance to change. It was also expected that highly educated employees would prefer online training while less educated ones, with fewer online skills, would prefer traditional face-to-face training. Another assumption was that employees with more library experience would prefer online training while less experienced ones would prefer traditional face-to-face training. The survey showed these assumptions were wrong. It was also assumed that an insufficient training bud- get might result in a preference for online training, since online training is more cost effective; and that good pro- fessional development policies might result in preference for traditional face-to-face training because it is of better quality than online training. The survey found these assumptions to be false. Training budget and professional development policies were irrelevant to the preference for online training. However, it was not surprising to find that preference for online training was associated with training providers and training locations, as seen in table 3. ■ Discussion The exploration of the relationships among these vari- ables revealed that the preference for online training was not related to demographics, budgets, or profes- sional development policies. However, the preference for online training did show a correlation to training providers and locations. It was surprising to discover Table 1. Demographic Characteristics Characteristics Frequency n % Gender Male 40 15 Female 220 85 Ethnicity Asian/Pacific Islander 22 8.5 American Indian 2 0.8 African American 17 6.5 Hispanic 7 2.7 White 212 81.5 Age 20–29 23 9.4 30–39 54 21.2 40–49 61 23.9 50–59 102 40 60+ 14 5.5 Missing 5 1.9 Education Less than 16 years 27 10.4 16–17 years/bachelor 45 17.4 18–19 years/one master 128 49.4 20–21 years/two masters 43 16.6 22+ years/doctorate 16 6.2 Missing data 1 0.4 Years of library experience Less than 10 years 94 35.9 10–19 years 81 31.3 20–29 years 48 16.2 More than 30 years 37 14.3 Missing data 1 0.4 aRtiCLE titLE | autHoR 37onLinE WoRkpLaCE tRaininG in LiBRaRiEs | HaLEY 37 the preference for online training was not associated with ethnicity, gender, age, education, or library experi- ence. It was interesting to note that training budgets and professional development policies were not related to the preference for online training. Several study hypotheses were confirmed. Library staff preferred traditional face-to-face training as opposed to online training. Although one-third (33.8 percent) of participants preferred (including Mildly agree, Agree, and Strongly agree) online training, only 12.7 percent of participants thought that online training was more effec- tive than traditional face-to-face training. On the other hand, the majority (80 percent) preferred online training when the training was held out of state; 56.2 percent preferred online training when it was held in state. The study concluded that online training was preferred if the training locations required participants to travel great distances from the library. Of the participants, 63.1 percent preferred online training when the training was provided by a vendor. Some participants did not think face-to-face contact was important for vendor training. This finding suggests that online training is a better choice for vendor training. Fifty-five percent preferred online training when it was provided by an association/organization. Association/ organization trainers should consider a combination of online and traditional face-to-face training to meet the needs of the majority. Online training can be provided for some specific tasks, and supplemented by face-to-face training for others. The following are survey summaries of key reasons to use online and traditional training, along with suggestions from the survey participants. the main reasons to use online training ■ flexible (allows more people from one worksite to participate) ■ saves time ■ eliminates travel cost ■ generally lower training costs ■ ease of access (able to have hands- on practice with a technology and software program, able to refer back to supplemental materials, able to obtain wider range of training, appropriate to give general over- views in preparation for more in- depth face-to-face training) ■ convenient (have some control over one’s time, attend training from the comfort of home or office rather than having to drive somewhere and sit through a presentation, fits easily into a busy schedule, and self-paced in asynchronous online training) the main reasons to use face-to-face training ■ Questions and answers: able to ask questions and discuss answers, see immediate feedback, questions others are asking may include some that you didn’t think of, and problems solved directly ■ Networking with peers: face-to-face training allows for serendipitous networking opportunities, you have the option of personal conversations with train- ers as well as social opportunities to meet other pro- fessionals, it is hard to meet people and make friends through an online training, get out of the library once in awhile, find out what experiences staff from other departments or libraries are having ■ Better communication and interaction: have per- sonal interaction with instructors and participants, share ideas and experiences with others, enjoy dis- cussions and diversity of personal opinions that come from face-to-face training ■ Learn efficiently and effectively: learn from others— not just the instructor, get more out of real training, easy to get disinterested if no face-to-face contact, learn better from an instructor ■ Technology barrier: sometimes technology can get in the way of training, some online training was poorly designed, online classes took forever to load and two seconds to read the whole page. suggestions to improve library workplace training Administrative support. The most important factor is hav- ing library administrators who support training and encourage staff at all levels to attend training. Provide workshops for professional librarians and civil service workers that relate to their work, and give them release time for training. Library administrators must under- stand the importance of training and develop training policies with a commitment toward staff development. Table 2. Frequencies of Preference of online training, budget, and policy Descriptor Frequency Mean* Median* Std. Deviation Preference of online training 3.93 4 1.281 Budget 3.45 3.0 1.550 Policy 3.0 3.0 1.437 1 = Strongly agree; 2 = Agree; 3 = Mildly agree; 4 = Mildly disagree; 5 = Disagree; 6 = Strongly disagree 38 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200838 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Library administrators must plan and design train- ing infrastructures for core competence and cumulative learning, instead of spontaneous one-shot training for new products or systems. More training. Many participants expressed their desire for more training. Training not only increases their knowledge and skills needed for their job, but also provides opportunities to network with colleagues. More face-to-face and technical hands-on training are needed since many librarians felt left out of the technol- ogy loop. They think that maintaining a current view of developments in technology is difficult. More online training is needed, both asynchronous and synchronous. Asynchronous training is good for self-paced training, which is preferred by many survey participants, while some enjoy online Webcasts of seminars and workshops for better interaction. It is hoped that state libraries will provide online streaming videos about various topics for academic, private, and public library staff. More funding. Make more funding available for library workplace training. The training budget should not be the first thing cut when budgets get tight. A combination of online and traditional face-to-face training. Walton (1999) notes that we must ensure we learn and grow. We may learn and grow by participat- ing workplace training. Training pro- grams should be built into strategic HRD plans that will best fit employ- ees’ learning preference. This study shows that online training works well with basic informational topics and most technology topics (databases, searching, or Web-related technolo- gies). Certain simple topics were more appropriate for online training, such as a vendor ’s product and procedural training. Some topics do not translate well into online training, however, such as how to conduct storytimes— topics that require a lot of interaction between participants. Difficult topics need traditional training for direct answers from the instructor. Topics that need in-depth discussion should be provided with traditional training. In other words, provide basic train- ings online and save face-to-face train- ing for more difficult topics. ■ Future research Future research should focus on new learning needs, how people interact with technology, and how people learn in an online environment. More research is needed for a variety of online training. In this study, the generic term “online training” was used. Future study needs to expand the term “online training” to static asynchronous online training and interactive synchronous online training. Static online training includes text-only static, and text- graphic static with or without voice. Interactive online training includes voice-only interactive, and voice-video interactive with ability to ask and answer questions in real time. As time goes by, more people will have taken online training and will be more comfortable with it. As more people have online training experience, their attitudes toward online training may change. Further research should examine and measure library staff’s preference for a variety of online training. In addition, participants should be surveyed by grouping experienced online trainees and non-experienced online trainees. Finally, studies may be conducted to survey library staff in other countries to com- pare their preferences with those of their U.S. peers. The goal of this study is to provide helpful infor- mation for department heads, supervisors, and library human resources staff to assist them in determining the types of training that will be most effective to meet train- Table 3. Bivariate correlations with preference of online training ( α = .05) Variables Preference of online training Age .980 Education .507 Library experience .259 Budget .858 Prof. development policies .280 Training provider Vendors <.01* Associations/org. (ALA, OCLC, etc.) <.01* Lib. consortia <.01* Library/institution <.01* Training location Out of state <.01* In state <.01* In town <.01* In house <.01* * Significant at α = .05 aRtiCLE titLE | autHoR 39onLinE WoRkpLaCE tRaininG in LiBRaRiEs | HaLEY 39 ing needs. The author hopes this study also provides useful information to all library employees who attend training or workshops, including civil service person- nel and librarians, and that this study will be utilized for further research on library training and, in turn, that research will make more contributions to the workplace training literature of libraries and other professions. Acknowledgements The author thanks Lorraine Lazouskas, John Webb, Judith Carter, and the copy editors at ALA production services for their assistance and valuable input on this manuscript. Bibliography Baumgartner, L. M. 2000. Preface. In L. M. Baumgartner and S. B. Merriam (eds.), Adult learning and development: Multicultural stories. Malabar, Fla.: Krieger Publishing. Creswell, J. W. 2005. Educational research: Planning, conducting, and evaluating quantitative and qualitative research. 2nd ed. Upper Saddle River, N.J.: Pearson Merrill Prentice Hall. Crichton, S., and R. LaBonte. 2003. Innovative practices for inno- vators: Walking the talk; Online training for online teaching. Educational Technology & Society 6, no. 1: 70–73. Davis, J. R., and A. B. Davis. 1998. Effective training strategies : A comprehensive guide to maximizing learning in organizations. San Francisco: Berret-Koehler. Deist, F. D., and J. Winterton. 2005. What is competence? Human Resource Development International 8, no. 1: 27–46. Downey, S., R. M. Wentling, T. Wentling, and A. Wadsworth. 2005. The relationship between national culture and the usa- bility of an e-learning system. Human Resource Development International 8, no. 1: 47–64. Ferris, A. M. 2002. Cataloging internet resources using MARC21 and AACR2: Online training for working catalogers. Catalo- ging and Classification Quarterly 34, no. 3: 339–353. Fowler, F. J. 2002. Survey research methods. 3rd ed. Thousand Oaks, Calif.: Sage. Haley, C. K. 2006. Who participates in online workplace training in libraries? Survey results retrieved April 25, 2006, from http:// freeonlinesurveys.com/viewresults.asp?surveyid=183507. Hake, B. J. 1999. Lifelong learning in late modernity: The chal- lenges to society, organizations, and individuals. Adult Edu- cation Quarterly 49, no. 2: 79–90. Hiltz, S. R., and R. Goldman, eds. 2004. Learning together online. Mahwah, N.J.: Lawrence Erlbaum Associates. Hitch, L. P., and D. Hirsch. 2001. Model training. Journal of Aca- demic Librarianship 27, no. 1: 15–19. Kovacs, D. K. 2000. Designing and implementing web-based training in libraries. Business and Finance Division Bulletin 113 (winter): 31–37. Macpherson, A., M. Elliot, I. Harris, and G. Homan. 2004. E-lea- rning: reflections and evaluation of corporate programme. Human Resource Development International 7, no. 3: 295–313. Mahmood, N. H. N., A. Ahmad, B. A. Samah, and K. Idris. 2004. Informal learning of management knowledge and skills and transfer of learning among head nurses. In Human Resource Development in Asia: Harmony and partnership, R. Moon, A. M. Osman-Gani, K. Shinil, G. Roth, and H. Oh, eds. Seoul: The Korea Academy of HRD. Markless, S. 2002. Learning about learning rather than about tea- ching. Retrieved July 5, 2007, from http://www.ifla.org/IV/ ifla68/papers/081-119e.pdf. McClelland, D. 1998. Identifying competencies with behavioral- event interviews. Psychological Science 9, no. 5: 331–339. Merriam, S. B., and R. S. Caffarella. 1999. Learning in adulthood. San Francsico: Jossey-Bass. Mezirow, J. 2000. Learning to think like an adult: Transformation theory; Core concepts. In Learning as transformation: Critical perspectives on a theory in progress, J. Mezirow and Associates, eds. San Francisco: Jossey-Bass. Nadler, D. A., and M. Tushman. 1999. The organization of the future: Strategic imperatives and core competencies for the 21st century. Organisational Dynamincs 27, no. 1: 45–48. Prahalad, C. K., and G. Hamel. 1990. The core competence of the corporation. Harvard Business Review 68, no. 3: 79–91. Shank, P., and A. Sitze. 2004. Making sense of online learning. San Francisco: Pfeiffer. Sloan, T., C. J. Daane, and J. Giesen. 2002. Mathematics anxiety and learning styles: what is the relationship in elementary preservice teachers? School Science and Mathematics 102, no. 2: 84–87. Smith, J. T. 2006. Applied categorical data analysis. Lecture presented in spring 2006 at Northern Illinois University, DeKalb. Swanson, R. A. 2001. Foundations of human resource development. San Francisco: Berrett-Koehler. Thorkildsen, T. A. 2005. Fundamentals of measurement in applied research. Boston: Pearson Education. Unruh, D. L. 2000. Desktop videoconferencing: The promise and problems of delivery of web-based training. Internet and Higher Education 3, no. 3: 183–199. Walton, J. 1999. Strategic human resource development. Harlow, England: Pearson Education. Wang, G. G., and J. Wang. 2004. Toward a theory of human resource development learning participation. Human Resource Development Review 3, no. 4: 326–353. Waterhouse, S. 2005. The power of elearning: The essential guide for teaching in the digital age. Boston: Pearson Education. 40 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200840 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 APPENDIX. Questionnaire part i. 1. Gender q Male q Female 2. Ethnicity q Asian or Pacific Islander q American Indian q African American q Hispanic q White, non-Hispanic q Other ____ 3. Please indicate the year of your birth: _________ 4. Please indicate years of education: _________ 5. Please indicate years of library experience: ________ part ii. For Questions 6–16, please read each item and check the response that best matches your degree of agreement/disagree- ment: (1 = Strongly agree; 2 = Agree; 3 = Mildly agree; 4 = Mildly disagree; 5 = Disagree; 6 = Strongly disagree) 6. If training is provided by library vendors such as EBSCO or Blackwell, I would prefer that it be offered online rather than face-to-face. 7. If training is provided by associations/organizations such as ALA and OCLC, I would prefer that it be offered online rather than face-to-face. 8. If training is provided by library consortia, I would prefer that it be offered online rather than face-to-face. 9. If training is provided by your institution or library, I would prefer that it be offered online rather than face-to- face. 10. If training location is out of state, I would prefer that it be offered online rather than face-to-face. 11. If training location is in-state, I would prefer that it be offered online rather than face-to-face. 12. If training location is in town, I would prefer that it be offered online rather than face-to-face. 13. If training location is in-house, I would prefer that it be offered online rather than face-to-face. 14. My library allocates sufficient budget for training (may include online training). 15. My library has good professional or staff development policies. 16. Generally speaking, I prefer online training rather than face-to-face training. part iii. 17. State reasons for your preference of traditional face-to-face training. 18. State reasons for your preference of online training. 19. Please make suggestions to improve library workplace training. 20. Do you think that online training is less effective than traditional face-to-face training? Yes__ No __ 3262 ---- aRtiCLE titLE | autHoR 41ContEnt-BasED inFoRmation REtRiEvaL anD DiGitaL LiBRaRiEs | Wan anD Liu 41 Content-Based Information Retrieval and Digital Libraries This paper discusses the applications and importance of content-based information retrieval technology in digital libraries. It generalizes the process and ana- lyzes current examples in four areas of the technology. Content-based information retrieval has been shown to be an effective way to search for the type of multime- dia documents that are increasingly stored in digital libraries. As a good complement to traditional text- based information retrieval technology, content-based information retrieval will be a significant trend for the development of digital libraries. W ith several decades of their development, digital libraries are no longer a myth. In fact, some gen- eral digital libraries such as the National Science Digital Library (NSDL) and the Internet Public Library are widely known and used. The advance of computer technology makes it possible to include a colossal amount of information in various formats in a digital library. In addition to traditional text-based documents such as books and articles, other types of materials—including images, audio, and video—can also be easily digitized and stored. Therefore, how to retrieve and present this multimedia information effectively through the interface of a digital library becomes a significant research topic. Currently, there are three methods of retrieving infor- mation in a digital library. The first and the easiest way is free browsing. By this means, a user browses through a collection and looks for desired information. The second method—the most popular technique used today—is text- based retrieval. Through this method, textual information (full text of text-based documents and/or metadata of multimedia documents) is indexed so that a user can search the digital library by using keywords or controlled terms. The third method is content-based retrieval, which enables a user to search multimedia information in terms of the actual content of image, audio, or video (Marques and Furht 2002). Some content features that have been studied so far include color, texture, size, shape, motion, and pitch. While some may argue that text-based retrieval tech- niques are good enough to locate desired multimedia information, as long as it is assigned proper metadata or tags, words are not sufficient to describe what is some- times in a human’s mind. Imagine a few examples: A patron comes to a public library with a picture of a rare insect. Without expertise in entomology, the librarian won’t know where to start if only a text-based informa- tion retrieval system is available. However, with the help of content-based image retrieval, the librarian can upload the digitized image of the insect to an online digital image library of insects, and the system will retrieve similar images with detailed description of this insect. Similarly, a patron has a segment of music audio, about which he or she knows nothing but wants to find out more. By using the content-based audio retrieval system, the patron can get similar audio clips with detailed information from a digital music library, and then listen to them to find an exact match. This procedure will be much easier than doing a search on a text-based music search system. It is definitely helpful if a user can search this non-textual information by styles and features. In addition, the advance of the World Wide Web brings some new challenges to traditional text-based information retrieval. While today’s Web-based digital libraries can be accessed around the world, users with different language and cultural backgrounds may not be able to do effective keyword searches of these librar- ies. Content-based information retrieval techniques will increase the accessibility of these digital libraries greatly, and this is probably a major reason it has become a hot research area in the past decade. Ideally, a content-based information retrieval system can understand the multi- media data semantically, such as its objects and categories to which it belongs. Therefore, a user is able to submit semantic queries and retrieve matched results. However, a great difficulty in the current computer technology is to extract high-level or semantic features of multimedia information. Most projects still focus on lower-level fea- tures, such as color, texture, and shape. Simply put, a typical content-based information retrieval system works in this way: First, for each mul- timedia file in the database, certain feature information (e.g., color, motion, or pitch) is extracted, indexed, and stored. Second, when a user composes a query, the feature information of the query is calculated as vectors. Finally, the system compares the similarity between the feature vectors of the query and multimedia data, and retrieves the best matching records. If the user is not satisfied with the retrieved records, he or she can refine the search results by selecting the most relevant ones to the search query, and repeat the search with the new information. This process is illustrated in figure 1. The following sections will examine some exist- ing content-based information retrieval techniques for most common information formats (image, audio, and video) in digital libraries, as well as their limitations and trends. Gary (Gang) Wan (gwan@tamu.edu) is a Science Librarian and Assistant Professor, and zao Liu (zliu@tamu.edu) is a Distance Learning Librarian and Assistant Professor at Sterling C. Evans Library, Texas A&M University, College Station, Texas. Gary (Gang) Wan and Zao Liu 42 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200842 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 ■ Content-based image retrieval There have been a large number of different content- based image retrieval (CBIR) systems proposed in the last few years, either building on prior work or exploring novel directions. One similarity among these systems is that most perform feature extraction as the first step in the process, obtaining global image features such as color, shape, and texture (Datta et al., 2005). One of the most well-known CBIR systems is query by image content (QBIC), which was developed by IBM. It uses several different features, including color, sketches, texture, shape, and example images to retrieve images from image and video databases. Since its launch in 1995, the QBIC model has been employed for quite a few digital libraries or collections. One recent adopter is the State Hermitage Museum in Russia (www.hermitage. ru), which uses QBIC for its Web-based digital image col- lection. Users can find artwork images by selecting colors from a palette or by sketching shapes on a canvas. The user can also refine existing search results by requesting all artwork images with similar visual attributes. The following screenshots demonstrate how a user can do a content-based image search with QBIC technology. In figure 2.1, the user chooses a color from the palette and composes the color schema of artwork he or she is looking for. Figure 2.2 shows the artwork images that match the query schema. Another example of digital libraries or collections that have incorporated CBIR technology is the National Science Foundation’s International Digital Library Project (www.memorynet.org), a project that is composed of several image collections. The information retrieval sys- tem for these collections includes both a traditional text-based search engine and a CBIR system called SIMPLIcity (Semantics-sensitive Integrated Matching for Picture Libraries) developed by Wang et al. (2001) of Pennsylvania State University. From the front page of these image collections, a user can choose to display a random group of images (figure 3.1). Below each image is a “similar” button; clicking this allows the user to view a group of images that contain similar objects to the previously selected one (figure 3.2). By providing feedback to the search engine this way, the user can find images of desired objects without knowing their names or descriptions. Simply put, SIMPLIcity segments each image into small regions, extracts several features (such as color, Figure 1. The general process of content-based information retrieval Figure 2.1. A user query Figure 2.2. The search results for this query aRtiCLE titLE | autHoR 43ContEnt-BasED inFoRmation REtRiEvaL anD DiGitaL LiBRaRiEs | Wan anD Liu 43 location, and shape) from these small regions, and clas- sifies these regions into some semantic categories (such as textured/nontextured and graph/photograph). When computing the similarity between the query image and images in the database, all these features will be consid- ered and integrated, and best matching results will be retrieved (Wang et al., 2001). Similar applications of CBIR technology in digital libraries include the University of California–Berkeley’s Digital Library Project (http://bnhm.berkeley.edu), the National STEM Digital Library (ongoing), and Virginia Tech’s anthropology digital library, ETANA (ongoing). While these feature-based approaches have been explored over the years, an emerging new research direction in CBIR is automatic concept recognition and annotation. Ideally, automatic concept recognition and annotation can discover the concepts that an image con- veys and assign a set of metadata to it, thus allowing image search through the use of text. A trusted automatic concept recognition and annotation system can be a good solution for large data sets. However, the semantic gap between computer processors and human brains remains the major challenge in the development of a robust auto- matic concept recognition and annotation system (Datta et al., 2005). A recent example of efforts in this field is Li and Wang’s ALIPR (Automatic Linguistic Indexing of Pictures—Real Time, http://alipr.com) project (2006). Through a Web interface, users are able to search images in several different ways: They may do text searches and provide feedback to the system to find similar images. Users may also upload an image, and the system will per- form concept analysis and generate a set of annotations or tags automatically, as shown in figure 4. The system then retrieves images from the database that are visually similar to the uploaded image. In the process of auto- matic annotation, if the user doesn’t think the tags given by the system are suitable, he or she can input other tags to describe the image. This is also the “training” process for the ALIPR system. Since CBIR is the major research area and has the lon- gest history in content-based information retrieval, there are many models, products, and ongoing projects in addi- tion to the above examples. As image collections become a significant part of digital libraries, more attention has been paid to possibilities of providing content-based image search as a complement to existing metadata search. ■ Content-based audio retrieval Compared with CBIR, content-based audio retrieval (CBAR) is relatively new, and fewer research projects on it can be found. In general, existing CBAR approaches start from the content analysis of audio clips. An example of this content analysis is extracting basic audio elements, such as duration, pitch, amplitude, brightness, and band- Figure 3.1. A group of random images in the collection Figure 3.2. CBIR results Figure 4. ALIPR’s automatic annotation feature 44 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200844 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 width (Wold et al., 1996). Because of the great difficulties in recognizing audio content, research in this area is less mature than that in content-based image and video retrieval. Although no CBAR system has been found to be implemented by any digital library so far, quite a few projects provide good prototypes or directions. One good example is Zhang and Kuo’s (2001) research project on audio classification and retrieval. The proto- type system is composed of three stages: coarse-level audio segmentation, fine-level classification, and audio retrieval. In the first stage, audio signals are semantically segmented and classified into several basic types includ- ing speech, music, song, speech with music background, environment sounds, and silence. Some physical audio features—such as the energy function, the fundamental frequency, and the spectral peak tracks—are examined in this stage. In the second stage, further classification is conducted for every basic type. Features are extracted from the time-frequency representation of audio signals to reveal subtle differences of timbre and pattern among different classes of sounds. Based on these differences, the coarse-level segmentation obtained in stage one can be classified to narrower categories. For example, speech can be differentiated into the voices of men, women, and children. Finally, in the information retrieval stage, two approaches—query-by-keyword and query-by-exam- ple—are employed. The query-by-keyword approach is more like the traditional text-based search system. The query-by-example approach is similar to content-based image retrieval systems where an image can be searched by color, texture, and histogram, and audio clips can be retrieved with distinct features, such as timbre, pitch, and rhythm. This way, a user may choose from a given list of features, listen to the retrieved samples, and modify the input feature set to get more desired results. Zhang and Kuo’s prototype is a very typical and classic CBAR system. It is relatively mature and can be used by large digital audio libraries. More recently, Li et al. (2003) proposed a new feature extraction method particularly for music genre classifica- tion named Daubechies Wavelet Coefficient Histograms (DWCHs). DWCHs capture the local and global informa- tion of music signals simultaneously by computing their histograms. Similar to other CBAR strategies, this method divides the process of music genre classification into two steps: feature extraction and multi-class classification. The music signal information representing the music is extracted first, and then an algorithm is used to identify the labels from the representation of the music sounds with respect to their features. Since the decomposition of audio signal can produce a set of subband signals at different frequencies cor- responding to different characteristics, Li et al. (2003) proposed a new methodology, the DWCHs algorithm, for feature extraction. With this algorithm, the decomposi- tion of the music signals is obtained at the beginning, and then a histogram of each subband is constructed. Hence, the energy for each subband is computed, and the charac- teristics of the music are represented by these subbands. One finding from this research reveals that this methodol- ogy, along with advanced machine learning techniques, has significantly improved accuracy of music genre clas- sification (Li et al. 2003). Therefore, this methodology potentially can be used by those digital music libraries widely developed in past several years. ■ Content-based video retrieval Content-based video retrieval (CBVR) is a more recent research topic than CBIR and CBAR, partly because the digitization technology for video appeared later than those for image and audio. As digital video Websites such as YouTube and Google Video become more popular, how to retrieve desired video clips effectively is a great con- cern. Searching by some features of video, such as motion and texture, can be a good complement to the traditional text-based search method. One of the earliest examples is the VideoQ system developed by Chang et al. (1997) of Columbia University (www.ctr.columbia.edu/VideoQ), which allows a user to search video based on a rich set of visual features and spatio-temporal relationships. The video clips in the data- base are stored as MPEG files. Through a Web interface, the user can formulate a query scene as a collection of objects with different attributes, including motion, shape, color, and texture. Once the user has formulated the query, it is sent to a query server, which contains several databases for different content features. On the query server, the similarities between the features of each object specified in the query and those of the objects in the database are com- puted; a list of video clips is then retrieved based on their similarity values. For each of these video clips, key-frames are dynamically extracted from the video database and returned to browser. The matched objects are highlighted in the returned key-frame. The user can interactively view these matched video clips by simply clicking on the key- frame. Meanwhile, the video clip corresponding to that key-frame is extracted from the video database (Chang et al. 1997). Figures 5.1–5.2 show an example of a visual search through the VideoQ system. Many other CBVR projects also examine these content features and try to find more efficient ways to retrieve data. A recent example is Wang et al.’s (2006) project, Vferret, a content-based similarity search tool for continu- ous archived video. The Vferret system segments video data into clips and extracts both visual and audio features as metadata. Then a user can do a metadata search or aRtiCLE titLE | autHoR 45ContEnt-BasED inFoRmation REtRiEvaL anD DiGitaL LiBRaRiEs | Wan anD Liu 45 content-based search to retrieve desired video clips. In the first stage, a simple segmentation method is used to split the archived digital video into five-minute video clips. The system then extracts twenty image frames evenly from each of these five-minute video clips for visual feature extraction. Additionally, the system splits the audio channel of each clip into twenty individual fifteen- second segments for further audio feature extraction. In the second stage, both audio and visual features are extracted. For visual features, the color element is used as the content feature. For audio features, 154 audio fea- tures originally used by Ellis and Lee (2004) to describe audio segments are computed. For each fifteen-second video segment, the visual feature vector extracted from the sample image and the audio feature vector extracted from the corresponding audio segment are combined into a single feature vector. In the information retrieval stage, the user submits a video clip query at first, then its feature vector is computed and compared with that of video clips in the database, and the most similar clips are retrieved (Wang et al. 2006). Similar projects in this area include Carnegie Mellon University’s Informedia Digital Video Library (www. informedia.cs.cmu.edu) and MUVIS of Finland’s Tampere University of Technology (http://muvis.cs.tut.fi/index. html). Content-based information retrieval for other digital formats With the advance of digitization technology, the content and formats of digital libraries are much richer than before. They are not limited to text, image, audio, and video. Some new formats of digital content are emerging. Digital libraries of 3-D objects are good examples. Since 3-D models have arbitrary topologies and can- not be easily “parameterized” using a standard template as in the case for 2-D forms (Bustos et al. 2005), content- based 3-D model retrieval is a more challenging research topic than other multimedia formats discussed earlier. So far, four types of solutions—primitive-based, statis- tics-based, geometry-based, and view-based—have been found (Bimbo and Pala 2006). Primitive-based solutions represent 3-D objects with a basic set of parameterized primitive elements. Parameters are used to control the shape of each primitive element and to fit each primitive element with a part of the model. With statistics-based approaches, shape descriptions based on statistical mod- Figure 5.1. The user composes a query Figure 5.2. Search results for the sample query 46 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200846 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 els are created and measured. Geometry-based methods, however, use geometric properties of the 3-D object and their measures as global shape descriptors. For view- based solutions, a set of 2-D views of the model and descriptors of their content are used to represent the 3-D object shape (Bimbo and Pala 2006). Another novel example is Moustakas et al.’s (2005) project on 3-D model search using sketches. In the experimental system, the vector of geometrical descrip- tors for each 3-D model is calculated during the feature extraction stage. In the retrieval stage, a user can ini- tially use one of the sketching interfaces (such as the virtual reality interface or by using an air mouse) to sketch a 2-D contour of the desired 3-D object. The 2-D shape is recognized by the system, and a sample primi- tive is automatically inserted in the scene. Next, the user defines other elements that cannot be described by the 2-D contour, such as the height of the object, and manip- ulates the 2-D contour until it reaches its target position. The final query is formed after all the primitives are inserted. Finally, the system computes the similarities between the query model and each 3-D model in the database, and renders the best matching records. An online demonstration can be found for a European project specifically designed for a 3-D digital museum col- lection, SCULPTEUR (www.sculpteurweb.org). From its Web-based search interface, a user can choose to do a meta- data search or content-based search for a 3-D object. The search strategy here is somewhat similar to that in some CBIR systems: the user can upload a 3-D model in VRML formats, then select a search algorithm (such as similar color, texture, etc.) to perform a search within a digital collection of 3-D models. As 3-D computer visualization has been widely used in a variety of areas, there are more research projects focusing on the content-based informa- tion retrieval techniques for this new multimedia format. ■ Conclusion There is no doubt that content-based information retrieval technology is an emerging trend for digital library development and will be an important comple- ment to the traditional text-based retrieval technology. The ideal CBIR system can semantically understand the information in a digital library, and render users the most desirable data. However, the machine understand- ing of semantic information still remains to be a great difficulty. Therefore, most current research projects, including those discussed in this paper, deal with the understanding and retrieval of lower-level features or physical features of multimedia content. Certainly, as related disciplines such as computer vision and artificial intelligence keep developing, more researches will be done on higher-level feature-based retrieval. In addition, the growing varieties of multimedia content in digital libraries have also brought many new challenges. For instance, 3-D models now become impor- tant components of many digital libraries and museums. Content-based retrieval technology can be a good direc- tion for this type of content, since the shapes of these 3-D objects are often found more effectively if the user can compose the query visually. New CBIR approaches need to be developed for these novel formats. Furthermore, most CBIR projects today tend to be Web-based. By contrast, many project were based on client applications in the 1990s. These Web-based CBIR tools will have significant influence on digital libraries or repositories, as most of them are also Web-based. Particularly in the age of Web 2.0, some large digital repositories—such as Flickr for images and YouTube and Google Video for video—are changing people’s daily lives. The implementation of CBIR will be a great benefit to millions of users. Since the nature of CBIR is to provide better search aids to end users, it is extremely important to focus on the actual user’s needs and how well the user can use these new search tools. It is surprising to find that little usabil- ity testing has been done for most CBIR projects. Such testing should be incorporated into future CBIR research before it is widely adopted. Bibliography Bimbo, A. and P. Pala. 2006. Content-based retrieval of 3-D mod- els. ACM Transactions on Multimedia Computing, Communica- tions, and Applications 2, no. 1: 20–43. Bustos, B., et al. 2005. Feature-based similarity search in 3-D object databases. ACM Computing Surveys 37, no. 4: 345–387. Chang, S., et al. 1997). VideoQ: an automated content based video search system using visual cues. In Proceedings of the 5th ACM International Conference on Multimedia, E. P. Glinert, et al., eds. New York: ACM. Datta R., et al. 2005. Content-based image retrieval: approaches and trends of the new age. In Proceedings of the 7th Interna- tional Workshop on Multimedia Information Retrieval, in Con- junction with ACM International Conference on Multimedia, H. Zhang, , J. Smith, and Q. Tian, eds. New York: ACM. Ellis, D. and K. Lee. Minimal-impact audio-based personal archives. In Proceedings of the 1st ACM workshop on Continuous Archival and Retrieval of Personal Experiences CARPE, J. Gem- mell, et al., eds. New York: ACM. Li, T., et al. 2003. A comparative study on content-based music genre classification. In Proceedings of the 26th Annual Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval, C. Clarke, et al., eds. New York: ACM. Li, J. and J. Wang, J. 2006. Real-time computerized annotation of pictures. In Proceedings of the 14th Annual ACM International aRtiCLE titLE | autHoR 47ContEnt-BasED inFoRmation REtRiEvaL anD DiGitaL LiBRaRiEs | Wan anD Liu 47 Conference on Multimedia, K. Nahrstedt, et al., eds. New York: ACM. Marques, O. and B. Furht. 2002. Content-based Image and Video Retrieval. Norwell, Mass: Kluwer. Moustakas, K., et al. 2005. MASTER-PIECE: A multimodal (gesture+speech) interface for 3D model search and retrieval integrated in a virtual assembly application. Proceedings of the eNTERFACE: 62–75. Wang, J., et al. 2001. SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Analysis and Machine Intelligence 23, no. 9: 947–963. Wang, Z., et al. 2006. VFerret: content-based similarity search tool for continuous archived video. In Proceedings of the 3rd ACM Workshop on Continuous Archival and Retrival of Personal Experiences, K. Maze et al., eds. New York: ACM. Wold, E., et al. 1996. Content-based classification, search, and retrieval of audio. IEEE MultiMedia 3, no. 3: 27–36. Zhang, T. and C. Kuo. 2001. Content-based Audio Classification and Retrieval for Audiovisual Data Parsing. Norwell, Mass.: Kluwer. LITA National Forum cover 2 LITA Guides cover 3 LITA Workshops cover 4 Index to Advertisers STATEMENT OF OWNERSHIP, MANAGEMENT, AND CIRCULATION Information Technology and Libraries, Publication No. 280-800, is published quarterly in March, June, September, and December by the Library Information and Technology Association, American Library Association, 50 E. Huron St., Chicago, Illinois 60611-2795. Editor: John Webb, Librarian Emeritus, Washington State University Libraries, Pullman, WA 99164-5610. Annual subscription price, $55. Printed in U.S.A. with periodical-class postage paid at Chicago, Illinois, and other locations. As a nonprofit organization authorized to mail at special rates (DMM Section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. EXTENT AND NATURE OF CIRCULATION (Average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: June 2007 issue). Total number of copies printed: average, 5,354; actual, 5,280. Sales through dealers and carriers, street vendors, and counter sales: average, 0; actual 462. Paid or requested mail subscriptions: average, 4,283; actual, 4,193. Free distribution (total): average, 292; actual, 292. Total distribution: average, 5,028; actual, 4,947. Office use, leftover, unaccounted, spoiled after printing: average, 326; actual, 333. Total: average, 5,354; actual, 5,280. Percentage paid: average, 94.19; actual, 94.10. S t a t e m e n t o f O w n e r s h i p , M a n a g e m e n t , a n d C i r c u l a t i o n ( P S F o r m 3 5 2 6 , S e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e U n i t e d S t a t e s P o s t O f f i c e P o s t m a s t e r i n C h i c a g o , O c t o b e r 1 , 2 0 0 7 . 3263 ---- 48 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200748 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 Touchable Online Braille Generator Wooseob Jeong A prototype of a touchable online Braille generator has been developed for the visu- ally impaired or blind using force feedback technology, which has been used in video games for years. Without expensive devices, this prototype allows blind people to access information on the Web by touching out- put Braille displays with a force feedback mouse. The data collected from user studies conducted with blind participants has pro- vided valuable information about the opti- mal conditions for the use of the prototype. The end product of this research will enable visually impaired people to enjoy informa- tion on the Web more freely. The United States has made some attempts to nationally address infor- mation access for those with disabili- ties. Section 508 of the Rehabilitation Act (www.section508.gov) requires federal agencies to make their elec- tronic information accessible to peo- ple with disabilities, mainly those who are visually impaired. The Library of Congress launched a Web- Braille service (www.loc.gov/nls/) for the blind in 1998, which continues today. With the upsurge in infor- mation stored on the Internet, the importance of these issues cannot be overemphasized. Many products have been devel- oped to help the visually impaired use technology. Several Braille out- put and input devices are available, such as the Braille Notetaker (www. artictech.com) and voice synthesizers for screen readers like JAWS (www. freedomscientific.com/fs_products/ software_jaws.asp). While these products are mainly for textual information, recent devel- opments put more focus on graphi- cal displays. The American National Institute of Standards and Technology proposed a “Pins” Down Imaging System for the Blind (www.nist.gov/ public_affairs/factsheet/visualdis- play.htm). Uniplan in Japan and KSG America (www.kgs-america.com/ dvs.htm) have produced other prod- ucts based on similar ideas. Software like the Duxbury Braille Translator (www.duxburysystems.com) can translate plain text into Braille out- put, which can then be used for embossed printing. However, such products are fairly expensive, rang- ing from hundreds to several thou- sands of dollars in addition to the cost of computers. Fortunately, there is a potentially promising solution. Based on the technology used in prior research, it is possible to develop an online Braille generator.1 The Braille could then be read either by touching the screen with a fingertip sensor or through the use of a force feedback mouse similar to the type used in some video games.2 This application has several advantages over existing devices. First, it does not require expensive special devices—only a $20 mouse, which is readily available. Also, the technology is available as long as there is access to the Internet. Another advantage is that this technology utilizes the existing Braille skills of visually impaired people. The same technology can be used for produc- ing image displays as well, allowing for the creation of a virtual museum for the blind where they can touch objects that are displayed alongside their Braille descriptions. Literature review Force feedback has been studied under the name of haptic perception. Haptic perception involves sensing the movement and position of joints, limbs, and fingers through kinesthe- sia and proprioception, and sensing information through the skin’s tactil- ity.3 Haptic output can be achieved through several techniques, including pneumatic, vibrotactile, electrotactile, and electromechanical stimulation.4 This study examines only vibrotac- tile haptic output methods because vibrotactile stimulation is easily cre- ated, manipulated, and delivered. It is also easily perceived by users through the use of commonly avail- able software and devices. Researchers have begun to develop various haptic input/output devices and software, such as Massachusetts Institute of Technology’s (MIT) fre- quently used Phantom haptic inter- face.5 Along with these developments, a number of studies have tried to apply haptic displays to real-world computing, including a force feed- back Braille system,6 force feedback Virtual Reality Modeling Language (VRML),7 a force feedback X Window System8, and GIS.9 Haptic studies have only recently become more mainstream, and there are few extensive studies with real subjects. Gillespie and others devel- oped the “virtual teacher,” a device for manual skill learning, which they tested with 24 participants and found that most profited from the “force feedback teacher.” 10 Langrana and others used the Rutgers Master II, a dexterous, portable master for virtual reality simulations for force feedback using four fingers. In their experiment of tumor detection in virtual livers with 32 subjects, the experimental group with force feedback training performed slightly better than the control group.11 This may mean that either the training methods need improvement or that the task did not require extensive training. Colwell and others confirmed that a hap- tic interface (Impulse Engine 3000) has considerable potential for blind computer users through their three- dimensional objects experiment with 22 subjects.12 Jeong tested ordering Communications Wooseob Jeong (wjj8612@uwm.edu) is Associate Professor at the School of Information Studies, University of Wisconsin–Milwaukee. aRtiCLE titLE | autHoR 49touCHaBLE onLinE BRaiLLE GEnERatoR | JEonG 49 tasks in auditory and haptic dis- plays with 23 subjects and found that subjects performed better with haptic-only displays than with audi- tory-only displays or with auditory/ haptic combination displays.13 Several studies already attempted to apply force feedback technology to assist blind people’s computing. Ramstein conducted a pilot study to apply haptics to Braille.14 Yu and Brewster compared the use of force feedback in multimodal virtual real- ity and printed medium in visu- alization for the blind.15 Tzovaras and others tried to implement a virtual reality interface with force feedback for blind people.16 Ramloll and others studied the use of haptic line graphs with sound for blind students.17 Emery and others tested a multi- modal haptic interface with 29 older adults to find that all participants per- formed well under auditory-haptic bimodal feedback.18 Jacko and others tested a multimodal interface with 29 normal vision older adults and 30 visually impaired older adults, finding that in some cases, nonvisual feedback forms—including auditory or haptic feedback—demonstrated significant performance gains over the visual feedback form.19 S. Jeong and others proposed an interactive system that combines an immersive virtual environment with a human- scale haptic interface.20 When conducting user studies with the visually impaired, it is nec- essary to separate the completely blind from the partially sighted. In spite of the different characteristics of these two groups, the literature on visually impaired people typi- cally does not distinguish between them. This distinction is especially important if the legally blind or those with low vision are included in the definition of visually impaired. The challenges to the partially sighted are different from those of the totally blind, demanding different assistance and considerations. In fact, the completely blind rep- resent a small portion of the visually impaired population. According to an advisor in Wisconsin’s Division of Vocational Rehabilitation, less than 5 percent of her advisees are totally blind and require very specialized attention quite different from par- tially sighted people. Purpose of study The purpose of this study is to explore the feasibility of using force feedback technology to facilitate blind people’s access to text information on the Web. Both quantitative and qualitative data were collected to identify the optimal conditions under which the prototype can best serve the blind. Significance of study Public libraries in the U.S., primar- ily through their main libraries, are providing special services for the visually impaired. Currently, the core service is the provision of audio- books. As digital libraries prevail, services for the blind should be online as well, with the the Library of Congress’s Web-Braille service as one of the leading examples. However, such services require the use of an expensive Braille output device. Upon refinement, this prototype would signifi- cantly improve the experience of the visually impaired using online ser- vices. This proto- type can be easily expanded to sup- port graphical dis- plays without any additional devices, making the use of touchable picture books possible for blind users in libraries. Prototype development Force feedback technology has been used for many years in video games. Its use has expanded to other areas such as surgical operations and dan- gerous mechanical processes. This technology was previously applied to GIS to solve the problem of ambiguous multicolor displays for multi-variable thematic maps.21 The same technique was used for this project. The online Braille generator translates text on the Web into a Braille display, letting the user feel the Braille dots with a vibrating mouse. The prototype interface was developed using Immersion Studio (www.immersion.com), JavaScript, Perl/CGI, and Active Server Pages (ASP). Logitech’s iFeel mouse, inex- pensive at a cost of $20, was used for force feedback output (figure 1). The interface has an input text box, which can be filled with any plain text. Once it is submitted, the text is instantly translated into Braille (fig- ures 2 and 3). When the user moves the mouse over each dot on the screen, it vibrates with a given force. While users explore the screen with the vibrating mouse, force feedback dots provide a tactile effect similar to Braille displays. In future projects, the manual con- Figure 1. Experimental setting 50 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200850 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 version programs will be upgraded to automatic conversion programs with which any texts on the Web can be grabbed by their URLs and converted into a touchable format for the blind. Participants To make this prototype more usable, user studies were conducted in Milwaukee, Wisconsin, with 21 par- ticipants who are completely blind and read Braille. The small sample size—due to the relatively small per- centage of visually impaired peo- ple who are completely blind and can read Braille—is comparable to or larger than those found in other research on the blind. The participants came from vari- ous age groups—teens (3), twenties (6), thirties (2), fifties (5), and sixties (5)—and included 9 females and 12 males. Nineteen of the 21 were born blind. Participants were recruited at several sites, including the University of Wisconsin–Milwaukee Student Accessibility Center, public librar- ies with centers for the blind, and nonprofit organizations for the physi- cally impaired. Vision teachers in local school districts were also contacted. Participants provided valuable i n f o r m a t i o n about the optimal con- ditions for the use of the prototype. This infor- mation will eventually lead to force feedback displays that enable visu- ally impaired people to access the vast amount of information on the Web without expensive devices. Experimental procedure Experiments were conducted in a number of settings, including in the organization’s offices, at the partici- pants’ homes, and at the site of a regional annual meeting for the blind. Each ses- sion lasted no more than 60 minutes. Participants were asked to try differ- ent interfaces of force feedback Braille out- puts with various dot sizes and magnitudes of force. They used a tactile mouse on a note- book computer; after exploring every option, they were asked to select the most comfort- able settings for their sense of touch, including what size the dots should be, how strong the force should be, what kind of force feedback should be used (vibration or friction), and their general opinions of the prototype (see figure 4). Interviews accompanied the experiments so that both quantita- tive and qualitative data could be col- lected. Interviews were transcribed for qualitative data analysis. Result Even though there were only 21 study participants, a number of issues were clearly identified. It is encouraging to see that all of the participants could identify Braille characters using the force feedback mouse with the guidance of the researcher. All the participants agreed that this prototype would be useful with training. The participants pre- ferred the largest dot size (30 pixels in diam- Figure 2. Touchable braille input screen Figure 3. Touchable braille output screen Figure 4. Inexpensive Force Feedback Mouse aRtiCLE titLE | autHoR 51touCHaBLE onLinE BRaiLLE GEnERatoR | JEonG 51 eter) and the strongest force possible for maximum perception of the force feedback effect. However, the prototype was less attractive to the participants than the currently dominant voice synthesizer software. At least two participants mentioned that their current Braille pads fulfill their needs. It seems that they are not motivated to invest their time and effort in a new device. When a potential graphical dis- play application was introduced at the end of a session, the participants became more receptive. At this time there is no practical solution for the visually impaired to feel graphics on computers. Experimental devices are available, but they are either quite expensive or still in the research phase. The blind participants also suggested that this graphical proto- type could be used for geometry and geography easily and effectively. Discussion Blind people’s navigation by mouse Because blind people do not use a mouse for computing, using the force feedback mouse itself was a challenge for the study participants. A sighted person uses a mouse with both hand and eye, moving the mouse while watching the mouse cursor on the screen. For the blind it is difficult to identify the mouse’s position. The direction of movement and the dis- tance between two points are difficult to grasp. Due to the lack of guidance, the blind encounter difficulties in moving the mouse in a straight line. These issues hinder the effectiveness of force feedback displays for the blind. However, this issue does not only affect the blind. Some sighted people, especially older adults, can- not move a mouse easily. One pos- sible solution may be to develop guardrails to help blind people to dif- ferentiate relevant areas of the screen from irrelevant ones. Due to their inexperience in using a mouse, the participants held the mouse too firmly to move it or to feel the force feedback. The only participant to use the mouse success- fully was a college student who is music major with 15 years of piano playing experience. This implies that a significant learning session will be required to allow blind people to use the mouse freely. Ignorance or suppression of graphical information need Even though the participants were more excited about the potential graphical displays, blind people’s graphical information needs are lim- ited. It is possible that their graphi- cal information needs are ignored or suppressed based on their life- time experiences. They tend to resort to Braille and, more recently, voice synthesizers instead of graphi- cal displays. This finding suggests the importance of studying the real information needs of the blind or visually impaired rather than the sighted researchers’ expectations of those needs. More research needed with sound Because the blind already use sound, particularly voice synthesiz- ers, more sound applications should be researched. For example, audio games have the potential to help blind children learn some skills in the same way that video games teach certain skills to sighted children. Audio games also provide a broader research area for future studies. Conclusion Numerous devices have been devel- oped to improve blind or visually impaired people’s access to informa- tion, including information on the Internet. However, such devices are quite expensive or limited in flex- ibility and mainly work in text-only environments. There is no suitable graphic display for the blind, except the laboratory level’s expensive and bulky pin-based external devices. This new prototype uses estab- lished force feedback technology with a minimal cost to existing PCs. It functions for both text and graph- ics. The final products derived from this study can be used for many pur- poses nationally and internationally. Information on the Web can be deliv- ered to the visually impaired without expensive devices. This touchable Braille also lets deaf-blind people, who cannot use screen reader soft- ware, access information on the Web, and it can help people learn Braille. The application of this force feed- back prototype to image displays has exciting and enormous potential because currently there is no practical, usable method for the blind to access images. For example, blind children are still using handmade 3-D picture books that are labor-intensive and time-consuming to produce. With this prototype, children’s books can be delivered easily to blind children, who will touch the books’ images via the force feedback mouse. Maps of local, state, national, or international interests can be delivered to the blind as well. This prototype will help to add yet another sense—touch—to already blossoming visual and auditory digi- tal libraries. Through force feedback technology, new multimodal digi- tal libraries will be accessible to the world. Acknowledgement This research was supported by a Diversity Research Grant from the American Library Association in 2005. 52 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 200852 inFoRmation tECHnoLoGY anD LiBRaRiEs | maRCH 2008 References and Notes 1. Wooseob Jeong and Myke Gluck, “Multimodal Geographic Information Systems: Adding Haptic and Auditory Display,” Journal of the American Society for Information Science and Technology 54, no. 3 (2003): 229–242. 2. Wooseob Jeong, “Touchable Online Braille Generator,” in Proceedings of the 7th International ACM SIGACCESS Conference on Computers and Accessibility (New York: ACM Press, 2005), 188–189. 3. Jack M. Loomis and Susan J. Leder- man, “Tactual Perception,” in Handbook of Perception and Human Performance, ed. K. R. Boff, L. Kaufman and J. P. Thomas (New York: John Wiley & Sons, 1986), vol. 2, chap. 31, 1–41. 4. R. Dan Jacobson, Robert Kitchen, and Reginald Golledge, “Multimodal Vir- tual Reality for Presenting Geographic Information,” in Virtual Reality in Geog- raphy, ed. P. Fisher and D. Unwin (New York: Taylor & Francis, 2000), 382–400. 5. J. Kenneth Salisbury and Man- dayam A. Srinivasan, “Phantom-Based Haptic Interaction with Virtual Objects,” IEEE Computer Graphics and Applications 17, no. 5 (1997): 6–10. 6. Christopher Ramstein, “Combining Haptic and Braille Technologies: Design Issues and Pilot Study,” in Proceedings of the 2nd Annual ACM Conference on Assis- tive Technologies (New York: ACM Press, 1996), 37–44. 7. A. Hardwick, S. Furner, and J. Rush, “Tactile Access for Blind People to Virtual Reality on the World Wide Web,” IEE Colloquium on Developments in Tactile Displays 1997, no. 012: 9/1–9/3. 8. Timothy Miller and Robert Zeleznik, “The Design of 3D Haptic Wid- gets,” in Proceedings of the 1999 Symposium on Interactive 3D Graphics (New York: ACM Press, 1999), 97–102. 9. R. Dan Jacobson, “Geographic Visualization with Little or No Sight: An Interactive GIS for Visually Impaired People (paper submitted to AAG-GIS spe- cialty group student paper competition). 10. R. Brent Gillespie and others, “The Virtual Teacher” in Proceedings of ASME Dynamic Systems and Control Division (New York: ASME, 1998), vol. 2, 171–78. 11. Noshira A. Langrana and others, “Human Performance Using Virtual Real- ity Tumor Palpation Simulation,” Com- puter & Graphics 21, no. 4 (1997): 451–458. 12. C. Colwell and others, “Haptic Vir- tual Reality for Blind Computer Users,” in Proceedings of the Third Annual ACM Conference on Assistive Technologies (New York: ACM Press, 1998), 92–99. 13. Wooseob Jeong, “Exploratory User Study of Haptic and Auditory Display for Multimodal Geographic Information Systems,” in CHI’01 Extended Abstracts on Human Factors in Computing Systems (New York: ACM Press, 2001), 73–74. 14. Ramstein, “Combining Haptic and Braille Technologies.” 15. Wai Yu and Stephen Brewster, “Multimodal Technologies: Multimodal Virtual Reality Versus Printed Medium in Visualization for Blind People,” in Proceedings of the 5th International ACM Conference on Assistive Technologies (New York: ACM Press, 2002), 57–64. 16. D. Tzovaras and others, “Multi- modal Technologies: Design and Imple- mentation of Virtual Environments Training of the Visually Impaired,” in Proceedings of the 5th International ACM Conference on Assistive Technologies (New York: ACM Press, 2002), 41–48. 17. R. Ramloll and others, “Construct- ing Sonified Haptic Line Graphs for the Blind Student: First Steps,” in Proceedings of the 4th International ACM Conference on Assistive Technologies (New York: ACM Press, 2000), 17–25. 18. V. Kathlene Emery and others, “Toward Achieving Universal Usability for Older Adults through Multimodal Feedback,” in Proceedings of the 2003 Con- ference on Universal Usability (New York: ACM Press, 2003), 46–53. 19. Julie A. Jacko and others, “Older Adults and Visual Impairment: What Do Exposure Times and Accuracy Tell Us About Performance Gains Associated with Multimodal Feedback?” in Proceed- ings of the SIGCHI Conference on Human Factors in Computing Systems (New York: ACM Press, 2003), 33–40. 20. Seongzoo Jeong, Naoki Hashimoto, and Sato Makoto, “A Novel Interaction System with Force Feedback between Real and Virtual Humans,” in Proceedings of the 2004 ACM SIGCHI International Con- ference on Advances in Computer Entertain- ment Technology (New York: ACM Press, 2004), 61–66. 21. Jeong and Gluck, “Multimodal Geographic Information Systems”; and Wooseob Jeong, “Multimodal Trivariate Thematic Maps with Auditory and Hap- tic Display” (paper contributed to ASIST 2005, Charlotte, North Carolina, October 28–November 2, 2005). 3265 ---- 2 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 Editorial: Farewell and Thank You John Webb This issue of Information Technology and Libraries (ITAL), December 2007, marks the end of my term as editor. It has been an honor and a privilege to serve the LITA membership and ITAL readership for the past three years. It has been one of the highlights of my professional career. Editing a quarterly print journal in the field of information technology is an interesting experience. My deadlines for the submission of copy for an issue are approximately three and a half months prior to the beginning of the month in which the issue is pub lished; for example, my deadline for the submission of this issue to ALA Production Services was August 15. Therefore, most articles that can appear in an issue were accepted in final form at least five months before they were published. Some are older; one was a baby at only four months old. When one considers the rate of change in information technologies today, one understands the need for blogs, wikis, lists, and other forms of profes sional discourse in our field. What role does ITAL play in this rapidly changing environment? For one, unlike these newer forms, it is doubleblind refereed. Published articles run a peer review gauntlet. This is an important distinction, not least to the many LITA members who work for aca demic institutions. It may be crass to state it so baldly, but publication in ITAL can help one earn tenure, an oldfashioned fact of life. It is indexed or abstracted in nineteen published sources, not all of them in English. Many of its articles appear in various digital repositories and archives, and these also are harvested or indexed or both. In addition, its articles are cataloged in WorldCat Local. Many of LITA’s most prominent members—your distinguished peers—have published articles in ITAL. The journal also serves as a source for the wider dis semination of sponsored research, a requirement of most grants. And you can read it on the bus or at the beach (heaven forbid!), in the brightest sunlight, or with a flashlight under the covers (though there are no reports of this ever having been observed). I am amazed at how quickly these three years have passed, though that may be at least as much a function of my advanced age as of the fun and pleasure I have had as editor. Certainly, these past three years have hosted some notable landmarks in our history. LITA and ITAL both celebrated their fortieth anniversaries. Sadly, the death of one of LITA’s founders and ITAL’s first editor, Frederick G. Kilgour, on July 31, 2006, at age ninetytwo, was a landmark in the passing of an era. OCLC and RLG’s merger, which Fred lived to witness, was a landmark of a different sort—one of maturity, we hope. ITAL is now an electronic as well as a print journal. This conversion has had some rough passages, but I trust these will have been ironed out by the time you read this. When I became editor, I had a number of goals for the journal, which I stated in my first editorial in March 2005. Reading that editorial today, I realize that we successfully accomplished the concrete ones that were most important to me then: increasing the number of articles from library and Ischool faculty; increasing the number that result from sponsored research; increasing the number that describe any relevant research or cuttingedge advance ments; increasing the number of articles with multiple authors; and finding a model for electronic publication of the journal. The accomplishment of the most abstract and ambitious goal, “to make ITAL a destination journal of excellence for both readers and authors,” only you, the readers and authors, can judge. I thank Mary Taylor, LITA executive director, and her staff for all of the support they provided to me during my term. I owe a debt that I can never repay to all of the staff of ALA Production Services who worked with me these past three years. Their patience with my some times bumbling ways was awardwinning. Thank all of you. The LITA presidents and other officers and board members were unfailingly supportive, and I thank you all. In the LITA organizational structure, the ITAL editor and the Editorial Board report to the LITA Publications Committee, and the editor is a member of that body. I thank all of the chairs and other members of that commit tee for their support. Once more, and sadly for the last time, I thank all of the members of the ITAL Editorial Board who served dur ing my term for their service and guidance. They perform more than their share of refereeing, but more importantly, as I have written before, they are the junkyard dogs who have kept me under control and prevented my acting on my worst instincts. I say again, you, the LITA member ship and ITAL readership, owe them more than you can ever guess. Trust me. To Marc Truitt, ITAL managing editor and the incom ing ITAL editor for the 2008–2010 volume years, I must say, “Thank you, thank you, thank you!” Marc and the ALA Production Services staff were responsible for the form, fit, and finish of the journal issues you received in the mail, held in your hands, and read under the covers. Finally, most of all, THANK YOU authors whose articles, communications, and tutorials I have had the privilege to publish, and you whose articles have been accepted and await publication. John Webb (jwebb@wsu.edu) is a Librarian Emeritus, Washington State university, and Editor of Information Technology and Libraries. EDitoRiaL: FaREWELL anD tHank You | JoHn WEBB 3 Not only is this the end of my term as editor, but I also have retired. From now on, my only role in the field of library and information technology will be as a user. Those of you have seen the movie The Graduate probably remember the early scene when Benjamin, the Dustin Hoffman character, receives the single word of advice regarding his future: “plastics.” (I don’t know if that scene is in the novel from which the movie was adapted.) My single word of advice to those of you too young or too ambitious to retire from our field is: “handhelds.” I am surprised that my Treo is more valuable to me now in retirement than it was when I was working. (I’m not surprised that my iPod video is, nor that Word thinks that Treo and iPod are misspellings.) I just wish that more of the Web was as easily accessible on my Treo as are Google Maps and almost all of Yahoo!. Handhelds. Trust me. 3266 ---- 4 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 Author iD box for 2 column layout Column Title Editor Enterprise digital asset management (DAM) systems are beginning to be explored in higher education, but little information about their implementation issues is avail- able. This article describes the University of Michigan’s investigation of managing and retrieving rich media assets in an enterprise DAM system. It includes the background of the pilot project and descriptions of its infrastructure and metadata schema. Two case studies are summarized—one in healthcare education, and one in teacher education and research. Experiences with five significant issues are summarized: privacy, intellec- tual ownership, digital rights management, uncataloged materials backlog, and user interface and integration with other systems. U niversities are producers and repositories of large amounts of intellectual assets. These assets are of various forms: in addition to text materials, such as journal papers, there are theses, performances from per forming arts departments, recordings of native speakers of indigenous languages, or videos demonstrating surgical procedures, to name a few.1 Such multimedia materials have not, in general, been available outside the originat ing academic department or unit, let alone systematically cataloged or indexed. Valuable assets are “lost” by being locked away in individual drawers or hard disks.2 Managing and retrieving multimedia assets are not problems confined to academia. Media companies such as broadcast news agencies and movie studios also have faced this problem, leading to their adoption of digital asset management (DAM) systems. In brief, DAM systems are not only repositories of digitalrich media content and the associated metadata, but also provide management functionalities similar to database manage ment systems, including access control.3 A DAM system can “ingest digital assets, store and index assets for easy searching, retrieve assets for use in many environments, and manage the rights associated with those assets.”4 In summer 2000, the University of Michigan (UM) TV station, UMTV, was searching for a video archive solution. That fall, a UM team visited CNN and experienced a “Eureka!” moment. As James Hilton, thenassociate provost for academic, information, and instructional technology affairs, later wrote, “building a digital asset management into the infrastructure . . . will be the digital equivalent of bringing indoor plumbing to the campus.”5 In spring 2001, an enterprise DAM system was considered for inclusion in the university infrastruc ture. Upon completion of a limited proofofconcept project, a crosscampus team developed the request for proposals (RFP) for the DAMS Living Lab, which was issued in July 2002 and subsequently awarded to IBM and Ancept. In August 2003, hardware and software installation began in the Living Lab.6 By 2006, the project changed its name to BlueStream to appeal to the grow ing mainstream user base.7 Six academic and two support units agreed to partner in the pilot: ■ School of Education ■ School of Dentistry ■ College of Literature, Science, and the Arts ■ School of Nursing ■ School of Pharmacy ■ School of Social Work ■ Information Technology Central Services ■ University Libraries The academic units were asked to provide typical and unusual digital media assets to be included in the Living Lab pilot. The pilot focused on rich media, so the preferred types of assets were digital video, images, and other multimedia delivered over the Web. The Living Lab pilot was designed to address four key questions: ■ How to create a robust infrastructure to process, manage, store, and publish digital rich media assets and their associated metadata. ■ How to build an environment where assets are eas ily searched, shared, edited, and repurposed in the academic model. ■ How to streamline the workflow required to create new works with digital rich media assets. ■ How to provide a campuswide platform for future application of rights declaration techniques (or other IP tools) to existing assets. This article describes the challenges encountered during the researchanddevelopment phase of the UM enterprise DAM system project known as the Living Lab. The project has now ended, and the implemented project is known as BlueStream. Enterprise Digital Asset Management System Pilot: Lessons Learned Yong-Mi Kim, Judy Ahronheim, Kara Suzuka, Louis E. King, Dan Bruell, Ron Miller, and Lynn Johnson Yong-mi kim (kimym@umich.edu) is CArAT-rackham Fellow 2004, School of information; Judy ahronheim (jaheim@umich .edu) is Metadata Specialist, university Libraries; kara suzuka (ksuzuka@umich.edu) is Assistant research Scientist, School of Education; Louis E. king (leking@umich.edu) is Managing Producer, Digital Media Commons; Dan Bruell (danlbee@umich .edu) is Director, School of Dentistry; Ron miller (ronalan@umich .edu) is Multimedia Services Position Lead, School of Education; and Lynn Johnson (lynjohns@umich.edu) is Associate Professor, School of Dentistry, university of Michigan, Ann Arbor. aRticLE titLE | autHoR 5EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 5 ■ Background of the Living Lab: U-M enterprise DAM system project An enterprise project such as the Living Lab at UM can have significant impact on an institution’s teaching and learning activities by allowing all faculty and students easy yet secure access to media assets across the entire campus. Such extensive impact can only be obtained by overcoming numerous and varied obstacles and by docu menting actual implementation experiences employed to overcome those challenges. Enterprise DAM system vendors such as Stellent, Artesia, and Canto list clients from many different industry sectors, including gov ernment and education, but provide no detailed case studies on their Web sites.8 Information regarding the status of enterprise DAM system projects and specific issues that arose during implementation is difficult to find. Information publicly available for enterprise DAM system projects in higher education is usually in the form of white papers or proposals that do not cover the actual implementations.9 Given the high degree of interest and the number of pilot projects announced in recent years, this shortcoming has prompted the writing of this article, which presents the most important lessons learned dur ing the first phase of the Living Lab pilot project with the hope that these experiences will be valuable to other academic institutions considering similar projects. As part of its core mission, UM strives to meet the teaching and learning needs of the entire campus. Thus, the Living Lab pilot solicited participation from a diverse crosssection of the university’s departments and units with the goal of evaluating the use of varied teaching and learning assets for the system. From the beginning, it was expected that this system would handle assets in many different forms, such as digital video or digitized images, and also accommodate various organizational schemas and metadata for different collections. This sets the UM enterprise DAM system apart from projects that focus on only one type of collection or define a large monolithic metadata schema for all assets. Data were gathered through interviews with asset providers, focus groups with potential users, and a review of the relevant literature. A number of barriers were identified during the pilot’s first phase. While there were some technical barriers, the most signifi cant barriers were cultural and organizational ones for which technical solutions were not clear. Perhaps the most significant cultural divide was between the culture of academia and the culture of the commercial sector. Cultural and organizational assumptions from com mercial business practices were embedded in the design of the products initially used in the Living Lab imple mentation. Thus, an additional implementation chal lenge was determining which issues should be resolved through technical means, and which should be solved by changing the academic culture. This is expected to be an ongoing challenge. ■ Architecture (building the infrastructure) An enterprise DAM system in an academic community such as UM needs to support a wide variety of services in order to meet the numerous and varied teaching, research, service, and administrative functions. Figure 1 illustrates the services that are provided by an enterprise DAM system and concurrently demonstrates its com plexity. The left column, Process, lists a few of the media processes that various producers will use prepare their media and subsequent ingestion into the enterprise DAM system; the middle column, Manage, demonstrates the various functions of the enterprise DAM system; while the third column, Publish, lists a subset of the publishing venues for the media. Because an enterprise DAM system supports a variety of rich media, a number of software tools and workflows are required. Figure 2 illustrates this complexity and describes the architecture and workflow used to add a video segment. The organization of figure 2 parallels that of figure 1. The left column, Process, indicates that Flip Factory by Telestream is used to convert digital video from the original codec to one that can be used for play back.10 In addition, VideoLogger by Virage uses media analysis algorithms to extract key frames and time codes Created by Louis E. King, ©2004 Regents of the University of Michigan Figure 1. Component services of the Living Lab 6 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 20076 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 from the video as well as to convert the speechtotext for easy searching.11 The middle column, Manage, illustrates tools from IBM that help create rich media as well as tools from Stellent, such as its Ancept Media Server (AMS), that store and index the rich media assets.12 The third column, Publish, illustrates two examples of how these digital video assets could be made available to the end user. One strategy is as a real video stream using Real Network’s Helix Server, and the other as a QuickTime video stream using IBM’s VideoCharger.13 A thorough discussion of all of the software and hardware that make up UM’s DAM system is beyond the scope of this article. However, a list of the software components with links to their associated Web sites is provided in figure 3. From the beginning the Living Lab pilot aimed for a diverse collection of assets to promote resource discovery and sharing across the university. Figure 4 illustrates how the Living Lab is expected to fit into the varied publishing venues that comprise the campus teaching and learning infrastructure. Existing storage and network infrastruc tures are used to deliver media assets to various software systems on campus. The Living Lab is used to streamline the cataloging, searching, and retrieving processes encoun tered during academic teaching and research activities. The following example describes how the enterprise DAM system fits into the future campus cyberinfrastruc ture. A faculty member in the School of Music is a jazz composer. One of her compositions is digitally stored in the enterprise DAM system along with the associated metadata (cataloging information) that will allow the piece to be found during a search. That single audio file is then found, accessed, and used by five unique publish ing venues—the course Web site, the university Web site, a radio broadcast, the music store, and the library archive. The faculty member uses the piece in her jazz interpreta tion course and thus includes a link to the composition on her Sakai course Web site.14 When she receives an award, the UM issues a press release on the UM Web site that includes a link to an audio sample. Concurrently, Michigan Radio uses the enterprise DAM system to find the piece for a radio interview with her that includes an audio segment.15 Her performance is published by Block M Records, UM’s Webbased recording label, and, lastly, the library permanently stores the valuable piece in its institutional archive, Deep Blue.16 ■ Metadata (managing assets within the academic model) The vision for enterprise DAM at UM is for digital assets to not only be stored in a secure repository, but also be findable, accessible, and usable by the appropriate persons in the university community in their academic endeavors. Information about these assets, or metadata, is a crucial component of fulfilling this vision. An important Created by Louis E. King, ©2004 Regents of the University of Michigan Figure 2. The Living Lab architecture North American Systems Ancept Media Server www.nasi.com/ancept.php IBM Content Manager www-306.ibm.com/software/data/cm/cmgr/mp/ Telestream Flip Factory www.telestream.net/products/flipfactory.htm Virage VideoLogger www.virage.com/content/products/index.en.html IBM Video Charger www-306.ibm.com/software/data/videocharger/ Real Networks Helix Server www.realnetworks.com/products/media_delivery. html Apple Quicktime Streaming Server www.apple.com/quicktime/streamingserver/ Handmade Software Image Alchemy www.handmadesw.com/Products/Image_Alchemy. htm Figure 3. Software used in the Living Lab aRticLE titLE | autHoR 7EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 7 question that arises is, “What kind of metadata should be required for the assets in the Living Lab?” To help answer this question, potential asset provid ers were interviewed regarding their current approach to metadata, such as if they used a particular schema and how well it met their purposes. Not surprisingly, asset providers had widely varied metadata implementations. While the assets intended for the Living Lab pilot all had some metadata, the scope and granularity varied greatly. Metadata storage and access methods also varied, ranging from databases implemented using commercial database products and providing Web frontends, to a combination of paper and spreadsheet records that had to be consulted together to locate a particular asset. The assets to be used in the Living Lab pilot consisted primarily of high and lowresolution digital images and digitized video. These interviews also generated a number of requirements for any potential Living Lab metadata schema. It was deter mined that the schema should be able to: ■ describe heterogeneous collections at an appropriate level of granularity and detail, allowing for domain specific description needs and vocabularies; ■ allow metadata entry by nonspecialists; ■ enable searches across multiple subject areas and col lections; ■ provide provenance information for the assets; and ■ provide information on authorized uses of the assets for differing classes of users. An examination of the literature showed a general consensus that no single metadata standard could meet the requirements of heterogeneous collections.17 Projects as diverse as PB Core and VIUS at Penn State adopted the approach of drawing from multiple existing metadata standards.18 Their approaches differ in that PB Core is a combination of selected metadata elements from a num ber of standards plus additional elements unique to PB Core, while VIUS opted for a merged superset of all the elements in the standards selected. In interviews with asset providers (usually faculty), cataloging backlog and the lack of personnel for gen erating and entering metadata emerged as consistent problems. There was concern that an overly complex or specialized schema would aggravate the cataloging back log by making metadata generation timeconsuming and cumbersome. Budgetary constraints made hiring pro fessional metadata creators prohibitive. Another aspect of the personnel problem was that adequate descrip tion required subject specialists who were, ideally, the resource authors or creators. But subject specialists, while familiar with the resources and the potential audience for them, may not be knowledgeable of how to produce highquality metadata, such as controlled vocabularies or consistent naming formats. To address these issues, the more simple and straight forward indexing process offered by Dublin Core (DC) was selected as the starting point for the metadata schema in the Living Lab.19 DC was originally developed to sup port resource discovery of a digital object, with resource authors as metadata creators. DC is a relatively small standard, but is extensible through the use of qualifiers. It has been adopted as a standard by a number of standards organizations, such as ISO and ANSI. A body of research exists on its use in digital libraries and its efficacy for authorgenerated metadata, and there are metadata crosswalks between DC and most other metadata stan dards. A number of other subjectspecific standards were also examined for more specialized description needs and controlled vocabularies: VRA Core, IMS Learning Resource MetaData Specification, and SNODENT.20 In the end, the project leaders elected to adopt a rather novel approach to metadata by not defining one metadata schema for all assets. By taking advantage of the power of multiple approaches (for example, PB Core for mixand match, and VIUS for a merged superset) each collection can have its own schema as long as it contains the ele ments of a more general, lowestcommondenominator schema. This overall schema, UM_Core, was defined based on DC. The elements are prefixed with DC or UM to specify the schema origin. UM_Publisher and UM_AlternatePublisher identify who should be contacted about problems or ques tions regarding that particular asset. UM_SecondarySubject is a crosscollection subject classification schema devel Created by Louis E. King, ©2004 Regents of the University of Michigan Figure 4. The enterprise DAM system as the future campus infra- structure for academic venues 8 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 20078 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 oped by the UM Libraries, and helps map the asset into the context of the university. In adopting such an approach to metadata, metadata creation is seen not as a oneshot process, but a collaborative and iterative one. For example, on initial ingestion into the Living Lab, the only metadata entered for an image may be DC_Title, DC_Date, and UM_Publisher. Additional meta data may be entered as users discover and use the asset, or as input from a subject specialist becomes available. The discussion so far has focused on metadata pro duced with human intervention. A number of metadata elements can be obtained from the digital objects through the use of software. In an enterprise DAM system, this is referred to as automatically generated metadata and is what can be directly obtained from a computer file such as file name, file size, and file format. This type of metadata is expected to play a larger role as an increasing propor tion of assets will be born digital and come accompanied by a rich set of embedded metadata. For example, images or video produced by current digital cameras contain exchangeable image file format (EXIF) metadata, which include such information as image size, date produced, and camera model used. When available, the Living Lab presents automatically generated metadata to the user in addition to the elements in UM_Core. Thus, asset metadata in the Living Lab can be pro duced in two ways: automatically generated through a tool such as Virage VideoLogger in the case of video, or entered by hand through the current DAM system inter face.21 In addition, if metadata already exist in a database format, such as FileMaker, this can be imported once the appropriate mappings are defined.22 VideoLogger, a video analysis tool for digital video files, can extract video key frames, add closed captions, determine whether the audio is speech or music, convert speech to text, and identify (through facial recognition) the speaker(s). These capabilities allow for more sophis ticated searching of video assets compared to the cur rent capabilities of search engines such as Google. Some degree of contentbased searching can now be done, as opposed to searching that relies on the title and other textual description provided separately from the video itself. For the pilot, particular interest was expressed in the speech recognition capability of VideoLogger. VideoLogger generates a timecoded text of spoken key words with 50 to 90 percent accuracy. The result is not nearly accurate enough to generate a transcript, but does indeed provide robust data for searching the content of video. Given the diversity of assets in the Living Lab, it is clear that the university can utilize lowcost keyword analysis to enhance search granularity as well as the more expensive, fully accurate handprocessed transcript. ■ Workflow examples Two instructional challenges demonstrate how an enter prise digital asset management system can provide a solution to instructional dilemmas and how a unique workflow needs to be created for each situation. The chal lenges related to each project are described. school of Dentistry the Educational Dilemma The UM School of Dentistry uses standardized patient instructors (SPIs) to assess students’ abilities to interact with patients. Carefully trained actors play carefully scripted patient roles. Dental students interview the patients, read their records, and make decisions about the patients’ care, all in a few minutes (see figure 6). Each session is video recorded. Currently, SPIs grade each student on predeter mined criteria, and the video recording is only used if a student contests the SPIs’ grade. Ideally, a dental educator should review each recording and also grade each student. However, the UM class size of 105 dental students causes a recordingbased grading process to be prohibitively expensive in terms of personnel time. In addition, the use of digital videotape makes it difficult for the recorded sessions to be made available to the students. Because the tapes are part of the student’s record, they cannot be checked out. If a student wants to review a tape, she or he must make an appointment and review it in a supervised setting. Living Lab solution The UM School of Dentistry’s Living Lab pilot attempted simultaneously to improve the SPI program and lower the cost of faculty grading SPI sessions through three goals: DC_Title DC_Creator DC_Subject UM_SecondarySubject DC_Description DC_Publisher DC_Contributor DC_Date DC_Type DC_Format DC_Identifier DC_Source DC_Language DC_Relation DC_Coverage DC_Rights UM_Publisher UM_AlternatePublisher Figure 5. The u-M enterprise DAM system metadata scheme uM_Core aRticLE titLE | autHoR 9EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 9 1. use speechtotext analysis to create an easily searched transcript; 2. streamline the recording process; and 3. make the videos available online for student review. Each of these challenges and the current results are summarized. speech-to-text analysis It was hypothesized that an effective speechtotext anal ysis of the SPI session could enable a grader quickly to locate video segments that: (1) represented student dis cussion of specific dental procedures; and (2) contained student verbalizations of key clinical communication skills.23 In summer 2005, nine SPI sessions were recorded and a comparison between manual transcription and the automated speechtotext processes was conducted. The transcribed audio track was manually marked up with timecoded reference points and inserted as an annota tion track to the video. Those same videos also were ana lyzed through the Video Logger speechtotext service in the Living Lab, resulting in an automatically generated, timecoded text track. Lastly, six keywords were selected that, if spoken by the student, indicated the correct use of either a dental procedure or good communication skills. Keyword searches were conducted on both the manual transcription and the speechtotext analysis. Three results were calculated on the key word searches of both versions of all nine recorded sessions. They were: (1) the number of successful keyword searches; (2) the number of successful search results that did not actually contain the keywords (false positives); and (3) the time required to complete the manual transcrip tion and texttospeech analysis of the recordings. The results demonstrated that the speechtotext analysis matched the manual transcription 20 to 60 percent of the time. Also, the speechtotext process resulted in a false positive less than 10 percent of the time. Lastly, the time required to complete the speechtotext analysis of a session was two minutes, while the average time required to complete a manual transcription of the same session was 180 minutes. While not perfect, the results are encouraging that manually transcribing the audio is no longer necessary. Improvements are being made to the clinical environment and microphones so that a higherquality recording is obtained. It is anticipated that those changes combined with improved software will improve the results of the speechtotext analysis sufficiently so that automated keyword searches can be conducted for grading purposes. streamlining the recording process Scale is a significant challenge to capturing 105 SPI inter actions in a short amount of time. Two to three weeks are required for the entire class of 105 students to complete a series of SPI experiences, with as a many as four concur rent sessions at any given time. In summer 2006, it was decided to record 50 percent of one class. Logistically, one camera operator could staff two stations simultane ously. The stations had to be physically close enough for a oneperson operation, but not so close that audio from the adjacent session was recorded. The optimal distance was about thirty to thirtyfive feet of separa tion. Staggering the start times of each session allowed the camera operator to make sure each was started with optimal settings. Since the results of the speechtotext analysis were linked to the quality of the equipment used, two prosumer miniDV cameras with professional quality microphones and tripods also were purchased. student availability An important strength of Living Lab is the ability to make the assets both protected and accessible. The current itera tion does not have an interface for usercreated access con trol lists (ACL), instead they need to be created by a systems administrator. Once a systems administrator has created an ACL, academic technology support staff can add or subtract people. To satisfy Family Educational Rights and Protection Act regulations, a separate ACL is needed for each student for the SPI project.24 Currently, the possibility of including the SPI recordings and their associated transcriptions as ele ments of an ePortfolio is being explored.25 In the meantime, students can use URL references to include these videos and transcripts in such Webbased tools as ePortfolios and course management systems. Discussion As the challenges of improving speechtotext analysis, recording workflow, and usercreated ACLs are overcome, the SPI program will be able to operate at a new and previ ously unimagined level. A more objective keyword grad ing process can be instituted. Students will be easily able to search through and review their sessions at times and locations that are convenient for them. Living Lab also will allow students to view their ePortfolio of SPI interactions and witness how they have improved their communica tion skills with patients. For the first time in healthcare education, a clinician’s communication skills, such as bedside or chairside manner, will be able to be taught and assessed using objective methods. school of Education the challenge of using records of practice for research and professional education Classroom documentation plays a significant role in educational research and in the professional education of teachers at the UM School of Education. Collections of 10 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200710 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 videos capturing classroom lessons, smallgroup work, and interviews with students and teachers—as well as other classroom records, such as images of student work, teacher lesson plans, and assessment documents—are basic to much of the research that takes places in the School of Education. However, there also is a large and increasing demand to use these records from real class rooms for educational purposes at the UM and beyond, creating rich media materials for helping preservice and practicing teachers learn to see, understand, and engage in important practices of teaching. This desire to create widely distributed educational materials from classroom documentation raises two important challenges: first, there is the important challenge of protecting the identity of children (and, in some cases, teachers); and second, there is the difficult task of ensuring that the classroom records can be easily accessed by individuals who have permission to view and use the records while being inac cessible to those without permission. One research and materials development project at the UM School of Education has been exploring the use of Living Lab to support the critical work of processing classroom records for use in research and in educational materials, and the distribution and protection of class room records as they are integrated into teacher educa tion lessons and professional development sessions at the UM and other sites in the United States. The findings and challenges of these efforts are summarized below. processing classroom records The classroom records used in the pilot were processed in three main ways, producing three different types of products: ■ Preservation copies are highquality formats of the classroom records with minimal loss of digital infor mation that can be read by modern computers with standard software. These files are given standardized filenames, cleaned of artifacts and minor irregu larities, and deidentified (that is, digitally altered to remove any information that could reveal the identity of the students and, in some cases, of the teachers). ■ Working copies are lowerquality versions of the preservation copies that are still sufficient for print ing or displaying and viewing. Trading some degree of quality for smaller file sizes and thus data rates, the working copies are easier for people to use and share. Additionally, these files are further devel oped to enhance usability: videos are clipped and composited to feature particular episodes; videos also are subtitled, flagged with chapter markers (or other types of coding), and embedded with links for accessing other relevant information; images of stu dent and teacher work are organized into multipage PDFs with bookmarks, links, and other navigational aids; and all files are embedded with metadata for aiding their discovery and revealing information about the files and their contents. ■ Distribution copies are typically similar in quality to the working copies but are often integrated into other documents or with other content; they are labeled with copyright information and statements about the limitations of use. They are, in many cases, edited for use on a variety of platforms and copy protected in small ways (for example, Word and PowerPoint files are converted to PDFs). The Living Lab was found to support this processing of classroom records in two important ways. First, the system allowed for the setup and use of workflows that enabled undergraduate students hired by the project to upload processed files into the system and walk through a series of quality checks, focused on different aspects of the products. So, for example, when checking the preservation copies, one person was assigned to check the preservation copy against the actual artifact to make sure everything was captured adequately and that the resulting digital file was named properly (“quality check 1”). Another individual was assigned to make sure the content was cleaned up properly and that no identifying information appeared anywhere (“quality check 2”). And finally, a third person checked the file against the meta data to make sure that all basic information about the file was correct (“quality check 3”). Files that passed through all checks were organized into collections accessible to project members and others (“organize”). Files that failed along the way were sent back to the beginning of the workflow (the “drawing board”), fixed, and checked again (see figure 7). Figure 6. A dental student interviewing an SPi. aRticLE titLE | autHoR 11EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 11 Second, Living Lab allowed asset and collection development to be carried out collaboratively and itera tively, enabling different individuals to add value in dif ferent ways over time. Undergraduate students did much of the initial processing and checking of the assets; skilled staff members converted subtitles into speech metadata housed within Living Lab; and, eventually, project faculty and graduate students will add other types of analytic codes and content specific metadata to the assets. Distribution and protection of classroom records In addition to supporting the production of various types of assets and collections, the Living Lab supported the distribution and protection of classroom records for use in education settings both at UM and other institutions. For example, almost fifteen hours of classroom videos from a thirdgrade mathematics class were made acces sible to and were used by instructors and students in the College of Education at Michigan State University. In a different context, approximately ten minutes of classroom video was made available to instructors in mathematics departments at Brigham Young University, the University of Georgia, and the City College of New York to use in courses for elementary teachers. Each asset (and its derivatives) housed within Living Lab has a URL that can be embedded within Web pages and online coursemanagement systems, allowing for a great deal of flexibility in how and where the assets are pre sented and used. At the same time, each call to the server is checked and, when required, users are prompted to authen ticate by logging in before any assets are delivered. This has great potential for easily, seamlessly, and safely integrating Living Lab assets into a variety of Web spaces. Although this feature has indeed allowed for a great deal of flexibility, there were and continue to be challenges with creating an integrated and seamless experience for School of Education students and their instructors. For example, depending on a variety of factors, such as user operating systems and Web browser combinations, users might be prompted for multiple logins. Additionally, the login for the Living Lab server can be quite unforgiving, locking out users who fail to login properly in the first few tries and providing limited communication about what has occurred and what needs to be done to correct the situation. Discussion During the Living Lab pilot a number of workflow chal lenges were overcome that now allow numerous and varied types of media related to classroom records to be ingested into Living Lab, and derivatives created. This demonstrates that Living Lab is ready for complex media challenges associated with instruction. However, the next challenge of delivering easily and smoothly to others still remains. Once authentication and authorization is con ducted using single signon techniques that allow users to access assets securely from Living Lab through other systems, assets will be able to be incorporated into Web based materials and used to enhance the instruction of teachers in ways that have yet to be conceived. ■ Privacy, intellectual property, and copyright During the course of the pilot, a number of issues emerged. Among these were some of the most critical issues that institutions considering embarking on a similar asset man agement system need to address. These issues are: ■ privacy; ■ intellectual ownership and author control of materials; ■ digital rights management and copyright; ■ uncataloged materials backlog; and ■ user interface and integration with other campus systems. Up to this point, enterprise DAM systems had been developed and used primarily by commercial enterprises— for example, CNN and other broadcasting companies. Using a product developed by and for the commercial sec tor brought to the fore the cultural differences between the academy and the commercial sector (see figure 8). The first three issues in the previous list are related to the differing cultures of commercial enterprise and academia. These issues are addressed below. The fourth and fifth issues are addressed in the section “Other Important Issues.” privacy Videos of medical procedures can be of tremendous value to students. In their own words, “Watching is different from reading about it in a textbook.” But subjects have the right to retract their consent regarding the use of their images or treatment information for educational purposes. This creates a dilemma: if other assets have been cre ated using it, do all of them have to be withdrawn? For drawing board → quality check 1 → quality check 2 → quality check 3 → organize Figure 7. Living Lab workflow 12 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200712 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 example, if a professor included an image from the univer sity’s DAM system in a classroom PowerPoint or Keynote presentation, and subsequently included the presentation in the university’s DAM system, what is the status of this file if the patient withdraws consent for use of her or his treatment information?26 When must the patient’s request be fulfilled? Can it be done at the end of the semester, or does it need to be completed immediately? If the request must be fulfilled immediately, the faculty member may not have sufficient time to find a comparable replacement. Waiting until the end of the semester helps balance patient privacy with teaching needs. In either case, files must be withdrawn from the enterprise DAM system and links to those files removed. Consent status and asset relationships must be part of the metadata for an asset to handle such situations. Consideration must be given to associating a digital copy of all consent forms with the corresponding asset within an enterprise DAM system. intellectual ownership and author control of materials Authors’ rights, as recognized by the Berne Convention for the Protection of Literary and Artistic Works, have two components.27 One, the economic right in the work, is what is usually recognized by copyright law in the United States, being a property right that the author of the work can transfer to others through a contract. The other component—the moral rights of the author—is not explicitly acknowledged by copyright law in the United States and thus may escape consideration regarding ownership and use of intellectual property. Moral rights include the right to the integrity of the work, and thus come into play in situations where a work is distorted or misrepresented. Unlike economic rights, moral rights cannot be transferred and remain with the author. In a university setting, the university may own the economic right for a researcher’s work, in the form of copyright, but the researcher retains moral rights. The following incident illustrates what can happen when only property rights are taken into account. A digital video segment of a medical procedure was being shown as part of a Living Lab demo at a university IT showcase. Because the UM held the copyright for that particular videotape, no problems were foreseen regarding its usage. A faculty member recognized the video as one she had cre ated several years ago and expressed great concern that it had been used for such a purpose without her knowledge or consent. The concern arose from the fact that video showed an outdated procedure. While the faculty member continued to use this video in the classroom, she felt this was different from having it available through the Living Lab. In the classroom, the faculty member alerted students to the outdated practices during the viewing, and she had full control over who viewed it. The faculty member felt she lost this control and additional clarification when the video became available through Living Lab. That is, her work was now misrepresented and her moral rights as an author were violated. Digital rights management and copyright In the academic world, digital rights management (DRM) is becoming a necessary component in disseminating intellectual products of all forms.28 However, at this time there are few standards and no technical DRM solution that works for all media on all platforms. Therefore, UM has elected to use social rather than technical means of managing digital rights. The Living Lab metadata schema provides an element for rights statements, DC_Rights. These metadata, combined with education of the univer sity community about copyright, fair use, and the highly granular access control and privileges management of the system, provide the community with the knowledge and tools to use the assets ethically. The university can establish rights declarations to use in the DC_Rights field as standards are developed and prec edent is established in the courts. These declarations may include copyright licenses developed by the university legal counsel as well as those from the Creative Commons.29 current solution—access control lists A clear difference between the cultures of commercial enterprises and academia emerged regarding access to assets, administered through ACLs.30 An ACL specifies Commercial DAM System Model University DAM System Model Assets held centrally Federated ownership of assets Access, roles, and privileges managed centrally Distributed management of access, privileges and roles Metadata frameworks— monolithic Federated metadata schema Agnostic user interface(s) re: privileges, ownership Figure 8. Differences between commerical and university uses of a DAM system. aRticLE titLE | autHoR 13EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 13 who is allowed to access an asset and how they can use it. In commercial settings, access to assets is centrally managed, while in academia, with its complex set of intellectual and copyright issues, it is preferable to have them managed by the asset holders. University users repeatedly asked for the ability to define ACLs for each asset in the Living Lab. Currently, end users and support staff cannot define ACLs—only system administrators can create them. The middleware for userdefined ACLs has been fully developed, and the user interface for user defined ACLs will be made available in the next version. This capability is important in the academic envi ronment because the composition of group(s) of people requiring access to a particular asset is fluid and can span many organizational boundaries, both within and outside the university. A research group owning a collection of assets may want to restrict access for various reasons, including requirements set forth by an institutional review board (IRB, a university group that oversees research projects involving human subjects), or regulations such as the Health Insurance Portability and Accountability Act of 1996, which addresses patient health information privacy.32 The research group will want flexible access control, as research group members may collaborate with others inside and outside the university. The original IRB approval may specify that confidentiality of the subjects must be maintained, and collected data, such as video or transcripts, can only be viewed by those directly involved in the research project and cannot be browsed by other researchers not involved in the study or the public at large. In another situation, a collection of art images may only be viewed by current students of the institution, thus requiring a different ACL. This situation is still open to interpretation. Some say patient consent regarding the use of information for instructional purposes cannot be withdrawn for the use of existing information at the home institution. They can only withdraw it for the use of future assets. Others may feel that patients can withdraw permission for the use of their patient assets. other important issues uncataloged materials backlog What emerged from interviews and focus groups with content providers was that while there was no lack of assets they would like to see online, a large proportion of these assets had never been cataloged or even sys tematically labeled in some form. This finding may be attributed in part to the pilot focusing on existing assets that have previously not been available for widespread sharing—such as the files stored on faculty hard disks and departmental servers—only known to a favored few. Owners or creators of these materials had not consciously thought about sharing these materials or making them available to others. Librarians, in contrast, have devel oped systems and practices to ensure the findability of materials that enter the library. Asset owners were more than willing to have the assets placed online, but did not have the time or resources to provide the appropriate metadata. Hiring personnel to create the metadata is problematic, as there is a limit to the metadata that can be entered by nonexperts, and experts often are scarce and expensive. For example, for a collection of oral pathology images of microscopic slides, a subject expert must provide the diagnoses, stain, magnification, and other information for each image. Without these details, merely putting the slides online is of little value, but these metadata cannot be provided by laypeople. Collaborative metadata creation, allowing multiple metadata authors and iterations, may be one solution to this problem. A number of studies indicate that both organiza tional support and userfriendly metadata creation tools are necessary for resource authors to create high quality metadata.33 Some of the backlog may be resolved through development of tools aimed at resource authors. In addition, increased use of digital file formats with embedded metadata may contribute to reducing future backlog by requiring less human involvement in meta data creation. Faculty need to be taught that metadata raises the value and utility of assets. As they come to understand the essential role metadata plays, they, too, will invest in its creation. user interface and integration with other systems An enterprise DAM system has two basic types of uses: by producers and by users. Producers tend to be digital media technologists who create the digital assets and ingest them into the enterprise DAM system. The users are the faculty, students, and staff who use these digital assets in their teaching, learning, or research. The research and development version of the enter prise DAM system, Living Lab, works well for digital asset producers, but not for the users of these digital assets. Ingestion and accessing processes are quite complex and are not currently integrated with other campus systems, such as the online library catalog or the Sakaibased, campuswide course management sys tem, CTools.34 Digital producers who are comfortable with complex systems are able to ingest and access rich media. However, users have to log onto the enterprise DAM system and navigate its complex user interface. The level of complexity of accessing the media can cre ate a barrier to adoption and use. If the level of complex ity for accessing the assets is too high for users, then the system also is too complex to expect users to contribute to the ingestion of digital assets. 14 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200714 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 In both student and faculty focus groups there was concern about the technical skills needed for faculty use of an enterprise DAM system in the classroom. Ideally faculty should be able to incorporate assets seamlessly from the enterprise DAM system to their classroom mate rials, such as PowerPoint or Keynote presentations. Then, the presentations created on their computers should dis play without glitches on the classroom system. Obviously faculty members cannot be expected to troubleshoot in the classroom when display problems occur. If the enterprise DAM system is perceived as difficult to use, or as requiring a lot of troubleshooting by the user, this will discourage adoption by the faculty. This creates additional demands on the enterprise DAM system, and potential additional IT staffing demands for the academic units wanting to promote enterprise DAM system use. When a problem is experienced in the classroom, the departmental IT support, not the enterprise DAM system support team, will be the first to be called. Ideally, an enterprise DAM system should be linked to the campus IT infrastructure such that users or con sumers do not interact with the DAM system itself, but rather through existing academic tools, such as the library gateway, course management system, or departmental Web sites. Having to learn a new system could be a sig nificant barrier to use for many potential DAM system users in academia. ■ Conclusions and lessons learned The vision of a DAM system that would allow faculty and students easy yet secure access to myriad rich media assets is extremely appealing to members of the academy. Conducting the pilot projects revealed numerous techni cal and cultural problems to resolve prior to achieving this vision. The authors anticipate that other institutions will need to address these same issues before undertaking their own enterprise DAM system. using commercial software developed in academia During the course of the Living Lab pilot, the differ ences between academia and the commercial sector proved to be a significant issue. Assumptions about the organizational culture and work methods are built into systems, often in a tacit manner. In the case of the initial iteration of the Living Lab, these assumptions were those of the corporate world, the primary clients of the commercial providers as well the environment of the developers. UM project participants, meanwhile, brought their own expectations based on the reality of their work environment in academia. Universities do not have a strict hierarchical structure, with each aca demic unit and department having a great degree of local control. Academia also has a culture of sharing, where teaching or research products are often shared with no payment involved, other than acknowledgment of the source. Thus, there was a process of mutual edu cation and negotiation regarding what was and was not acceptable in the enterprise DAM system implementa tion. This difference of cultures first manifested itself with ACLs. In the initial implementation, an ACL could be defined only by a system administrator. This was a showstopper for the UM participants, who thought that asset providers themselves would be able to define and modify the ACL for any particular asset. A centralized solution with a single owner of the assets (the company), which is acceptable in the corporate environment, is not acceptable in a university environment, where each user is consumer and owner. Defining who has access to an asset can be a complex problem in academia, since this access is a moving target subject to both departmental and institutional constraints. Libraries and librarians The traditional role of libraries is one of preserving and making accessible the intellectual property of all of humanity. With each new advance in information tech nology, such as DAM systems, the role of libraries and librarians continues to evolve. This pilot highlighted the role and value of librarians skilled in metadata develop ment and assignment. Without their expertise and early involvement, there would have been no standard method of indexing assets, thus preventing users from finding useful media. Also, the project reinforced two reasons for encouraging asset creators to assign metadata at the asset creation point instead of at the archival point. One, this ensures that metadata are assigned when the content expertise is available. It is very difficult for producers to assign metadata retrospectively, and the indexing information may no longer be available at the point of archive. Two, metadata assignment at the point of asset creation helps to ensure consistent metadata assignment that lends itself to automated solutions at the time of archiving.35 Thus, while their role in digital asset man agement systems continues to evolve, the authors predict that the librarians’ role will evolve around metadata, and that libraries will start to become the archive for digital materials. It is anticipated that librarians will work with technical experts to develop workflows that include the automated metadata assignment to help faculty routinely add existing and new collections of assets to the system. One example of such a role is Deep Blue at the University of Michigan. Deep Blue is a digital framework for pre serving and finding the best scholarly and artistic work produced at the University. aRticLE titLE | autHoR 15EntERpRisE Dam sYstEm piLot | KiM, AHrONHEiM, SuzuKA, KiNg, BruELL, MiLLEr, AND JOHNSON 15 production productivity New technical complexities emerge with each new asset collection added to the UM system. New workflows as well as richer software features continue to be developed to meet newly identified integration and user interface needs. As the Living Lab experience advances, techni cal barriers are eliminated and new workflows auto mated. The authors anticipate that, eventually, automated workflows will allow faculty and staff to routinely use digital assets with a minimum of technical expertise, thus decreasing the personnel costs associated with the use of rich media. For the foreseeable future, however, techni cally knowledgeable staff will be required to develop these workflows and even complete a significant amount of the work. academic practice The more delicate and challenging issue is educating fac ulty on the value and power of digital assets to improve their research and teaching. DAM is a new concept to fac ulty, and it will only become useful when integrated into their daily teaching and research. This will happen as fac ulty members become more knowledgeable and increase their comfort in the use of digital assets. The dental case study demonstrates that an improved student experience can be provided with such an asset management system, while the education case study demonstrates that a com plex set of authentic classroom materials can be orga nized and ingested for use by others. These case studies are only two examples of the unanticipated outcomes that result from the use of digital assets in education. The authors predict that as more unanticipated and innova tive uses of digital assets are discovered, these new uses will, in turn, lead to increased academic productivity—for example, teaching more without increasing the number of faculty, students teaching each other with rich media, smallgroup work, and projectbased learning. The list of possibilities is endless. As the Living Lab evolved from a research and development project into the implementation project known as BlueStream, it has become an actual classroom resource. This article described myriad issues that were addressed so that other institutions can embark on their own enterprise DAM systems fully informed about the road ahead. The remaining technical issues can and will be resolved over time. The greatest challenges that remain are being discovered as faculty and students use BlueStream to improve teaching, learning, and research activities. The success of BlueStream specifically, and enterprise DAM systems in general, will be determined by their successes and failures in meeting the needs of faculty and students. ■ Acknowledgements The authors recognize that the Living Lab pilot program was conducted with the support of others. We thank RuxandraAna Iacob for her administrative contributions to the project. We thank both RuxandraAna Iacob and Sharon Grayden for their assistance with writing this article. Thanks to Karen Dickinson for her encourage ment, optimism, and constant support throughout the project. We thank Mark Fitzgerald for his vision regard ing the potential of the School of Dentistry SPI project and for conducting the original research. The Living Lab pilot was conducted with support from the University of Michigan Office of the Provost through the CARAT Partnership program, which pro vided funding for the pilot, and the CARATRackham Fellowship program, which funded the metadata work. References 1. A. Doyle and L. Dawson, “Current Practices in Digital Asset Management,” Internet2/CNI Performance Archive & Retrieval Working Group, 2003, http://docs.internet2.edu/ doclib/draftinternet2humanitiesdigitalassetmanagement practices200310.html (accessed Feb. 17, 2007). 2. D. Z. Spicer, P. B. DeBlois, and the EDUCAUSE Current Issues Committee. “Fifth Annual EDUCAUSE Survey Identifies Current IT Issues.” EDUCAUSE Quarterly 27, no. 2 (2004): 8–22. 3. Humanities Advanced Technology and Information Insti tute (HATII), University of Glasgow, and the National Initiative for a Networked Cultural Heritage (NINCH), “The NINCH Guide to Good Practice in the Digital Representation and Man agement of Cultural Heritage Materials,” 2003, www.nyu.edu/ its/humanities/ninchguide (accessed July 10, 2005). 4. A. McCord, “Overview of Digital Asset Management Sys tems,” EDUCAUSE Evolving Technologies Committee, Sept. 6, 2002. 5. James L. Hilton, “Digital Management Systems,” EDU- CAUSE Review 38, no. 2 (2003): 53. 6. James. Hilton, “University of Michigan Digital Asset Management System,” 2004. http://sitemaker.umich.edu/ bluestream/files/dams_year01_campus.ppt (accessed Feb. 15, 2007). 7. The University of Michigan, “BlueStream,” 2006, http:// sitemaker.umich.edu/bluestream (accessed Feb. 15, 2007). 8. Oracle Corp., “Stellent Universal Content Management,” 2006, www.stellent.com/en/index.htm (accessed Feb. 15, 2007); Artesia Digital Media Group, “Artesia: The Open Text Digital Media Group,” 2006, www.artesia.com/ (accessed Feb. 15, 2007); Canto, “Canto,” 2007, www.canto.com (accessed Feb. 15, 2007). 9. R. D. Vernon and O. V. Riger, “Digital Asset Management: An Introduction to Key Issues,” www.cit.cornell.edu/oit/Arch Init/digassetmgmt.html (accessed Sept. 24, 2004); Yan Han, “Digital Content Management: The Search for a Content Man agement System,” Library Hi Tech 22, no. 4 (2004): 355–65; Stan ford University Libraries and Academic Information Resources, 16 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200716 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 “Media Preservation: Digital Preservation,” 2005, http://library. stanford.edu/depts/pres/mediapres/digital.html (accessed July 29, 2005). 10. Telestream, “Telestream, Inc.,” 2005, www.telestream.net/ products/flipfactory.htm (accessed Feb. 15, 2007). 11. Autonomy, Inc., “Virage Products Overview: Virage Vid eoLogger,” 2006, www.virage.com/content/products/index. en.html (accessed Feb. 15, 2007). 12. International Business Machines Corp., “Ancept Media Server: Digital Asset Management Solution,” 2007, www.nasi. com/ancept.php (accessed Feb. 15, 2007). 13. RealNetworks, Inc., “RealNetworks Media Servers,” 2007, www.realnetworks.com/products/media_delivery.html (accessed Feb. 15, 2007); Apple, Inc., “Quicktime Streaming Server,” 2007, www.apple.com/quicktime/streamingserver (accessed Feb. 15, 2007); International Business Machines Corp., “DB2 Content Manager Video Charger,” 2007, www306.ibm. com/software/data/videocharger/ (accessed Feb. 15, 2007). 14. Sakai, “Sakai: Collaboration and Learning Environment for Education,” 2007, www.sakaiproject.org (accessed Feb. 15, 2007). 15. The University of Michigan, “Michigan Radio,” 2007, www.michiganradio.org (accessed Feb. 15, 2007). 16. The University of Michigan, “Block M Records,” 2005, www.blockmrecords.org (accessed Feb. 15, 2007); The Univer sity of Michigan, “Deep Blue,” 2007, http://deepblue.lib.umich. edu (accessed Feb. 15, 2007). 17. E. Duval et al., “Metadata Principles and Practicalities,” D-Lib Magazine 8, no 4 (2002); A. M. White et al., “PB Core— The Public Broadcasting Metadata Initiative: Progress Report,” 2003 Dublin Core Conference Sept. 28–Oct. 2, 2003, Seattle; J. Attig, A. Copeland, and M. Pelikan, “Context and Meaning: The Challenges of Metadata for a Digital Image Library within the University,“ College & Research Libraries 65, no. 3 (May 2004): 251–61. 18. White et al., “PB Core—The Public Broadcasting Meta data Initiative”; Attig, Copeland, and Pelikan, “Context and Meaning.” 19. Dublin Core Metadata Initiative, “Dublin Core Metadata Initiative,” 2007, http://dublincore.org (accessed Feb. 15, 2007). 20. Visual Resources Association, “VRA Core Categories, Version 3.0,” 2002, www.vraweb.org/vracore3.htm (accessed Feb. 15, 2007); Louis J. Goldberg, et al., “The Significance of SNODENT,” Studies in Health Technology and Informatics 116 (Aug. 2005): 737–42; http://ontology.buffalo.edu/medo/SNO DENT_05.pdf (accessed Feb. 15, 2007). 21. Autonomy, “Virage Products Overview.” 22. FileMaker, Inc., “FileMaker,” 2007, www.filemaker.com/ products (accessed Feb. 15, 2007). 23. M. Fitzgerald et al., “Efficacy of SpeechtoText Technol ogy in Managing Video Recorded Interactions,” Journal of Dental Research 85, special issue A (2006): abstract no. 833. 24. U.S. Department of Education, “Family Educational Rights and Privacy Act FERPA,” 2005, www.ed.gov/policy/ gen/guid/fpco/ferpa/index.html (accessed Feb. 15, 2007). 25. G. Lorenzo and J. Ittelson, “An Overview of EPortfolios,” EDUCAUSE Learning Initiative, 2005, http://educause.edu/ir/ library/pdf/ELI3001.pdf (accessed Feb. 15, 2007). 26. Microsoft Corp., “Microsoft Office PowerPoint 2007,” 2007, http://office.microsoft.com/enus/powerpoint/default. aspx (accessed Feb. 15, 2007); Apple, Inc., “Keynote,” 2007, www.apple.com/iwork/keynote (accessed Feb. 15, 2007). 27. World Intellectual Property Organization, “Berne Con vention for the Protection of Literary and Artistic Works,” 1979, www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html (accessed Feb. 15, 2007). 28. Wikimedia Foundation, Inc., “Digital Rights Manage ment,” 2007, http://en.wikipedia.org/wiki/Digital_Rights_ Management (accessed Feb. 15, 2007). 29. Creative Commons, “Creative Commons,” 2007, http:// creativecommons.org (accessed Feb. 15, 2007). 30. Wikimedia Foundation, Inc., “Access Control List,” 2007, http://en.wikipedia.org/wiki/Access_control_list (accessed Feb. 15, 2007). 31. The University of Michigan, “UM Institutional Review Boards,” 2007, www.irb.research.umich.edu (accessed Feb. 15, 2007). 32. Health Insurance Portability and Accountability Act of 1996 (HIPAA), “Centers for Medicare and Medicaid Ser vices,” 2005, www.cms.hhs.gov/HIPAAGenInfo/Downloads/ HIPAALaw.pdf (accessed Feb. 15, 2007). 33. J. Greenberg et al., “Authorgenerated Dublin Core Meta data for Web Resources: A Baseline Study in an Organization,” Journal of Digital Information 2, no. 2 (2002), http://journals.tdl. org/jodi/article/view/jodi39/45 (accessed Nov. 10, 2007); A. Crystal and J. Greenberg, “Usability of a Metadata Creation Application for Resource Authors,” Library & Information Science Research 27, no. 2 (2005): 177–89. 34. The University of Michigan, “CTools,” 2007, https:// ctools.umich.edu/portal (accessed Feb. 15, 2007). 35. M. Cox et al., Descriptive Metadata for Television (Amster dam: Focal Pr., 2006); Michael A. Chopey, “Planning and Imple menting a MetadataDriven Digital Repository,” Cataloging & Classification Quarterly 40, no. 3/4 (2005): 255–87. 3267 ---- paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 17 Author iD box for 2 column layout Column Title Editor The goal of the technology brief is to familiarize library decision-makers with the opportunities and challenges of participatory networks. In order to accomplish this goal the brief is divided into four sections (excluding an over- view and a detailed statement of goal): ■ a conceptual framework for understanding and eval- uating participatory networks; ■ a discussion of key concepts and technologies in par- ticipatory networks drawn primarily from Web 2.0 and Library 2.0; ■ a merging of the conceptual framework with the tech- nological discussion to present a roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. This summary will highlight the discussions in each of these four topics. For consistency, the section numbers and titles from the full brief are used. K nowledge is created through conversation. Libraries are in the knowledge business. Therefore, libraries are in the conversation business. Some of those conversations span millennia, while others only span a few seconds. Some of these conversations happen in real time. In some conversations, there is a broadcast of ideas from one author to multiple audiences. Some conversa tions are sparked by a book, a video, or a Web page. Some of these conversations are as trivial as directing someone to the bathroom. Other conversations center on the foun dations of ourselves and our humanity. It may be odd to start a technology brief with such seemingly abstract comments. Yet, without this firm, if theoretical, footing, the advent of Web 2.0, social net working, Library 2.0, and participatory networks seems a clutter of new terminology, tools, and acronyms. In fact, as will be discussed, without this conceptual footing, many library functions can seem disconnected, and the field that serves lawyers, doctors, single mothers, and eightyear olds (among others) fragmented. The scale of this technology brief is limited; it is to present library decisionmakers with the opportunities and challenges of participatory networks. It is only a single piece of a much larger puzzle that seeks to pres ent a cohesive framework for libraries. This framework not only will fit tools such as blogs and wikis into their offerings (where appropriate), but also will show how a more participatory, conversational approach to libraries in general can help libraries better integrate current and future functions. Think of this document as an overview or introduction to participatory librarianship. Readers will find plenty of examples and definitions of Web 2.0 and social networking later in this article. However, to jump right into the technology without a larger frame work invites the rightful skepticism of a library organiza tion that feels constantly buffeted by new technological advances. In any environment with no larger conceptual founding, to measure the importance of an advance in technology or practice selection of any one technology or practice is nearly arbitrary. Without a framework, the field becomes open to the influence of personalities and trendy technology. Therefore, it is vital to ground any technological, social, or policy conversation into a larger, rooted concept. As Susser said, “to practice without theory is to sail an uncharted sea; theory without practice is not to set sail at all.”1 For this paper, the chart will be conversation theory. The core of this article is in four sections: ■ a conceptual framework for understanding and eval uating participatory networks; ■ a discussion of key concepts and technologies in par ticipatory networks drawn primarily from Web 2.0 and Library 2.0; ■ a merging of the conceptual framework with the technological discussion to present a sort of roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. It is recommended that the reader follow this order to get the big picture; however, the second section should be a useful primer on the language and concepts of partici patory networks. ■ Library as a facilitator of conversation Let us return to the concept that knowledge is created through conversation. This notion stretches back to Socrates and the Socratic method. However, the specific foundation for this statement comes from conversation theory, a means of explaining cognition and how people learn.2 It is not the purpose of this article to provide a R. David Lankes (jdlankes@iis.syr.edu) is Director and Associate Professor, Joanne silverstein (jlsilver@iis.syr.edu) is research Professor, and scott nicholson (scott@scottnicholson.com) is Associate Professor at the information institute of Syracuse, (N.Y.) Syracuse university’s School of information Studies. Participatory Networks: The Library As Conversation R. David Lankes, Joanne Silverstein, and Scott Nicholson 18 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200718 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 detailed description of conversation theory, a task already admirably accomplished by Pask. Rather, let us use the theory as a structure upon which to hang an exploration of participatory networking and, more broadly, participa tory librarianship. The core of conversation theory is simple: people learn through conversation. Different communities have different standards for conversations, from the scientific community’s rigorous formalisms, to the religious com munity’s embedded meaning in scripture, to the some times impenetrable dialect of teens. The point remains, however, that different actors establish meaning through determining common definitions and building upon shared concepts. The library has been a place where we facilitate con versations, though often implicitly. The concept of learn ing through conversation is evidenced in libraries in such large initiatives as information literacy and teaching criti cal thinking skills (using such metacognitive approaches as selfquestioning), and in the smaller events of book groups, reference interviews, and speaker series. Library activities such as building collections of artifacts (the tan gible products of conversation) inform scholars’ research through a formal conversation process where ideas are supported with evidence and methods. Similarly, pres ervation efforts, perhaps of wax cylinders with spoken word content or of ancient maps that embody an ongo ing dialogue about the shape and nature of the physical world, seek to save, or at least document, important conversations. Common use of the word “conversation” is com pletely in accordance with the use of the term in conver sation theory. The term is, however, more specifically defined as an act of communication and agreement between a set of agents. So, a conversation can be between two people, two organizations, two countries, or even within an individual. How can a conversation take place within an individual? Educators and school librarians may be familiar with the term “metacogni tion,” or the act of reflecting on one’s learning.3 Yet, even the most casual reader will be familiar with the concept of debating oneself (“if I go right, I’ll get there faster, but if I go left I can stop by Jim’s . . .”). The point is that a conversation is with at least two agents trying to come to an understanding. Also note that those two agents can change over time. So, while Socrates and Plato are dead, the conversation they started about the nature of knowl edge and the world is carried forward by new genera tions of thinkers—same conversation, different agents. People converse, organizations converse, states con verse, societies converse. The requirements, in the terms of conversation theory, are two cognitive systems seek ing agreement. The results of these conversations, what Pask would call “cognitive entanglements,” are books, videos, and artifacts that either document, expand, or result from conversations.4 So, while one cannot con verse with a book, that book certainly can be a starting point for many conversations within the reader and within a larger community. If the theory is that conversation creates knowledge, the library community has added a corollary: the best knowl edge comes from an optimal information environment, one in which the most diverse and complete information is available to the conversant(s). Library ethics show an implicit understanding of this corollary in the advocacy of intellectual freedom and unfettered access. Libraries seek to create rich environments for knowledge and have taken the stance that they are not in the job of arbitrating the conversations that occur or the appropriateness of the information used to inform those conversations. As will be discussed later, this belief in openness of conversations will have some farreaching implications for the library collec tion and is an ideal that can never truly be met. For now, the reader may take away that conversation theory is very much in line with current and past library practice, and it also shows a clear trajectory for the future. This viewpoint’s value is not just theoretical; it has real consequences and uses. For example, much of library evaluation has been based on numeric counts of tangible outputs: books circulated, collection size, reference transactions, and so on. Yet this quantitative approach has been frustrating to many who feel they are count ing outcomes but not getting at true impact of library service. Librarians may ask themselves, “Which num bers are important . . . and why?” If libraries focused on conversations, there might be some clarity and cohesion between statistics and other outcomes. Suddenly, the number of reference questions can be linked to items cat aloged or to circulation numbers . . . they are all markers of the scope and scale of conversations within the library context. This approach might enable the library com munity to better identify important conversations and demonstrate direct contributions to these conversations across functions. For example, a school district identifies early literacy as important. There is a discussion about public policy options, new programs, and school goals to achieve greater literacy in K–5. The library should be able to track two streams in this conversation. The first is the one libraries are accustomed to counting; that is, the library’s contribution to K–5 literacy (participation in book talks, children’s events, circulation of children’s books, reference questions, and so on). But the library also can document and demonstrate how it furthered the conversation about children’s literacy in general. It could show the resources provided to community offi cials. It could show the literacy pathfinders that were created. The point of this example is that the library is both participant in the conversation (what we do to pro mote early literacy) and facilitator of conversation (what we do to promote public discourse). aRticLE titLE | autHoR 19paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 19 The theoretical discussion leads us to a discussion about the second topic of this technology brief: pragmatic aspects of the knowledge as conversation approach, or a participatory approach, as it will be called. As new technologies are developed and deployed in the current environment of limited resources, there must be some means of evaluating their utility. A technology’s util ity is appropriately measured against a given library’s mission, which is, in turn, developed to respond to the needs of the community that library serves. First, how ever, let us identify some of the new technologies and describe them briefly. ■ Participatory networking, social networks, and Web 2.0 Let us now move from the theoretical to the opera tional. The impetus behind this article is the relatively recent emergence of a new group of Internet services and capabilities. Suddenly, terms such as wiki, blog, mashup, Web 2.0, and biblioblogosphere have become commonplace. As with any new wave of technological creation, these terms can seem ambiguous. They also come wrapped in varying amounts of hype. They may all, however, be grouped under the phenomenon of par ticipatory networking. While we now have a conceptual framework to evaluate these technologies that support participatory networking (for example, do they further conversa tions), we still need to know the basics of the terminol ogy and technologies. This section outlines key concepts in the pragmatics of participatory networking. The section after this one will join the theoretical and operational to outline key chal lenges and opportunities for the library world. We begin with Web 2.0. Web 2.0 Much of what we call participatory networking, at least the technological foundation of it, stems from developments in Web 2.0.5 As with many buzzwords, the exact definition of Web 2.0 is not clear. It is more an aggregation of concepts that range from software development (loosely coupled Application Programming Interfaces [APIs] and the ease of incorporating features across platforms) to abstrac tions (the user is the content). What pervades the Web 2.0 approach is the notion that Internet services are increas ingly facilitators of conversations. The following sections describe some of the characteristics of Web 2.0. Web 2.0 characteristic: social networks A core concept of Web 2.0 is that people are the content of sites; that is, a site is not populated with information for users to consume. Instead, services are provided to individual users for them to build networks of friends and other groups (professional, recreational, and so on). The content of a site, then, comprises userprovided infor mation that attracts new members of an everexpanding network. Examples include: ■ Flickr. Flickr (www.flickr.com) provides users with free Web space to upload images and create photo albums. Users then can share these photos with friends or with the public at large. Flickr facilitates the creation of shared photo galleries around themes and places. ■ The Cheshire Public Library. The Teen Book Blog (http://cpltbb.wordpress.com) at the Cheshire Public Library offers book reviews created only by the stu dents who use the library. ■ Memorial Hall Library. The Memorial Hall Library in Andover, Massachusetts, offers podcasts of poetry contests in which the content is created by students (www.mhl.org/teens/audio/index.htm). ■ Libraries in MySpace. MySpace searches show that there are MySpace sites for hundreds of individual libraries and scores of library groups. Alexandrian Public Library (APL), for example, has established a site at MySpace (www.myspace.com/teensatapl). This practice is growing among public libraries and is an attempt to reach out to users in their preferred online environments. In this venue, the more friends a library’s MySpace site has, the more successful it may be considered. As of this writing, APL had sev entyfive friends and fifteen comments. The Brooklyn College Library had 2,195 friends and 270 comments. Web 2.0 characteristic: wisdom of crowds There has been some research into the quality of mass decisionmaking.6 That research shows how remarkably accurate groups are in their judgments. Web 2.0 pools large groups of users to comment on decisions. This aggregation of input is facilitated by the ready availabil ity of social networking sites. Certainly, this approach of community organization and verification of knowledge also has its detractors. Many, for example, question the wisdom seen in some entries of Wikipedia. Yet, recent articles have compared this mass editing process favor ably to traditional sources of information, such as the Encyclopedia Britannica.7 Examples include: ■ eBay. eBay has perhaps the most studied and copied community policing and reputation systems. All buyers and sellers can be rated. The aggregation of many users’ experiences create a feedback score that is equivalent to a group credibility rating (see figure 1). These kinds of group feedback systems can now be seen in most major Internet retailers. ■ LibraryThing. LibraryThing.com makes book recom 20 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200720 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 mendations based on the collective intelligence of all users of the site. The greater the pool of collective intelligence, the more information available to the user for decisionmaking. ■ The Diary Project. The Diary Project Library (www. diaryproject.com) is a nonprofit organization that encourages teens to write about their daytoday experiences growing up. The goal of this site is to encourage communication among teens of all cul tures and backgrounds, provide peertopeer support, stimulate discussion, and generate feedback that can help ease some of the concerns teens encounter along the way and let them know that they are not alone. To that end, the site comprises thousands of entries in twentyfour categories. Because of the great number of entries, most youth can find helpful materials. Web 2.0 characteristic: loosely coupled apis An API provides a set of instructions (messages) that a programmer can use to communicate between applica tions. APIs allow programmers to incorporate one piece of software they may not be able to directly manipulate (code) into another. For example, Google Maps has made a public API that allows Web page designers to include satellite images into their Web pages with little more than a latitude and longitude.8 APIs vary in their ease of integration. Loosely coupled APIs allow for very easy integration using highlevel scripting languages such as Javascript9. Examples include: ■ Google Maps. Google Maps displays street or sat ellite maps showing markers on specific locations provided by an external source with simple sets of longitudes and latitudes. It becomes extremely easy to create Geographic Information Systems with little knowledge of GIS principles. ■ Flickr. Flickr provides easy means to integrate hosted images into other Web pages or applications (as with a Google Map that shows images taken at a specific location). ■ YouTube. YouTube (www.youtube.com) provides users with the capability to upload and comment upon video on the Internet. It also allows for easy integration of the videos into other Web pages and blogs. With a simple line of HTML code, anyone can access streaming video for their content. Web 2.0 characteristic: mashups Mashups are combinations of APIs and data that result in new information resources and services.10 This ease of incorporation has led to an assumption of a “right to remix.” In the world of open source software and the creative commons, the right to remix refers to a grow ing expectation among Internet users that they are not limited by the interfaces and uses presented to them by a single organization. Examples include: ■ ChicagoCrime.org. An oftencited example of a mashup is ChicagoCrime.org, which uses Google Maps to plot crime data for the city of Chicago. Users can now see exactly which street corner had the most murders. Figure 2 shows a marker at the location of every homicide in Chicago from November 2, 2005, to August 2, 2006. ■ Book Burro. Book Burro (http://bookburro.org/ about.html) “is a Web 2.0 extension for Firefox and Flock. When it senses you are looking at a page that contains a book, it will overlay a small panel which when opened lists prices at online bookstores such as Amazon, Buy, Half (and many more) and whether the book is available at your library.” ■ Library Lookup. The MIT Library LookUp Greasemonkey Script for Firefox (http://libraries. mit.edu/help/lookup.html) searches MIT’s Barton catalog from an Amazon book screen. Web 2.0 characteristic: permanent betas The concept of a permanent beta is, in part, a realization that no software is ever truly complete so long as the user community is still commenting upon it. For example, Google does not release services from beta until it has achieved a sufficient user base, no matter how fixed the underlying source code is.11 Permanent beta also is a design strategy. Large applications are broken into smaller constituent parts that can be manipulated sepa rately. This allows large applications to be continually Figure 1. A seller’s profile shows a potential buyer the eBay com- munity’s current estimation of a seller’s credibility. aRticLE titLE | autHoR 21paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 21 developed by a more diverse and distributed community (as in open source). Examples include: ■ Google Labs. Google has a site named “Google Labs” (http://labs.google.com) that puts out company generated tools and services. In fact, part of a Google employee’s work time is dedicated to creating the resources and tools through personal projects and exploration. These tools and services remain a part of the “lab” until they are finished and have sufficient user bases. Projects (see figure 3) range from the simple (Google Suggest, which provides a dropdown box of possible search queries as you being to type your search terms) to the extensive (Google Maps, which started as a Google Lab project). ■ MIT Libraries. The MIT Libraries are experimenting with new technologies to help make access to informa tion easier. The tools below are offered to the public with an appeal for feedback and additional tools, and the there is a permanent address designed just to collect feedback from the betaphase tools, which include: ■ The New Humanities Virtual Browsery, which highlights new books and incorporates an RSS feed, the ability to comment on books, links to book reviews, availability information, and links to other books by the same author. ■ The LibX—MIT Edition (http://libraries.mit. edu/help/libx.html), which is a Firefox toolbar that allows users to search the Barton catalog, Vera, Google Scholar, the SFX FullText Finder, and other search tools; it embeds links to MIT only resources in Amazon, Barnes & Noble, Google Scholar, and NYT Book Reviews. ■ The Dewey Research Advisor Business and Economics Q&A (http://libraries.mit.edu/help/ dra.html), which provides starting points for specific research questions in the fields of busi ness, management, and economics. Web 2.0 characteristic: software gets better the more people use it An increasing number of Web 2.0 sites emphasize social networks, where these services gain value only as they gain users. Malcolm Gladwell recounts this principle and the work of Kevin Kelly with an earlier telecommunica tions network, the network of fax machines connected to the phone system: The first fax machine ever made . . . cost about $2,000 at retail. But it was worth nothing because there was no other fax machine for it to communicate with. The second fax machine made the first fax machine more valuable, and the third fax made the first two more valuable, and so on. . . . When you buy a fax machine, then, what you are really buying is access to the entire fax network—which is infinitely more valuable than the machine itself.12 With social networking sites, and all sites that seek to capitalize on user input (reviews, annotations, profiles, etc.), the true value of each site is defined by the number of people it can bring together. A classic example of this characteristic is Amazon. Amazon sells books and other merchandise, but, in reality, Amazon is very much about the marketing of information. Amazon gains tremendous value by allowing its users to review and rate items. The more people use Amazon and the more they comment, the more visibility these active users gain and the more credibility markers they take on. Web 2.0 characteristic: folksonomies A folksonomy is a classification system created in a bottomup fashion with no central coordination. This differs from the deductive approach of such classifica tions systems as the Dewey Decimal System, where the world of ideas is broken into ten nominal classes.13 It also differs from other means of developing classifications where some central authority determines if a term should be included. In a folksonomy, the members of a group simply attach terms (or tags) to items (such as photos or blog postings), and the aggregate of these terms is seen as the classification. What emerges is a classification scheme that prioritizes common usage (the mostused tags) over semantic clarity (if most people use “car,” but some use “cars,” they are seen as different terms, and the tag “auto mobile” has no real relationship within the aggregate classification). Examples include: Figure 2: Screenshot of Chicagocrime.org 22 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200722 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 ■ PennTags. PennTags (http://tags.library.upenn.edu/ help) is a social bookmarking tool for locating, orga nizing, and sharing one’s favorite online resources. Members of the Penn Community can collect and maintain URLs, links to journal articles, and records in Franklin, the online catalog, and VCat, the online video catalog. Once resources are compiled, users can organize them by assigning tags (freetext key words) or by grouping them into projects according to specific preferences. PennTags also can be used collaboratively, as it acts as a repository of the varied interests and academic pursuits of the Penn com munity, and a user can find topics and other users related to his or her own favorite online resources. ■ Hillsdale Teen Library. The Hillsdale Teen Library (www.flickr.com/photos/hillsdalelibraryteens) uses Flickr to post pictures of events at the Hillsdale Teen Library (figure 4). The resulting tag view is repre sented in figure 5. These tags allow users to easily retrieve the images in which they are interested. There are more characteristics of Web 2.0, but these give some overall concepts. core new technologies: aJaX and Web services As we have just discussed, Web 2.0 is little more than set of related concepts, albeit with a lot of value being currently attached to these concepts. These concepts are supported by two underlying technologies that have facilitated Web 2.0 development and brought a substantially new (and improved) user experience to the Web. The first is AJAX, which allows a more desktoplike experience for users. The second is the advent of Web services. These technolo gies are not necessary for Web 2.0 concepts, but they have made Web 2.0 sites much more compelling. aJaX AJAX stands for Asynchronous JavaScript and XML.14 It is a set of existing Web technologies brought together. At the most basic, AJAX allows a browser (the part the user interacts with) and a server (where the data resides) to send data back and forth without needing to refresh the entire Web page being worked on. Think about the Web sites you work with. You click on a link, the browser freezes and waits for the data, then draws it on the screen. Early versions of such sites as MapQuest would show a map. If you wanted to zoom into the map, you would press a zoom icon and wait while the new map, and the rest of the Web page was redrawn. Compare this to Google Maps, where you click in the middle of a map and drag left or right and the map moves dynamically. We are used to this kind of interaction in desktop applications. Click and drag has become second nature on the desktop, and AJAX is making it second nature on the Web, too. Another AJAX advantage is that it is open and requires only light programming skills. Javascript on the client and almost any serverside scripting language (such as active server pages or PHP) are easily accessible languages. This fact allows for both fast development and easier integration with existing systems. As an example, it should now be easier to bring more interactive Web interfaces to existing online catalogs. Web services Web services allow for softwaretosoftware interactions on the Web.15 Using Web protocols and XML, applications exchange queries and information in order to facilitate the larger functioning of a system. One example would be a system that uses an ISBN number to query multiple online catalogs and commercial vendors for availability (and price) of a book. This simple process might be part of a much larger library catalog that shows users a book and its availability. The point is, that unlike federated search systems such as Z39.50, Web services are small. They also tend to be lightweight (that is, limited in what they do), and are aggregated for greater functionality. This is the technological basis for the loosely coupled APIs dis cussed previously. Library 2.0 Library 2.0 is a somewhat diffuse concept. Walt Crawford, in his extended essay “Library 2.0 and ‘Library 2.0,’” found sixtytwo different (and often contradictory) views and seven distinct definitions of Library 2.0.16 It is no wonder that people are confused. However, it is natural for emerging ideas and groups to function in an environ Figure 3: Screenshot of current google Lab projects aRticLE titLE | autHoR 23paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 23 ment of high ambiguity. For use in this technology brief, the authors see Library 2.0 as an attempt to apply Web 2.0 concepts (and some longstanding beliefs for greater com munity involvement) to the purpose of the library. In the words of Ormsby, “The purpose of a library is not to . . . showcase new gadgetry . . . ; rather, it is to make possible that instant of insight when all the facts come together in the shape of new knowledge.”17 In the case of Library 2.0, the new gadgetry discussed in the previous section comprises a group of software applications. How the applications are used will determine whether they support Ormsby’s “instant of insight.” Many libraries and librarians already are pursuing this goal. Some, for instance, are using blogs to reach other librarians, their own users (on their own Web sites), and potential users (using MySpace and other online communities). They are using wikis to deliver reports, teach information literacy, and serve as repositories. One has developed an API that allows WordPress posts to be directly integrated into a library catalog. Clearly, the Internet and newer tools that empower users seem to be aligned with the library mission. After all, librarians blogging and allowing the catalog to be mashed up can be seen as an extension of current information services. But this abundance of new applications poses a challenge. Given the speed with which new tools are invented, librarians may find it difficult to create strate gies that include all the desired services that they make possible. For every new application that becomes avail able, library administrators must decide whether it can serve the library, how to use it, and how to find additional resources to manage it (for example, “Now we can do this. But why should we?”). This problem stems from focusing excessively on the technology. Librarians should instead focus on the phenomena made possible by the technology. Most important of these phenomena, the library invites participation. As Chad and Miller state: Library 2.0 facilitates and encourages a culture of participation, drawing upon the perspectives and con tributions of library staff, technology partners and the wider community. Library 2.0 means harnessing this type of participation so that libraries can benefit from increasingly rich collaborative cataloguing efforts, such as including contributions from partner libraries as well as adding rich enhancements, such as book jackets or movie files, to records from publishers and others. Library 2.0 is about encouraging and enabling a library’s community of users to participate, contribut ing their own views on resources they have used and new ones to which they might wish access. With Library 2.0, a library will continue to develop and deploy the rich descriptive standards of the domain, whilst embracing more participative approaches that encourage interaction with and the formation of com munities of interest.18 The carte blanche statement that users participating in the library is “good,” however, is insufficient. Library administers must ask, “What is the ultimate goal?” In summary, current initiatives in the library world to bring the tools of Web 2.0 to the service of Library 2.0 are exciting and innovative, and, more to the point, they are supportive of the library’s purpose. They may, however, incur costs, such as monitoring blogs and wikis, and cre Figure 4: Hillsdale Teen Library Figure 5: Hillsdale Teen Library Flickr site 24 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200724 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 ating content and corresponding with users that stretch already inadequate resources even further. Ultimately, the value of Library 2.0 concepts requires us to answer some important questions: will they be used to further knowledge, or will they simply create more work for librarians? What does the next version of Library 2.0 look like? Is its mission the same, and only the tools dif ferent? What makes the library different from MySpace— simply a legacy? Should we incorporate new services into the current library offerings? How do we, as facilitators of conversations, point the way to the next generation of library? It is hoped that some of the concepts in participa tory librarianship may answer these questions and help further the innovations of the Library 2.0 community. participatory networks The authors use the phrase “participatory networking” to encompass the concept of using Web 2.0 principles and technologies to implement a conversational model within a community (a library, a peer group, the general public, and so on). Why not simply adopt social networking, Web 2.0, or Library 2.0 for that matter? Let us examine each term’s limitations: ■ Social networking: Social network sites such as MySpace and Facebook have certainly captured public attention. They also have proven very popular. In their short life spans, these sites have garnered an immense audience (MySpace has been ranked one of the top destination sites on the Web) and drawn much atten tion from the press.19 Some of that attention, however, has been very negative. MySpace, for example, has been typified as a refuge for pedophiles and online predators. Even the television show Saturday Night Live has parodied the site for the ease with which users can create false personas and engage in risky online behaviors.20 To say you are starting a social networking site in your library may draw either enthusiastic support, vehement opposition (“social networking experiment in my library?!”), or simply confused looks. Add to the potential negative con notations the ambiguity of the term. Is a blog a social networking site? Is Flickr? To compound this confu sion, the academic domain of social network theory predates MySpace by about a decade. ■ Web 2.0: Ambiguity also dogs the Web 2.0 world. For some, it is technology (blogs, AJAX, Web ser vices, and so on). For others, it is simply a buzzword for the current crop of Internet sites that survived the burst of the dotcom bubble. In any case, Web 2.0 certainly implies more than just the inclusion of users in systems. ■ Library 2.0: As stated before, the term Library 2.0 is a vague term used by some as a goad to the library community. Further, this term limits the discussion of userinclusive Web services to the library world. While this brief focuses on the library community, it also sees the library community as a potential leader in a much broader field. So, ultimately, the authors propose “participatory net working” as a positive term and concept that libraries can use and promote without the confusion and limitations of previous language. The phrase “participatory network” also has a history of prior use that can be built upon. It represents systems of exchange and integration and has long been used in discussions of policy, art, and government.21 The phrase also has been used to describe online communities that exchange and integrate information. ■ Libraries as participatory conversations So where are we? We started with the abstract statement that knowledge is created through conversation. We then looked at the current landscape of technologies that can facilitate these conversations and showed examples of how libraries, other industries, and individuals are using these technologies. In this section we combine the larger framework with the technologies to see how libraries can incorporate participatory networks to further their knowledge mission. participatory librarianship in action Let us look specifically at how participatory networks can be used in the library’s role as facilitator of knowledge through conversation. An obvious example is libraries hosting blogs and wikis for their communities, creat ing virtual meeting spaces for individuals and groups. Indeed, these are increasingly useful functions for librar ies. They meet a perceived need in the community and can generate excitement both within the library and in the community. The idea of creating online sites for individu als and organizations makes sense for a library, although it is not without difficulties (see the section on challenges and opportunities). Libraries also could use freely avail able (and increasingly easy to implement) open source software to create library versions of Wikipedia (with or without enhanced editorial processes). Another way for libraries to offer these services would be through a cooperative or other thirdparty vendor. Such a service easily can be seen as a knowledge management activity capturing and providing local expertise while linking this expertise to that produced at other libraries. Another reason for libraries to engage in participatory networking is that one library can more easily collaborate aRticLE titLE | autHoR 25paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 25 with other libraries in richer dialogues. We currently have systems that connect our online catalogs and share resources through interlibrary loan. These conduits exist and can be used for the transferal of richer data, as has been proved through collaborative virtual reference sys tems. In our current systems, as in traditional library practice, when users are referred to other libraries, they are sent out and not brought back. In a participatory library setting, libraries would facilitate a conversation between the user, the community of the local library, and then through the developed conduits, other libraries and their communities. The end result would be a seamless web of libraries where the user can ignore the intrica cies of the library’s organization structure and boundar ies, and in which the libraries are using the best local resources to meet local needs. Bringing libraries seamlessly together to participate in conversations with a single user has another sig nificant advantage: the Library would make it easy for users to join the conversation regardless of where they are, through the presentation of a single façade. There is, for example, only one Google, one Amazon, and one Wikipedia. Why should users have to search from among thousands of libraries to find the conversations they want? Participatory networking will be most effective when libraries work together, when the whole is greater than its parts. We currently see elements of the participatory library in the OCLC Open Worldcat project. For example, users searching Google may come across a listing provided by OCLC. After selecting the entry for the book, the user can then jump to his or her own local library’s information about the book. Users do not have to know which library to visit to find a book near them. Extending this concept to conversations, one goal of these participatory networks is to make it easier for the user to enter a conversation with the Library without having to work to discover their own specific entry points. However, ensuring this effective seamless access to the Library will require more than simply adding ele ments of participatory networking around the library’s edges. Adding services such as blogs and wikis may be seen merely as adjunct to current library offerings. As with any technological advance, scarce resources must be weighed against a desire to incorporate new services. Do we expand the collection, improve the Web site, or offer blogs to students? A better approach for making these kinds of decisions is to look at the needs of the community served in context with the commonly accepted, core tasks of a library, and see how they can be recast (and enhanced) as conversational, or participatory, tools. In point of fact, every service, patron, and access point is a starting point for a conversation. Let’s start with the catalog. If the catalog is a conversation, it is decidedly formal and, more importantly, one way. Think of today’s catalog as the educational equivalent of a college lecture. A for mal system is used to serve up a series of presentations on a given topic (selected by the user). The presentations are rigid in their construction (MARC, AACR2, and so on). They follow an abstract model (relevance scores, some times alphabetical listings), and provide minimal oppor tunities to the receiver of the information to provide feedback or input. They provide no constructive means for the user to improve or shape the conversation. Even recent advances in catalog functions (dynamic, graphical visualizations; faceted searching; simple search boxes’ links to noncollection resources) do little more than make the presentation of information more varied. They are still not truly interactive because they do not allow user participation; they do not allow for conversation. To highlight the oneway nature of the catalog, ask a simple question: what happens when the user doesn’t find something? Do we assume that the information is there, but that the user is simply incapable of finding it (in which case the catalog presents search tips, refers the patron to an expert librarian who is capable, or offers more information literacy instruction)? Do we assume that the information does not exist (refer the patron to interlibrary loan, pass him or her on to a broader search engine)? Do we assume that the catalog itself is limited (refer the user to online databases, or other finding aids)? What if we assume that the catalog is just the current place a user is involving in an ongoing conversation —what would that look like? How can such a traditionally rigid system (in concept, more than in any one feature set) be made more participa tory? What if the user, finding no relevant information in the catalog, adds either the information or a place holder for someone else to fill in the missing information? Possibly the user adds information from his or her exper tise. However, assuming that most people go to a catalog because they don’t have the information, perhaps the user instead begins a process for adding the information. The user might ask a question using a virtual reference service; at the end of the transaction, the user then has the option to add the question, along with the answer and associ ated materials, to the catalog. Or perhaps, the user simply leaves the query in the catalog for other patrons to answer, requesting to be notified when a response is posted. In that case, when a new user does a catalog search and runs across the question, he or she can provide an answer. That answer might be a textual entry (or an image, sound, or video), or simply a new query that directs the original questioner or new patrons to existing information in the catalog (usercreated see also entries in the catalog). The catalog also can associate conversations with any data point. For example, a user pulls up the record for a book she or he feels might be relevant to an information need she or he is having. This process starts a conver sation between that user and the library, its users, and 26 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200726 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 authors of associated works. The user can see comments and ratings associated with this book from not only users of this library, but users of other libraries. Also associated is a list of related works and the full audio of a lecture by the author. The user also might be directed to an in person or online book group that is reading that book. The point is that the catalog facilitates a conversation as opposed to simply presenting what it “knows” about a topic and then stepping out of the process. The catalog, then, does not simply present information, but instead helps users construct knowledge by allowing the user to participate in a conversation. There are other means of improving (and linking) systems in a conversational library. Take the implicit link between the catalog and circulation. Of course, these systems have always been linked in that items found in the catalog can be checked out, and checked out items have their status reflected in the catalog. But this kind of state information is a pretty meager offering. Imagine using circulation data to improve the actual functionality of the catalog. Take the example of a user who is search ing the catalog for fictional books on magic. Currently, a relevance score between an item’s metadata and the query is computed and then all the items are ranked in a retrieval set. This relevance score can be computed in many ways, but is usually based on the number of times a keyword appears in the record and the placement of that keyword in the metadata record (giving preference to terms appearing in certain MARC fields, such as titles). What is missing is the actual, realworld circulation of an item. Wouldn’t it make sense, given such an abstract query, to present the user with Harry Potter first (but not exclusively)? What if we added circulation data to our relevance rankings: how many times this item has been checked out? It turns out that using a simple statistic is amazingly powerful. It is akin to Google’s page rank algorithm that presents sites most linked to higher in the results. Also, for those worried that users would be flooded with only popular materials, studies show that while these algorithms do change the very top ranked material, the effect quickly fades so that the user can still easily find other materials. Another consideration for adjusting a search is to allow the user to tweak the algorithms used to retrieve works. In the example above, a user could turn off the popularity feature. The user also could toggle switches for currency, authority, and other facets of relevancy rankings. The conversational model requires us to rethink the catalog as a dynamic system with data of varying levels of currency and, frankly, quality, coming into and out of the system. In a conversational catalog, there is no reason that some data can’t exist in the catalog for limited dura tions (from years to seconds). Records of wellgroomed physical collections may be a core and durable collection in the catalog, but that is only one of many types of infor mation that could exist in the catalog space. Furthermore, even this core data can be annotated and linked to (and from) more transient media. So, the user might see a review from a blog as part of a catalog record on one day, but when she or he pulls the record up again in a few days, that review might be absent, the blog writer hav ing withdrawn the comment. This is akin to weeding the collection; however, it would happen in a more dynamic fashion than occurs with the content on library shelves. The conversational model also can be used in other areas of the library. What do we digitize? What do we select? What programs do we offer? What do we pre serve? The empowered user can participate in answer ing all of these questions but does not replace the expert librarian; rather, the user contributes additional and diverse information and commentary. In fact, the catalog scenario just proposed already assumes that the library catalog does more than store metadata. In order for the scenario to work, the catalog must store questions, answers, video, audio—in essence the catalog must be expanded and integrated with other library systems so that a final participatory library system can present a coherent view of the Library to patrons. The next section lays out a sort of roadmap for these enhance ments and mergers. Framework for integration of participatory librarianship As has been noted, participatory networks and libraries as conversations are not brand new concepts sprung from the head of Zeus. Instead, they are means to integrate past and current innovations and create a viable plan forward. Figure 6 provides a sort of road map of how the library might make the transition from current systems to a truly participatory system. It includes current systems, systems under development (such as federated searching), and new concepts (such as the participatory library). It seeks to capture current momentum and push the field forward to a larger view instead of getting bogged down in the intricacies of any one development activity. Along the left side of the graph are current library systems. While the terminology may differ from library to library, nearly every system can be found on today’s library Web sites. By showing the systems together, the problems of user confusion and library management burden become obvious. Users must often navigate these systems based on their needs, and often with little help. Should they search the catalogs first, or the databases? Isn’t the catalog really just another database? Which data base do they choose? In our attempts to serve the users better by creating a rich set of resources and services, we have instead complicated their informationseeking lives. As one librarian puts it, “don’t give me one more system I, or my patrons, have to deal with.” aRticLE titLE | autHoR 27paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 27 From the array of systems on the left side, we can see that libraries have not been doing themselves any favors either. We are maintaining many systems, therefore mak ing the calls for yet more systems not only impractical but unwise. The answer is to integrate systems, combining the best of each while discarding the complexity of the whole. The library world is in the midst of doing just that. This section seeks to highlight promising developments in integrating library systems well beyond the library catalog and to highlight not only an ideal endpoint, but also how this ideal system is truly participatory. merging reference and community involvement The functional area furthest along in the integration of participatory librarianship is reference; as reference is most readily recognizable as a conversation, this comes as no surprise. Over the last decade, reference services have gone online and have led to shared reference ser vices. More importantly, reference done online creates artifacts of reference conversations: electronic files than can be cleaned of personal information and placed in a knowledge base and used as a resource for other users. A new development in reference is the reference blog, in which multiple librarians and other users can be part of a questionanswering community with conversations that can live on beyond a single transaction. Another functional area of libraries that is already involved with participatory librarianship is community involvement. For decades, public libraries have supported local community groups through meeting spaces. Some libraries now are hosting Web spaces for local groups. As libraries incorporate participatory technologies into their offerings, they can create virtual places such as discussion forums, wikis, and blogs for these community groups to use. If there are standards for these discussion areas, then groups from different communities also could easily participate in shared boards; this makes sense for groups such as Weight Watchers or Alcoholics Anonymous that have local branches and national involvement. In an academic setting, these groups can be student, faculty, or staff organizations or courses. In addition to reference and hosted community con versations, the library has been actively creating digi tal collections of materials (either through digitization, leasing service from content providers, or capturing the library’s born digital items). Parallel to the digital collec tion building of library materials is an active attempt to create institutional repositories of faculty papers, teacher lesson plans, organizational documentation, and the like. These services are participatory systems in which col lections come from users’ contributions, and they may evolve into digital repositories that include both user and librariancreated artifacts. These different conversations can be archived into a single repository, and, if properly planned, the refer ence conversations can live alongside, and eventually be intermingled with, the community conversations, and the digital repository (which, after all, though formal, is a community conversation) into a community repository. Community repositories allow librarians to be more eas ily involved in the conversations of the community and capture important artifacts of these conversations for later use. merging library metadata into an enhanced catalog Participatory librarianship can be supported by another functional area of the library: collections. Traditionally, the collection comprises books, magazines, and other information resources paid for by the library. Electronic resources, such as databases that are leased instead of purchased, make up a large portion of library expen ditures. More recently, Webbased resources (external feeds and sites) have been selected and added to the virtual collection. Several kinds of finding aids are used to locate these information resources. The catalog and databases both contain descriptions of resources and searching interfaces. In order to improve access, libraries include records for databases within the catalog. Conversely, federated search ing tools combine the records from different databases and could allow the retrieval of both books and articles by com bining records from the traditional catalog and databases into one tool. If communitycreated resources are part of the catalog, then these resources also would be findable alongside other traditional library resources. The tools for describing information resources also can be participatory. In traditional librarianship, the librarians provide metadata that patrons then use to make selections. Figure 6: road map of how the library might make the transition from current systems to a truly participatory system. 28 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200728 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 By examining this use data, recommender systems can be created to help users locate new materials. In participatory networking, patrons will be encouraged to add comments about items. If standards are used for these comments, then they can be shared among libraries to create larger pools of recommendations. As these comments are analyzed, they can be combined with usage databases to create stronger recommender systems to present patrons with additional choices based upon what is being explored. The end result is an enhanced catalog that allows users and libraries to find information regardless of which sys tem the information resides in. However, the enhanced catalog is still just that, a catalog. It contains surrogates of digital information and is managed separately from the artifacts themselves. In the case of physical items, this may be all the library systems can manage, but in the case of digital content, there is one more step that needs to be taken. Namely, the artificial barrier between catalog (defined as inventory control system) and content (housed in the community repository) must come down. Building the participatory library At this point in the evolution of distributed systems into a truly integrated library system, the participatory library, we have two large collections: one of resources, and one of information about the resources. The first collection of digital content, the community repository, is built by the library and its users collaboratively. The second collection, the enhanced catalog, includes metadata, both formal and usercreated (such as ratings, commentary, use data, and the like). Both the community repository and the enriched catalog are participatory. Yet to realize the dream of a seamless system of functionality (seamless to the user and the library), these two systems must be merged, allow ing users to find resources and, much more importantly, conversations. Furthermore, the users must be able to add to metadata (such as tags to catalog records) and content (such as articles, postings to a wiki, or personal images). The result may be conceived of as a single integrated infor mation resource, which, for the purposes of this conversa tion, is called the participatory library. Users may access the participatory library directly through the library or as a series of services in Google, MySpace, or their own home pages. The point is that the access to the library takes place at the point of conversa tion, not at the point the user realizes he or she needs information from the library. conversations and preservation The conversation model highlights the need for preserva tion. Aside from simply providing systems that facilitate conversation, libraries serve as the vital community memory. Conversations construct knowledge, but some one must remember what has already been said and know how to access that dialog. Scientific conversations, for example, are built on previous conversations (theories, studies, methods, results, and hypotheses). Capturing conversations and playing them back at the right time is essential. This might mean the preservation of artifacts (maps, transcripts, blueprints, photographs), but also it means the increasingly important tasks of capturing the digital dialogs. This highlights the need for institutional repositories (that will later be integrated seamlessly with other library systems, as previously discussed). Specifically, Web sites, lectures, courseware, and articles must be kept. Further, they must be kept in true conversa tional repositories that capture the artifacts (the papers), the methods (data, instruments, policy documents), and the process (meeting notes, conversations, presentations, Web sites, electronic discussions). They must be kept in information structures that make them readily available as conversations; in other words, users must be able to search for materials and reconstruct a conversation in its entirety from one fragment. Being where the conversation is Imagine the conversations that are going on in your local library as you read this. Imagine the physicist chatting with the gardener, and the trustee talking with the volunteer who is reading the latest bestseller. What knowledge can be gleaned from these novel interac tions? Can you measure it? Can you enhance it? Can you capture it? Can you recall it when it would be precisely what a user needs? Note also that these conversations do not belong solely to the library. The library is only part of the con versation. Faced with the daunting variety of resources available on the Web, many organizations try to become the single point of entry into it. Remember that conversa tions are varied in their mode, places, and players, and, more importantly, that they are intensely personal. This means that participants need to have ownership in them, and often in their locations as well. This also means that the library, as facilitator, needs to be varied in its modes and access points. In many cases, it is better to either create a personal space in which users may converse, or, increasingly, to be part of someone else’s space. What we can learn from Web 2.0’s mashups is that smaller sets of limited (but easy to access) functionalities lead to greater incorporation of tools into people’s lives. In the ChicagoCrime–Google Maps mashup, combining maps from Google and Chicago crime statistics, it was important for the host of the site to brand the space and shape the interface for his conversation on crime. Can your library functions be as easily incorporated into these types of conversations? Can a user search your catalog and present the results on his or her Web site? The point is that libraries need to be proactive in a new way. Instead of aRticLE titLE | autHoR 29paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 29 the mantra, “Be where the user is,” we need to, “Be where the conversation is.” It is not enough to be at the users’ desktops; you need to be in their email program, in their MySpace pages, in their instant messaging lists, and in their RSS feed readers. All of these examples point to a significant mental shift that librarians will need to make in moving from delivering information from a centralized location to delivering information in a decentralized manner where the conversations of users are taking place. The catalog example presented earlier is an example of a centralized place for conversations. What if, instead of only being in a catalog, the same data were split into smaller components and embedded in the user’s browser and email pro grams? Just as Google’s mail system embeds advertising based upon the content of a message, the Library could provide links to its resources based upon what a user is working on. By disaggregating the information within its system, the Library can deliver just what is needed to a user, provide connections into mashups, and live in the space of the user instead of forcing the user to come to the space of the library. challenges and opportunities There is clearly a host of challenges in incorporating par ticipatory networks and a participatory model into the library. This is to be expected when we are dealing with something as fundamental as knowledge and as personal as conversations. We consider four major challenges that must be met by libraries before they can truly get into the business of participatory librarianship. technical There is a rich suite of participatory networking software that libraries can incorporate into their daily operations. Implementing a blog, a wiki, or RSS feeds these days is not a hard task, and they can easily be used to deliver information about library services and conversations to the user’s space. Furthermore, these systems are often tested in very largescale environments and are, in some cases, the same tools used in large participatory network ing sites such as Wikipedia and Blogger. Some of these packages are commercial, but others are open source software. Open source software is cheaper, easier to adapt, and, in some cases, more advanced. The downside to open source is that it requires a considerable amount of technical knowledge by the library (but not as much as one might think) and does not come with a technical support hotline. The largest technological impediment, however, may be the currently installed base of software within librar ies. Integrated library systems have a long history and include a broad range of library functions. Legacy code and near monolithic systems have restricted the easy exchange of a diverse set of information. Were these sys tems written today, they would use modular code and loosely coupled APIs and allow customers much more interface customizability. These changes may come to integrated library systems (as customers are demanding it), but it may take years. Several libraries are currently attempting to pick apart these integrated systems themselves. Often, libraries go to the underlying databases that hold the library metadata or create their own data structures, such as the University of Pennsylvania Data Farm project.22 Once components of this system are exposed, the catalog simply becomes another database that can be federated into new and uni fied interfaces. However, such integration requires a great deal of technological expertise. There is an opportunity for integrated library system vendors or large consortial groups such as OCLC to move quickly into this space. In the meantime, there is an opportunity for the larger library community. This technology brief was created in response to a perceived need. Whether evi denced in the Library 2.0 community or in conversations at LITA, libraries are now interested in incorporating new Web technologies into their offerings and opera tions. The technologies under consideration here pres ent platforms for experimentation. Rather than setting up thousands of separated experiments, however, the library community should create a participatory net work of its own. The technology certainly exists to create a test bed for libraries to set up various combinations of communication technologies (blogs, tagging, wikis), to test new Web services against pooled data (catalog data, metadata repositories, and large scale data sets), and even to incorporate new services into the current library offerings (RSS feeds, for example). By combining resources (money, time, expertise) in a single, largescale test bed, libraries not only can get greater impact for the their investments, but can directly experience life as a connected conversation. These connections, if built at the ground level, will then make it easier for the Library to come into existence. Terminology can be clarified, claims tested, and best practices collaboratively developed, greatly accelerating innovation and dissemination. operational In addition to being in the conversation business, librar ies are in the infrastructure business. One of the most powerful aspects of a library is its ability not only to develop a collection of some type of information, but to maintain it over time. Sometimes infrastructure can be problematic (as in the case of legacy systems), but more often than not it provides a stable foundation from which to operate. There are many conversations going on that need infrastructure but have none (or little). Think of the opportunities in your community for using the Web to 30 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200730 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 facilitate a conversation. It might be a researcher want ing to disseminate the results of his or her latest study. It might be a community organization seeking funding. It might be a business trying to manage its basic opera tional knowledge. The point is that such individuals and community organizations are not in the infrastructure business and could use a partner who is. Imagine a local organization coming to the library and, within a few min utes, setting up a Web site with an RSS feed, a blog, and bulletin boards. The library facilitates, but does not own, that individual’s or organization’s conversation. It does form a strong partnership, however, that can be leveraged into resources and support. The true power of participa tory networking in libraries is not to give every librarian a blog; it is in giving every community member a blog (and making the librarian a part of the community). In addition, the library can play the role of connecting these conversations to other users when appropriate. Participatory libraries allow the concept of com munity center (intellectual center, service center, media center, information center, meeting center) to be extended to the Web. Many public libraries have no problem providing meeting space to local nonprofits. Why not provide Web meeting space in the form of a Web site or Web conferencing? Many academic libraries attempt to capture the scholarly output of their faculties, why not help generate the output with research data stores? The answers to these questions inevitably come back to time and money. However, there is nothing in this brief that says such services have to be free. In fact, the best part nerships are formed when all partners are invested in the process. The true problem is that libraries have no idea of how to charge for such services. Faculty would be glad to write library support into grants (in the form of Web site creation and hosting), but need a dollar figure to include and how long each task will take. Many libraries aren’t used to positioning their services on a per item basis, and this makes it difficult to build partnerships. Sometimes it is not a lack of money, but a lack of structure to take in money that is the problem. policy As always, it is policy that presents the greatest challenges. The idea of opening the library functions to a greater set of inputs is rife with potential pitfalls. How can libraries use the technologies and concepts of Facebook and MySpace without being plagued by their problems? How can users truly be made part of the collection without the library being liable for all of their actions? The answers may lie in a seemingly obscure concept: identity management. Conversations can range in their mode, topic, and duration. They also can vary in the conversants. The library needs to know a conversant’s status to determine policy (for example, we can only disclose this information to this person), and requires a unique identifier, such as a library card, to uphold it. In traditional libraries, that is the extent of identity management. In a participatory model, distinctions among identi ties become complex and graduated, and require us to consider a new approach. This new model, of patrons adding information directly to library systems, is not as radical as it may first appear. We have become very used to the idea of roles and tiered levels of authority in many other settings. Most modern computer systems allow for some gradation in user abilities (and responsibilities). Online communities have even introduced merit systems, by which continual highquality contributions to a site equals greater power in the site. Think about Amazon, Wikipedia, even eBay; as users contribute more to the community, they gain status and recognition. From par ticipants to editors, from readers to writers, these organi zations have seen membership as a sliding scale of trust, and libraries need to adopt this approach in all of their basic systems. We currently do, to a degree, in the form of librarians, paraprofessionals, and other staff. Yet even these distinctions tend to be rigid and often classbased, with high walls (such as a master’s degree) between the strata. Some of this is imposed by outside organizations (civil service requirements, tenure track, and so on), but a great deal is there by inertia of the field. Skillful use of identity management will help librar ies avoid the baggage of MySpace and Facebook. As users grain greater access, greater responsibility, and greater autonomy, libraries need to be more certain of their identities. That is, for a user to do more requires the library to know more. Knowing about a user may involve traditional identity verification or tracking an activity trail, whereby intentions can be judged in rela tion to actions. These concepts may be expressed as, “The more we know you, the more control you can have in valuable services such as blogging, or the catalog.” The concepts are illustrated in Blogger and LiveJournal, both of which require some level of identity information. In another example, to join LiveJournal you must be invited, thus the community confers identity. The common theme is that verifying (and building) identity is community based. The difference between the library and MySpace is that the library works in an established community with traditional norms of identity, whereas MySpace is seeking to create a community (where identity is more defined by social connections than actions). Both the library and the services mentioned above, however, base their functions and services on identity. Ethical As knowledge is developed through conversation, and libraries facilitate this process, libraries have a powerful impact on the knowledge generated. Can librarians inter fere with and shape conversations? Absolutely. Should we? We can’t help it. Our collections, our reference work, aRticLE titLE | autHoR 31paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 31 our mere presence will influence conversations. The ques tion is, in what ways? By dedicating a library mission to directly align with the needs of a finite community, we are accepting the biases, norms, and priorities of the com munity. While a library may seek to expand or change the community, it does so from within. When Internet filtering became a requirement for fed eral Internet funding, public and school libraries could not simply quit, or ignore the fact, because they are agents of their communities. School libraries had to accept filtering with federal funding because their parent organizations, the schools, accepted filtering.23 We see, from this example, that libraries may shift from facilitating conversations to becoming active conversants, but they are always doing both. Thus, the question is not whether the library shapes conversations, but which ones, and how actively? These questions are hardly new to the underlying principles of librarianship. And nothing in the participa tory model seeks to change those underlying principles. The participatory model does, however, highlight the fact that those principles shape conversations and have an impact on the community. ■ Recommendations The overall recommendation of this article is that librar ies must be active participants in the ongoing conversa tions about participatory networking. They must do so through action, by modeling appropriate and innovative use of technologies. This must be done at the core of the library, not on the periphery. Rather than just adding blogs and photosharing, libraries should adopt the princi ples of participation in existing core library technologies, such as the catalog. Anything less simply adds stress and stretches scarce resources even further. To complement this broad recommendation, the authors make two specific proposals: expand and deepen the discussion and understanding of participatory net works and participatory librarianship, and create a par ticipatory library test bed to give librarians needed participatory skills and sustain a standing research agenda in participatory librarianship. As stated in the outset of this document, what you are reading is limited. While it certainly contains the kernel and essence of participatory networks (systems to allow users to be truly part of services) and participatory librar ianship (the role of librarianship as facilitators and actors in conversations in general), the focus was on technology and technology changes. Already, the ideas contained in this document have been part of an active conversation. The first draft of this document was made available for public comment via a wiki, email, and bulletin boards, and concepts herein presented at conferences and lec tures. However, there is now a need to broaden the scope and scale of the conversation. The theoretical founda tions of participatory librarianship need to be rigorously presented. The nontechnical components of the ideas (and the marriage of nontechnical to technical) need to be explored. There are curricular implications: How do we prepare participatory librarians? The nature and form of the Library and participatory systems need to be dis cussed and examined in theoretical, experimental, and operational contexts. In order to do this, the authors propose a series of con versations to engage the ideas. These conversations, both in person and virtual, need to be within the profession and across disciplines and industries. The deeper conversa tions need to be documented in a series of publications that expand this document for academics and practitioners. The authors feel, however, that the first proposal must be grounded in action. To complement the more abstract exploration of participatory networks and participatory librarianship, there must be an active playground where conversants can experience firsthand the technologies discussed, and then actively shape the tools of partici pation. This is the test bed. This test bed would imple ment a participatory network of libraries, and provide a common technology platform to host blogs, wikis, discussion boards, RSS aggregators, and the like. These shared technologies would be used to experiment with new technologies and to provide real services to librar ies. Thus, libraries could not only read about blogging applications, they could try them and even roll them out to their community members. As libraries start new com munity initiatives, they could rapidly add wikis and RSS feeds hosted at the shared test bed. The test bed would also make all software available to the libraries so they could locally implement technologies that have proven themselves. The test bed would provide the open source software and consulting support to implement features locally. The test bed also would develop new metrics and means of evaluating participatory library services for the use of planners and policy makers. A major deliverable of the test bed, however, would be to model innovations in integrated library systems (ILS). The test bed would work with libraries and ILS vendors to pilot new technologies and specify new stan dards to accelerate ILS modernization. The point of the test bed is not to create new ILSs, but to make it easy to incorporate innovative technologies into vendor and open source ILSs. The location and support model of the test bed are open for the library community to determine. Certainly, it could be placed in existing library associations or orga nizations. However, it would require the host to be seen as neutral in ILS issues, and to be capable of supporting a diverse infrastructure over time. The host organiza tion also would need to be a nimble organization, able 32 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200732 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 to identify new technical opportunities and implement them quickly. One model that might work is establishing a pooled fund from interested libraries. This pooled fund would support an open source technology infrastructure and a small team of researchers and developers. The team’s activities would be overseen by an advisory panel drawn from contributing members. Such a model spreads this investment out into experimentation across a broad col laboration and should, ultimately, save libraries time and money. As a result, the time and money that indi vidual libraries might spend on isolated or disconnected experiments can be invested in a common effort with greater return. Libraries have a chance not only to improve service to their local communities, but to advance the field of par ticipatory networks. With their principles, dedication to service, and unique knowledge of infrastructure, libraries are poised not simply to respond to new technologies, but to drive them. By tying technological implementa tion, development, and improvement to the mission of facilitating conversations across fields, libraries can gain invaluable visibility and resources. Impact and leadership, however, come from a firm and conceptual understanding of libraries’ roles in their communities. The assertion that libraries are an indis pensable part of knowledge generation in all sectors pro vides a powerful argument to an expanded function of libraries. Eventually, blogs, wikis, RSS, and AJAX all will fade in the continuously dynamic Internet environment. However, the concept of participatory networks and con versations is durable. ■ Acknowledgements The authors would like to thank the following people and groups: Ken Lavender, for his editing prowess. The doctoral students of IST 800 for providing input on conversation theory: Johanna Birkland, John D’Ignazio, Keisuke Inoue, Jonathan Jackson, Todd Marshall, Jeffrey Owens, Katie Parker, David Pimentel, Michael Scialdone, Jaime Snyder, Sarah Webb. The students of IST 676 for their tremendous input and for their exploration of the related concept of Massive Scale Librarianship: Marcia Alden, Charles Bush, Janet Chemotti, Janet Feathers, Gabrielle Gosselin, Ana Guimaraes, Colleen Halpin, Katie Hayduke, Agnes Imecs, Jennifer Kilbury, MinChun Ku, Todd McCall, Virginia Payne, Joseph Ryan, Jean Van Doren, Susan Yoo. Those who commented on the draft, including Karen Scheider, Walt Crawford and John Buschman, and Kathleen de la Peña McCook. LITA for giving us a forum for feedback. Carrie Lowe, Rick Weingarten, and Mark Bard of ALA’s OITP for their feedback and support. The Institute staff, including Lisa Pawlewicz, Joan Laskowski, and Christian O’Brien, for logistical support. References and Notes 1. Cited in P. Hardiker and M. Baker, “Towards Social Theory for Social Work,” Handbook of Theory for Practice Teachers in Social Work, J. Lishman, ed. (London: Jessica Kingsley, 1991). 2. G. Pask, Conversation Theory: Applications in Education and Epistemology (New York: Elsevier, 1976). 3. Linda H. Bertland, “An Overview of Research in Metacog nition: Implications for Information Skills Instruction,” School Library Media Quarterly 15 (Winter 1986): 96–99. 4. Pask, Conversation Theory, 92. 5. Tim O’Reilly, “What Is Web 2.0: Design Patterns and Business Models for the Next Generation of Software,” O’Reilly, www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/ whatisweb20.html (accessed Feb. 1, 2007). 6. J. Suroweicki, The Wisdom of Crowds (New York: Double day, 2004). 7. “Wiki’s Wild World: Researchers Should Read Wikipedia Cautiously and Amend It Enthusiastically,” Nature 438, no. 890 (Dec. 2005): 890; www.nature.com/nature/journal/v438/ n7070/full/438890a.html (accessed Feb 1, 2007). 8. Google, “Google Maps API,” www.google.com/apis/ maps (accessed Feb. 1, 2007). 9. “Java Script Tutorial,” W3 Schools, www.w3schools.com/ js/default.asp (accessed Feb. 1, 2007). 10. While the terms in Web 2.0 are a bit ambiguous, many people confuse the term “mashup” with “remixes.” Mashups are combining data and functions (such as mapping), whereas remixes are reusing and combining content only. So combining a song with a piece of video to create a “new” music video would be a remix. Mapping all of your videos on a map using YouTube to store the videos and Google Maps to plot them geographically would be a mashup. 11. For example Gmail, a very widely used, Webbased email service, but is still considered “beta” by Google. 12. Malcolm Gladwell, The Tipping Point: How Little Things Can Make a Big Difference (Boston: Back Bay Books, 2000), 272. 13. OCLC, “Introduction to Dewey Decimal Classification,” www.oclc.org/dewey/versions/ddc22print/intro.pdf (accessed Feb. 1, 2007). 14. “Ajax (programming),” Wikipedia, http://en.wikipedia .org/wiki/Ajax_(programming) (accessed Feb. 1, 2007). 15. “Web Services Activity,” W3C, www.w3.org/2002/ws (accessed Feb. 1, 2007). 16. Walt Crawford, “Library 2.0 and ‘Library 2.0.’ ” Cites & Insights 6, no. 2 (2006), http://citesandinsights.info/civ6i2.pdf (accessed Dec. 13, 2007). 17. Eric Ormsby, “The Battle of the Book: The Research Library Today,” The New Criterion (Oct. 2001): 8. 18. Ken Chad and Paul Miller, “Do Libraries Matter? The Rise of Library 2.0: A White Paper,” version 1.0, 2005, www.talis .com/downloads/white_papers/DoLibrariesMatter.pdf (accessed Feb. 1, 2007). 19. Slashdot, “MySpace #1 US Destination Last Week,” h t t p : / / s l a s h d o t . o rg / a r t i c l e s / 0 6 / 0 7 / 1 2 / 0 0 1 6 2 11 . s h t m l aRticLE titLE | autHoR 33paRticipatoRY nEtWoRks | LankEs, siLvERstEin, anD nicHoLson 33 (accessed Feb. 1, 2007); Pete Williams, “MySpace, Facebook Attract Online Predators,” MSNBC, www.msnbc.msn.com/ id/11165576 (accessed Feb. 1, 2007); “The MySpace Gener ation,” BusinessWeek, Dec. 12, 2005, www.businessweek .com/magazine/content/05_50/b3963001.htm (accessed Feb. 1, 2007). 20. Saturday Night Live, “Sketch: MySpace Seminar,” NBC, www.nbc.com/Saturday_Night_Live/segments/9166.shtml (accessed Feb. 1, 2007). 21. C. Stohl and G. Cheney, “Participatory Processes/Para doxical Practices,” Management Communication Quarterly 14, no. 3 (2001): 349–407. 22. J. Zucca, “Traces in the Clickstream: Early Work on a Management Information Repository at the University of Penn sylvania,” Information Technology and Libraries 22, no. 4 (2003): 175–78. 23. To be more precise, public and school libraries that accept eRate funding. 3268 ---- 34 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 Author iD box for 2 column layout Column Title Editor As public libraries are becoming e-government access points relied on by both patrons and government agen- cies, it is important for libraries to consider the impli- cations of these roles. While providing e-government access serves to reinforce the tremendously important role of public libraries in the United States social infra- structure, it also creates new demands on libraries and opens up significant new opportunities. Drawing upon several different strands of research, this paper exam- ines the nexus of public libraries, values, trust, and e-government, focusing on the ways in which the values of librarianship and the trust that communities place in their public libraries reinforce the role of public librar- ies in the provision of e-government. The unique values embraced by public libraries have not only shaped the missions of libraries, they have influenced popular opinion surrounding public libraries and fostered the confidence that communities place in them as a source of trusted information and assistance in finding infor- mation. As public libraries have embraced the provision of Internet access, these values and trust have become intertwined with their new social role as a public access point for e-government both in normal information activities and in the most extreme circumstances. This paper explores the intersections of these issues and the relation of the vital e-government role of public libraries to library funding, public policy, library and informa- tion science education, and research initiatives. P ublic libraries have always been valued and trusted institutions within society. Due to recent advances in technology and changes in United States society, public libraries now also play a unique and critical role by offering free public Internet access. With the increas ing reliance on the Internet as a key source of news, social capital, and access to government services and information, the free access provided by public librar ies is an invaluable resource. As a result, a significant proportion of the U.S. population, including people who have no other means of access, people who need help using computers and the Internet, and people who have lower quality access, rely on the Internet access and computer help available in public libraries. Federal, state, and local government agencies now also rely on public libraries to provide citizens with access to and guidance in using egovernment Web sites, forms, and services; many government agencies simply direct citizens to the nearest public library for help. This confluence of events has created a major new social role for public libraries— guarantors of Internet and egovernment access. Though public libraries are not the only points of free Internet access in many communities, they have created the strongest commitment to providing access and help for all. By providing not only the access to technology, but also to help using the technology, libraries became Internet access points, while community technology cen ters, which usually did not offer the same level of avail able assistance, failed in the late 1990s and early 2000s. Further, as libraries not only provide Internet access, but free computer access as well, they attract the people who do not own computers and do not benefit from a city’s or coffee shop’s free WiFi. The compelling combination of free computer access, free Internet access, the avail ability of assistance from knowledgeable librarians, the value that public librarians place on serving their local communities, and the historical trust that society places in public libraries has made libraries a critical part of the U.S. social infrastructure. Without public libraries, large segments of the population would be cut off from access to the Internet and egovernment. While the provision of Internet access for those who have no other access parallels the role of public libraries as providers of access to print materials, the matura tion of public libraries into Internet and egovernment access hubs has profound implications for the roles that public libraries are being expected to play in their communities. Public libraries are trusted by their com munities as places that community members can turn to for unfettered information access and as places to go for information in times of need. Combining this trust with the power of Internet access and support makes public libraries even more critical within their local com munities. The trust placed in libraries is also important in balancing the lack of confidence that many citizens place in other government institutions as well as in the Internet. Clearly, egovernment, which exists at this intersection, has its trustworthiness bolstered by the role of public libraries in its use. As patrons are able to access egovernment through the library—a place that is trusted—they may have greater confidence in the gov ernment services they use through library computers and with the assistance of librarians. The important role of libraries in providing citizens with access to the Internet, and especially to egovern paul t. Jaeger (pjaeger@umd.edu) is an Assistant Professor and Director of the Center for information Policy and Electronic government at the College of information Studies of the university of Maryland, College Park. kenneth R. Fleischmann (kfleisch@umd.edu) is an Assistant Professor at the College of information Studies of the university of Maryland, College Park. Paul T. Jaeger and Kenneth R. Fleischmann Public Libraries, Values, Trust, and E-Government aRticLE titLE | autHoR 35puBLic LiBRaRiEs, vaLuEs, tRust, anD E-GovERnmEnt | JaEGER anD FLEiscHmann 35 ment, makes natural sense given the values of the public library. These new services reflect the values traditionally upheld by public libraries, such as equal access to infor mation, literacy and learning, and democracy. Indeed, these values likely have played a significant role in developing and sustaining public trust in public libraries as institutions. Thus, to understand how public libraries have come to serve as the default site for egovernment access, it is important to consider how this role builds on and reflects the public library’s enduring values. Drawing upon several different strands of research, this article explores the intersections of public libraries, values, trust, and egovernment. The article first exam ines the values of public libraries and the role that these values play in influencing popular opinion surrounding public libraries. Next, the article focuses on the trust that communities place in public libraries, which builds upon the values that libraries uphold. After that, the article explores the reasons why public libraries became and remain the public access point for egovernment, providing examples from the 2004 and 2005 hurricane seasons that illustrate this point in the most extreme cir cumstances. The article then examines the nexus of public libraries, values, trust, and egovernment, further exam ining how the values of librarianship and the confidence that communities place in their public libraries reinforce the role of public libraries in the provision of egovern ment. Finally, the article explores how the egovernment role of public libraries could be cultivated to improve library services through involvement in research and educational initiatives. ■ Public libraries and values Values can be seen as “evaluative beliefs that synthesize affective and cognitive elements to orient people to the world in which they live.”1 In other words, values tie together how individuals think about the world and how they feel about the world. Following this definition, values are situated within individuals. Although they are a result of social interaction and may be shared among individuals, values are a highly individualized and per sonalized phenomenon. Thus, values arise at the intersec tion of the individual and the social, with some scholars now making a case for increasing the emphasis placed on values in the social sciences.2 Recently, many scholars and commentators have focused on the values of librar ies, most notably former ALA president Michael Gorman, who has written extensively on the topic.3 Gorman focuses on library values in response to what he views as a disconnect between library practitioners and academics. He argues that libraryscience programs are becoming increasingly detached from reality, and that one way to ground library science, as well as the library profession, is through an emphasis on the values of librar ianship, which demonstrate the core, enduring values of the profession.4 He explains that values, on the one hand, should provide a foundation for interaction and mutual understanding among members of a profession; on the other hand, they should not be viewed as immutable, but rather as sufficiently flexible to match the changing times. He lists eight central values of librarianship that he views as particularly salient at present: stewardship, service, intellectual freedom, rationalism, literacy and learning, equity of access to recorded knowledge and information, privacy, and democracy. Frances Groen echoes Gorman’s sentiments and argues that one of the major limitations of libraryscience programs is their lack of attention to values.5 She argues that library and information science (LIS) programs place almost all of their educational emphasis on what librar ians do and how they do it, and almost none on the rea sons why they do what they do and why such activities are important. She identifies three fundamental library values: access to information, universal literacy, and preservation of cultural heritage, all of which she argues are also characteristics of liberal democratic societies. This argument parallels the observation that increases in information access within a society are essential to increasing the inclusiveness of the democratic process in that society.6 Library historian Toni Samek focuses on another aspect of library values that is no longer as strongly emphasized—attempts to achieve neutrality in libraries.7 Neutrality often was advocated as a cherished value, in the sense of providing equal access to all information and sources. However, Samek demonstrates that libraries, on the contrary, were more likely to emphasize mainstream information sources and thus privilege them over alter native sources. Not only has the value of neutrality been problematic in terms of how it has been implemented and mobilized in public libraries in the 1960s and 1970s, but it also is perhaps impossible to ever achieve in reality.8 The fact that neither Gorman nor Groen include neutrality in their listings of fundamental library values demonstrates how library values have continued to evolve as public libraries have developed as social institutions. As library values have developed, they have served to unite librarians and establish the role of public libraries in their communities. The values of librarianship have been encoded in the American Library Association’s (ALA) Library Bill of Rights, which strongly asserts the values of equal access and service for all patrons, nondiscrimina tion, diversity of viewpoint, and resistance to censorship and other abridgments of freedom of expression.9 The values of libraries and librarianship are one of the fac tors that lead communities to trust public libraries, as the following section explores. Overall, further study of the 36 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200736 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 role of values in libraries is essential, especially given the increasing role of technology in public libraries.10 ■ Public libraries and trust Exactly one half of the respondents to a 2007 Pew Research Center study agreed with the statement “You can’t be too careful in dealing with people.”11 However, even in a climate where trust can be a precious commodity, public libraries are trusted by their communities. Carr argues that libraries have come to earn the trust of their com munities because of four obligations that librarians strive to meet: to provide usercentered service, to actively engage in helping users, to connect information seekers to unexplored information sources, and to take the goal of helping users as a professional duty that is controlled first and foremost by the library user.12 Similarly, Jaeger and Burnett argue that, because of its traditional defense of commonly accepted and popular values—such as free access to and exchange of information, providing a diverse range of materials and perspectives to users from across society, and opposition to government intrusions into personal reading habits—public libraries have come to be seen by members of the populace as a trusted source of information in the community.13 Gorman argues for a direct link between the values of libraries and the trust that is instilled within them by the public, stating that one important mission for ensuring the survival of libraries and librarianship is “assuring the bond of trust between the library and the society we serve by demonstrating our stewardship and commitment, thus strengthening the mutuality of the interests of librar ians and the broader community.”14 Further, a 2006 study conducted by Public Agenda found that “public libraries seem almost immune to the distrust that is associated with so many other institutions.”15 In specific terms of the Internet, the public library “is a trusted communitybased entity to which individuals turn for help in their online activities—even if they have comput ers and Internet access at home or elsewhere.”16 In a large scale national survey, 64 percent of respondents, including both users and nonusers of public libraries, asserted that providing public access to the Internet should be one of the highest priorities for public libraries.17 Thus, trust in public libraries seems to carry over from other library services to provision of Internet access and training. However, challenges to trust in public libraries seem to be growing in the Internet age. The trusted role of pro tecting users’ personal information may create conflicts with the other social responsibilities of public libraries.18 As a result of a lack of preparedness of some librarians to deal with privacy issues, it is possible that “the trust that research shows users place in libraries is not fully repaid.”19 A 2005 OCLC study suggests that, indeed, user trust in public libraries shows signs of weakening, as the majority of citizens place as much trust in Internet search engines as they do in public libraries.20 Further, the changes in the law following the 9/11 terror attacks that have increased the ability of the federal government to track patron activities in public libraries, such as through the USA PATRIOT Act, have raised serious concerns about privacy and freedom of expression among many public library patrons and librarians.21 Trust in libraries also has been challenged by the impo sition of filters for public libraries that receive Erate fund ing due to the Children’s Internet Protection Act.22 While Internet access is no longer unfettered in libraries that have to comply with the law, public libraries have been able to prevent this law from eroding their role as trusted Internet provider through ALA’s vigorous legal challenge to the constitutionality of law and the rejection of Erate funds by a large number of libraries after the Supreme Court upheld the constitutionality of the law.23 Thus, the trusting rela tionships that public libraries have built with their com munities are valuable commodities that can be transferred under some circumstances from one particular service to another, yet are not inalienable rights granted to public libraries. Rather, public trust is something that libraries must work hard to maintain. Trust in public libraries also has served as an important cause and effect of the role of libraries in providing access to egovernment. ■ Public libraries and e-government Public libraries are not only trusted as a means of access to the Internet in general, they are trusted as a provider of access to egovernment. With nearly every United States public library now connected to the Internet and offer ing free public access, they can fill a community need of ensuring that all citizens have access to egovernment and assistance using egovernment services.24 Indeed, public libraries and the Internet have both improved public access to government information.25 This social role also is embraced by all levels of government, with government agencies often directing people with questions about their online materials to public libraries for help.26 As such, government agencies also trust public libraries to serve as key providers of e government access and training. Public libraries could not have foreseen becoming the default social access point for egovernment when they began to provide free public Internet access in the mid1990s, due in great part to the largely separate evolution of Internet access in libraries and egovernment. However, they now fill this role in society, ensuring access to those who have no other means of reaching egovernment and providing a safety aRticLE titLE | autHoR 37puBLic LiBRaRiEs, vaLuEs, tRust, anD E-GovERnmEnt | JaEGER anD FLEiscHmann 37 net of training and assistance for those who have access but need help using egovernment. Public libraries have developed into the social source of egovernment for two reasons. The first is simply that libraries committed to the provision of public Internet access in the early 1990s and have continued to grow and improve that access so that virtually all public libraries in the United States provide free public Internet access.27 However, presence of access alone does not account for the current role of the public library, as most public schools and government offices have Internet access, and community technology centers were origi nally funded to create an environment that would provide computer access. A key difference in public libraries is that they are historically trusted as providers of information, including government information, to all segments of society. “The public library is one place that is culturally ingrained as a trusted source of free and open information access and exchange.”28 A key part of the provision of Internet access in pub lic libraries also has been providing help. As Heanue explains, “even if Americans had all the hardware they needed to access every bit of government information they required, many would still need the help of skilled librarians whose job it is to be familiar with multiple systems of access to government systems.”29 Not only is the information trusted because of the source, the help is trusted because the librarians are part of the library. As egovernment has developed and the complexity has grown, this trusted help has become invaluable to many people who need to use egovernment but do not feel able to on their own. In a 2001 study of both public library and Internet users, the key preferences identified for public libraries included the ease of use, accuracy of informa tion available, and help provided by library staff.30 These perceptions have carried over into egovernment, as the staff members not only provide help using egovernment; their guidance directs users to the correct egovernment sites and forms and makes using the sites an easier expe rience than it otherwise would be. In the era of egovernment, governments internation ally are showing a strong preference for delivering ser vices via the Internet, particularly as a means of boosting costefficiency and reducing time spent on direct interac tions with citizens.31 However, citizens show a strong preference for phonebased or inperson interactions with government representatives when they have questions or are seeking services.32 Egovernment services generally are limited by difficulties in searching for and locating the desired information, as well as lack of availability of computers and Internet access to many segments of the general population.33 Such problems are exacerbated by general lack of familiarity of the structure of government and which agencies to contact as well as many citizens’ attitudes toward technology and government.34 Also, as egovernment sites give more emphasis to presenting political agendas rather than promoting democratic par ticipation, users are less trusting of the sites themselves.35 Finally, perhaps the most compelling reason for the reli ance on public libraries to provide access to and help with egovernment is that public libraries provide support equally to all members of a community—and that free services are of most relative value to those who have the fewest resources of their own. As a result of the reliance of patrons and government agencies on the public library as a center for egovernment access and assistance, public librarians have had to become de facto experts on egovernment, ranging from Medicare prescription plans to FEMA forms to immigration registra tion to water management registration.36 In one case, the involvement of a librarian who specialized in government information was necessary in a community planning pro cess to sort through the related egovernment materials and information sources.37 One area where the social roles as provider of egovernment and as trusted provider of information were notably intertwined was during the 2004 and 2005 hurricane seasons along the Gulf Coast. ■ Public libraries as trusted provider of e-government Public libraries have become vital access points and com munication hubs for many communities and, in times of emergency, are vital in helping their communities cope with the crisis.38 This role proved especially important in com munities along the Gulf Coast during the unprecedented 2004 and 2005 hurricane seasons, with public libraries employing their Internet access to assist their communities in hurricane recovery in numerous ways. The public librar ies in that region described five major roles for the public library Internet access in communities after a hurricane: ■ finding and communicating with dispersed and dis placed family members and friends; ■ completing FEMA forms, which are online only, and insurance claims; ■ searching for news about conditions in the areas from which they had evacuated; ■ trying to find information about the condition of their homes or places of work, including checking news sites and satellite maps; and ■ helping emergency service providers find informa tion and connect to the Internet.39 The provision of egovernment information and assis tance in filling out egovernment forms was a central function of these libraries in helping their communities. The level of assistance was astounding—one Mississippi library completed more than fortyfive thousand FEMA 38 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200738 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 applications for patrons in the first month after Katrina struck—despite the fact that the libraries were not specifi cally prepared to offer such a service and that few library systems planned for this type of situation.40 Furthermore, while libraries helped many communities, they could not meet the enormous needs in the affected communi ties. The events along the Gulf Coast in 2004 and 2005 revealed a serious need for the integration of local and state public entities that have largescale coordination plans to work with the libraries.41 Most of the functions that community organizations played in the most ravaged areas after Katrina, Rita, Wilma, Dennis, Ivan, and the other major storms were completely ad hoc and unplanned.42 The federal gov ernment was of little help in the immediate aftermath of many of these situations.43 As such, it was the local community organizations, particularly public libraries, that used information technology (at least what was still working) to try to pick up the pieces, get aid, find the missing, and perform other vital functions. Consider the following quotes from local government officials explaining the role computers and Internet access in public libraries played in providing information to dev astated communities: Our public access computers have been the only source of communicating with insurance carriers, the Federal Emergency Management Agency and other sources of aid. The greatest impact has been access to information such as FEMA forms and job applications that are ONLY available via Internet. This was highly visible during the aftermath of hurricanes Rita & Katrina. Overall access to information in this rural community has been outstanding due to use of the Internet. Relief workers were encouraged to use the library to keep in touch with family and friends through email. . . . The Library provided a FEMA team with local maps and help in locating areas that potentially suffered major damage from the storm. During the immediate aftermath of Katrina, our com puters were invaluable in locating missing family, applying for FEMA relief (which could only be done online) and other emergency needs. For that time—the computers were a Godsend. We have a large number of displaced people who are coming to rely upon the library in ways many of them never expected. I’ve had so many people tell me that they had never been to a library before they had to find someplace to file a FEMA application or insur ance claim. Many of these people knew nothing about computers and would have been totally lost without the staff’s help.44 Along with egovernment access, one of the greatest affects of access to information related to searches for lost family, friends, and pets, with many libraries creating lists of individuals who had been to the library and who were being sought to help in establishing contacts between people. As one librarian stated, “our computers were invaluable in locating a missing family.”45 Searches were conducted by patrons and by librarians helping them to locate evacuees and search for information about those who stayed behind. Internet access also allowed patrons to have “contact with family members outside of the disaster area,” “communicate with family and friends,” and “stay in touch with family and friends due to lack of telephone service.”46 Libraries used their Internet access to aid rescue personnel to communicate with their agen cies, and even to direct emergency responders with direc tions, maps, and information about where people most needed help.47 The level of local libraries’ success in meeting the needs of their communities after the hurricanes varied widely, though. Many were simply overwhelmed by the numbers of people in need and limited by the fact that they had never expected to have to act as a community lifeline in this way.48 The libraries that faired the best were usually in Florida; they have a greater familiarity with dealing with hurricanes and thus were more prepared and had more established ties between local libraries, county governments, and state agencies.49 Having Internet access and expertise is clearly not enough. Planning, coordina tion, experience, and government support and funding all influenced how different public libraries were able to respond after the major hurricanes. Public libraries also may be able to play a role in ongoing emergency response efforts, such as the development of largescale community response grids that coordinate citizens and emergency responders in emergencies.50 The greatest lesson, however, may be that public librar ies, as trusted providers of information technology access, particularly access to egovernment, are the most local line of response in communities. The national government failed shatteringly and completely to help people after Hurricane Katrina, while little public libraries in and on the edges of the devastation hummed along. The local nature of the response that libraries could provide man aged to reach communities and members of those commu nities much better than national or state level responses. Such local response to crises, while vital, is becoming much harder to find outside of public libraries. ■ The nexus of public libraries, values, trust, and e-government The democratically oriented core values of public librar ies and the trust that communities place in their public aRticLE titLE | autHoR 39puBLic LiBRaRiEs, vaLuEs, tRust, anD E-GovERnmEnt | JaEGER anD FLEiscHmann 39 libraries have the potential to significantly enhance and strengthen the role of public libraries in the provision of egovernment. Citizens who access egovernment using computers in public libraries, and with the expert assistance of librarians, may have more confidence in the egovernment information and services they are using as a result of their high regard for public libraries. As patrons trust that librarians will help them reach the information they need, patrons’ awareness of and confidence in egovernment will increase as they learn from librarians about the types of information and services available from egovernment. Further, by teaching patrons what is available from and how to use egovernment, librar ians are serving to increase the number of egovernment users. Because egovernment is still at an early stage in its development, such positive associations could play a critical role in encouraging and facilitating its widespread acceptance and adoption. Just as egovernment is still in its formative stages, research on egovernment also is just getting started. To date, research on egovernment has focused more on technical than social aspects. For example, a meta analysis of 110 peerreviewed journal articles related to egovernment revealed that the relationship between egovernment and values is an important, yet to date understudied, topic.51 It is important to consider not only bandwidth and markup languages, but also values and trust in developing and analyzing egovernment. It also is important to consider the relationship between trust in egovernment and the potential for increasingly participatory democracy. Trust can be seen as “centrally positioned at the nexus between the primarily internally driven administrative reforms of egovernment’s architecture and the related, more exter nally rooted pressures for egovernance reflected in widening debates on openness and engagement.”52 Similarly, “citizen engagement can help build and strengthen the trust relationship between governments and citizens.”53 Through egovernment, it is possible to facilitate citizen participation in government through the bidirectional interactive potential of the Internet, making it possible to move toward strong democracy.54 Greater faith in democracy can potentially significantly increase citizen trust in egovernment. At the same time that we consider all of these impor tant issues related to egovernment, it is important not to lose sight of the critical role that public libraries play in the provision of egovernment. Further, it is necessary to make certain that public libraries receive credit and support for the work that they do in providing access to and help with egovernment. As demonstrated above, public libraries are uniquely and ideally situated to ensure access to and assistance in using egovernment information and services. However, this activity is not sustainable without the recognition and resources that must accompany this role. The conclusion addresses this important point in more detail. ■ Conclusions and future directions The evolution of the public library into an egovernment access point has occurred without the direct intention of public libraries and without their involvement in policy decisions related to these new social roles. As with the need to become more active in encouraging the develop ment of technologies to help libraries fulfill these social expectations, public libraries also must become more involved in the policymaking process and in seeking financial and other support for these activities. Public libraries have to demand a voice not only to better con vey their critical role in the provision egovernment, but to help shape the direction of the policymaking process to ensure more government support for the access to and help with egovernment that they provide. Public libraries have taken on these responsibilities without receiving additional funding. While the provi sion of Internet access alone is a major expense for public libraries, the reliance of government agencies on public libraries as the public support system for egovernment adds very significant extra burdens to libraries.55 In a 2007 survey of Florida public libraries, for example, 98.7 percent indicated that they receive no support from an outside agency to support the egovernment services the library provides, despite the fact that 83.3 percent of responding libraries indicated that the use of egovern ment in the library had increased overall library usage.56 This lack of outside support has resulted in public librar ies in different parts of the country having widely varying access to the Internet.57 The reality is that public libraries are expected by patrons and government agencies to fulfill this social role, whether or not any support—financial, staffing, or training—is provided for this role. The vital roles that public libraries played in the aftermath of the major hur ricanes of the 2004 and 2005 seasons may have perma nently cemented the public and government perception of public libraries as hubs for egovernment access.58 While public libraries have become the unofficial uni versal access point for egovernment and are trusted to serve as a vital community response and recovery agency during emergencies, they do not receive funding or other forms of external assistance for these functions. Public libraries need to become involved in and encourage plans and programs that will serve to sustain these essential and inextricably linked activities, while also bringing some level of financial, training, and staffing support for these roles. The tremendous efforts and successes of public librar ies in the aftermath of the 2004 and 2005 hurricanes has 40 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200740 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 earned libraries a central position to egovernment and emergency planning at local, state, and federal levels. In those emergency situations, public libraries were able to serve their communities in a capacity that was far beyond the traditional image of the role of libraries, but these emergency response roles are as significant as anything else libraries could do for their communities. In order to continue fulfilling these roles and adequately performing other expected functions, public libraries need to push not only for financial support, but also for a greater role in planning and decisionmaking related to egovernment services as well as emergency response and recovery at all levels of government. If strategic plans and library activities have a consis tent message about the need for support, the interrelated roles of trusted source of local information, egovernment access provider, and communityresponse information and coordination center can make a compelling argument for increases in funding, support, and social standing of public libraries. The most obvious source of further sup port for these activities would be the federal government. Amazingly, federal government support accounts for only about 1 percent of public library funding.59 Given that federal government agencies are already relying on public libraries to ensure access to egovernment and fos ter community response and recovery in times of emer gencies, federal support for these social roles of the public library clearly can and should be increased significantly. State libraries, cooperatives, and library networks already work to coordinate funding and activities related to certain programs, such as the Erate program.60 These same library collectives may be able to work together to promote the need for additional resources and coor dinate those resources once they are attained. Private and public partnerships offer another potential means of support for these library activities. With its strong historical and current connections to technology and libraries, the Bill and Melinda Gates Foundation might be a very important partner in funding and facilitating the increased role that public libraries play in providing access to and help with egovernment. The search for additional funding to support egovernment provision should not only focus on funds for access and training, but also on funds for research about how to better meet individual and community egovernment needs and the affects of egovernment provision by public libraries on individuals and communities. Regardless of what approaches are taken to find ing greater support, however, public libraries must do a better job of communicating their involvement in the provision of egovernment to governments and private organizations in order to increase support. Such commu nications will need to be part of a larger strategy to define a place within public policy that gives public libraries a voice in egovernment issues. If public libraries are going to fulfill this social role, they must become a greater pres ence in the national policy discourse surrounding egov ernment. To increase their support and standing in policy discourse, libraries must not be hesitant in reminding the public and government officials of their successes after emergencies and in providing the social infrastructure for efiling of taxes, enrolling in Medicare prescription drug plans, and myriad other routine egovernment activities. In many societies, egovernment has come to be seen by many citizens and governments as a force that will enhance democratic participation, more closely link citizens and their representatives, and help disadvan taged populations become more active participants in government and in society.61 Egovernment is seen by many as having “the potential to fundamentally change a whole array of public interactions with government.”62 While the Egovernment Act of 2002 and President’s Egovernment Management Agenda have emphasized the transformative effect of egovernment, thus far it has primarily been used as a way to make information available, provide forms and electronic filing, and distrib ute the viewpoints of government agencies.60 However, many citizens do look to egovernment as a valuable source of information, considering egovernment sites to be “objective authoritative sources.”64 Currently, the primary reason that people use egovernment is to gather information.65 In the United States, 58 percent of Internet users in the United States believe egovernment to be the best source for government information, 65 percent of Americans expect that information they are seeking will be on a government site, and 26 million Americans seek political information online everyday.66 Public satisfaction with the egovernment services available, however, is limited. As commercial sites are developing faster and provide more innovative services than egovernment sites, public satisfaction with gov ernment Web sites is declining.67 Public confidence in government Web sites also has declined as much of the public policy related to egovernment since 9/11 has been to reduce access to information through egovern ment.68 The types of information that have been affected include many forms of socially useful information, from scientific information to public safety information to information about government activities.69 For these and other reasons, the majority of citizens, even those with a highspeed Internet connection at home, seeking govern ment information and services prefer to speak to a person directly in their contacts with the government.70 In many cases, people turn to public librarians to serve as the per son involved in egovernment contacts. Further, when people struggle with, become frustrated by, or reject egovernment services, they turn to public libraries. Every year, public libraries deal with huge num bers of patrons needing help with online taxes, and the Medicare prescription drug plan signup period resulted in aRticLE titLE | autHoR 41puBLic LiBRaRiEs, vaLuEs, tRust, anD E-GovERnmEnt | JaEGER anD FLEiscHmann 41 an influx of seniors to public libraries seeking help in using the online registration system.71 For example, during the 2006 tax season, Virginia discontinued the distribution of free print copies of tax forms to encourage use of the online system. Instead, citizens of the state flooded public librar ies, assuming that libraries could find them print copies of the forms, which of course the libraries did. It seems unlikely, however, that the same government officials pushing the use of egovernment are aware of the roles of public libraries in helping citizens with daytoday egovernment use. Further, the enormous social roles of public libraries in emergency response in communities, such as during the 2004 and 2005 hurricane seasons, are far from widely known among government officials. To encourage the provision of external funding, the develop ment of targeted support technologies, and policy sup port for these social roles, public libraries must make the government and the public better aware of these roles and what is needed to ensure that the roles can be fulfilled. Similarly, there is an extremely important role for LIS programs in ensuring public libraries can meet community expectations for egovernment provision. LIS program graduates need to be prepared to help patrons access and use egovernment information and services. As govern ment activities move primarily or exclusively online, patrons will increasingly seek help with egovernment from public libraries. LIS programs must ensure that grad uates are ready to serve patrons in this capacity. In 2007, the College of Information Studies at the University of Maryland became the first ALAaccredited school to offer a concentration in egovernment as part of the Master of Library Science program.72 The goal of this concentration is to prepare future librarians who wish to specialize in egovernment, which will be an area of increasing and sig nificant need as more government information and services move online and more government agencies rely on public libraries to ensure access to egovernment. LIS programs need to prioritize finding ways to incorporate the teaching of issues related to egovernment in public libraries as new concentrations or courses, or into existing courses. The provision of egovernment is an important role of public libraries that is likely to increase significantly, and gradu ates of LIS programs need to be prepared to meet patrons’ egovernment information needs. Further, LIS faculties also can support public libraries in their egovernment access and training roles by focusing more research on the intersections of public libraries and egovernment. Ultimately, the role of the trusted and valued public provider of egovernment access creates many financial and staffing obligations and social responsibilities, but it also is a tremendous opportunity for public libraries. Fighting against censorship efforts in the 1950s estab lished the public perception of libraries as guardians of the First Amendment during the McCarthy era.73 Working to ensure access and the ability to use egovernment is creating new public perceptions of libraries as guardians of equal access in new but just as socially meaningful ways. Rather than needing to ponder whether the emer gence of the Internet will limit or remove the relevance of public libraries, the advent of egovernment has created a brand new and very significant role that public libraries can play in serving their communities. Given the empha sis that governments are placing on moving information and services online, patrons will continue to need access to and assistance in using egovernment. The trust and values that have long been associated with public libraries are evolving to include the social expectations of the provision of access to and training for egovernment by public libraries. In the same ways that patrons have learned to trust public libraries to provide equal access to print information sources, they now have learned to trust that libraries can provide equal access to egovernment information. It seems that citizens will regu larly be turning to public libraries for help with mundane egovernment activities, such as finding forms and filing taxes, as well as with the most pressing egovernment activities, as was demonstrated in the aftermath of hur ricanes Katrina and Rita. Because the trust in and values of public libraries have set the stage for the emerging role of libraries in egovernment, public libraries need to work to ensure the availability of the support, education, and policy decisions that they need to serve their communities in this new and vital role in situations ranging from every day information needs to the most extreme circumstances. In spite of the costs associated with serving as the public’s egovernment access center, acting as the social guarantor of equal access to egovernment emphatically demonstrates that public libraries will continue to be a central part of the infrastructure of society in the Internet age. Public libraries now must learn to articulate better the social roles they are playing and the types of support they need from LIS programs, funding agencies, and gov ernment agencies to continue playing these roles. ■ Acknowledgment The authors of this paper have worked with several col leagues on projects related to the ideas discussed in this paper. The authors would particularly like to thank John Carlo Bertot, Lesley A. Langa, Charles R. McClure, Jennifer Preece, Yan Qu, Ben Shneiderman, and Philip Fei Wu. References and notes 1. Margaret Mooney Marini, “Social Values and Norms,” Encyclopedia of Sociology, Edgar F. Borgatta and Marie L. Borgatta, eds., 2828 (New York: Macmillan, 2000). 42 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 200742 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 2. Steven Hitlin and Jane Allyn Piliavin, “Values: Reviv ing a Dormant Concept,” Annual Review of Sociology 30 (2004): 359–93. 3. Michael Gorman, Our Singular Strengths: Meditations for Librarians (Chicago: ALA, 1997); Michael Gorman, Our Enduring Values: Librarianship in the 21st Century (Chicago: ALA, 2000); Michael Gorman, Our Own Selves: More Meditations for Librarians (Chicago: ALA, 2005). 4. Gorman, Our Enduring Values. 5. Frances K. Groen, Access to Medical Knowledge: Librar- ies, Digitization, and the Public Good (Lanham, Md.: Scarecrow, 2007). 6. Elizabeth Smith, “Equal Information Access and the Evo lution of American Democracy,” Journal of Educational Media and Library Sciences 33, no. 2 (1995): 158–71. 7. Toni Samek, Intellectual Freedom and Social Responsibility in American Librarianship, 1967–1974 (Jefferson, N.C.: McFarland, 2001). 8. Pam Scott, Evelleen Richards, and Brian Martin, “Cap tives of Controversy: The Myth of the Neutral Social Researcher in Contemporary Scientific Controversies,” Science, Technology, and Human Values 15 (1990): 474–94. 9. American Library Association, “Library Bill of Rights,” www.ala.org/ala/oif/statementspols/statementsif/librarybill rights.htm (accessed May 19, 2007). 10. Kenneth R. Fleischmann, “Digital Libraries with Embed ded Values: Combining Insights from LIS and Science and Technology Studies,” Library Quarterly (in press); Kenneth R. Fleischmann, “Digital Libraries and Human Values: Human Computer Interaction meets Social Informatics,” Proceedings of the 70th Annual Conference of the American Society for Infor mation Science and Technology, Milwaukee, Wisc., 2007. 11. Pew Research Center, Americans and Social Trust: Who, Where, and Why (Washington, D.C.: Pew Research Center, 2007), http://pewresearch.org/assets/social/pdf/SocialTrust.pdf, 2. 12. David Wildon Carr, “An Ethos of Trust in Information Service,” in Ethics and Electronic Information: A Festschrift for Stephen Almagno, Barbara Rockenbach and Tom Mendina, eds., 45–52 (Jefferson, N.C.: McFarland, 2003). 13. Paul T. Jaeger and Gary Burnett, “Information Access and Exchange among Small Worlds in a Democratic Society: The Role of Policy in Redefining Information Behavior in the Post 9/11 United States,” Library Quarterly 75, no. 4 (2005): 464–95. 14. Gorman, Our Enduring Values, 66. 15. Public Agenda, Long Overdue: A Fresh Look at Public and Leadership Attitudes about Libraries in the 21st Century (New York: Public Agenda, 2006), 11, www.publicagenda.org/research/ pdfs/long_overdue.pdf (accessed May 19, 2007). 16. John Carlo Bertot et al., “Public Access Computing and Internet Access in Public Libraries: The Role of Public Librar ies in Egovernment and Emergency Situations,” First Monday 11, no. 9 (2006), www.firstmonday.org/issues/issue11_9/bertot (accessed May 19, 2007). 17. Public Agenda, Long Overdue. 18. Nancy Zimmerman and Feili Tu, “It Is Not Just a Matter of Ethics II: An Examination of Issues Related to the Ethical Provi sion of Consumer Health Services in Public Libraries,” Ethics and Electronic Information: A Festschrift for Stephen Almagno, Barbara Rockenbach and Tom Mendina, eds., 119–27 (Jefferson, N.C.: McFarland, 2003). 19. Paul Sturges and Ursula Iliffe, “Preserving a Secret Garden for the Mind: The Ethics of User Privacy in the Digital Library,” Ethics and Electronic Information: A Festschrift for Stephen Almagno, Barbara Rockenbach and Tom Mendina, eds., 74–81 (Jefferson, N.C.: McFarland, 2003), 81. 20. Online Computer Library Center, Inc. (OCLC), Perceptions of Libraries and Information Resources: A Report to the OCLC Mem- bership (Dublin, Ohio: OCLC, 2005). 21. Jaeger and Burnett, “Information Access and Exchange among Small Worlds in a Democratic Society”; Paul T. Jaeger et al., “The USA PATRIOT Act, the Foreign Intelligence Surveil lance Act, and Information Policy Research in Libraries: Issues, Impacts, and Questions for Library Researchers,” Library Quar- terly 74, no. 2 (2004): 99–121. 22. Children’s Internet Protection Act, Public Law 106–554. 23. Paul T. Jaeger, John Carlo Bertot, and Charles R. McClure, “The Effects of the Children’s Internet Protection Act (CIPA) in Public Libraries and its Implications for Research: A Statistical, Policy, and Legal Analysis,” Journal of the American Society for Information Science and Technology 55, no. 13 (2004): 1131–39; Paul T. Jaeger et al., “CIPA: Decisions, Implementation, and Impacts,” Public Libraries 44, no. 2 (2005): 105–09. 24. Bertot et al., “Public Access Computing and Internet Access in Public Libraries”; John Carlo Bertot et al., “Drafted: I Want You to Deliver Egovernment,” Library Journal 131, no. 13 (2006): 34–39; John Carlo Bertot et al., Public Libraries and the Internet 2006: Study Results and Findings (Tallahassee, Fla.: Infor mation Institute, 2006), www.ii.fsu.edu/plinternet_reports.cfm (accessed May 19, 2007). 25. Nancy Kranich, “Libraries, the Internet, and Democracy,” Libraries & Democracy: The Cornerstones of Liberty, Nancy Kranich, ed., 83–95 (Chicago: ALA, 2001). 26. Bertot et al., “Public Access Computing and Internet Access in Public Libraries”; Bertot et al., “Drafted.” 27. Bertot et al., Public Libraries and the Internet 2006. 28. Jaeger and Burnett, “Information Access and Exchange among Small Worlds in a Democratic Society,” 487. 29. Anne Heanue, “In Support of Democracy: The Library Role in Public Access to Government,” Information, Libraries, and Democracy: The Cornerstones of Liberty, Nancy Kranich, ed. (Chi cago: ALA, 2001), 124. 30. George D’Elia et al., “The Impact of the Internet on Public Library Uses: An Analysis of the Current Consumer Market for Library and Internet Services,” Journal of the American Society for Information Science and Technology 53, no. 10 (2002): 802–20; Eleanor Jo Rodger, George D’Elia, and Corrine Jorgensen, “The Public Library and the Internet: Is Peaceful Coexistence Pos sible?,” American Libraries 31, no. 5 (2001): 58–61. 31. W. E. Ebbers, W. J. Pieterson, and H. N. Noordman, “Elec tronic Government: Rethinking Channel Management Strate gies,” Government Information Quarterly (in press). 32. Ibid. 33. Awdhesh K. Singh and Rajendra Sahu, “Integrating Inter net, Telephones, and Call Centers for delivering Better Quality Egovernance to All Citizens,” Government Information Quarterly (in press). 34. Paul T. Jaeger and Kim M. Thompson, “Egovernment around the World: Lessons, Challenges, and New Directions,” Government Information Quarterly 20, no. 4 (2003): 389–94; Paul T. Jaeger and Kim M. Thompson, “Social Information Behavior aRticLE titLE | autHoR 43puBLic LiBRaRiEs, vaLuEs, tRust, anD E-GovERnmEnt | JaEGER anD FLEiscHmann 43 and the Democratic Process: Information Poverty, Normative Behavior, and Electronic Government in the United States,” Library & Information Science Research 26, no. 1 (2004): 94–107. 35. Paul T. Jaeger, “Deliberative Democracy and the Con ceptual Foundations of Electronic Government,” Government Information Quarterly 22, no. 4 (2005): 702–19; Paul T. Jaeger, “Information Policy, Information Access, and Democratic Partic ipation: The National and International Implications of the Bush Administration’s Information Politics,” Government Information Quarterly (in press). 36. Bertot et al., “Public Access Computing and Internet Access in Public Libraries”; Bertot et al., “Drafted.” 37. Aimee C. Quinn and Laxmi Ramasubramanian, “Infor mation Technologies and Civic Engagement: Perspectives from Librarianship and Planning,” Government Information Quarterly (in press). 38. Bertot et al., Public Libraries and the Internet 2006; Paul T. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes: Evolv ing Roles and Lessons Learned for Public Libraries in Disaster Preparedness and Community Services,” Public Library Quarterly (in press). 39. Bertot et al., “Drafted.” 40. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes.” 41. Ibid. 42. Ibid. 43. Michael Arnone, “Storm Watch 2006: Ready or Not,” Fed- eral Computer Week, June 5, 2006, www.fcw.com/print/12_20/ news/947111.html (accessed May 19, 2007). 44. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes.” 45. Bertot et al., “Public Access Computing and Internet Access in Public Libraries.” 46. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes.” 47. Ibid. 48. Ibid. 49. Bertot et al., “Public Access Computing and Internet Access in Public Libraries.” 50. Paul T. Jaeger et al., “911.gov: Harnessing Egovernment, Mobile Communication Technologies, and Social Networks to Promote Community Participation in Emergency Response,” Telecommunications Policy (in press); Ben Shneiderman and Jenny Preece, “911.gov: Community Response Grids,” Science 315 (2007): 944. 51. Kim Viborg Andersen and Helle Zinner Henriksen, “EGovernment Research: Capabilities, Interaction, Orientation, and Values,” Current Issues and Trends in E-Government Research, Donald F. Norris, ed., 269–88 (Hershey, Pa.: Cybertech, 2007). 52. Jeffrey Roy, “EGovernment in Canada: Transition or Trans formation?” Current Issues and Trends in E-Government Research, Donald F. Norris, ed., 44–67 (Hershey, Pa.: Cybertech, 2007), 51. 53. OECD eGovernment Studies, The E-government Imperative (Danvers, Mass.: Organization for Economic CoOperation and Development, 2005), 45. 54. Bruce Barber, Strong Democracy (Berkeley, Calif.: Univ. of California Pr., 1984). 55. Bertot et al., Public Libraries and the Internet 2006. 56. Charles R. McClure et al., E-government and Public Librar- ies: Current Status, Meeting Report, Findings, and Next Steps (Tallahassee, Fla.: Information Use Management and Policy Institute, 2007), www.ii.fsu.edu/announcements/egov2006/ egov_report.pdf (accessed May 19, 2007). 57. Paul T. Jaeger et al., “Public Libraries and Internet Access across the United States: A Comparison by State from 2004 to 2006,” Information Technology and Libraries 26, no. 2 (2007): 4–14. 58. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes.” 59. Bertot et al., “Drafted.” 60. Jaeger et al., “Public Libraries and Internet Access across the United States.” 61. Beth Simone Noveck, “Designing Deliberative Democracy in Cyberspace: The Role of the Cyberlawyer,” Boston University Journal of Science and Technology 9 (2003): 1–91. 62. S. H. Holden and L. I. Millett, “Authentication, Privacy, and the Federal Egovernment,” Information Society 21 (2005): 367. 63. Egovernment Act of 2002, P.L. 107–347; Jaeger, “Delibera tive Democracy and the Conceptual Foundations of Electronic Government”; E-government Strategy: Implementing the President’s Management Agenda for E-government (Washington, D.C.: EGov, 2003), www.whitehouse.gov/omb/egov/2003egov_strat.pdf (accessed May 19, 2007). 64. Anderson Office of Government Services, A Usability Analysis of Selected Federal Government Web Sites (Anderson Office of Government Services: Washington, D.C., 2002), 1. 65. Christopher G. Reddick, “Citizen Interaction with Egov ernment: From the Streets to Servers?,” Government Information Quarterly 22, no. 1 (2005): 338–57. 66. John B. Horrigan, Politics Online (Washington, D.C., Pew Internet & American Life Project, 2006); John B. Horrigan and Lee Rainie, Counting on the Internet (Washington, D.C., Pew Internet & American Life Project, 2002). 67. Stephen Barr, “Public Less Satisfied with Government Websites,” Washington Post, Mar. 21, 2007, www.washingtonpost. com/wpdyn/content/article/2007/03/20/AR2007032001338. html (accessed May 19, 2007). 68. Lotte E. Feinberg, “FOIA, Federal Information Policy, and Information Availability in a Post9/11 World,” Govern- ment Information Quarterly 21 (2004): 439–60; Elaine L. Halchin, “Electronic Government: Government Capability or Terrorist Resource,” Government Information Quarterly 21 (2004): 406–19: Harold C. Relyea and Elaine L. Halchin, “Homeland Security and Information Management,” The Bowker Annual: Library and Trade Almanac 2003, D. Bogart, ed., 231–50 (Medford, N.J.: Infor mation Today, 2003). 69. Jaeger, “Information Policy, Information Access, and Democratic Participation.” 70. John B. Horrigan, How Americans Get in Touch with Govern- ment (Washington, D.C., Pew Internet & American Life Project, 2004). 71. Bertot et al., “Public Access Computing and Internet Access in Public Libraries”; Bertot et al., “Drafted.” 72. The description of the University of Maryland’s egov ernment master’s program is available at www.clis.umd.edu/ programs/egov.shtml. 73. Jaeger and Burnett, “Information Access and Exchange among Small Worlds in a Democratic Society.” 3269 ---- 44 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 Author iD box for 3 column layout Column TitleCommunications Afghanistan Digital Library Initiative: Revitalizing an Integrated Library System Yan Han and Atifa Rawan This paper describes an Afghanistan digital library initiative of building an integrated library system (ILS) for Afghanistan uni- versities and colleges based on open-source software. As one of the goals of the Afghan eQuality Digital Libraries Alliance, the authors applied systems analysis approach, evaluated different open-source ILSs, and customized the selected software to accom- modate users’ needs. Improvements include Arabic and Persian language support, user interface changes, call number label print- ing, and ISBN-13 support. To our knowl- edge, this ILS is the first at a large academic library running on open-source software. The last quartercentury has been devastating for Afghanistan, with an uninterrupted period of inva sions, civil wars, and oppressive regimes. “Since 1979, the education system was virtually destroyed on all levels. Schools and colleges were closed, looted, or physically reduced; student bodies and faculties were emptied by war, migration, and eco nomic hardship; and libraries were gutted.”1 Kabul University (KU), for example, was largely demolished by 1994 and completely closed down in 1998. It is universally recognized that Afghanistan desperately needs trained faculty, teachers, librarians, and staff. The current state of the higher education system is one of dramatic destruction and deteriora tion. Based on Rawan’s assessments of KU Library, most of its collections were damaged or destroyed. She found that there were approximately 60,000 to 70,000 books in English, 2,000 to 3,000 books in Persian, and 2,000 theses in Persian. None of these collections have manual or online catalog records. The library has eigh teen staff members, but not all are fully trained in library activities.2 Rebuilding the educational infra structure in Afghanistan is essential. Afghan equality digital libraries alliance The University of Arizona (UA) Library has been involved in rebuilding academic libraries in Afghanistan since April 2002. In 2005, we were invited to be part of the Digital Libraries Alliance (DLA) as part of the Afghan eQuality Alliances: 21st Century Universities for Afghanistan initiative funded by the USAID and Washington State University. DLA’s goal is to build the capacity of Afghan libraries and librarians to work with open source digital libraries platforms; and to provide and enhance access to schol arly information resources and open content that all Afghanistan univer sities can share. Revitalizing the Afghan ILS An integrated library system (ILS) usually includes several critical com ponents, such as acquisitions, cat aloging, catalog (search and find), circulation, and patron management. Traditionally it has been the center of any library. Recent developments in digital libraries have resulted in dis tributed systems in libraries, and the ILS is treated as one of many digital library systems. It still is critical to have a centralized ILS to provide a primary way to access libraryowned materials for Afghanistan universi ties and colleges. Other services, such as interlibrary loan and other digital library systems, can be further devel oped to extend libraries’ services to users and communities. The UA library is working collab oratively with other DLA members, including universities around the world and universities in Afghanistan. One of the goals is to develop a digital library environment, includ ing a centralized ILS for four aca demic universities in Kabul (Kabul University, Polytechnic University, Kabul Medical University, and Kabul Education University). In the future, the ILS will include other regional institutions throughout Afghanistan. The ILS will support 30,000 students and 2,000 faculty in Afghan universi ties and colleges. Overview of the ILS market Currently the ILS market is primar ily dominated by commercial sys tems, such as Innovative Interface, Endeavor, and Sirsi. Compared with other computing areas, opensource systems in ILS are immature and limited, as there are only a few prod ucts available, and most of them do not have the full features of an ILS. However, they are providing a valu able alternative to those costly com mercial systems. Based on the availability of exist ing funding, experiences with com mercial vendors, and consideration of vendor supports and future direc tions, the authors decided to build a digital library infrastructure with the open concept (open access, open source, and open standards). The decision is widely influenced by glo balization, open access, open source, open standards, and increasing user expectations. At the same time, the decision gives us an opportunity to develop and integrate new tools and services for libraries as suggested by the University of California.3 Koha is probably the most renowned opensource ILS. It is Yan Han (hany@u.library.arizona.edu) is Systems Librarian and atifa Rawan (rawana@u.library.arizona.edu) is Librarian at the university of Arizona Libraries, Tucson. aFGHanistan DiGitaL LiBRaRY initiativE | Han anD RaWan 45 a fullfeatured ILS, developed in New Zealand and first deployed in Horowhenua Library Trust in 2000. So far Koha has been running in a few public and special libraries. The underlying architecture is the Linux, Apache, MySQL, and Perl (LAMP) stack. Building on a simi lar LAMP (Linux, Apache, MySQL, and PHP) architecture, OpenBiblio has a relatively short history, releas ing its first beta 0.1.0 version in 2002 and currently in beta 0.5.1 version. WEBILS is an opensource ILS based on UNESCO’s CDS/ISIS database, developed by the Institute for Computer and Information Engineering in Poland. The software has some ILS features, including cataloging, catalog (search and find), loan, and report modules. WEBLIS must run on Windows and Window based Web servers, such as Xitami/ Microsoft IIS and ISIS database. GNUTECA, another opensource ILS widely deployed in South America universities, was developed in Brazil. As with WEBILS, it has some ILS features, such as cataloging, cata log, and loan; however, the software interface is written in Portuguese, which presents a language barrier for U.S. and Afghanistan users. The paper Open Source Integrated Library Systems provides a good overview of other systems.4 Systems Analysis The authors adopted systems analy sis by taking account of Afghan col lections, users’ needs, and systems functionality required to perform essential library operations. Koha was chosen as the base software, due to its functionality, maturity, and support. Some of the reasons are: ■ The software architecture is open source LAMP, which is popular, stable, and predominant. ■ Our staff have skills in these open software systems. ■ It is a fullfeatured opensource ILS. Certain components, such as multiple branch support and user management, are critical. ■ Two large public libraries serv ing population of 30,000 users in New Zealand and United States have been running their ILS on Koha for a few years. The soft ware is stable, and most bugs have been fixed. ■ Koha has a mailing list that is used by Koha developers and users as a communication tool to ask and answer questions. Kabul Universities have com puter science faculty and students who have the capacity to participate in the development. Due to working schedules and locations, we prefer to develop and maintain the system in the UA Library. The technical project team consists of three people: Yan Han, who is responsible for manag ing the overall implementation and development in the open source ILS system; one parttime (twenty hours per week) student developer whose major task is to develop and man age source code; and a temporary student (ten hours per week for two months) responsible for translating English to Farsi and Dari. Testing tasks, such as unit testing and sys tem testing, are shared by all mem bers of the team. Major challenges Farsi and Dari languages support Koha version 2.2 cannot correctly handle East Asian language records, including Farsi and Dari records. Supporting Persian, Farsi, and Dari records is a very important require ment, as these Afghan Universities have quite a few Persian and Dari materials. Koha generates a Web based graphical user interface (GUI) through Perl included templates that use a HTML meta tag with Western character set (ISO85591) to encode characters. Browsers such as Internet Explorer and Firefox use the meta tag to decode characters with a predefined character set. Therefore, other characters, such as Arabic and Persian as well as Chinese would not be displayed correctly. The Perl tem plates were identified and modified to allow characters to be encoded in Unicode, and this solved the prob lem. Persian and Dari characters can be entered into the cataloging module and displayed correctly in the GUI. However, we should understand the limitations of this approach when dealing with other East Asian character sets, such as Chinese characters. Only frequently used characters can be represented. A project of Academia Sinica is one of the efforts to deal with 65,000 unique Chinese characters.5 Farsi/Dari GUI As the project is designed for local Afghanistan users, there is a need for a Farsi and Dari GUI. The current version of Koha does not have such an interface, and we decided to create a new Farsi/Dari GUI for the OPAC. The Koha system’s internal structure is logically arranged; therefore, our development work in translation is not difficult to manage. The transla tion student translates English words in Perl template files into Farsi and Dari. At the same time he works with the developer to make sure it is dis played correctly in the OPAC. Figure 1 is the screenshot of the GUI. Other improvements We further developed a spine label printing module and integrated the module into the ILS, as there is no such function provided. The module allows library staff to print one or more standardized labels (1.5 inches high by 1 inch wide) with OCLC formats on Gaylord LSL 01 paper, which has fiftysix labels per sheet. 46 inFoRmation tEcHnoLoGY anD LiBRaRiEs | DEcEmBER 2007 Lstaff can select an appropriate label slot to start and print out his or her choices of labels through the Web preview feature. This feature eases library staff operations and provides cost savings for label papers. ISBN13 replaced ISBN10 after January 1, 2007, and any ILS has to be able to handle the new ISBN13. Our ILS has been improved to han dle both ISBN standards. Thanks to Koha’s delegation of the GUI and major functionality, interfaces such as fonts and Web pages can be modi fied through the templates and CSS. A Z39.50 service has been configured to allow users to search other librar ies’ catalogs. Hardware and software support Afghanistan is still developing its fun damental infrastructure: electricity, transportation, and communication. When considering buying hardware for the ILS, difficult issues, such as server services and computer parts, have to be solved. Even international IT companies, such as Dell, HP, and IBM, have very limited services and support in Afghanistan. Regarding software and system support, our strategies are to: ■ maintain and develop the open source software at the UA library by the project team; ■ run one server in Kabul, Afghanistan, administrated by a local system administrator. ■ run one server in the UA library administrated by the library’s system administrator. Cost We estimated our overall cost for building the opensource system is low and reasonable. The system is currently run ning on a Dell 2800 server ($5,000 for 3GHZ CPU, 4GB RAM, and five 73GB hard drives), kernel built Debian Linux (free), Apache 2 (free), MySQL (free), and Perl (free). Han spends four hours per week for coor dination, communication, and man agement of the project. The student developer works twenty hours per week for development and mainte nance, while the translation student will spend one hundred hours for translation. Conclusion Revitalizing an Afghan ILS is the first important goal to build digital library initiatives for the Afghanistan higher education system. By under standing Afghan university librar ies, collections, and users, the UA library is working with other DLA members to build the open source ILS. The new Farsi and Dari user interface, language support, and other improvements have been made to meet needs of Afghan uni versities and colleges. The cost of using and developing existing open source software is reasonable. Acknowledgments We thank USAID, Washington State University, and other DLA mem bers for providing support. This work was supported by USAID and Washington State University. References and notes 1. Nazif Sharani et. al., Conference transcription, Conference on Strate gic Planning of Higher Education for Afghanistan, 2002, Indiana University, Bloomington, Oct. 6–7. 2. Atifa Rawan, Transformation in Afghanistan: Rebuilding Libraries, paper presented at AZLA conference, Mesa, Ariz., Oct. 11–13, 2005. 3. The University of California Libraries, Rethinking How We Provide Bibliographic Services for the University of California, 2005, http://libraries.univer sityofcalifornia.edu/sopag/BSTF/Final. pdf. 4. Eric Anctil and Jamshid Beheshti, Open Source Integrated Library Systems: An Overview, 2004, www.anctil.org/users/ eric/oss4ils.html (accessed Nov. 5, 2006). 5. Derming Juang et al., “Resolving the Unencoded Character Problem for Chinese Digital Libraries,” Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2005, Denver (June 7–11, 2005): 311–19 (New York: ACM Pr., 2005). Figure 1: Afghanistan academic libraries union catalog in Farsi/Dari LITA cover 2, cover 3, cover 4 Index to Advertisers 3270 ---- 2 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 W elcome to my first ITAL president’s column. Each president only gets a year to do these col umns, so expectations must be low all around. My hope is to stimulate some thinking and conversation that results in LITA members’ ideas being exchanged and to create real opportunities to implement those ideas. My first column I thought I would keep short and sweet, and discuss just a few of the ideas that have been rattling around in my head since the 2007 Midwinter LITA Town Meeting, which have been enhanced by a number of discussions among librarians over the last six months. With any luck, these thoughts might have some bearing on what any of those ideas could mean to our organization. First off, I don’t think I can express how weird this whole presidential appellation is to me. I am extremely proud to be associated with LITA, and honored and surprised at being elected. I come from a consortia envi ronment and an extremely flat organization. Solving problems is often a matter of throwing all the parties in a room together and hashing it out until solutions are arrived at. I’ve been a training librarian for quite a while now, and pragmatic approaches to problem solving are my central focus. I’m a consortia wrangler, a trainer, and a technology pusher, and I hope my approach is, and will be, to listen hard and then see what can be accomplished. So in my own way, I find being president kind of on the embarrassing side. It’s like not knowing what to do with your hands when you’re speaking in public. At the LITA Town Meeting (http://litablog .org/2007/06/17/litatownmeeting2007report/) it was pretty obvious that members want community in all its various forms, facetoface in multiple venues and online in multiple venues. It’s also pretty obvious from the studies done by Pew Internet and American Life and by OCLC that our users, and in particular our younger users, really want community. The Web 2.0 and the Library 2.0 movements are responses to that desire. As a somewhat flippant observation, we spent a generation educating our kids to work in groups, and now we shouldn’t be sur prised that they want to work and play in groups. Many of us work effectively in collaborative groups everyday. We find it exciting, productive, and even fun. It’s an environment that we would like to create for our patrons, inhouse and virtually. It’s what we would like to see in our association. Having been to every single Top Tech Trends program and listened to the LITA trendsters, one theme that often comes up is that complaining about the systems our ven dors deliver can at times be pointless, because they sim ply deliver what we ask for. There is of course a corollary to this. Once a system is in the marketplace, adding func tionality often becomes centered around the lowhang ing fruit. As a fictitious example, a vendor might easily add the ability to change the colors of the display to the patron, but adding a shelf list browse might take serious coding to create. So through discussions and RFP, we ask for and get the pretty colors while the browsing function waits, a form of procrastination. So then does innovation come only when all the lowhanging fruit has finally been plucked, and there’s nothing else to procrastinate on? As social organizations, libraries, ALA, LITA and other groups, it appears that we have plucked all the lowhanging fruit of Web 1.0. Email and static Web pages have been done to death. As a pragmatist, what concerns me most is implementation. What delivery systems should and can we adopt and develop to fulfill the promise of services we’d like? Can we ensure that barriers to participation are either eliminated or so low as to include everyone? I like to think that Web 2.0 is innovation toward mirroring how we personally want to work and play and how we want our social structures to perform. So how can we make LITA mirror how we want to work and play? I do know it’s not just making everything a wiki. Mark Beatty (mbeatty@wils.wisc.edu) is LITA President 2007/2008 and Trainer, Wisconsin Library Services, Madison. President’s Column Mark Beatty 3271 ---- TESTING INFORMATION LITERACY IN DIGITAL ENvIRONMENTS | KATz 3 Despite coming of age with the Internet and other tech- nology, many college students lack the information and communication technology (ICT) literacy skills neces- sary to navigate, evaluate, and use the overabundance of information available today. This paper describes the development and early administrations of ETS’s iSkills assessment, an Internet-based assessment of informa- tion literacy skills that arise in the context of technology. From the earliest stages to the present, the library com- munity has been directly involved in the design, develop- ment, review, field trials, and administration to ensure the assessment and scores are valid, reliable, authentic, and useful. T echnology is the portal through which we interact with information, but there is growing belief that people’s ability to handle information—to solve problems and think critically about information—tells us more about their future success than does their knowledge of specific hardware or software. These skills—known as information and communications technology (ICT) literacy—comprise a twentyfirstcentury form of literacy in which researching and communicating information via digital environments are as important as reading and writing were in earlier centuries (Partnership for 21st Century Skills 2003). Although today’s knowledge society challenges stu dents with overabundant information of often dubious quality, higher education has recognized that the solution cannot be limited to improving technology instruction. Instead, there is an increasingly urgent need for students to have stronger information literacy skills—to “be able to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information” (American Library Association 1989)—and apply those skills in the context of technology. Regional accreditation agencies have integrated information lit eracy into their standards and requirements (for example, Middle States Commission on Higher Education 2003; Western Association of Schools and Colleges 2001), and several colleges have begun campuswide initiatives to improve the information literacy of their students (for example, The California State University 2006; University of Central Florida 2006). However, a key challenge to designing and implementing effective information lit eracy instruction is the development of reliable and valid assessments. Without effective assessment, it is difficult to know if instructional programs are paying off—whether students’ information literacy skills are improving. ICT literacy skills are an issue of national and inter national concern as well. In January 2001, Educational Testing Service (ETS) convened an International ICT Literacy Panel to study the growing importance of exist ing and emerging information and communication tech nologies and their relationship to literacy. The results of the panel’s deliberations over fifteen months highlighted the growing importance of ICT literacy in academia, the workplace, and society. The panel called for assessments that will make it possible to determine to what extent young adults have obtained the combination of techni cal and cognitive skills needed to be productive mem bers of an informationrich, technologybased society (International ICT Literacy Panel 2002). This article describes ETS’s iSkills assessment (for merly “ICT Literacy Assessment”), an Internetbased assessment of information literacy skills that arise in the context of technology. From the earliest stages to the pres ent, the library community has been directly involved in the design, development, review, field trials, and admin istration to ensure the assessment and scores are valid, reliable, authentic, and useful. ■ Motivated by the library community Although the results of the International ICT Literacy Panel provided recommendations and a framework for an assessment, the inspiration for the current iSkills assessment came more directly from the higher educa tion and library community. For many years, faculty and administrators at the California State University (CSU) had been investigating issues of information literacy on their campuses. As part of their systemwide Information Competence Initiative that began in 1995, researchers at CSU undertook a massive ethnographic study to observe students’ research skills. The results suggested a great many shortcomings in students’ infor mation literacy skills, which confirmed librarian and classroom faculty anecdotal reports. However, clearly such a massive data collection and analysis effort would be unfeasible for documenting the information literacy skills of students throughout the CSU system (Dunn 2002). Gordon Smith and the late Ilene Rockman, both of the CSU Chancellor ’s office, discussed with ETS the idea of developing an assessment of ICT literacy that could support CSU’s Information Competence Initiative as well as similar initiatives throughout the higher edu cation community. Irvin R. Katz Irvin R. Katz (ikatz@ets.org) is Senior Research Scientist in the Research and Development Division at Educational Testing Service. Testing Information Literacy in Digital Environments: ETS’s iSkills Assessment � INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 ■ National higher education ICT literacy initiative In August 2003, ETS established the National Higher Education ICT Literacy Initiative, a consortium of seven colleges and universities that recognized the need for an ICT literacy assessment targeted at higher educa tion. Representatives of these institutions collaborated with ETS staff to design and develop the iSkills assessment. The consortium built upon the work of the International Panel to explicate the nature of ICT literacy in higher education. Over the ensuing months, repre sentatives of consortium institutions served as subject matter experts for the assessment design and scoring implementation. The development of the assessment followed a process known as EvidenceCentered Design (Mislevy, Steinberg, and Almond 2003), a systematic approach to the design of assessments that focuses on the evidence (student performance and products) of proficiencies as the basis for constructing assessment tasks. Through the Evidence Centered Design process, ETS staff (psychometricians, cognitive psychologists, and test developers) and sub jectmatter experts (librarians and faculty) designed the assessment by considering first the purpose of the assess ment and by defining the construct—the knowledge and skills to be assessed. These decisions drove discussions of the types of behaviors, or performance indicators, to serve as evidence of student proficiency. Finally, simulation based tasks designed around authentic scenarios were crafted to elicit from students the critical performance indicators. Katz et al. (2004) and Brasley (2006) provide a detailed account of this design and development process, illustrating the critical role played by librarians and other faculty from higher education. ■ ICT literacy = information literacy + digital environments Consortium members agreed with the conclusions of the International ICT Literacy Panel that ICT literacy must be defined as more than technology literacy. College students who grew up with the Internet (the “Net Generation”) might be impressively technologically literate, more accepting of new technology, and more technically facile than their parents and instructors (Oblinger and Oblinger 2005). However, anecdotally and in smallscale studies, there is increasing evidence that students do not use technology effectively when they conduct research or communicate (Rockman 2004). Many educators believe that students today are less information savvy than earlier generations despite having powerful information tools at their disposal (Breivik 2005). ICT literacy must bridge the ideas of information literacy and technology literacy. To do so, ICT literacy draws out the technologyrelated components of infor mation literacy as specified in the oftencited standards of the Association of College and Research Libraries (ACRL) (American Library Association 1989), focusing on how students locate, organize, and communicate information within digital environments (Katz 2005). This conflu ence of information and technology directly reflects the “new illiteracy” concerns of educators: students quickly adopt new technology, but do not similarly acquire skills for being critical consumers and ethical producers of information (Rockman 2002). Students need training and practice in ICT literacy skills, whether through general education or within discipline coursework (Rockman 2004). The definition of ICT literacy adopted by the con sortium members reflects this view of ICT literacy as information literacy needed to function in a technological society: ICT literacy is the ability to appropriately use digital technology, communication tools, and/or networks to solve information problems in order to function in an information society. This includes having the ability to use technology as a tool to research, organize, and communicate information and having a fundamental understanding of the ethical/legal issues surrounding accessing and using information (Katz et al. 2004, 7). Consortium members further refined this defini tion, identifying seven performance areas (see figure 1). These areas mirror the ACRL standards and other related standards, but focus on elements that were judged most central to being sufficiently information literate to meet the challenges posed by technology. ■ ETS’s iSkills Assessment ETS’s iSkills assessment is an Internetdelivered assess ment that measures students’ abilities to research, orga nize, and communicate information using technology. The assessment focuses on the cognitive problemsolving and criticalthinking skills associated with using technol ogy to handle information. As such, scoring algorithms target cognitive decisionmaking rather than technical competencies. The assessment measures ICT literacy through the seven performance areas identified by con sortium members, which represent important problem solving and criticalthinking aspects of ICT literacy skill (see figure 1). Assessment administration takes approx imately seventyfive minutes, divided into two sec tions lasting thirtyfive and forty minutes, respectively. ARTICLE TITLE | AUTHOR 5TESTING INFORMATION LITERACY IN DIGITAL ENvIRONMENTS | KATz 5 Figure 1. Components of ICT literacy Define: Understand and articulate the scope of an information problem in order to facilitate the electronic search for information, such as by: ■ distinguishing a clear, concise, and topical research question from poorly framed questions, such as ones that are overly broad or do not otherwise fulfill the information need; ■ asking questions of a “professor” that help disambiguate a vague research assignment; and ■ conducting effective preliminary information searches to help frame a research statement. Access: Collect and/or retrieve information in digital environments. Information sources might be Web pages, databases, discussion groups, e-mail, or online descriptions of print media. Tasks include: ■ generating and combining search terms (keywords) to satisfy the requirements of a particular research task; ■ efficiently browsing one or more resources to locate pertinent information; and ■ deciding what types of resources might yield the most useful information for a particular need. Evaluate: Judge whether information satisfies an information problem by determining authority, bias, timeliness, relevance, and other aspects of materials. Tasks include: ■ judging the relative usefulness of provided Web pages and online journal articles; ■ evaluating whether a database contains appropriately current and pertinent information; and ■ deciding the extent to which a collection of resources sufficiently covers a research area. Manage: Organize information to help you or others find it later, such as by: ■ categorizing e-mails into appropriate folders based on a critical view of the e-mails’ contents; ■ arranging personnel information into an organizational chart; and ■ sorting files, e-mails, or database returns to clarify clusters of related information. Integrate: Interpret and represent information, such as by using digital tools to synthesize, summarize, compare, and contrast information from multiple sources while: ■ comparing advertisements, e-mails, or Web sites from competing vendors by summarizing information into a table; ■ summarizing and synthesizing information from a variety of types of sources according to specific criteria in order to compare information and make a decision; and ■ re-representing results from an academic or sports tournament into a spreadsheet to clarify standings and decide the need for playoffs. Create: Adapt, apply, design, or construct information in digital environments, such as by: ■ editing and formatting a document according to a set of editorial specifications; ■ creating a presentation slide to support a position on a controversial topic; and ■ creating a data display to clarify the relationship between academic and economic variables. Communicate: Disseminate information tailored to a particular audience in an effective digital format, such as by: ■ formatting a document to make it more useful to a particular group; ■ transforming an e-mail into a succinct presentation to meet an audience’s needs; ■ selecting and organizing slides for distinct presentations to different audiences; and ■ designing a flyer to advertise to a distinct group of users. © 2007 Educational Testing Service. All rights reserved. 6 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20076 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 During this time, students respond to fifteen interactive, performancebased tasks. Each interactive task presents a realworld scenario, such as a class or work assignment, that frames the infor mation problem. Students solve informationhandling tasks in the context of simulated software (for example, email, Web browser, library database) having the look and feel of typical applications. There are fourteen three to fiveminute tasks and one fifteenminute task. The three to fiveminute tasks target a single perfor mance area, while the fifteenminute tasks comprise more complex problemsolving scenarios that target multiple performance areas. The simpler tasks contribute to the overall reliability of the assessment, while the more com plex task focuses on the richer aspects of ICT literacy performance. In the assessment, a student might encounter a sce nario that requires him or her to access information from a database using a search engine (see figure 2). The results are tracked and strategies scored based on how he or she searches for information, such as key words chosen, search strategies refined, and how well the information returned meets the needs of the task. The assessment tasks each contain mechanisms to keep students from pursuing unproductive actions in the simulated environment. For example, in an Internet browsing task, when the student clicks on an incorrect link, he might be told that the link is not needed for the current task. This message cues the student to try an alter native approach while still noting for scoring purposes that the student made a misstep. In a similar way, the student who fails to find useful (or any) journal articles in her database search might receive an instant message from a “teammate” providing her with a set of journal articles to be evaluated. These mechanisms potentially keep students from becoming frustrated (for example, via a fruitless search) while providing the opportunity for the students to demonstrate other aspects of their skills (for example, evaluation skills). The scoring for the iSkills assessment is completely automated. Unlike a multiplechoice question, each simu lationbased task provides many opportunities to collect information about a student and allows for alternative paths leading to a solution. Scored responses are pro duced for each part of a task, and a student’s overall score on the test accumulates the individual scored responses across all assessment tasks. The assessment differs from existing measures in sev eral ways. As a largescale measure, it was designed to be administered and scored across units of an institution or across institutions. As a simulationbased assessment, the tasks go beyond what is possible in multiplechoice format, providing students with the look and feel of interactive digital environments along with tasks that elicit higherorder criticalthinking and problemsolving skills. As a scenariobased assessment, students become engaged in the world of the tasks, and the task scenarios describe the types of assignments students should be see ing in their ICT literacy instruction as well as examples of workplace and personal information problems. ■ Two levels of assessments The iSkills assessment is offered at two levels: core and advanced. The core level was designed to assess readi ness for the ICT literacy demands of college. It is targeted at high school seniors and firstyear college students. The advanced level was designed to assess readiness for the ICT literacy challenges in transitioning to higherlevel college coursework, such as moving from sophomore to junior year or transferring from a twoyear to a fouryear institution. The advanced level targets students in their second or third year of postsecondary study. The key difference between the core and advanced levels is in the difficulty of the assessment tasks. Tasks in the core level are designed to be easier; examinees are presented with fewer options, the scenarios are more straightforward, and the reasoning needed for each step in a task is simpler. An advanced task might require an individual to infer the search terms needed from a gen eral description of an information need; the correspond ing core task would state the information need more explicitly. In a task of evaluating Web sites, the core level might present a Web site with many clues that it is not Figure 2. In the iSkills assessment, students demonstrate their skills at handling information through interaction with simulated software. In this example task, students develop a search query as part of a research assignment on earthquakes. © 2007 Educational Testing Service. All rights reserved. ARTICLE TITLE | AUTHOR 7TESTING INFORMATION LITERACY IN DIGITAL ENvIRONMENTS | KATz 7 authoritative (a “.com” URL, unprofessional look, content that directly describes the authors as students). The cor responding advanced task would present fewer cues of the Web site’s origin (for example, a professional look, but careful reading reveals the Web site is by students). ■ Score reports for individuals and institutions Both levels of the assessment feature online delivery of score reports for individuals and for institutions. The individual score report is intended to help guide students in their learning of ICT literacy skills, aiding identifica tion of students who might need additional ICT literacy instruction. The report includes an overall ICT literacy score, a percentile score, and individualized feedback on the student’s performance (see figure 3). The percentile compares students to a reference group of students who took the test in early 2006 and who fall within the target population for the assessment level (core or advanced). As more data are collected from a greater number of institutions, these reference groups will be updated and, ideally, approach nationally representative norms. Score reports are available online to students, usually within one week. High schools, colleges, and universities receive score reports that aggregate results from the testtakers at their institution. The purpose of the reports is to provide an overview of the students in comparison with a reference group. These reports are available to institutions online after at least fifty students have taken either the core or advanced level test—that is, when there are sufficient num bers to allow reporting of reliable scores. Figure 4 shows a graph from one type of institutional report. Users have the option to specify the reference group (for example, all students, all students at a fouryear institution) and the subset of testtakers to compare to that group (for exam ple, freshmen, students taking the test within a particular timeframe). A second report summarizes the performance feedback of the individual reports, providing percentages of students who received the highest score on each aspect of performance (each of the fourteen short tasks are scored on two or three different elements). Finally, institutions can conduct their own analyses by downloading the data of their testtakers, which include each student’s responses to the background questions, iSkills score, and responses to institutionspecified questions. ■ Testing the test A variety of specialists contributed to the development of ETS’s iSkills assessment: librarians, classroom fac ulty, education administrators, assessment specialists, researchers, userinterface and graphic designers, and systems developers. The team’s combined goal was to produce a valid, reliable, authentic assessment of ICT literacy skills. Before the iSkills assessment produced Figure 3. First page of a sample score report for an individual. The subsequent pages contain additional performance feedback. Figure 4. Sample portion of an institutional score report: compari- son between a user-specified reference group and data from the user’s institution. © 2007 Educational Testing Service. All rights reserved. © 2007 Educational Testing Service. All rights reserved. � INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 official scores for testtakers, these specialists—both ETS and ICT literacy experts—subjected the assess ment to a variety of review procedures at many stages of development. These reviews ranged from weekly teleconferences with consortium members during the initial development of assessment tasks (January–July 2004), to smallscale usability studies in which ETS staff observed individual students completing assessment tasks (or mockups of assessment tasks), to field trials that mirrored actual test delivery. The usability studies investigated students’ comprehension of the tasks and testing environment as well as the ease of use of the simulated software in the assessment tasks. The field trials provided opportunities to collect performance data and test the automated scoring algorithms. In some cases, ETS staff finetuned the scoring algorithms (or developed alternatives) when the scores produced were not psychometrically sound, such as when one element of students’ scores was inconsistent with their overall performance. Through these reviews and field trials, the iSkills assessment evolved to its current form, targeting and reporting the performance of individuals who complete the seventyfiveminute assessment. In some cases, feedback from experts and field trial participants led to significant changes. For example, the iSkills assess ment began in 2005 as a twohour assessment (at that time called the ICT Literacy Assessment), that reported scores only to institutions on the aggregated perfor mance of their participating students. Some students entering higher education found the 2005 assessment excessively difficult, which led to the creation of the easier core level assessment. Table 1 outlines the participation volumes for the field trials and test administrations. During each field trial, as well as during the institutional administration, feedback was collected from students on their experience with the test via a brief exit survey. Table 2 summarizes some results of the exit survey. Student reactions to the test were reasonably consistent: most students enjoyed taking the test and found the tasks realistic. In writ ten comments, students taking the institutional assess ment found the experience rewarding but exhausting, and thought the amount of reading excessive. Student feedback directly influenced the design of the core and advanced level assessments, including the shorter test Table 1. Chronology of field trials and test administrations Date Administration Approximate no. of students Approximate no. of participating institutions July–September 2004 Field trials for institutional assessment 1,000 40 January–April 2005 Institutional assessment 5,000 30 May 2005 Field trials for alternative individual assessment structures 400 25 November 2005 Field trials for advanced level individual assessment 700 25 January–May 2006 Advanced level individual assessment 2,000 25 February 2006 Field trials for core level individual assessment 700 30 April–May 2006 Core level individual assessment 4,500 45 August–December 2006 Core level: Continuous administration 2,100 20 August–December 2006 Advanced level: Continuous administration 1,400 10 Note: Items in bold represent “live” test administrations in which score reports were issued to institutions, students, or both. ARTICLE TITLE | AUTHOR 9TESTING INFORMATION LITERACY IN DIGITAL ENvIRONMENTS | KATz 9 taking time and lighter reading load compared with the institutional assessment. As shown in table 1 (bolded rows), test administra tions in 2005 and early 2006 occurred within set time frames. Beginning in August 2006, the core and advanced level assessments switched to continuous testing: instead of a specific testing window, institutions create testing sessions to suit the convenience of their resources and students. The tests are still administered in a proctored lab environment, however, to preserve the integrity of the scores. ■ Student performance Almost 6,400 students at sixtythree institutions par ticipated during the first administrations of the core and advanced level iSkills assessments between January and May 2006. (Some institutions administered both the core and advanced level assessments.) Testtakers consisted of 1,016 highschool students, 753 community college students, and 4,585 fouryear college and university stu dents. Institutions selected students to participate based on their assessment goals. Some chose to test students enrolled in a particular course, some recruited a random sample, and some issued an open invitation and offered gift certificates or other incentives. Because the sample of students is not representative of all United States institu tions nor all higher education students, these results do not necessarily generalize to the greater population of collegeage students and should therefore be interpreted with caution. Even so, the preliminary results reveal interesting trends in the ICT literacy skills of participat ing students. Overall, students performed poorly on both the core and advanced level, achieving only about half of the possible points on the tests. Informally, the data suggest that students generally do not consider the needs of an audience when communicating information. For exam Table 2. Student feedback from the institutional assessment and individual assessments’ field trials Statement % agreeing Institutional assessment (N=4,898) Advanced level field trials (N=736) Core level field trials (N=648) I enjoyed taking this test. 61 59 67 This test was appropriately challenging. 90 90 86 I have never taken a test like this one before. 90 90 89 To perform well on this test requires thinking skills as well as technical skills. 95 93 94 I found the overall testing interface easy to use (even if the tasks themselves might have been difficult). 83 82 85 My performance on this test accurately reflects my ability to solve problems using computers and the Internet. 63 56 67 I didn’t take this test very seriously. 25 25 23 The tasks reflect activities I have done at school, work, or home. 79 77 78 The software tools were unrealistic. N/A 21 24 10 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200710 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 ple, they do not appear to recognize the value of tailor ing material to an audience. Regarding the ethical use of information, students tend not to check the “fair use” policies of information on the assessment’s simulated Web sites. Unless the usage policy (for example, copy right information) is very obvious, students appeared to assume that they may use information obtained online. On the positive side, testtakers appeared to recognize that .edu and .gov sites are less likely to contain biased material than .com sites. Eighty percent of testtakers correctly completed an organizational chart based on emailed personnel information. Most testtakers cor rectly categorized emails and files into folders. And when presented with an unclear assignment, 70 percent of testtakers selected the best question to help clarify the assignment. During a task in which students evaluated a set of Web sites: ■ only 52 percent judged the objectivity of the sites cor rectly; ■ sixtyfive percent judged the authority correctly; ■ seventytwo percent judged the timeliness correctly; and ■ overall, only 49 percent of testtakers uniquely identi fied the one Web site that met all criteria. When selecting a research statement for a class assign ment: ■ only 44 percent identified a statement that captured the demands of the assignment; ■ fortyeight percent picked a reasonable but too broad statement; and ■ eight percent picked statements that did not address the assignment. When asked to narrow an overly broad search: ■ only 35 percent selected the correct revision; and ■ thirtyfive percent selected a revision that only mar ginally narrowed the search results Other results suggest that these students’ ICT literacy needs further development: ■ in a Web search task, only 40 percent entered mul tiple search terms to narrow the results; ■ when constructing a presentation slide designed to persuade, 12 percent used only those points directly related to the argument; ■ only a few testtakers accurately adapted existing material for a new audience; and ■ when searching a large database, only 50 percent of testtakers used a strategy that minimized irrelevant results. ■ Validity evidence The goal of the iSkills assessment is to measure the ICT literacy skills of students—higher scores on the assess ment should reflect stronger skills. Evidence for this validity argument has been gathered since the earliest stages of assessment design, beginning in August 2003. These documentation and research efforts, conducted at ETS and at participating institutions, include: ■ The estimated reliability of iSkills assessment scores is .88 (Cronbach alpha), which is a measure of test score consistency across various administrations. This level of reliability is comparable to that of many other respected contentbased assessments, such as the Advanced Placement exams. ■ As outlined earlier, the EvidenceCentered Design approach ensures a direct connection between experts’ view of the domain (in this case, ICT literacy), evi dence of student performance, design of the tasks, and the means for scoring the assessment (Katz et al. 2004). Through the continued involvement of the library community in the form of the ICT Literacy National Advisory Committee and development committees, the assessment maintains the endorsement of its con tent by appropriate subjectmatter experts. ■ In November 2005, a panel of experts (librarians and faculty representing high schools, community colleges, and fouryear institutions from across the United States) reviewed the task content and scoring for the core level iSkills assessment. After investigat ing each of the thirty tasks and their scoring in detail, the panelists strongly endorsed twentysix of the tasks. Four tasks received less strong endorsement and were subsequently revised according to the committee’s recommendations. ■ Students’ selfassessments of their ICT literacy skills align with their scores on the iSkills assessment (Katz and Macklin 2006). The selfassessment measures were gathered via a survey administered before the 2005 assessment. Interestingly, although students’ confidence in their ICT literacy skills aligned with their iSkills scores, iSkills scores did not correlate with the frequency with which students reported per forming ICT literacy activities. This result supports librarians’ claims that mere frequency of use does not translate to good ICT literacy skills, and points ARTICLE TITLE | AUTHOR 11TESTING INFORMATION LITERACY IN DIGITAL ENvIRONMENTS | KATz 11 to the need for ICT literacy instruction (Oblinger and Hawkins 2006; Rockman 2002). ■ Several other validity studies are ongoing, both at ETS and at collaborating institutions. These stud ies include using the iSkills assessment in prepost evaluations of educational interventions, detailed comparisons of student performance on the assess ment and on more realworld ICT literacy tasks, and comparisons of iSkills assessment scores and scores from writing portfolios. ■ National ICT literacy standards and setting cut scores In October 2006, the National Forum on Information Literacy, an advocacy group for information literacy policy (http://www.infolit.org/), announced the formation of the National ICT Literacy Policy Council. The policy coun cil—composed of representatives from key policymaking, informationliteracy advocacy, education, and workforce groups—has the charter to draft ICT literacy standards that outline what students should know and be able to do at different points in their academic careers. Beginning in 2007, the council will first review existing standards docu ments to draft descriptions for different levels of perfor mance (for example, minimal ICT literacy, proficient ICT literacy), creating a framework for the national ICT literacy standards. Separate performance levels will be defined for the corresponding target population for the core and advanced assessments. These performancelevel descrip tions will be reviewed by other groups representing key stakeholders, such as business leaders, healthcare educa tors, and the library community. The council also will recruit experts in ICT literacy and informationliteracy instruction to review the iSkills assessment and recommend cut scores corresponding to the performance levels for the core and advanced assess ments. (A cut score represents the minimum assessment score needed to classify a student at a given performance level.) The standardsbased cut scores are intended to help educators determine which students meet the ICT literacy standards and which may need additional instruction or remediation. The council will review these recommended cut scores and modify or accept them as appropriately reflecting national ICT literacy standards. ■ Conclusions ETS’s iSkills assessment is the first nationally available measure of ICT literacy that reflects the richness of that area through simulationbased assessment. Owing to the 2005 and 2006 testing of more than ten thousand students, there is now evidence consistent with anec dotal reports of students’ difficulty with ICT literacy despite their technical prowess. The results reflect poor ICT literacy performance not only by students within one institution, but across the participating sixtythree high schools, community colleges, and fouryear colleges and universities. The iSkills assessment answers the call of the 2001 International ICT Literacy Panel and should inform ICT literacy instruction to strengthen these criti cal twentyfirstcentury skills for college students and all members of society. ■ Acknowledgments I thank Karen Bogan, Dan Eignor, Terry Egan, and David Williamson for their comments on earlier drafts of this article. The work described in this article represents con tributions by the entire iSkills team at Educational Testing Service and the iSkills National Advisory Committee. Works Cited American Library Association. 1989. Presidential committee on information literacy: Final report. Chicago: ALA. Available online at http://www.ala.org/acrl/legalis.html (accessed June 13, 2007). Brasley, S. S. 2006. Building and using a tool to assess info and tech literacy. Computers in Libraries 26, no. 5: 6–7, 43–48. Breivik, P. S. 2005. 21st century learning and information literacy. Change 37, no. 2: 20–27. Dunn, K. 2002. Assessing information literacy skills in the Cali fornia State University: A progress report. Journal of Academic Librarianship 28, no. 1/2: 26–36. International ICT Literacy Panel. 2002. Digital transformation: A framework for ICT literacy. Princeton, N.J.: Educational Testing Service. Available online at http://www.ets.org/Media/ Tests/Information_and_Communication_Technology_Lit eracy/ictreport.pdf (accessed June 13, 2007). Katz, I. R. 2005. Beyond technical competence: Literacy in infor mation and communication technology. Educational Technol- ogy Magazine 45, no 6: 144–47. Katz, I. R., and A. Macklin. 2006. Information and communica tion technology (ICT) literacy: Integration and assessment in higher education. In Proceedings of the 4th International Conference on Education and Information Systems, Technologies, and Applications, F. Malpica, A. Tremante, and F. Welsch, eds. Caracas, Venezuela: International Institute of Informatics and Systemics. Katz, I. R., et al. 2004. Assessing information and communications technology literacy for higher education. Paper presented at the 12 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200712 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Annual Meeting of the International Association for Educa tional Assessment, Philadelphia, Pa. Middle States Commission on Higher Education. 2003. Develop- ing research and communication skills: Guidelines for information literacy in the curriculum. Philadelphia: Middle States Com mission on Higher Education. Mislevy, R. J., L. S. Steinberg, and R. G. Almond. 2003. On the structure of educational assessments. Measurement: Interdisci- plinary Research and Perspectives 1: 3–67. Oblinger, D. G., and B. L. Hawkins. 2006. The myth about stu dent competency. EDUCAUSE Review 41, no. 2: 12–13. Oblinger, D. G., and J. L. Oblinger, eds. 2005. Educating the Net Generation. Washington, D.C.: EDUCAUSE, http://www. educause.edu/educatingthenetgen (accessed Dec. 29, 2006). Partnership for 21st Century Skills. 2003. Learning for the 21st cen- tury: A report and mile guide for 21st century skills. Washington, D.C.: Partnership for 21st Century Skills. Rockman, I. F. 2002. Strengthening connections between infor mation literacy, general education, and assessment efforts. Library Trends 51, no. 2: 185–98. ———. 2004. Introduction: The importance of information lit eracy. In Integrating information literacy into the higher education curriculum: Practical models for transformation. I. F. Rockman and Associates, eds. San Francisco: JossyBass. The California State University. 2006. Information competence initiative Web site. http://calstate.edu/ls/infocomp.shtml (accessed June 4, 2006). University of Central Florida. 2006. Information fluency initiative Web site. http://www.if.ucf.edu/ (accessed June 4, 2006). Western Association of Schools and Colleges. 2001. Handbook of accreditation. Alameda, Calif.: Western Association of Schools and Colleges. Available online at http://www.wascsenior .org/wasc/Doc_Lib/2001%20Handbook.pdf (accessed Dec. 22, 2006). 3272 ---- Author ID box for 2 column layout This article examines the linguistic structure of folk- sonomy tags collected over a thirty-day period from the daily tag logs of Del.icio.us, Furl, and Technorati. The tags were evaluated against the National Information Standards Organization (NISO) guidelines for the con- struction of controlled vocabularies. The results indicate that the tags correspond closely to the NISO guidelines pertaining to types of concepts expressed, the predomi- nance of single terms and nouns, and the use of recog- nized spelling. Problem areas pertain to the inconsistent use of count nouns and the incidence of ambiguous tags in the form of homographs, abbreviations, and acro- nyms. With the addition of guidelines to the construc- tion of unambiguous tags and links to useful external reference sources, folksonomies could serve as a power- ful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and user-driven readers’ advisory services. O ne of the most daunting challenges of information management in the digital world is the ability to keep, or refind, relevant information; book marking is one of the most popular methods for storing relevant Web information for reaccess and reuse (Bruce, Jones, and Dumais 2004). The rising popularity of social bookmark managers, such as Del.icio.us, addresses these concerns by allowing users to organize their bookmarks by assigning tags that reflect directly their own vocabu lary and needs. The collection of userassigned tags is referred to commonly as a folksonomy. In recent years, significant developments have occurred in the creation of customizable user features in public library catalogs. These features offer clients the opportunity to customize their own library Web pages and to store items of interest to them, such as book lists. Client participation in these interfaces, however, is largely reactive; clients can select items from the catalog, but they have little ability to orga nize and categorize these items in a way that reflects their own needs and language. Digital document repositories, such as library cata logs, normally index the subject of their contents via key words or subject headings. Traditionally, such indexing is performed either by an authority, such as a librarian or a professional indexer, or is derived from the authors of the documents; in contrast, collaborative tagging, or folkson omy, allows anyone to freely attach keywords or tags to content. Demspey (2003) and Ketchell (2000) recommend that clients be allowed to annotate resources of interest and to share these annotations with other clients with similar interests. Folksonomies can thus make significant contributions to public library catalogs by enabling cli ents to organize personal information spaces; namely, to create and organize their own personal information space in the catalog. Clients find items of interest (items in the library catalog, citations from external databases, external Web pages, and so on) and store, maintain, and organize them in the catalog using their own tags. In order to more fully understand these applications, it is important to examine how folksonomies are struc tured and used, and the extent to which they reflect user needs not found in existing lists of subject headings. The purpose of this proposed research is thus to examine the structure and scope of folksonomies. How are the tags that constitute the folksonomies structured? To what extent does this structure reflect and differ from the norms used in the construction of controlled vocabular ies ,such as Library of Congress Subject Headings? What are the strengths and weaknesses of folksonomies (for example, reflect user need, ambiguous headings, redun dant headings, and so forth)? This article will examine a selection of tags obtained from three folksonomy sites, Del.icio.us (referred to henceforth as Delicious), Furl, and Technorati, over a thirtyday period. The structure of these tags will be examined and evaluated against section 6 of the NISO guidelines for the construction of controlled vocabularies (NISO 2005), which looks specifically at the choice and form of terms. ■ Definitions of folksonomies Folksonomies have been described as “usercreated meta data . . . grassroots community classification of digital assets” (Mathes 2004). Wikipedia (2006) describes a folksonomy as “an Internetbased information retrieval methodology consisting of collaboratively generated, openended labels that categorize content such as Web pages, online photographs, and Web links.” The concept of collaboration is attributed commonly to folksonomies (Bateman, Brooks, and McCalla 2006; Cattuto, Loreto, and Pietronero 2006; Fichter 2006; Golder and Huberman The Structure and Form of Folksonomy Tags: The Road to the Public Library Catalog Louise F. Spiteri Louise F. Spiteri (Louise.Spiteri@dal.ca) is Associate Professor at the School of Information Management, Dalhousie University, Halifax, Nova Scotia, Canada. This research was funded by the OCLC/ALISE Library and Information Science Research Grant Program. THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 13 1� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20071� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 2006; Mathes 2004; Quintarelli 2005; Udell 2004). Thomas Vander Wal, who coined the term folksonomy, argues, however, that: the definition of Folksonomy has become completely unglued from anything I recognize. . . . It is not col laborative . . . it is the result of personal free tagging of information and objects (anything with a URL) for one’s own retrieval. The tagging is done in a social environment (shared and open to others). The act of tagging is done by the person consuming the informa tion” (Vanderwal.net 2005). It may be more accurate, therefore, to say that folk sonomies are created in an environment where, although people may not actively collaborate in their creation and assignation of tags, they may certainly access and use tags assigned by others. Folksonomies thus enable the use of shared tags. Folksonomies are used primarily in social bookmark ing sites, such as Delicious (http://del.icio.us/) and Furl (http://www.furl.net/), which allow users to add sites they like to their personal collections of links, to organize and categorize these sites by adding their own terms, or tags, and to share this collection with other people with the same interests. The tags are used to collocate bookmarks within a user’s collection and bookmarks across the entire system, so, for example, the page http://del.icio.us/tag/blogging will show all bookmarks that are tagged with blogging by any member of the Delicious site. ■ Benefits of folksonomies Quintarelli (2005) and Fichter (2006) suggest that folk sonomies reflect the movement of people away from authoritative, hierarchical taxonomic schemes that reflect an external viewpoint and order that may not necessarily reflect users’ ways of thinking. “In a social distributed environment, sharing one’s own tags makes for innova tive ways to map meaning and let relationships naturally emerge” (Quintarelli 2005). Vander Wal (2006) adds that “the value in this external tagging is derived from people using their own vocabulary and adding explicit mean ing, which may come from inferred understanding of the information/object.” An attractive feature of folksonomies is their inclusive ness; they reflect the vocabulary of the users, regardless of viewpoint, background, bias, and so forth. Folksonomies may thus be perceived to be a democratic system where everyone has the opportunity to contribute and share tags (Kroski 2006). The development of folksonomies may reflect also the difficulty and expense of applying con trolled taxonomies to the Web: building, maintaining, and enforcing a sound, controlled vocabulary is often simply too expensive in terms of development time and of the steep learning curve needed by the user of the system to learn the classification scheme (Fichter 2006; Kroski 2006; Quintarelli 2005; Shirky 2004). A further limitation of taxonomies is that they may become outdated easily. New concepts or products may emerge that are not yet included in the taxonomy; in comparison, folksonomies easily accommodate such new concepts (Fichter 2006; Mitchell 2005; Wu, Zubair, and Maly, 2006). Shirky (2004) points out that the advantage of folksonomies is not that they are better than controlled vocabularies, but that they are better than nothing. Folksonomies follow desire lines, which are expres sions of the direct information needs of the user (Kroski 2006; Mathes 2004; Merholz 2004). These desire lines also may reflect the needs of communities of interest: tag gers who use same set of tags have formed a group and can seek each other out using simple search techniques. “Tagging provides users an easy, yet powerful method to express themselves within a community” (Szekely and Torres 2005). ■ Weaknesses of folksonomies Folksonomies share the problems inherent to all uncon trolled vocabularies, such as ambiguity, polysemy, syn onymy, and basic level variation (Fichter 2006; Golder and Huberman 2006; Guy and Tomkin 2006; Mathes 2004). The terms in a folksonomy may have inherent ambiguity as different users apply terms to documents in different ways. The polysemous tag port could refer to a sweet fortified wine, a porthole, a place for loading and unloading ships, the lefthand side of a ship or air craft, or a channel endpoint in a communications system. Folksonomies do not include guidelines for use or scope notes. Folksonomies provide for no synonym control; the terms mac, macintosh, and apple, for example, are all used to describe Apple Macintosh computers. Similarly, both singular and plural forms of terms appear (for example, flower and flowers), thus creating a number of redun dant headings. The problem with basic level variation is that related terms that describe an item vary along a continuum of specificity ranging from very general to very specific, so, for example, documents tagged perl and javascript may be too specific for some users, while a document tagged programming may be too general for others. Folksonomies provide no formal guidelines for the choice and form of tags, such as the use of com pound headings, punctuation, word order, and so forth; for example, should one use the tag vegan cooking or cooking, vegan? Guy and Tomkin (2006) provide some general suggestions for tag selection best practices, such as the use of plural rather than singular forms, the use ARTICLE TITLE | AUTHOR 15THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 15 of underscore to join terms in a multiterm concept (for example, open_source), following conventions estab lished by others, and adding synonyms. These sugges tions are rather too vague to be of much use, however; for example, under what circumstances should singular forms be used (such as noncount nouns), and how should synonyms be linked? ■ Applications of folksonomies Other than social bookmarking sites, folksonomies are used in commercial shopping sites, such as Amazon (http://www.amazon.com/), where clients tag items of interest; these tags can be accessed by people with similar interests. Platial (http://www.platial.com/ splash) is used to tag personal collections of maps. Examples of the use of folksonomies for intranets include IBM’s social bookmarking application Dogear, which allows people to bookmark pages within their Intranet (http://domino.watson.ibm.com/cambridge/ research.nsf/99751d8eb5a20c1f852568db004efc90/ 1c181ee5fbcf59fb852570fc0052ad75?OpenDocument), and Scuttle (http://sourceforge.net/projects/scuttle/), an opensource bookmarking project that can be hosted on Web servers for free. PennTags (http://tags.library. upenn.edu/) is a social bookmarking service offered by the University of Pennsylvania Library to its community members. Steve Museum is a project that is investigating the incorporation of folksonomies into museum catalogs (Trant and Wyman 2006). Another potential application of folksonomies is to public library catalogs, where users can organize and tag items of interest in userspecific folders; users could then decide whether or not to post the tags publicly (Spiteri 2006). ■ Analyses of folksonomies Analysis of the structure, or composition, of tags has thus far been limited; there has been more emphasis placed upon the cooccurrence of tags and their frequency of use. Cattuto, Loreto, and Pietronero (2006) applied a stochas tic model of user behavior to investigate the statistical properties of tag cooccurrence; their results suggest that users of collaborative tagging systems share universal behaviors. Michlmayr (2005) compared tags assigned to a set of Delicious bookmarks to the DMOZ (http://www. dmoz.org/) taxonomy, which is designed by a commu nity of volunteers. The study concluded that there were few instances of overlap between the two sets of terms. Mathes (2004) provides an interesting analysis of the strengths and limitations of the structure of Delicious and Flickr, but does not provide an explanation of the meth odology used to derive his observations; it is not clear, for example, for how long he studied these two sites, how many tags he examined, what elements he was looking for, or what evaluative criteria he applied. Golder and Huberman (2006) conducted an analysis of the structure of collaborative tagging systems, look ing at user activity and kinds and frequencies of tags. Specifically, Golder and Huberman looked at what tags Delicious members assigned and how many bookmarks they assigned to each tag. This study identified a number of functions tags perform for bookmarks, including iden tifying the: ■ subject of the item; ■ format of the item (for example, blog); ■ ownership of the item; and ■ characteristics of the item (for example, funny). While the Golder and Huberman study provides an important look at tag use, their study is limited in that they examined only one site for a period of four days; their results are an excellent first step in the analysis of tag use, but the narrow focus of their population and sample size means that their observations are not easily generalized. Furthermore, this study focuses more on how bookmarks are associated with tags (for example, how many bookmarks are assigned per tag and by whom) rather than at the structural composition of the tags themselves. Guy and Tonkin (2006) collected a random sampling of tags from Delicious and Flickr to see whether “popular objections to folksonomic tagging are based on fact.” The authors do not explain, however, over what period the tags were acquired (for example, over a oneday period, over a month), nor to they provide any evaluative criteria. The tags were entered into Aspell, an open source spell checker, from which the authors concluded that 40 percent of Flickr and 28 percent of Delicious tags were either mis spelled, encoded in a manner not understood by Aspell, or consisted of compound words of two or more words. Tags did not follow convention in such areas as the use of case or singular versus plural forms. While this study certainly focuses upon the structure of the tags, the bases for the authors’ conclusions are problematic. It is not clear that the use of a spell checker is a sufficient measure of quality. Does the spell checker allow for cultural variations in spell ing (for example, labor or labour)? How wellrecognized and comprehensive is the source vocabulary for this spell checker? Furthermore, if a tag does not exist in the spell checker, does this necessarily mean that the tag is incor rect? Tags may include several neologisms, such as podcast- ing, that may not yet exist in conventional dictionaries but are wellrecognized in a particular domain. The authors do not mention whether they took into account the cor 16 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200716 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 rect use of the singular form of such tags as noncountable nouns (for example, air) or tags that describe disciplines or emotions (for example, history and love). If a named entity (person or organization) was not recognized by Aspell, does this mean that the tag was classified as incorrect? Lastly, the authors seem to imply that compound words of two or more words are necessarily incorrect, which may not be the case (for example, open source software). The pitfalls of folksonomies have been welldocu mented; what is missing is an indepth analysis of the linguistic structure of tags against an established bench mark. While popular opinion suggests that folksonomies suffer from ambiguous and inconsistent structure, the actual extent of these problems is not yet clear; further more, analyses conducted so far have not established clear benchmarks of quality pertaining to good tag structure. Although there are no guidelines for the construction of tags, recognized guidelines do exist for the construction of terms that are used in taxonomies. Although these guidelines discuss the elucidation of interterm relation ships (hierarchical, associative, and equivalent), which does not apply to the flat space of folksonomies, they contain sections pertaining to the choice and formation of concept terms that may, in fact, have relevance for the construction of tags. ■ Methodology Selection of folksonomy sites Tags were chosen from three popular folksonomy sites: Delicious, Furl, and Technorati (http://www.technorati. com/). Delicious and Furl function as bookmarking sites, while Technorati enables people to search for and organize blogs. These sites were chosen because they provide daily logs of the most popular tags that have been assigned by their members on a given day. The daily tag logs from each of the sites were acquired over a thirtyday period (February 1–March 2, 2006). The daily tags for each site were entered into an Excel spreadsheet. A list of unique tags for each site was compiled after the thirtyday period; unique refers to the single instance of a tag. Some of the tags were used only once during the thirtyday period, while others, such as travel, occurred several times, so travel appears only once in the list of unique tags. Variations of the same tag—for example, car or cars, Cheney or Dick Cheney—were considered to constitute two unique tags. Only Englishlanguage tags were accumulated. The analysis of the tag structure in the three lists was conducted by applying the NISO guidelines for thesaurus construction, which are the most current set of recognized guidelines for the: contents, display, construction . . . of controlled vocabu laries. This Standard focuses on controlled vocabularies that are used for the representation of content objects in knowledge organization systems including lists, syn onym rings, taxonomies, and thesauri (NISO 2005, 1). While folksonomies are not controlled vocabularies, they are lists of terms used to describe content, which means that the NISO guidelines could work well as a benchmark against which to examine how folksonomy tags are structured as well as the extent to which this structure reflects the widely accepted norm for controlled vocabu laries. Section 6 of the guidelines (term choice, scope, and form) was applied to the tags, specifically the following elements (see appendix A for the expanded list): 6.3 Term choice 6.4 Grammatical form of terms 6.5 Nouns 6.6 Selecting the preferred form Only those elements in section 6 that were found to apply to the lists of unique tags are included in appendix A. For each site, the section 6 elements were applied to each unique tag; for example, it was noted whether a tag consists of one or more terms, whether the tag is a noun, adjective, or adverb, and so on. The frequency of occur rence of the section 6 elements was noted for each site and then compared across the three sites in order to determine the existence of any patterns in tag structure and the extent to which these patterns reflect current practice in the design of controlled vocabularies. Definition and disambiguation of tags The meanings of the tags were determined based upon (1) the context of their use; and (2) their definition in three external sources, namely Merriam Webster online dic tionary (http://www.mw.com/); Google (http://www. google.com/); and Wikipedia (http://www.wikipedia. org/). MerriamWebster was used specifically to define all tags other than those that constitute unique entities (for example, named people, places, organizations, or products) and to determine the various meanings of tags that are homographs (for example, art or web). The actual concept represented by homographs was determined by examin ing the sites or blogs to which the tag was assigned. MerriamWebster also was used to determine the grammatical form of a tag; for example, noun, verbal noun, adjective, or adverb. Determining verbal nouns proved to be complicated, especially given that NISO relies only on examples to illustrate such nouns. Some tags could serve as both verbal and simple nouns; for example, the tag clipping could describe the activity to clip or an item that has been clipped, such as a newspaper ARTICLE TITLE | AUTHOR 17THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 17 clipping. Similarly, does skiing refer to an activity, or the sport? If the dictionary defined a tag as an activity, the tag was classified as a verbal noun. In the case of tags that were defined as both verbal nouns and simple nouns, the context in which the tag was used determined the final classification. The dictionary also was used to determine the type of concept represented by a tag. The NISO guidelines do not define any of these seven types of concepts outlined in section 6.3.2; they provide only a short list of examples for each type. If the term represented by the tag was defined as an activity, property, material, event, discipline or field of study, or unit of measurement, it was classified as such unless the context of the tag suggested otherwise. If none of these six types was defined in the dictionary, the default value of thing was assigned to the tag. These definitions were then compared to the context in which the tag was used. In the case of the tag art, for example, an examination of the sites associated with this tag indicated that it refers to art objects, rather than the discipline, so it was classified as a thing. MerriamWebster was used to determine whether a tag constitutes a recognized term in standard English (both United States and United Kingdom variants); for example, the tag blogs is a recognized term in the dictionary, while podcasting is not. NISO does not provide a clear definition of slang, neologism, or jargon, other than to say that they are nonstandard terms not generally found in dictionaries. Is the term podcasting, for example, an instance of slang, jargon, or neologism? At what point does jargon become a neologism? Because of the difficulty of distinguishing among these three categories, it was decided to use the broader category nonstandard terms to cover tags that (1) could not be found in the dictionary; or (2) are designated as vulgar or slang in the dictionary. Google and Wikipedia were used to define the mean ings of tags that constitute unique entities. Wikipedia also was used to distinguish the various meanings of tags that constitute abbreviations or acronyms via its disambigua tion pages; for example, the tag NFL is given eight pos sible meanings. In this case, the tag NFL is used to refer specifically to the National Football League, so the tag is a homograph, noun, and unique entry. ■ Tagging conventions and guidelines of the folksonomy sites Delicious Delicious defines tags as: oneword descriptors that you can assign to your bookmarks. . . . They’re a little bit like keywords but nonhierarchical. You can assign as many tags to a bookmark as you like and easily rename or delete them later. Tagging can be a lot easier and more flexible than fitting your information into preconceived categories or folders” (Del.icio.us 2006a). The Delicious help page for tags encourages people to “enter as many tags as you would like, each separated by a space” in the tag field. This paragraph explains briefly that two lists of tags may appear under the entry form used to enter a bookmark. The first list consists of popular tags assigned by other people to the bookmark in question, while the second consists of recommended tags, which contains a combination of tags that have been assigned by the client in question as well as other users (Del.icio.us 2006b). It is not clear how the two lists differ in that they both contain tags assigned by other people to the bookmark at hand. The only tangible guideline provided about how tags should be structured is the sentence “your only limitation on tags is that they must not include spaces.” Delicious thus addresses only indirectly the fact that it does not allow multiterm tags; the examples provided suggest ways in which compound terms can be expressed; for example, Sanfrancisco, SanFranciso, San.franciso (Del. ico.us 2006b). Punctuation thus appears to be allowed in the construction of tags, which is confirmed by the sug gestion that asterisks may be used to rate bookmarks: “a tag of * might mean an OK link, *** is pretty good, and a bookmark tagged ***** is awesome” (Del.icio.us 2006b). It is thus possible that tags may not consist of recognizable terms, even though asterisks are neither searchable nor indicative of content. Furl The Furl Web site uses the term topics rather than tags, but provides no guidelines or instructions for how to con struct these topics. Furl mentions only that when entering a bookmark, “a small window will pop up. It should have the title and URL of the page you are looking at. Enter any additional details (i.e., topic, rating, comments) and click Save” (Furl 2006). Furl provides all users with a list of default topics to which one can add at will. Furl provides no guidelines as to whether single or multiword topics may be used; it is only by trial and error that the user discovers that the latter are, in fact, allowed. Technorati In its tags help page, Technorati encourages users to “think of a tag as a simple category name. People can categorize their posts, photos, and links with any tag that makes sense” (Technorati 2006). A tag may be “anything, but it should be descriptive. Please only use tags that are rel evant to the post” (Technorati 2006). Technorati tags are 1� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20071� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 embedded into individual blogs via the link rel=”tag”; for example: global warming. The tag will appear as simply global warming. No other guidelines are provided about how tags should be constructed. As can be seen, the three folksonomy sites provide very few guidelines or conventions for how tags should be constructed. Users are not pointed to the common problems that exist in uncontrolled vocabulary, such as ambiguous headings, homographs, synonyms, spelling variations, and so forth, nor are suggestions made as to the preferred form of tags, such as nouns, plural forms, or the distinction between count nouns (for example, dogs) and mass nouns (for example, air). Given this lack of guidance, it is not unreasonable to assume that the tags acquired from these sites will vary considerably in form and structure. ■ Findings Unless stated otherwise, the number of tags per folk sonomy site is 76 for Delicious, 208 for Furl, and 229 for Technorati. Homographs The NISO guidelines recommend that homographs— terms with identical spellings but different meanings— should be avoided as far as possible in the selection of terms. Homographs constitute 22 percent of Delicious tags, 12 percent of Furl tags, and 20 percent of Technorati tags. Unique entities constitute a significant proportion of the homographs in all three sites, with 71 percent in Delicious, 43 percent in Furl, and 55 percent in Technorati. The most frequently occurring homographs across the three sites consist predominantly of computerrelated terms, such as Ajax and CSS. Single-word versus multiword terms The NISO guidelines recommend that terms should represent a single concept expressed by a single or mul tiword term, as needed. Singleterm tags constitute 93 percent of Delicious tags, 76 percent of Furl tags, and 80 percent of Technorati tags. The preponderance of single tags in Delicious may reflect the fact that it does not allow for the use of spaces between the different elements of the same tag; for example, open source. Types of concepts NISO provides a list of seven types of concepts that may be represented by terms; while this list is not exhaustive, it represents the most frequently occurring types of con cept. Table 1 shows the percentage of tags that correspond to each of the seven types of concepts. Tags that represent things are clearly predominant in the three sites, with activities and properties forming a distant second and third in importance. None of the tags represent events or measures, and only a fraction of the Technorati tags represent materials. The NISO guidelines provide no indication of the expected distribution of the types of concepts, so it is difficult to determine to what extent the three folksonomy sites are consistent with other lists of descriptors. None of the tags fell outside the scope of the seven types of concepts. Unique Entities Unique entities may represent the names of people, places, organizations, products, and specific events (NISO 2005). Unique entities constitute 22 percent of Delicious tags, 14 percent of Furl tags, and 49 percent of Technorati tags. There is no consistency in the percentage of unique enti ties: Technorati has nearly twice the percentage of tags than Delicious has, and nearly triple the percentage of tags than Furl has. Computerrelated products constitute 100 percent of the unique entities in Delicious, 63 percent in Furl, and 38 percent in Technorati. The remainder of the unique entities in Furl and Technorati represent places, people, and corporate bodies. The unique entities in Technorati are closely related to developments in current news events, an occurrence that is likely due to the site’s focus on blogs rather than Web sites. As will be discussed in a subsequent section, the unique entries constitute a significant proportion of the tags that represent ambiguous acronyms or abbreviated terms, such as Ajax or PSP. Table 1. Concepts represented by the tags Delicious (%) Furl (%) Technorati (%) Things 76 82 90.0 Materials 0 0 0.4 Activities 12 10 4.0 Events 0 0 0.0 Properties 8 6 4.0 Disciplines 4 3 1.0 Measures 0 0 0.0 ARTICLE TITLE | AUTHOR 19THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 19 Grammatical forms of terms The NISO standards recommend the use of the following grammatical forms of terms: ■ Nouns and noun phrases ■ verbal nouns ■ noun phrases ■ premodified noun phrases ■ postmodified noun phrases ■ Adjectives ■ Adverbs Table 2 shows the distribution of the grammatical forms of tags. If all the types of nouns are combined, then 95 percent of Delicious tags, 94 percent of Furl tags, and 97 percent of Technorati tags constitute types of nouns. The gram matical structure of the tags in the three folksonomy sites thus reflects very closely the NISO recommendations that tags consist of mainly nouns, with the added proviso that adjectives and adverbs be kept to a minimum. None of the folksonomy sites used adverbs as tags, and the num ber of adjectives was very small, forming an average total of 5 percent of the tags. Nouns (plural and singular forms) NISO divides nouns into two categories: Count nouns (how many?), and noncount, or mass nouns (how much?). NISO recommends that count nouns appear in the plural form and mass nouns in the singular form. NISO specifies other types of nouns that appear typi cally in the singular form: ■ Abstract concepts ■ beliefs; for example, Judaism, Taoism ■ activities; for example, digestion, distribution ■ emotions; for example, anger, envy, love, pity ■ properties; for example, conductivity, silence ■ disciplines; for example, chemistry, astronomy ■ Unique entities Table 3 shows the distribution of the singular and plu ral forms of noun tags. The term singular nouns was used to collocate all the types of nonplural nouns. Table 3 represents the number of tags that constitute count nouns; this does not mean, however, that the tags appeared correctly in the plural form. Of the count nouns, 36 percent of Delicious tags, 62 percent of Furl tags, and 34 percent of Technorati tags appeared correctly in the plural form. It should be noted that although table 3 indicates that properties constitute 8 percent of Delicious, 6 percent of Furl, and 4 percent of Technorati tags, most of these tags are adjectives, and thus are not counted in the table. The NISO guidelines do not suggest the typical distribution of count versus singular nouns, but table 3 indicates that at least among the three folksonomy sites, singular nouns form the bulk of the tags. Table 2. Grammatical form of tags Delicious (%) Furl (%) Technorati (%) Nouns 88 71 86 Verbal Nouns 5 6 4 Noun Phrases— Premodified 1 15 4 Noun Phrases— Postmodified 0 2 3 Adjectives 6 6 3 Adverbs 0 0 0 Table 3. Count and noncount noun tags Delicious (%) Furl (%) Technorati (%) Count nouns 18 35 23 Noncount nouns 77 59 74 Mass nouns 36 32 19 Activities 12 10 4 Properties 3 0 1 Disciplines 4 3 1 Unique 22 14 49 Total 95 94 97 20 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200720 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Spelling The NISO guidelines divide the spelling of terms into two sections: warrant and authority. With respect to warrant, NISO recommends that “the most widely accepted spell ing of words, based on warrant, should be adopted,” with crossreferences made between variant spellings of terms. As far as authority is concerned, spelling should follow the practice of wellestablished dictionaries or glossaries. While spelling refers normally to whole words, I included in this analysis acronyms and abbreviations used to denote unique entities, such as countries or product names, as there are recognized spellings of such acronyms and abbreviations. Table 4 shows the tags from the three sites that do not conform to recognized spelling; the terms in italics show the accepted spelling. The number of tags that do not conform to spelling warrant is clearly very few, constituting a total of 4 per cent of the Delicious tags, 3 percent of the Furl tags, and 2 percent of the Technorati tags. Two of the nonrecognized spellings in Delicious are likely due to the difficulty of creating compound tags in this site, as was discussed earlier. The remainder of the tags conformed to recog nized spellings as found in the three reference sources consulted. The findings suggest that tags are spelled con sistently and in keeping with recognized warrant across the three folksonomy sites. Because of the international nature of the three folksonomy sites, no default English spelling was assumed. Table 5 shows those tags whose spellings reflect regional variations. None of the three folksonomy sites featured lexical variants of any one tag. As the three sites are United States–based, the preponderance of American spelling is not surprising. What is surprising, however, is that Technorati features only the British variants in the total of tags examined in this study. It should be pointed out that the two lexical variants of these terms do appear in the three folksonomy sites; the two variants simply did not appear in the daily logs examined. No system to enable crossreferencing (for example, Humour USE or SEE Humor) exists in any of the three folksonomy sites, nor is crossreferencing discussed in the help logs of the sites. Abbreviations, initialisms, and acronyms NISO recommends that the full form of terms should be used. Abbreviations or acronyms should be used only when they are so wellestablished that the full form of the term is rarely used. Crossreferences should be made between the full and abbreviated forms of the terms. Abbreviations and acronyms constitute 22 percent of Delicious tags, 16 percent of Furl tags, and 19 percent of Technorati tags. The majority of these abbreviations and acronyms pertain to unique entities, such as product names (for example, Flash, Mac, and NFL). In the case of Delicious and Furl, none of the abbreviated tags is referred to also by its full form. Four of the abbreviated Technorati tags have fullform equivalents: ■ Cheney/Dick Cheney ■ IE/Internet Explorer ■ Sheehan/Cindy Sheehan ■ UAE/United Arab Emirates Abbreviations and acronyms play a significant role in the ambiguity of the tags from the three sites; they represent 71 percent of the abbreviated Delicious tags, 45 percent of the abbreviated Furl tags, and 73 percent of the abbreviated Technorati tags. Furl and Technorati are very similar in the proportion of abbreviated tags used, but Delicious is significantly higher. The Delicious tags are focused more heavily upon computerrelated products, which may explain why there are so many more abbrevi ated tags, as many of these products are often referred to by these shorter terms; for example, CSS, Flash, Apple, and so on. Table 4. Tags that do not conform to spelling warrant Delicious (N=76) Furl (N=208) Technorati (N=229) Howto (How to) Hollywood b- day (Hollywood birthday) Met-art pics (Metropolitan art pictures) Opensource (Open source) Med-books (Medical books) Superbowl (Super Bowl) Toread (To read) Oralsex (Oral sex) Web-20 (Web2.0) Table 5. Tags that reflect regional spelling variations Delicious (N=76) Furl (N=208) Technorati (N=229) Humor (U.S. spelling) Humor (U.S. spelling) Favourite (British spelling) Jewelry (U.S. spelling) Humour (British spelling) ARTICLE TITLE | AUTHOR 21THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 21 Neologisms, slang, and jargon The NISO guidelines explain that neologisms, slang, and jargon terms are generally not included in standard dic tionaries and should be used only when there is no other widely accepted alternative. Nonstandard tags do not constitute a particularly relevant proportion of the total number of tags per site; they account for 3 percent of the Delicious tags, 10 percent of the Furl tags, and 6 percent of the Technorati tags. The nonstandard tags refer almost exclusively to either computer or sexrelated concepts, such as Podcast, Wiki, and Camsex. Nonalphabetic characters This section of the NISO guidelines deals with the use of capital letters and nonalphabetic characters. Capitalization was not examined in the three folksonomy sites, as none of them are case sensitive; Delicious and Furl, for exam ple, post tags in lower case, regardless of whether the user has assigned upper or lower case, while Technorati shows capital letters only if they are assigned by the users themselves. The NISO guidelines state that nonalphabetic characters, such as hyphens, apostrophes (unless used for the possessive case), symbols, and punctuation marks, should not be used because they cause filing and search ing problems. Table 6 shows the occurrence of nonalpha betic characters in the three folksonomy sites. A very small proportion of the tags in the three folk sonomy sites contains nonalphabetic characters, namely 1 percent of the Delicious tags, and 3 percent of the Furl and Technorati tags. As was discussed previously, the Delicious help screens may encourage people to use nonalphabetic characters to construct compound tags; in spite of this, however, such characters are not, in fact, used very frequently. It should be noted that the terms above were all searched, with punctuation intact, in their respective sites; in all three cases, the search engines retrieved the tags and their associated blogs or Web sites, which suggests that nonalphabetic characters may not negatively impact searching. ■ Discussion and Recommendations The tags examined from the three folksonomy sites cor respond closely to a number of the NISO guidelines pertaining to the structure of terms, namely in the types of concepts expressed by the tags, the predominance of single tags, the predominance of nouns, the use of recognized spelling, and the use of primarily alphabetic characters. Potential problem areas in the structure of the tags pertain to the inconsistent use of the singular and plural form of count nouns, the difficulty with creating multi term tags in Delicious, and the incidence of ambiguous tags in the form of homographs and unqualified abbre viations or acronyms. As has been seen, a significant proportion of tags that represent count nouns appears incorrectly in the singular form. Because many search engines do not deploy default truncation, the use of the singular or plural form could affect retrieval; a search for the tag computer in Delicious, for example, retrieved 208,409 hits, while one for computers retrieved 91,205 hits. Some of the results from the two searches overlapped, but only if both the singular and plural forms of the tags coexist. It would thus be useful for the help features of the folksonomy sites to explain the difference between count and noncount nouns and to discuss the impact of the form of the noun upon retrieval. While all three sites conform to the NISO recommendation that single terms be used whenever possible, some concepts cannot be expressed in this fashion, and thus folksonomy sites should accom modate the use of multiterm tags. Table 6. Nonalphabetic characters Delicious (N=76) Furl (N=208) Technorati (N=229) Hyphens — Hollywood b-day; URL- Project Consumer- Credit; Web- 2.0 Apostrophes — Mom’s medical (possessive) Valentine’s Day (possessive) Underscore Safari_export Blogger_life — Full stop — Web 2.0 (part of product name) Web-2.0 (part of product name) Forward slash — — /Africa + sign — JCR+ — 22 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200722 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Furl and Technorati allow for their use, but make no mention of this feature in their help screens, which means that such tags may be constructed inconsistently—for example, by the insertion of punctuation—where a sim ple space between the tags will suffice. As has been seen, Delicious does not allow directly for the construction of multiterm tags, and in its instructions it actually promotes inconsistency in how various punctuation devices may be used to conflate two or three separate tags, once again at the detriment of retrieval, as is shown below: Opensource: 103,476 hits Open_source: 91, 205 hits Open.source: 26,494 hits Delicious should consider allowing for the insertion of spaces between the composite words of a compound tag; without this facility, users may be unaware of how to create compound tags. Alternatively, Delicious should recommend the use of only one punctuation symbol to conflate terms, such as the underscore. Furl and Technorati should explain clearly that compound tags may be formed by the simple convention of placing a space between the terms. Ambiguous headings constitute the most problematic area in the construction of the tags; these headings take the form of homographs and abbreviations or acronyms. In the case of computerrelated product names, it may be safe to assume that in the context of an online environ ment it is likely that the meaning of these product names is relatively selfevident. In the case of the tag Yahoo, for example, none of the sites or blogs associated with this tag pertained to “a member of a race of brutes in Swift’s Gulliver’s Travels who have the form and all the vices of humans, or a boorish, crass, or stupid person” (Merriam Webster 2007), but referred consistently to the Internet service provider and search engine. On the other hand, the tag Ajax was used to refer to Asynchronous JavaScript and XML technology as well as to a number of mainly European soccer teams. Given the international audience of these folksonomy sites, it may be unwise to assume that the meanings of these homographs are selfevident. Library of Congress Subject Headings often uses parenthetical qualifiers to clarify the meaning of terms— for example, Python (Computer program language)—even though this goes against NISO recommendations. It is unlikely, however, that such use of parentheses will be effective in the folksonomy sites. A search for Opera (browser), for example, will likely imply an underlying AND Boolean operator, which detracts from the pur pose and value of the parenthetical qualifier; this was confirmed in a Furl search, where the terms Opera and Browser appeared either immediately adjacent to each other or within the same document. The application of the section of the NISO guidelines pertaining to abbreviations and acronyms is particularly difficult, as it is important to balance between using abbre viated forms of concepts that are so wellknown that the full version is hardly used versus creating ambiguous tags. The fact that abbreviated forms appear so prominently in the daily logs of the three folksonomy sites suggests that the full forms of these tags are, in fact, very wellestablished. At face value, therefore, many of the abbreviated tags are ambiguous because they can refer to different concepts, but it is questionable whether such tags as CSS, Flash, Apple, and RSS, for example are, in fact, ambiguous to the users of the sites. The use of the full forms for these tags seems cumbersome, as these concepts are hardly ever referred to in their full form. It could possibly be argued, in fact, that in some cases, the full forms may not be familiar; I may know to what concept RSS refers, for example, without knowing the specific words represented by the letters R, S, S. The possible ambiguity of abbreviated forms is com pounded by the fact that none of the three folkson omy sites allows for crossreferences between equivalent terms, which is a standard feature of most controlled vocabularies, for example: NFL/National Football League USE National Football League/Used For NFL The help screens of the three sites do not address the notion of ambiguity in the construction of tags: They do not draw people’s attention to the inherent ambigu ity of abbreviated forms that may represent more than one concept. The sites also fail to address the fact that abbreviated forms (or any tag, for that matter) may be culturally based, so that while the meaning of NFL may be obvious to North American users, this may not be the case for people who live in other geographic areas. It may be useful for the folksonomy sites to add direct links to an online dictionary and to Wikipedia, and to encourage people to use these sites to determine whether their cho sen tags may have more than one application or meaning; I had not realized, for example, that RSS could represent twentythree different concepts until I used Wikipedia and was led to a disambiguation page. Access to these external sources may help users decide which full version of the abbreviation to use in the case of ambiguity. The examination of the structure of the tags pointed to some deficiencies in section 6 of the NISO guidelines, specifically its occasional lack of sufficient definition or explanation of some of its recommendations. The guidelines list seven types of concepts that are typically represented by controlled vocabulary terms, but rely only upon a few examples to define the meaning and scope of these concepts. The guidelines thus provide no consistent mechanism by which the creators of terms can assess consistently the types of concepts represented. How, for example, is a discipline to be determined? Does the term business represent a discipline if it is a subject area that is taught formally in a postsecondary institute, for ARTICLE TITLE | AUTHOR 23THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 23 example? Is it necessary for a discipline to be recognized as such among a majority of educational institutions? In its examples for events, NISO lists holidays and revolutions. It is unclear, however, what level of specificity applies to this concept; would Christmas, for example, be considered an event or a unique entity/proper noun (which is listed separately from types of concepts)? It is only later in the guidelines, under the examples provided for unique enti ties (for example, Fourth of July), that one may assume that a named event should be considered a unique entity. Verbal nouns also are difficult to determine based only upon the NISO examples, and once again no guidelines are provided to determine whether a noun represents an activity or a thing, or possibly both; for example, skiing or clipping. The lack of clear definitions in NISO also appeared in the section pertaining to slang, neologisms, and jargon, which are considered to be nonstandard terms that do not generally appear in dictionaries. As was discussed previ ously, it is not clear at what point a jargon term or a slang term becomes a neologism. All of the slang tags found in the three sites (for example, babe) appeared in Merriam Webster, which may serve to make this NISO section even more ambiguous. ■ Conclusion The most notable suggested weaknesses of folksonomies are their potential for ambiguity, polysemy, synonymy, and basic level variation as well as the lack of consistent guidelines for the choice and form of tags. The examina tion of the tags of the three folksonomy sites in light of the NISO guidelines suggests that ambiguity and polysemy (such as homographs) are indeed problems in the struc ture of the folksonomy tags, although the actual propor tion of homographs and ambiguous tags each constitutes fewer than onequarter of the tags in each of the three folksonony sites. In other words, although ambiguity and polysemy are certainly problematic areas, most of the tags in each of the three sites are unambiguous in their meaning and thus conform to NISO recommendations. The help sites of the three folksonomy provide few tangible guidelines for (1) the construction of tags, which affects the construction of multiterm tags; and (2) the clear distinction between the singular and plural forms of count versus noncount nouns. As has been shown, the use of the singular or plural forms of terms, as well as the use of punctuation to form multiterm tags, affects search results. A large proportion of the tags in all three sites consists of single terms, which mitigates the impact on retrieval, but the inconsistent use of the singular and plural forms of nouns is indeed significant and thus may have marked effect upon retrieval. Synonymy and basic level variation were not examined in this study, but are certainly worthy of further exploration. In other areas, the tags conform closely to the NISO guidelines for the choice and form of controlled vocabu laries. The tags represent mostly nouns, with very few unqualified adjectives or adverbs. The tags represent the types of concepts recommended by NISO and conform well to recognized standards of spelling. Most of the tags conform to standard usage; there are few instances of nonstandard usage, such as slang or jargon. In short, the structure of the tags in all three sites is well within the standards established and recognized for the construction of controlled vocabularies. Should library catalogs decide to incorporate folkson omies, they should consider creating clearly written rec ommendations for the choice and form of tags that could include the following areas: ■ The difference between count and noncount nouns, as well as an explanation of how the use of the sin gular and plural forms affects retrieval. ■ One standard way in which to construct multiterm tags; for example, the insertion of a space between the component terms, or the use of an underscore between the terms. ■ A link to a recognized online dictionary and to Wikipedia to enable users to determine the meanings of terms, to disambiguate amongst homographs, and to determine if the full form would be preferable to the abbreviated form. An explanation of the impact of ambiguous tags and homographs upon retrieval would be useful. ■ An acceptable use policy that would cover areas of potential concern, such as the use of potentially offensive tags, overly graphic tags, and so forth. Although such terms were not the focus of this study, their presence was certainly evident in some cases, and would need to be considered in an environment that includes clients of all ages. With the use of such expanded guidelines and links to useful external reference sources, folksonomies could serve as a very powerful and flexible tool for increasing the userfriendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and userdriven readers’ advisory services. Works Cited Bateman, S., C. Brooks, and G. McCalla. 2006. Collabora- tive tagging approaches for ontological metadata in adaptive e-learning systems. http://www.win.tue.nl/SWEL/2006/ cameraready/02bateman_brooks_mccalla_SWEL2006_ final.pdf (accessed Jan. 11, 2007). 2� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20072� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Bruce, H., W. Jones, and S. Dumais. 2004. Keeping and re-finding information on the web: What do people do and what do they need? Seattle: Information School. http://kftf.ischool.washington .edu/refinding_information_on_the_web3.pdf (accessed Jan. 11, 2007). Cattuto, C., V. Loreto, and L. Pietronero. 2006. Collaborative tag- ging and semiotic dynamics. http://arxiv.org/PS_cache/cs/ pdf/0605/0605015.pdf (accessed Jan. 11, 2007). Del.icio.us. 2006a. Del.ico.us/about. http://del.icio.us/about/ (accessed Jan. 11, 2007). Del.icio.us. 2006b. Del.ico.us/help/tags. http://del.icio.us/help/ tags (accessed Jan. 11, 2007). Dempsey, L. 2003. The recombinant library: portals and people. Journal of Library Administration 39, no. 4: 103–36. Fichter, D. 2006. Intranet applications for tagging and folkson omies. Online 30, no. 3: 43–45. Furl. 2006. How to save a page in Furl. http://www.furl.net/ howToSave.jsp (accessed Jan. 11, 2007). Golder, S. A., and B. A. Huberman. 2006. Usage patterns of col laborative tagging systems. Journal of Information Science 32, no. 2: 198–208. Guy, M., and E. Tonkin. 2006. Tidying up tags? D-Lib Magazine 12, no. 1. http://www.dlib.org/dlib/Jan.06/guy/01guy.html (accessed Jan. 11, 2007). Ketchell, D. S. 2000. Too many channels: making sense out of portals and personalization. Information Technology and Librar- ies 19, no. 4: 175–79. Kroski, E. 2006. The hive mind: folksonomies and user-based tag- ging. http://infotangle.blogsome.com/2005/12/07/thehive mindfolksonomiesanduserbasedtagging/ (accessed Jan. 11, 2007). Mathes, A. 2004. Folksonomies—Ccooperative classification and com- munication through shared metadata. http://www.adammathes .com/academic/computermediatedcommunication/ folksonomies.html (accessed Jan. 11, 2007). Merholz, P. 2004. Ethnoclassification and vernacular vocabularies. http://www.peterme.com/archives/000387.html (accessed Jan. 11, 2007). MerriamWebster. (2007). Yahoo. http://www.mw.com/ (accessed Jan. 11, 2007). Michlmayr, E. 2005. A case study on emergent semantics in communities. http://wit.tuwien.ac.at/people/michlmayr/ publications/michlmayr_casestudy_on_emergentsemantics _final.pdf (accessed Jan. 11, 2007). Mitchell, R. L. 2005. Tag teams wrestle with Web content. Com- puterworld 38, no. 16: 31. NISO. 2005. Guidelines for the construction, format, and management of monolingual controlled vocabularies. ANSI/NISO Z39.192005. Bethesda, Md.: National Information Standards Organization. http://www.niso.org/standards/resources/Z39192005 .pdf (accessed Jan. 11, 2007). Quintarelli, E. 2005. Folksonomies: Power to the people. http:// www.iskoi.org/doc/folksonomies.htm (accessed Jan. 11, 2007). Shirky, C. 2004. Folksonomy. http://www.corante.com/many/ archives/2004/08/25/folksonomy.php (accessed Jan. 11, 2007). Spiteri, L. F. 2006. The use of folksonomies in public library cata logues. The Serials Librarian 51, no. 2: 75–89. Szekely, B., and E. Torres. 2005. Ranking bookmarks and bistros: Intelligent community and folksonomy development. http:// torrez.us/archives/2005/07/13/tagrank.pdf. (accessed Jan. 11, 2007). Technorati. 2006. Technorati help:Tags. http://www.technorati. com/help/tags.html (accessed Jan. 11, 2007). Trant, J., and B. Wyman. (2006). Investigat- ing social tagging and folksonomy in art muse- ums with steve.museum. http://www.archimuse .com/research/www2006taggingsteve.pdf (accessed Jan. 11, 2007). Udell, J. 2004. Collaborative knowledge gardening. http://www. infoworld.com/article/04/08/20/34OPstrategic_1.html (accessed Jan. 11, 2007). Vander Wal, T. 2006. Understanding folksonomy: Tagging that works. http://s3.amazonaws.com/2006presentations/ dconstruct/Tagging_in_RW.pdf (accessed Jan. 11, 2007). Vanderwal.net. 2005. Folksonomy definition and Wikipedia. http:// www.vanderwal.net/random/entrysel.php?blog=1750 (accessed Jan. 11, 2007). Wikipedia. 2006. Folksonomy. http://en.wikipedia.org/wiki/ Folksonomy (accessed Jan. 11, 2007). Wu, H., M. Zubair, and K. Maly. 2006. Harvesting social knowledge from folksonomies. http://delivery.acm.org/10.1145/1150000/ 1149962/p111wu.pdf (accessed Jan. 11, 2007). ARTICLE TITLE | AUTHOR 25THE STRUCTURE AND FORM OF FOLKSONOMY TAGS | SpITERI 25 Appendix A: List of NISO elements 6.3 Term Form 6.3.1 Single Word vs. Multiword Terms 6.3.2 Types of Concepts Terms for things and their physical parts Terms for materials Terms for activities or processes Terms for events or occurrences Terms for properties or states Terms for disciplines or subject fields Terms for units of measurement 6.3.3 Unique Entities 6.4 Grammatical Forms of Terms 6.4.1 Nouns and Noun Phrases 6.4.1.1 Verbal Nouns 6.4.1.2 Noun Phrases 6.4.1.2.1 Premodified Noun Phrases 6.4.1.2.2 Postmodified Noun Phrases 6.4.2 Adjectives 6.4.3 Adverbs 6.5 Nouns 6.5.1 Count Nouns 6.5.2 Mass Nouns 6.5.3 Other Types of Singular Nouns 6.5.3.1 Abstract Concepts 6.5.3.2 Unique Entities 6.6.2 Spelling 6.6.2.1 Spelling—Warrant 6.6.2.2 Spelling—Authorities 6.6.3 Abbreviations, Initialisms, and Acronyms 6.6.3.1 Preference for Abbreviation 6.6.3.2 Preference for Full Form 6.6.3.2.1 General Use 6.6.3.2.2 Ambiguity 6.6.4 Neologisms, Slang, and Jargon 6.7.1 Capitalization and Nonalphabetic Characters 3273 ---- 26 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Author ID box for 2 column layout Wikis in Libraries Matthew M. Bejune Wikis have recently been adopted to support a variety of collaborative activities within libraries. This article and its companion wiki, LibraryWikis (http://librarywikis. pbwiki.com/), seek to document the phenomenon of wikis in libraries. This subject is considered within the frame- work of computer-supported cooperative work (CSCW). The author identified thirty-three library wikis and developed a classification schema with four categories: (1) collaboration among libraries (45.7 percent); (2) collabo- ration among library staff (31.4 percent); (3) collabora- tion among library staff and patrons (14.3 percent); and (4) collaboration among patrons (8.6 percent). Examples of library wikis are presented within the article, as is a discussion for why wikis are primarily utilized within categories I and II and not within categories III and IV. It is clear that wikis have great utility within libraries, and the author urges further application of wikis in libraries. I n recent years, the popularity of wikis has skyrocketed. Wikis were invented in the mid1990s to help facilitate the exchange of ideas between computer programmers. The use of wikis has gone far beyond the domain of com puter programming, and now it seems as if every Google search contains a Wikipedia entry. Wikis have entered into the public consciousness. So, too, have wikis entered into the domain of professional library practice. The purpose of this research is to document how wikis are used in librar ies. In conjunction with this article, the author has created LibraryWikis (http://librarywikis.pbwiki.com/), a wiki to which readers can submit additional examples of wikis used in libraries. The article will proceed in three sections. The first section is a literature review that defines wikis and introduces computersupported cooperative work (CSCW) as a context for understanding wikis. The second section documents the author’s research and presents a schema for classifying wikis used in libraries. The third section considers the implications of the research results. ■ Literature review What’s a wiki? Wikipedia (2007a) defines a wiki as: a type of Web site that allows the visitors to add, remove, edit, and change some content, typically with out the need for registration. It also allows for linking among any number of pages. This ease of interaction and operation makes a wiki an effective tool for mass collaborative authoring. Wikis have been around since the mid1990s, though it is only recently that they have become ubiquitous. In 1995, Ward Cunningham launched the first wiki, WikiWikiWeb (http://c2.com/cgi/wiki), which is still active today, to facilitate the exchange of ideas among computer program mers (Wikipedia 2007b). The launch of WikiWikiWeb was a departure from the existing model of Web communica tion ,where there was a clear divide between authors and readers. WikiWikiWeb elevated the status of readers, if they so chose, to that of content writers and editors. This model proved popular, and the wiki technology used on WikiWikiWeb was soon ported to other online communi ties, the most famous example being Wikipedia. On January 15, 2001, Wikipedia was launched by Larry Sanger and Jimmy Wales as a complementary project for the nowdefunct Nupedia encyclopedia. Nupedia was a free, online encyclopedia with articles written by experts and reviewed by editors. Wikipedia was designed as a feeder project to solicit new articles for Nupedia that were not submitted by experts. The two services coexisted for some time, but in 2003 the Nupedia servers were shut down. Since its launch, Wikipedia has undergone rapid growth. At the close of 2001, Wikipedia’s first year of operation, there were 20,000 articles in eighteen language editions. As of this writing, there are approximately seven million articles in 251 languages, fourteen of which have more than 100,000 articles each. As a sign of Wikipedia’s growth, when this manuscript was first submitted four months earlier, there were more than five million articles in 250 languages. Author’s note: Sources in the previous two para graphs come from Wikipedia. The author acknowledges the concerns within the academy regarding the practice of citing Wikipedia within scholarly works; however, it was decided that Wikipedia is arguably an authoritative source on wikis and itself. Nevertheless, the author notes that there were changes—insubstantial ones—to the cited Wikipedia entries between when the manuscript was first submitted and when it was revised four months later. Wikis and CSCW Wikis facilitate collaborative authoring and can be con sidered one of the technologies studied under the domain of CSCW. In this section, CSCW is explained and it is shown how wikis fit within this framework. CSCW is an area of computer science research that considers the application of computer technology to sup port cooperative, also referred to as collaborative work. The term was first coined in 1984 by Irene Greif (1988) and Matthew M. Bejune (mbejune@purdue.edu) is an Assistant Professor of Library Science at Purdue University Libraries. He also is a doctoral student at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. ARTICLE TITLE | AUTHOR 27WIKIS IN LIBRARIES | BEJUNE 27 Paul Cashman to describe a workshop they were planning on the support of people in work environments with com puters. Over the years there have been a number of review articles that describe CSCW in greater detail, including Bannon and Schmidt (1991), Rodden (1991), Schmidt and Bannon (1992), Sachs (1995), Dourish (2001), Ackerman (2002), Olson and Olson (2002), Dix, Finlay, Abowd, and Beale (2004), and Shneiderman and Plaisant (2005). Publication in the field of CSCW primarily occurs through conferences. The first conference on CSCW was held in 1986 in Austin, Texas. Since then, the conference has been held biennially in the United States. Proceedings are published by the Association for Computing Machinery (ACM, http://www.acm.org/). In 1991, the first European Conference on Computer Supported Cooperative Work (ECSCW) was held in Amsterdam. ECSCW also is held biennially, in oddnumbered years. ECSCW proceedings are published by Springer (http://www.ecscw.unisie gen.de/). The primary journal for CSCW is Computer Supported Cooperative Work: The Journal of Collaborative Computing. Publications also appear within publications of the ACM and CHI, the Conference on Human Factors in Computing. CSCW and libraries As libraries are, by nature, collaborative work envi ronments—library staff working together and with patrons—and as digital libraries and computer technolo gies become increasingly prevalent, there is a natural fit between CSCW and libraries. The following researchers have applied CSCW to libraries. Twidale et al. (1997) pub lished a report sponsored by the British Library Research and Innovation Centre that examined the role of col laboration in the informationsearching process to inform how information systems design could better address and support collaborative activity. Twidale and Nichols (1998) offered ethnographic research of physical collaborative environments—in a university library and an office—to aid the design of digital libraries. They wrote two reviews of CSCW as applied to libraries—the first was more com prehensive (Twidale and Nichols 1998) than the second (Twidale and Nichols 1999). Sánchez (2001) discussed collaborative environments designed and prototyped for digital library environments. Classification of collaboration Technologies that facilitate collaborative work are typically classified within CSCW across two continua: synchronous versus asynchronous, and colocated versus remote. If put together in a twobytwo matrix, there are four possibilities: (1) synchronous and colocated (same time, same place); (2) synchronous and remote (same time, different place); (3) asynchronous and remote (different time, different place); and (4) asynchronous and colocated (different time, same place). This classification schema was first proposed by Johansen et al. (1988). Nichols and Twidale (1999) mapped work applications within the realm of CSCW in figure 1. Wikis are not present in the figure, but their absence is not an indication that they are not cooperative work technologies. Rather, wikis were not yet widely in use at the time CSCW was considered by Nichols and Twidale. The author has added wikis to Nichols and Twidale’s graphical representation in figure 2. Interestingly, wikis are bordercrossers fitting within two quadrants: the upper right—asynchronous and colocated; and the lower right—asynchronous and remote. Wikis are asynchronous in that they do not require people to be working together at the same time. They are both colocated and remote in that people working collaboratively may not need to be working in the same place. It is also interesting to note that library technologies also can be mapped using Johansen’s schema. Nichols and Twidale (1999) also mapped this, and figure 3 illus trates the variety of collaborative work that goes on within libraries. ■ Method In order to to discover the widest variety of wikis used in libraries, the author searched for examples of wikis used in libraries within three areas—the LIS literature, the Library Success Wiki, and within messages posted on three professional electronic discussion lists. When examples were found, they were logged and classified according to a schema created by the author. Results are presented in the next section. The first area searched was within the LIS literature. The author utilized the Wilson Library Literature and Figure 1. Classification of CSCW applications Co-located Remote Synchronous Asynchronous meeting rooms distributed meetings MUDs and MOOs shared drawing video conferencing collaborative writing team rooms organizational memory workflow Web-based applications collaborative writing 2� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20072� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Information Science database. There were two main types of articles: ones that argued for the use of wikis in libraries, and ones that were case studies of wikis that had been implemented. The second area searched was within Library Success: A Best Practices Wiki (http://www.libsuccess.org/) (see figure 4), created by Meredith Farkas, distance learning librarian at Norwich University. As the name implies, it is a place for people within the library community to share their success stories. Posting to the wiki is open to the public, though registration is encouraged. There are many subject areas on the wiki, including management and leadership, readers’ advisory, reference services, infor mation literacy, and so on. There also is a section about collaborative tools in libraries (http://www.libsuccess .org/index.php?title=Collaborative_Tools_in_Libraries), in which examples of wikis in libraries are presented. Within this section there is a presentation about wikis made by Farkas (2006) titled Wiki World (http://www. libsuccess.org/indexphp?title=Wiki_World), from which examples were culled. The third area that was searched was professional electronic discussion list messages from Web4lib, DIG_ REF, and LIBREFL. The Web4Lib electronic discussion list (Tennant 2005) is “for the discussion of issues relating to the creation, management, and support of library based World Wide Web servers, services, and applica tions.” The list is moderated by Roy Tennant and the Web4Lib advisory board and was started in 1994. The DIG_REF electronic discussion list is a forum for “people and organizations answering the questions of users via the Internet” (WebJunction n.d.). The list is hosted by the Information Institute of Syracuse, School of Information Studies, Syracuse University, and was created in 1998. The LIBREFL electronic discussion list is “a moderated discussion of issues related to reference librarianship (Balraj 2005). Established in 1990, it’s operated out of Kent State University and moderated by a group of list own ers. These three electronic discussion lists were selected for two reasons. First, the author is a subscriber to each electronic discussion list, and prior to the research noted there were messages about wikis in libraries. Second, based on the descriptions of each electronic discussion list stated above, the selected electronic discussion lists reasonably covered the discussion of wikis in libraries within the professional library electronic discussion lists. One year of messages, November 15, 2005, through November 14, 2006, was analyzed for each list. Messages about wikis in libraries were identified through key word searches against the author’s personal archive of electronic discussion list messages collected over the Figure 2. Classification of CSCW applications including wikis Co-located Remote Synchronous Asynchronous meeting rooms distributed meetings MUDs and MOOs shared drawing video conferencing collaborative writing Wikis team rooms Wikis organizational memory workflow Web-based applications collaborative writing Figure 3. Classification of collaborative work within libraries Co-located Remote Synchronous Asynchronous personal help reference interview issue of book on loan fact-to-face interactions use of OPACs database search video conferencing telephone notice boards Post-It notes memos documents for study social information filtering e-mail, voicemail distance learning postal services Figure �. Library Success: A Best Practices wiki (http://www. libsuccess.org/) ARTICLE TITLE | AUTHOR 29WIKIS IN LIBRARIES | BEJUNE 29 years. An alternative method would have been to search the Web archive of each list, but the author found it easier to search within his mail client, Microsoft Outlook. Messages with the word “wiki” were found in 513 mes sages: 354 in Web4lib, 91 in DIG_REF, and 68 in LIBREF L. This approach had high recall, as discourse about wikis frequently included the use of the word “wiki,” though low precision, as there were many results that were not about wikis used in libraries. Common false hits included messages about the Nature study (Giles 2005) that com pared Wikipedia to Encyclopedia Britannica, and messages that included the word “wiki” but were simply refer ring to wikis, though not examples of wikis used within libraries. From the list of 513 messages, the author read each message and came up with a much shorter list of thirtynine messages about wikis in libraries: thirtytwo in Web4Lib, three in DIG_REF, and four in LIBREFL. ■ Results Classification of the results After all wiki examples had been collected, it became clear that there was a way to classify the results. In Farkas’s (2006) presentation about wikis, she organized wikis in two categories: (1) how libraries can use wikis with their patrons; and (2) how libraries can use wikis for knowledge sharing and collaboration. This schema, while it accounts for two types of collaboration, is not granular enough to represent the types of collaboration found within the wiki examples identified. As such, it became clear that another schema was needed. Twidale and Nichols (1998) identified three types of collaboration within libraries: (1) collaboration among library staff; (2) collaboration between a patron and a member of staff; and (3) collaboration among library users. Their classification schema mapped well to the examples of wikis that were identified; however, it too was not granular enough, as it did not distinguish among col laboration between library staff intraorganizationally and extraorganizationally, the two most common types of wiki usage found in the research (see appendix). To account for these types of collaboration, which are common not only to wiki use in libraries but to all professional library prac tice, the author modified Twidale and Nichols schema (see figure 6). The improved schema also uniformly represents entities across the categories—library staff and member of staff are referred to as “library staff,” and patrons and library users are referred to as “patrons.” Examples of wikis used in libraries for each category are provided to better illustrate the proposed classifica tion schema. ■ Collaboration among libraries The Library Instruction Wiki (http://instructionwiki .org/Main_Page) is an example of a wiki that is used for collaboration among libraries (figure 7). It appears as though the wiki was originally set up to support library instruction within Oregon—it is unclear if this was asso ciated with a particular type of library, say academic or public—but now the wiki supports library instruction in general. The wiki is selfdescribed as: a collaboratively developed resource for librarians involved with or interested in instruction. All librarians and others interested in library instruction are welcome and encouraged to contribute. The tagline for the wiki is “stop reinventing the wheel”(Library Instruction Wiki 2006). From this wiki there Figure 6. Four types of collaboration within libraries 1. Collaboration among libraries (extra-organizational) 2. Collaboration among library staff (intra-organizational) 3. Collaboration among library staff and patrons 4. Collaboration among patrons Figure 5. Wiki World (http://www.libsuccess.org/index.php?title=Wiki _World) 30 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200730 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 is a list of library instruction resources that include the fol lowing: handouts, tutorials, and other resources to share; teaching techniques, tips, and tricks; classspecific Web sites and handouts; glossary and encyclopedia; bibliography and suggested reading; and instructionrelated projects, brainstorms, and documents. Within the handouts, tutori als, and other resources to share section, the author found a wide variety of resources from libraries across the country. Similarly, there were a number of suggestions to be found under the teaching techniques, tips, and tricks section. Another example of a wiki used for collaboration among libraries is the Library Success wiki (http://www .libsuccess.org/), one of the sources of examples of wikis used in this research. Adding to earlier descriptions of this wiki as presented in this paper, Library Success seems to be one of the most frequently updated library wikis and perhaps the most comprehensive in its cover age of library topics. ■ Collaboration among library staff The University of Connecticut Libraries’ Staff Wiki (http:// wiki.lib.uconn.edu/) is an example of a wiki used for col laboration among library staff (figure 8). This wiki is a knowledge base containing more than one thousand infor mation technology services (ITS) documents. ITS docu ments support the information technology needs of the library organization. Examples include answers to com monly asked questions, user manuals, and instructions for a variety of computer operations. In addition to being a repository of ITS documents, the wiki also serves as a portal to other wikis within the University of Connecticut Libraries. There are many other wikis connected to library units; teams; software applications, such as the Libraries ILS; libraries within the University of Connecticut Libraries; and other University of Connecticut campuses. The Health Science Library Knowledge Base, Stony Brook University (http://appdev.hsclib.sunysb.edu/ twiki/bin/view/Main/WebHome) is another example of a wiki that is used for collaboration among library staff (figure 9). The wiki is described as “a space for the dynamic collaboration of the library staff, and a platform of shared resources” (Health Sciences Library 2007). On the wiki there are the following content areas: news and announcements; HSL departments; projects; trouble shooting; staff training resources, working papers and support materials; and community activities, scholarship, conferences, and publications. ■ Collaboration among library staff and patrons There are only a few examples of wikis used for collabora tion among library staff and patrons to cite as exemplars. One example is the St. Joseph County Public Library (SJPL) Subject Guides (http://www.libraryforlife.org/ subjectguides/index.php/Main_Page), seen in figure 10. This wiki is a collection of resources and services in print and electronic formats to assist library patrons with subject area searching. As the wiki is published by library staff for public consumption, it has more of a professional feel than wikis from the first two categories. Pages have images, and the content is structured to look like a standard Web page. Though the wiki looks like a Web page, there still remain a number of edit links that follow each section of text on the Wiki. While these tags bear importance for those editing Figure 7. Library Instruction wiki (http://instructionwiki.org/) Figure �. The University of Connecticut Libraries’ staff wiki (http:// wiki.lib.uconn.edu/) ARTICLE TITLE | AUTHOR 31WIKIS IN LIBRARIES | BEJUNE 31 the wiki—library staff only in this case—they undoubtedly puzzle library patrons who think that they have the ability to edit the wiki when, in fact, they do not. Another example of collaboration between library staff and patrons that takes a similar approach is the USC Aiken GreggGraniteville Library Web site (http://library. usca.edu/) in figure 11. As with the SJPL Subject Guides, this wiki looks more like a Web site than a wiki. In fact, the USC Aiken wiki conceals its true identity as a wiki even more so than the SJPL Subject Guides. The only evidence that the Web site is a wiki is a link at the bottom of each page that says “Powered by PmWiki.” PmWiki (http:// pmwiki.org/) is a content management system that uti lizes the wiki technology on the back end to manage a Web site while retaining the look and feel of a standard Web site. It seems that the benefits of using a wiki in such a way are shared content creation and management. ■ Collaboration among patrons As there are only three examples of wikis used for col laboration among patrons, all examples will be high lighted in this section. The first example is Wiki WorldCat (http://www.oclc.org/productworks/wcwiki.htm), sponsored by OCLC. Wiki WorldCat launched as a pilot project in September 2005. The service allows users of Open WorldCat, OCLC’s Web version of WorldCat, to add book reviews to item records. Though this wiki does not have many book reviews in it, even for contemporary bestsellers, it gives a taste for how a wiki could be used to facilitate collaboration among patrons. A second example is the Biz Wiki from Ohio University Libraries (http://www.library.ohiou.edu/subjects/ bizwiki/index.php/Main_Page) (see figure 12). The Biz Wiki is a collection of business information resources avail able through Ohio University. The wiki was created by Chad Boeninger, reference and instruction librarian, as an alternate form of a subject guide or pathfinder. What separates this wiki from those in the third category, collaboration among library staff and patrons, is that the wiki is editable by patrons as well as librarians. Similarly, Butler WikiRef (http://www .seedwiki.com/wiki/butler_wikiref) is a wiki that has reviews of reference resources created by Butler librarians, faculty, staff, and students (see figure 13).Figure 9. Health Sciences Library Knowledge Base (http://appdev .hsclib.sunysb.edu/twiki/bin/view/Main/WebHome) Figure 11. USC Aiken Gregg-Graniteville Library (http://library.usca .edu/) Figure 10. SJCPL Subject Guides (http://libraryforlife.org/subject guides/index.php/Main_Page/) 32 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200732 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Full Results Thirtythree wikis were identified. Two wikis were classi fied in two categories each. The full results are available in the appendix. Table 1 illustrates how wikis were not uniformly distributed across the four categories: category I had 45.7 percent, category II had 31.4 percent, category III had 14.3 percent, and category IV had 8.6 percent. Nearly 80 percent of all examples were found within categories I and II. As seen in some of the examples in the previous section, wikis were utilized for a variety of purposes. Here is a short list of purposes for which wikis were utilized: for sharing information, supporting association work, collecting soft ware documentation, supporting conferences, facilitating librariantofaculty collaboration, creating digital reposito ries, managing Web content, creating Intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. Wiki software utilization is summarized in tables 2 and 3. MediaWiki is the most popular software utilized by libraries (33.3 percent), followed by unknown (30.3 percent), PBWiki (12.1 percent), PmWiki (12.1 percent), SeedWiki (6.1 percent), TWiki (3 percent), and XWiki (3 percent). If the values for unknown are removed from the totals (table 3 ), MediaWiki is utilized in almost half (47.8 percent) of all library wiki applications. ■ Discussion With a wealth of examples of wikis in categories I and II and a dearth of examples of wikis in categories III and IV, the library community seems to be more comfortable using wikis to collaborate within the community, but less comfortable using wikis to collaborate with library patrons or to enable collaboration among patrons. The research results pose the questions: Why are wikis pre dominantly used for collaboration within the library community? and Why are wikis minimally used for col laborating with patrons and helping patrons to collabo rate with one another? Why are wikis predominantly used for collaboration within the library community? This is perhaps the easier of the two questions to explain. There is a long legacy of cooperation and collaboration intraorganizationally and extraorganizationally within libraries. One explanation for this is the shared bud getary climate within libraries. All too often there are insufficient money, staff, and resources to offer desired levels of service. Librarians work together to overcome these barriers. Prominent examples include coopera tive cataloging, interlibrary lending, and the formation of consortia to negotiate pricing. Another explanation can be found in the personal characteristics of library professionals. Librarianship is a service profession that consequently attracts serviceminded individuals who are interested in helping others, whether they are library patrons or fellow colleagues. A third reason is the role of library associations, such as the International Federation of Library Associations and Institutions, the American Library Association, the Special Libraries Association, and the Medical Library Association, as well as many others at the international, national, state, and local lev Figure 12. Ohio University Libraries Biz wiki (http://www.library. ohiou.edu/subjects/bizwiki) Figure 13. Butler WikiRef (http://www.seedwiki.com/wiki/butler_ wikiref) ARTICLE TITLE | AUTHOR 33WIKIS IN LIBRARIES | BEJUNE 33 els, and the work that is done through these associations at annual conferences and throughout the year. Libraries use wikis to collaborate intraorganizationally and extra organizationally because collaboration is what they do most naturally. Why are wikis minimally used for collaborating with patrons and helping patrons to collaborate with one another? The reasons for why libraries are only minimally using wikis to collaborate with patrons and for patron collabora tion are more difficult to ascertain. However, due to the untapped potential of using wikis, the proposed answers to this question are more important and may lead to future implementations of wikis in libraries. Here are four pos sible explanations, some more speculative than others. First, perhaps one of the reasons is the result of the way in which libraries are conceived by library patrons and librarians alike. A strong case can be made for libraries as places of collaborative work, and the author takes this posi tion. However, historically libraries have been repositories of information, and this remains a pervasive and difficult concept to change—libraries are frequently seen simply as places to get books. In this scenario, the librarian is a gate keeper that a patron interacts with to get a book—that is, if the patron interacts with a librarian at all. It also is worthy to note that the relationship is oneway—the patron needs the assistance of librarian, but not the other way around. Viewed in these terms, this is not a collaborative situation. For libraries to use wikis for the purpose of collaborating with library patrons, it might demand the reconceptualiza tion of libraries by library patrons and librarians. Similarly, this extreme conceptualization of libraries does not con sider patrons working with one another, even though it is an activity that occurs formally and informally within libraries, not to mention with the emergence of interdisci plinary and multidisciplinary work. If wikis are to be used to facilitate collaboration between patrons, the conceptual ization of the library by library patrons and librarians must be expanded. Second, there may be fears within the library commu nity about authority, responsibility, and liability. Libraries have long held the responsibility of ensuring the authority of the bibliographic catalog. If patrons are allowed to edit the library wiki, there is potential for negatively affecting the authority of the wiki and even the perceived author ity of the library. Likewise, there is potential liability in allowing patrons to post to the library wiki. Similar con Table 2. Software totals Wiki software No. % MediaWiki 11 33.3 Unknown 10 30.3 PBWiki 4 12.1 PmWiki 4 12.1 SeedWiki 2 6.1 TWiki 1 3 XWiki 1 3 Total: 33 100 Table 3. Software totals without unknowns Wiki software No. % MediaWiki 11 47.8 PBWiki 4 17.4 PmWiki 4 17.4 SeedWiki 2 8.7 TWiki 1 4.3 XWiki 1 4.3 Total: 23 100.0 Table 1. Classification summary Category No. % I: Collaboration among libraries 16 45.7 II: Collaboration among library staff 11 31.4 III: Collaboration among library staff and patrons 5 14.3 IV: Collaboration among patrons 3 8.6 Total: 35 100.0 3� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20073� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 cerns have been raised in the past about other collabora tive technologies, such as blogs, bulletin boards, mailing lists, and so on, all aspects of the Library 2.0 movement. If libraries are fully to realize Library 2.0 as described by Casey and Savastinuk (2006), Miller (2006), and Courtney (2007), these issues must be considered. Third, perhaps it is due to a matter of fit. It might be the case that wikis are utilized in categories I and II and not within categories III and IV because the tools are better suited to support the types of activities within categories I and II. Consider some of the activities listed earlier: sup porting association work, collecting software documenta tion, supporting conferences, creating digital repositories, creating Intranets, and creating knowledge bases. Each of these illustrates a wiki that is utilized for the creation of a resource with multiple authors and readers, tasks that are wellsuited to wikis. Wikipedia is a great example of a wiki with clear, shared tasks for multiple authors and multiple readers and a sense of persistence over time. In contrast, relationships between library staff and patrons do not typically lead to the shared creation of resources. While it is true that the relationship between patron and librarian in the context of a patron’s research assignment can be collab orative depending on the circumstances, authorship is not shared but is possessed by the patron. In addition, research assignments in the context of undergraduate coursework are shortlived and seldom go beyond the confines of a particular course. In terms of patrons working together with other patrons, there is the precedent of group work; however, groups often produce projects or papers that share the characteristics of nongroup research assignments listed above. This, of course, does not mean that wikis are not suitable for collaboration within categories III and IV, but perhaps the opportunities for collaboration are fewer or that they stretch the imagination of the types and ways of doing collaborative work. Fourth, perhaps it is a matter of “Not yet.” While the research has shown that libraries are not utilizing wikis in categories III and IV, this may be because it is too soon. It should be noted that wikis are still new technologies. It might be the case that librarians are experimenting in safer contexts so they will gain experience prior to trying more public projects where their expertise will be needed. If this explanation is true, it is expected that more exam ples of wikis in libraries will soon emerge. As they do, the author hopes that all examples of wikis in libraries, new and old, will be added to the companion wiki to this article, LibraryWikis (http://librarywikis.pbwiki.com/). ■ Conclusion It appears that wikis are here to stay, and that their utili zation within libraries is only just beginning. This article documented the current practice of wikis used in libraries using CSCW as a framework for discussion. The author located examples of wikis in three places: within the LIS lit erature, on the Library Success wiki, and within messages from three professional electronic discussion lists. Thirty three examples of wikis were identified and classified using a classification schema created by the author. The schema has four categories: (1) collaboration among librar ies; (2) collaboration among library staff; (3) collaboration among library staff and patrons; and (4) collaboration among patrons. Wikis were used for a variety of purposes, including for sharing information, supporting associa tion work, collecting software documentation, supporting conferences, facilitating librariantofaculty collaboration, creating digital repositories, managing Web content, creat ing Intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. By and large, wikis were primarily used to support collaboration among library staff intraorganiza tionally and extraorganizationally, with nearly 80 percent (45.7 percent and 31.4 percent respectively) of the examples so identified, and less so in the support of collaboration among library staff and patrons (14.3 percent) and col laboration among patrons (8.6 percent). A majority of the examples of wikis utilized the MediaWiki software (47.8 percent). It is clear that there are plenty of examples of wikis utilized in libraries, and more to be found each day. It is at this time that the profession is faced with extending the use of this technology, and it is to the future to see how wikis will continue to be used within libraries. Works Cited Ackerman, Mark S. 2002. The intellectual challenge of CSCW: The gap between social requirements and technical feasibil ity. In Human-computer interaction in the new millennium, ed. John M. Carroll, 179–203. New York: AddisonWesley. Balraj, Leela, et al. 2005 LIBREFL. Kent State University Librar ies. http://www.library.kent.edu/page/10391 (accessed June 12, 2007). Archive is available at this link as well. Bannon, Liam J., and Kjeld Schmidt. 1991. CSCW: Four charac ters in search for a context. In Studies in computer supported cooperative work. ed. John M. Bowers and Steven D. Benford, 3–16. Amsterdam: Elsevier. Casey, Michael E., and Laura C. Savastinuk. 2006. Library 2.0. Library Journal 131, no. 14: 40–42. http://www.libraryjournal. com/article/CA6365200.html (accessed June 12, 2007). Courtney, Nancy. 2007. Library 2.0 and beyond: Innovative technolo- gies and tomorrow’s user (in press). Westport, Conn.: Libraries Unlimited. Dix, Alan, et al. 2004. Socioorganizational issues and stake holder requirements. In Human computer interaction, 3rd ed., 450–74. Upper Saddle River, N.J.: Prentice Hall. Dourish, Paul. 2001. Social computing. In Where the action is: The foundations of embodied interaction, 55–97. Cambridge, Mass: MIT Pr. ARTICLE TITLE | AUTHOR 35WIKIS IN LIBRARIES | BEJUNE 35 Farkas, Meredith. 2006. Wiki World. http://www.libsuccess. org/index.php?title=Wiki_World (accessed June 12, 2007). Giles, Jim. 2005. Internet encyclopaedias go head to head. Nature 438: 900–01. http://www.nature.com/nature/journal/v438/ n7070/full/438900a.html (accessed June 12, 2007). Greif, Irene, ed. 1988. Computer supported cooperative work: A book of readings. San Mateo, Calif.: Morgan Kaufmann Publishers. Health Sciences Library, State University of New York, Stony Brook. 2007. Health Sciences Library Knowledge Base. http://appdev.hsclib.sunysb.edu/twiki/bin/view/Main/ WebHome (accessed June 12, 2007). Johansen, Robert, et al. 1988. Groupware: computer support for business teams. New York: Free Press. Library Instruction Wiki. 2006. http://instructionwiki.org/ Main_Page (accessed June 12, 2007). Miller, Paul. 2006. Coming together around Library 2.0. D- Lib Magazine 12, no. 4. http://www.dlib.org/dlib/april06/ miller/04miller.html (accessed June 12, 2007). Nichols, David M., and Michael B. Twidale. 1999. Com puter supported cooperative work and libraries. Vine 109: 10–15. http://www.comp.lancs.ac.uk/computing/research/ cseg/projects/ariadne/docs/vine.html (accessed June 12, 2007). Olson, Gary M., and Judith S. Olson. 2002. Groupware and com putersupported cooperative work. In The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications, ed. Julie A. Jacko and Andrew Sears, 583–95. Mahwah, N.J.: Lawrence Erlbaum Associates, Inc.. Rodden, Tom T. 1991. A survey of CSCW systems. Interacting with Computers 3, no. 3: 319–54. Sachs, Patricia. 1995. Transforming work: Collaboration, learn ing, and design. Communications of the ACM 38: 227–49. Sánchez, J. Alfredo. 2001. HCI and CSCW in the context of digi tal libraries. In CHI ‘01 extended abstracts on human fac- tors in computing systems. Conference on human factors in computing systems. Seattle, Wash., Mar. 31–Apr. 5 2001. Schmidt, Kjeld, and Liam J. Bannon. 1992. Taking CSCW seri ously: Supporting articulation work. Computer Supported Cooperative Work 1, no. 1/2: 7–40. Shneiderman, Ben, and Catherine Plaisant. 2005. Collaboration. In Designing the user interface: Strategies for effective human- computer interaction, 4th ed., 408–50. Reading, Mass.: Addison Wesley. Tennant, Roy. 2005. Web4Lib Electronic Discussion. WebJunc tion.org. http://lists.webjunction.org/web4lib/ (accessed June 12, 2007). Archive is available at this link as well. Twidale, Michael B., et al. 1997. Collaboration in physical and digital libraries. Report No. 64, British Library Research and Innovation Centre. http://www.comp.lancs.ac.uk/ computing/research/cseg/projects/ariadne/bl/report/ (accessed June 12, 2007). Twidale, Michael B., and David M. Nichols. 1998a. Using studies of collaborative activity in physical environments to inform the design of digital libraries. Technical Report CSEG/11/98, Computing Department, Lancaster University, UK. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/cscw98.html (accessed June 12, 2007). Twidale, Michael B., and David M. Nichols. 1998b. A survey of applications of CSCW for digital libraries. Technical Report CSEG/4/98, Computing Department, Lancaster University, UK. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/survey.html (accessed June 12, 2007). WebJunction. n.d. Dig_Ref electronic discussion list. http:// www.vrd.org/Dig_Ref/dig_ref.shtml (accessed June 12, 2007). Wikipedia. 2007a. Wiki. http://en.wikipedia.org/wiki/Wiki (accessed April 29, 2007). Wikipedia. 2007b. WikiWikiWeb. http://en.wikipedia.org/ wiki/WikiWikiWeb (accessed April 29, 2007). 36 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200736 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Appendix. Wikis in Libraries I = Collaboration between libraries II = Collaboration between library staff III = Collaboration between library staff and patrons IV = Collaboration between patrons Category Description Location Wiki Software I Library Success: A Best Practices Wiki—a wiki capturing library success stories. Covers a wide variety of topics. Also features a presentation about wikis http://www.libsuccess. org/index.php?title=Wiki_World http://www.libsuccess.org/ MediaWiki I Wiki for School Library Association in Alaska http://akasl.pbwiki.com/ PBWiki I Wiki to support Reserves Direct. Free, opensource software for managing academic reserves materials developed by Emory University. http://www.reservesdirect.org/ wiki/index.php/Main_Page MediaWiki I SUNYLA New Tech Wiki—a place for State University of New York (SUNY) librarians to share how they are using information technologies to interact with patrons http://sunylanewtechwiki.pbwiki. com/ PBWiki I Wiki for librarians and faculty members to collaborate across campuses. Being used with distance learning instructors and small groups Message from Robin Shapiro. On [DIG_REF] electronic discussion list dated 10/18/2006. Unknown I Discusses setting up three wikis in last month: “one to sup port a preconference workshop, another for behindthe scenes conferences planning by local organizers, and one for conference attendees to use before they arrived and during the sessions” (30). Fichter, Darlene. 2006. Using Wikis to Support Online Collaboration in Libraries. Information Outlook 10, no.1: 3031. Unknown I Unofficial wiki to the American Library Association 2005 Annual Conference http://meredith.wolfwater.com/ wiki/index.php?title=Main_Page MediaWiki I Unofficial wiki to the 2005 Internet Librarian conference http://ili2005.xwiki.com/xwiki/bin/ view/Main/WebHome XWiki I Wiki for the Canadian Library Association (CLA) 2005 Annual Conference http://wiki.ucalgary.ca/page/CLA MediaWiki I Wiki for South Carolina Library Association http://www.scla.org/Governance/ HomePage PmWiki I Wiki set up to support national discussion about Institutional Repositories in New Zealand http://wiki.tertiary.govt.nz/ ~InstitutionalRepositories PmWiki I The Oregon Library Instruction Wiki used for sharing infor mation about library instruction http://instructionwiki.org/ MediaWiki I Personal Repositories Online Wiki Environment (PROWE)— An online repository sponsored by The Open University and the University of Leicester that uses wikis and blogs to encourage the open exchange of ideas across communities of practice http://www.prowe.ac.uk/ Unknown ARTICLE TITLE | AUTHOR 37WIKIS IN LIBRARIES | BEJUNE 37 Category Description Location Wiki Software I LIS Wiki—space for collecting articles and general informa tion about Library and Information Science http://liswiki.org/wiki/Main_Page MediaWiki I Making of Modern Michigan—a wiki to support a statewide digital library project http://blog.lib.msu.edu/mmmwiki/ index.php/Main_Page Unknown (Behind Firewall) I Wiki used as a Web content editing tool in a digital library initiative sponsored by Emory University, the University of Arizona, Virginia Tech, and the University of Notre Dame http://sunylanewtechwiki.pbwiki .com/ PBWiki II Wiki at SUNY Stony Brook Health Sciences Library used as Knowledge Base http://appdev.hsclib.sunysb.edu/ twiki/bin/view/Main/WebHome; presentation can be found at: http:// ms.cc.sunysb.edu/%7Edachase/ wikisinaction.htm TWiki II Wiki at York University used internally for committee work. Exploring how to use wikis as a way to collaborate with users Message from Mark Robertson. On Web4lib electronic discussion list dated 10/13/2006. Unknown II Wiki for internal staff use at the University of Waterloo. They utilize access control to restrict parts of the wiki to groups Message from Chris Gray. On Web4lib electronic discussion list dated 08/09/2006. Unknown II Wiki at the University of Toronto for internal communica tions, technical problems, and as a document repository Message from Stephanie Walker. On LIBREFL electronic discussion list dated 10/28/2006. Unknown II Wiki used for coordination and organization of Portable Professor program, which appears to be a collaborative infor mation literacy program for remote faculty http://tfppcommittee.pbwiki.com/ PBWiki II The University of Connecticut Libraries’ Staff Wiki which is a repository of Information Technology Services documents http://wiki.lib.uconn.edu/wiki/ Main_Page MediaWiki II Wiki used at Binghamton University Libraries for staff intranet. Features pages for Committees, Documentation, Policies, Newsletters, Presentations, and Travel Reports Screenshots can be found at http://library.lib.binghamton.edu/ presentations/CIL2006/CIL%202006 _Wikis.pdf MediaWiki II Wiki used at the Information Desk at Miami University Described in: Withers, Rob. “Something Wiki this Way Comes.” C&RL News 66, no. 11 (2005): 775–77. Unknown II Use of wiki as knowledge base to support reference service http://oregonstate.edu/~reeset/ RDM/ Unknown II University of Minnesota Libraries Staff Web site in wiki form https://wiki.lib.umn.edu/ PmWiki II Wiki used to support the MIT Engineering and Science Libraries BTeam. The wiki may no longer be active, but is still available http://www.seedwiki.com/wiki/b team SeedWiki III A wiki that is subject guide at St. Joseph County Public Library in South Bend, Indiana http://www.libraryforlife.org/ subjectguides/index.php/Main_Page MediaWiki 3� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 20073� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 Category Description Location Wiki Software III Wiki used at the Aiken Library, University of South Carolina as a content management system (CMS) http://library.usca.edu/Main/ HomePage PmWiki III Doucette Library of Teaching Resources Wiki—a repository of resources for education students http://wiki.ucalgary.ca/page/ Doucette MediaWiki IV Wiki WorldCat (WikiD) is an OCLC pilot project (now defunct) that allowed users to add reviews to Open WorldCat records http://www.oclc.org/product works/wcwiki.htm Unknown III and IV WikiRef lists reviews of reference resources—databases, books, Web sites, etc. —created by Butler librarians, faculty, staff, and students. http://www.seedwiki.com/wiki/ butler_wikiref; reported in Matthies, Brad, Jonathan Helmke, and Paul Slater. Using a Wiki to Enhance Library Instruction. Indiana Libraries 25, no. 3 (2006): 32–34. SeedWiki III and IV Wiki used as a subject guide at Ohio University http://www.library.ohiou.edu/sub jects/bizwiki/index.php/Main_Page; presentation about the wiki: http://www.infotoday.com/cil2006/ presentations/C101102_Boeninger .pps MediaWiki 3274 ---- ARTICLE TITLE | AUTHOR 39 Author ID box for 2 column layout THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA 39 Author ID box for 2 column layout Knowledge organization systems denotes formally repre- sented knowledge that is used within the context of digital libraries to improve data sharing and information retrieval. To increase their use, and to reuse them when possible, it is vital to manage them adequately and to provide them in a standard interchange format. Simple knowledge orga- nization systems (SKOS) seem to be the most promising representation for the type of knowledge models used in digital libraries, but there is a lack of tools that are able to properly manage it. This work presents a tool that fills this gap, facilitating their use in different environments and using SKOS as an interchange format. U nlike the largely unstructured information avail able on the Web, information in digital libraries (DLs) is explicitly organized, described, and man aged. In order to facilitate discovery and access, DL sys tems summarize the content of their data resources into small descriptions, usually called metadata, which can be either introduced manually or automatically generated (index terms automatically extracted from a collection of documents). Most DLs use structured metadata in accor dance with recognized standards, such as MARC21 (U.S. Library of Congress 2004) or Dublin Core (ISO 2003). In order to provide accurate metadata without ter minological dispersion, metadata creators use different forms of controlled vocabularies to fill the content of typi cal keyword sections. This increase of homogeneity in the descriptions is intended to improve the results provided by search systems. To facilitate the retrieval process, the same vocabularies used to create the descriptions are usu ally used to simplify the construction of user queries. As there are many different schemas for modeling controlled vocabularies, the term knowledge organiza- tion systems (KOS) is intended to encompass all types of schemas for organizing information and promoting knowledge management. As Hodge (2000) says, “A KOS serves as a bridge between the users’ information need and the material in the collection.” Some types of KOS can be highlighted. Examples of simple types are glossaries, which are only a list of terms (usually with definitions), and authority files that control variant ver sions of key information (such as geographic or personal names). More complex are subject headings, classifica tion schemes, and categorization schemes (also known as taxonomies) that provide a limited hierarchical structure. At a more complex level, KOS includes thesauri and less traditional schemes, such as semantic networks and ontologies, that provide richer semantic relations. There is not a single KOS on which everyone agrees. As Lesk (1997) notes, while a single KOS would be advantageous, it is unlikely that such a system will ever be developed. Culture constrains the knowledge classifi cation scheme because what is meaningful to one area is not necessarily meaningful to another. Depending on the situation, the use of one or another KOS has its advan tages and disadvantages, each one having its place. These schemas, although sharing many characteristics, usually have been treated heterogeneously, leading to a variety of representation formats to store them. Thesauri are an example of the format heterogeneity problem. According to ISO2788 (norm for monolingual thesauri) (ISO 1986), a thesaurus is a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between con cepts (for example, synonyms, broader terms, narrower terms, and related terms) are made explicit. This stan dard is complemented with ISO5964 (ISO 1985), which describes the model for multilingual thesauri, but none of them describe a representation format. The lack of a stan dard representation model has caused a proliferation of incompatible formats created by different organizations. So each organization that wants to use several external thesauri has to create specific tools to transform all of them to the same format. In order to eliminate the heterogeneity of represen tation formats, the W3C initiative has promoted the development of simple knowledge organization systems (SKOS) (Miles et al. 2005) for its use in the semantic Web environment. SKOS has been created to represent simple KOS, such as subject heading lists, taxonomies, classifica tion schemes, thesauri, folksonomies, and other types of controlled vocabulary as well as concept schemes embed ded in glossaries and terminologies. Although SKOS has been recently proposed, the number and importance of organizations involved in its creation process (and that publish their KOS in this format) indicates that it will probably become a standard for KOS representation. SKOS provides a rich, machinereadable language that is very useful to represent KOS, but nobody would expect to have to create it manually or by just using a generalpurpose Resource Description Framework (RDF) editor (SKOS is RDFbased). However, in the digital library area, there are not specialized tools that are able to manage it adequately. Therefore, this work tries to fill this gap, describing an open source tool, ThManager, that ThManager: An Open Source Tool for Creating and Visualizing SKOS Javier Lacasta, Javier Nogueras-Iso, Francisco Javier López-Pellicer, Pedro Rafael Muro-Medrano, and Francisco Javier Zarazaga-Soria Javier Lacasta (jlacasta@unizar.es) is Assistant Professor, Javier Nogueras-Iso (jnog@unizar.es) is Assistant Professor, Francisco Javier López-pellicer (fjlopez@unizar.es) is Research Fellow, pedro Rafael Muro-Medrano (prmuro@ unizar.es) is Associate Professor, and Francisco Javier zarazaga-Soria (javy@unizar.es) is Associate Professor in the Computer Science and Systems Engineering Department, University of Zaragoza, Spain. �0 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007�0 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 facilitates the construction of SKOSbased KOS. Although ThManager has been created to manage thesauri, it also is appropriate to create and manage any other models that can be represented using SKOS format. This article describes the ThManager tool, highlight ing its characteristics. ThManager’s layerbased architec ture permits the reuse of the components created for the management of thesauri in other applications where they are also needed. For example, it facilitates the selection of values from a controlled vocabulary in a metadata cre ation tool, or the construction of user queries in a search client. The tool is distributed as open source software accessible through the SourceForge platform (http:// thmanager.sourceforge.net/). ■ State of the art in thesaurus tools and representation models The problem of creating appropriate content for thesauri is of interest in the DL field and other related disciplines, and an increasing number of software packages have appeared in recent years for constructing thesauri. For instance, the Web site of Willpower Information (http://www .willpower.demon.co.uk/thessoft.htm) offers a detailed revision of more than forty tools. Some are only avail able as a module of a complete information storage and retrieval system, but others also allow the possibility of working independently of any other software. Among these thesaurus creation tools, one may note the follow ing products: ■ BiblioTech (http://www.inmagic.com/). This is a multiplatform tool that forms part of BiblioTech PRO Integrated Library System and can be used to build an ANSI/NISO standard thesaurus (standard Z39.19 [ANSI 1993]). ■ Lexico (http://www.pmei.com/lexico.html). This is a Javabased tool that can be accessed and/or manip ulated over the Internet. Thesauri are saved in a textbased format. It has been used by the U.S. Library of Congress to manage such vocabularies and thesauri as the Thesaurus for Graphic Materials, the Global Legal Information Network Thesaurus, the Legislative Indexing Vocabulary, and the Symbols of American Libraries Listing. ■ MultiTes (http://www.multites.com/) is a Windows based tool that provides support for ANSI/NISO relationships plus userdefined relationships and comment fields for an unlimited number of thesauri (both monolingual and multilingual). ■ TermTree 2000 (http://www.termtree.com.au/) is a Windowsbased tool that uses Access, SQL Server, or Oracle for data storage. It can import and export TRIM thesauri (a format used by the Towers Records Information Management system [http://www.towersoft.com/]), as well as a defined TermTree 2000 tag format. ■ WebChoir (http://www.webchoir.com/) is a family of clientserver Web applications that provides dif ferent utilities for thesaurus management in multiple DBMS platforms. TermChoir is a hierarchical infor mation organizing and searching tool that enables one to create and search varieties of hierarchical subject categories, controlled vocabularies, and tax onomies based on either predefined standards or a userdefined structure, and is then exported to an XMLbased format. LinkChoir is another tool that allows indexers to describe information sources using terminology organized in TermChoir. And SeekChoir is a retrieval system that enables users to browse thesaurus descriptors and their references (broader terms, related terms, synonyms, and so on). ■ Synaptica (http://www.synaptica.com/) is a client server Web application that can be installed locally on a client’s intranet or extranet server. Thesaurus data is stored in a SQL server or Oracle database. The application supports the creation of electronic the sauri in compliance with the ANSI/NISO standard. The application allows the exchange of thesauri in CSV (commaseparated values) text format. ■ SuperThes (Batschi et al. 2002) is a Windowsbased tool that allows the creation of thesauri. It extends the ANSI/NISO relationships, allowing many pos sible data types to enrich the properties of a concept. It can import and export thesauri in XML and tabular format. ■ TemaTRES (hhttp://r020.com.ar/tematres/) is a Web application specially oriented to the creation of thesauri, but it also can be used to develop Web navigation structures or to manage the documentary languages in use. The thesauri are stored in a MySQL database. It provides the created thesauri in Zthes (Tylor 2004) or in SKOS format. Finally, it must be mentioned that, given that thesauri can be considered as ontologies specialized in organiz ing terminology (Gonzalo et al. 1998), ontology editors have sometimes been used for thesaurus construction. A detailed survey of ontology editors can be found in the Denny study (2002). All of these tools (desktop or Webbased) present some problems in using them as general thesaurus editors. The main one is the incompatibility in the interchange formats that they support. These tools also present integration problems. Some are deeply integrated in bigger sys tems and cannot easily be reused in other environments because they need specific software components to work ARTICLE TITLE | AUTHOR �1THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA �1 (as DBMS to store thesauri). Others are independent tools (can be considered as generalpurpose thesaurus editors), but their architecture does not facilitate their integration within other information management tools. And most of them are not open source tools, so there is no possibility to modify them to improve their functionality. Focusing on the interchange format problem, the ISO5964 standard (norm for multilingual thesauri) is currently undergoing review by ISO TC46/SC 9, and it is expected that the new modifications will include a stan dard exchange format for thesauri. It is believed that this format will be based on technologies such as RDF/XML. In fact, some initiatives in this direction have already arisen: ■ The ADL Thesaurus Protocol (Janée et al. 2003) defines an XML and HTTPbased protocol for access ing thesauri. As a result of query operations, portions of the thesaurus encoded in XML are returned. ■ The Language Independent Metadata Browsing of European Resources (LIMBER) project has published a thesaurus interchange format in RDF (Matthews et al. 2001). This work introduces an RDF representa tion of thesauri, which is proposed as a candidate thesaurus interchange format. ■ The California Environmental Resources Evaluation System (CERES) and the NBII Biological Resources Division are collaborating in a thesaurus partnership project (CERES/NBII 2003) for the development of an integrated environmental thesaurus and a thesau rus networking toolset for metadata development and keyword searching. One of the deliverables of this project is an RDF format to represent thesauri. ■ The Semantic Web Advanced Development for Europe (SWADEurope 2001) project includes the SWADEurope Thesaurus Activity, which has defined the SKOS, a set of specifications to represent the knowledge organization systems (KOS) on the semantic Web (thesauri between them). The British standards BS5723 (BSI 1987) and BS6723 (BSI 1985) (equivalent to the international ISO2788 and ISO5964) also lack a representation format. The British Standards Institute IDT/2/2 Working Group is now developing the BS8723 standard that will replace them and whose fifth part will describe the exchange formats and protocols for interoperability of thesauri. The objec tive of this working group is to promote the standard to ISO, to replace the ISO2788 and ISO5964. Here, it is important to remark that given the direct involvement of the IDT/2/2 Working Group with SKOS development; probably the two initiatives will not diverge. The new representation format will be, if not exactly SKOS, at least SKOSbased. Taking into account all these circumstances, SKOS seems to be the most adequate representation model to store thesauri. Given that SKOS is RDFbased, it can be created using any tool that is able to manage RDF (usually used to edit ontologies); for example, SWOOP (MINDSWAP Group 2006), Protégé (Noy et al. 2000), or Triple20 (Wielemaker et al. 2005). The problem with these tools is that they are too complex for editing and visualizing such a simple model as SKOS. They are thought to create complex ontologies, so they provide too many options not spe cifically adapted to the type of relations in SKOS. In addition, they do not allow an integrated management of collection of thesauri and other types of controlled vocabularies as needed in DL processes (for example, the creation of metadata of resources, or the construction of queries in a search system). ■ SKOS model SKOS is a representation model for simple knowledge organization systems, such as subject heading lists, tax onomies, classification schemes, thesauri, folksonomies, other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies. This section describes the model, providing characteristics, showing the state of development, and indicating the problems found to represent some types of KOS. SKOS was initially developed within the scope of the Semantic Web Advanced Development for Europe (SWADEurope 2001). SWADE was created to support W3C’s Semantic Web initiative in Europe (part of the IST7 programme). SKOS is based on a generic RDF schema for thesauri that was initially produced by the DESIRE project (Cross et al. 2001), and further developed in the LIMBER project (Matthews et al. 2001). It has been developed as a draft of an RDF Schema for thesauri com patible with relevant ISO standards, and later adapted to support other types of KOS. Among the KOS already published using this new format are GEMET (EEA 2001), AGROVOC (FAO 2006), ADL Feature Types (Hill and Zheng 1999), and some parts of WordNet lexical data base (Miller 1990), all of them available on the SKOS project Web page. SKOS is a collection of three different RDF Schema application profiles: SKOSCore, to store common prop erties and relations; SKOSMapping, whose purpose is to describe relations between different KOS; and SKOS Extension, to indicate specific relations and properties only contained in some type of KOS. For the first step of the development of the ThManager tool, only the most stable part of SKOS has been consid ered. Figure 1 shows the part of SKOSCore used. The rest of SKOSCore is still unstable, so its support has been delayed until it is approved. SKOSMapping and SKOSExtension are still in their first steps of develop �2 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007�2 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 ment and are very unstable, so their management in ThManager also has been delayed until the creation of stable versions. In SKOSCore, a KOS (in our case, usually a the saurus) consists of a set of concepts (labelled as skos: concept) that are grouped by a concept scheme (skos: conceptScheme). To distinguish between different mod els provided, the skos:conceptScheme contains a URI that identifies it, but to describe the model content to humans, metadata following the Dublin Core standard also can be added. The relation of the concept scheme with the concepts of the KOS is done through the skos: hasTopConcept relation. This relation points at the most general concepts of the KOS (top concepts), which are used as entry points to the KOS structure. In SKOS, each concept consists of a URI and a set of properties and relations to other concepts. Among the properties, skos.preflabel and skos.altLabel provide labels for a concept in different languages. The first one is used to show the label that better identifies a concept (for the sauri it must be unique). The second one is an alternative label that contains synonyms or spelling variations of the preferred label (it is used to redirect to the preferred label of the concept). The SKOS concepts also can contain three other properties called skos.scopeNote, skos.definition, and skos.example. They contain annotations about the ways to use a concept, a definition, or examples of use in differ ent languages. Last, the skos.prefSymbol and skos.altSymbol properties are used to provide a preferred or some alter native symbols that graphically represent the concept. For example, a graphical representation is very useful to identify the meaning of a mathematical formula. Another example is a chemical formula, where a graphical repre sentation of the structure of the substance also provides valuable information to the user. With respect to the relations, each concept indicates by means of the skos:inScheme relation in which concept scheme it is contained. The skos.broader and the skos.nar- rower relations are inverse relations used to model the generalization and specialization characteristics present in many KOS (including thesauri). Skos.broader relates to more general concepts, and skos.narrower to more spe cific ones. The skos.related relation describes associative relationships between concepts (also present in many thesauri), indicating that two concepts are related in some way. With these properties and relations, it is perfectly possible to represent thesauri, taxonomies, and other types of controlled vocabularies. However, there is a problem for the representation of classification schemes that provide multiple coding of terms, as there is no place to store this information. Under this category, one may find classification schemes such as ISO639 (ISO 2002) (ISO standard for coding of languages), which proposes different types of alphanumeric codes (for example, two letters and three letters). For this special case, the SKOS working group proposes the use of the property skos.notation. Although this property is not in the SKOS vocabulary yet, it is expected to be added in future versions. Given the need to work with these types of schemes, this property has been included in the ThManager tool. ■ ThManager architecture This section presents the architecture of ThManager tool. This tool has been created to manage thesauri in SKOS, but it also is a base infrastructure that facilitates the management of thesauri in DLs, simplifying their inte gration in tools that need to use thesauri or other types of controlled vocabularies. In addition, to facilitate its use on different computer platforms, ThManager has been developed using the Java objectoriented language. The architecture of ThManager tool is shown in figure 2. The system consists of three layers: first, a repository layer where thesauri are stored and identified by means of associated metadata describing them; second, a per sistence layer that provides an API for access to thesauri stored in the repository; and third, a GUI layer that offers different graphical components to visualize thesauri, to search by their properties, and to edit them in different ways. The ThManager tool is an application that uses the different components provided by the GUI layer to allow the user to manage the thesauri. In addition, the layered architecture allows other applications to use some of the visualization components or the method provided by the persistence layer to provide access to thesauri. The main features that have guided the design of these layers have been the following: a metadatadriven design, efficient management of thesauri, the possibility of interrelating thesauri, and the reusability of ThManager Figure 1. SKOS Model ARTICLE TITLE | AUTHOR �3THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA �3 components. The following subsections describe these characteristics in detail. Metadata-driven design A fundamental aspect in the repository layer is the use of metadata to describe thesauri. ThManager considers metadata of thesauri as basic information in the thesau rus management process, being stored in the metadata repository and managed by the metadata manager. The reason for this metadatadriven design is that thesauri must be described and classified to facilitate the selec tion of the one that better fits the user needs, allowing the user to search them not only by their name but also by the application domain or the associated geographi cal area between others. The lack of metadata makes the identification of useful thesauri (provided by other organizations) difficult, producing a low reuse of them in other contexts. To describe thesauri in our service, a metadata profile based on Dublin Core has been created. The reason to use Dublin Core as basis of this profile has been its extensive use in the metadata community. It provides a simple way to describe a resource using very general metadata ele ments, which can be easily matched with complex domain specific metadata standards. Additionally, Dublin Core also can be extended to define application profiles for specific types of resources. Following the metadata pro file hierarchy described in TolosanaCalasanz et al. (2006), the thesaurus metadata profile refines the definition and domain of Dublin Core elements as well as includes two new elements (metadata language and metadata identifier) to appropriately identify the metadata records describing a thesaurus. The profile for thesauri has been described using the IEMSR format (Heery et al. 2005) and is distributed with the tool. IEMSR is an RDFbased format created by the JISC IE Metadata Schema Registry project to describe metadata application profiles. Figure 3 shows the metadata created for GEMET thesaurus (the resource), expressed as a hedgehog graph (reinterpreta tion of RDF triplets: resources, named properties, and values). The purpose of these metadata is not only to sim plify the thesaurus location to a user, but also to facilitate the identification of thesauri useful for a specific task in a machinetomachine communication. For instance, one may be interested only in thesauri that cover a restricted geographical area or have a specific thematic. Efficient thesauri storage Thesauri vary enormously in size, ranging from hundreds of concepts and properties to millions. So the time spent on load, navigation, and search processes are a functional restriction for a tool that has to manage them. SKOS is RDFbased, and because reading RDF to extract the con tent is a slow process, the format is not appropriate for inner storage. To provide better access time, ThManager transforms SKOS into a binary format when a new SKOS is imported. The persistence layer provides a unified access to the thesaurus repository. This layer is used by the GUI layer Figure 2. KOS Manager Architecture Viewer GeneratorViewer Generator Repository Concept Repository Metadata Manager Concept Manager Persistence GUI Disambiguation Tool Concept Core Thesaurus Persistence Manager SKOS Core SKOS Mapping JENA API Metadata Repository Thesaurus Metadata Applications ThManagerThManager Other tools that use thesauri Other tools that use thesauri Desktop tools that use thesauri Other tools that use thesauri Other tools that use thesauri Other tools that use thesauri Other tools that use thesauri Desktop tools that use thesauri Desktop tools that use thesauri Other tools that use thesauri Other tools that use thesauri Web services that use thesauri Other tools that use thesauri Other tools that use thesauri Other tools that use thesauri Other tools that use thesauri Web services that use thesauri Web services that use thesauri Visualization Edition Search GUI Manager Figure 3. Metadata of GEMET thesaurus European Topic Centre on Catalogue of Data Sources (ETC/CDS) GEneral Multilingual Environmental Thesaurus dc:title dcterms:alternative GEMET dc:creator [ http://www2.ulcc.ac.uk/unesco/concept/MT_MT_2.55 ] SCIENCE.ENVIRONMENTAL SCIENCES AND ENGINEERING [ http://www2.ulcc.ac.uk/unesco/concept/MT_2.60 ] SCIENCE.POLLUTION, DISASTERS AND SECURITY [ http://www2.ulcc.ac.uk/unesco/concept/MT_2.65 ] SCIENCE.NATURAL RESOURCES dc:subject dc:subject dc:subject dc:subject GEMET was conceived as a "general" thesaurus, aimed to define a common general language, a core of general terminology for the environment dc:description dc:publisher European Environment Agency (EEA) dc:date 2005-03-07 dc:type [ http://iaaa.cps.unizar.es/DcType/Concept/236 ] TEXT.REFERENCE MATERIALS.ONTOLOGY dc:format [ http://iaaa.cps.unizar.es/MimeType/Concept/skos ] SKOS http://www.eionet.eu.int/GEMETdc:identifier dc:language en es fr ... iaaa:metadataLanguage en http://iaaa.cps.unizar.es/ontologies/GEMETiaaa:metadataIdentifier [ http://www2.ulcc.ac.uk/unesco/concept/MT_2.75 ] SCIENCE.NATURAL SCIENCES [ http://www.eionet.europa.eu ] European Environment Information and Observation Network It can be used whenever there is no commercial profitdc:rights dc:relation US Environmental Protection Agency (EPA) dc:contributor dc:source [ http://europa.eu/eurovoc ] EUROVOC thesaurus European Environment Agency (EEA) dc:creator ... �� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007�� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 to access the thesauri, but it also can be employed by other tools that need to use thesauri outside a desktop environment (for example, a thematic search system accessible through the Web that requires browsing a thesaurus to facilitate construction of user queries). This layer performs the transformation of SKOS to the binary format when a thesaurus is imported. The transformation is provided using the Jena library, a popular library to manipulate RDF documents that allows storing them in different kinds of repositories (http://jena.sourceforge. net/). Jena provides an open model that can be extended with specialized modules to use other ways of storage, making it possible to easily change the storage format system for another that is more efficient if needed. The data structure used is shown in figure 4. The model is an optimized representation of the information given by the RDF triplets. The Concepts map contains the concepts and their associated relations in the form of keyvalue pairs: the key is a URI identifying a concept; and the value is a Relations object containing the properties of the concept. A Relations object is a map that stores the properties of one concept in the form of pairs. The keys used for this map are the names of the typical property types in the SKOS model (for example, narrower or broader). The only special cases for encoding these property types in the proposed data structure occur when they have a language attribute (for example, prefLa- bel, definition, or scopeNote). In those cases, we propose the use of a [lang] suffix to distinguish the property type for a particular language. For instance, prefLabel_en indicates a prefLabel property type in English. Additionally, it must be noted that the data type of the property values assigned to each key in the relations map varies upon the semantics given to each property type. The data types fall into the following categories: a string for a prefLabel property type; a list of strings for altLabel, definition, scope note, and example property types; a URI for a prefSymbol property type; a list of URIs for narrower, broader, related, and altSymbol property types; and a list of Notation objects for a notation property type. The data type used for notation values is a complex object because there may be different notation types. A Notation object consists of type and value attributes. The type attribute is a URI that identifies a particular notation type and qualifies the associated notation value. Additionally, and with the objective of increasing the speed of some operations (for example, navigation or search), some optimizations have been added. First, the URIs of the top concepts are stored in the TopConcepts list. This list contains redundant information, given that those concepts also are stored in the Concepts map, but it makes immediate their location. Second, to speed up the search of concepts and the drawing of the alphabetic viewer, the Translations map has been added. For each language sup ported by the thesaurus, this map contains a TranslationTerm object, or list of pairs , ordered by prefLabel. It also contains redundant information that allows the immediate creation of the alphabetic viewer for a language, simplifying the search process; as can be seen later, this does not provides a big over head in load time. In addition, if no alphabetic viewer and search are needed, this structure can be removed without affecting the hierarchical viewer. This solution has proven to be useful to manage the kind of thesauri we use (they do not sur pass 50,000 concepts and about 330,000 properties), loading them to memory in an average com puter in a reasonable time, and allowing immediate navigation and search (see section 6). Interrelation of thesauri The vast choice of thesauri that are available nowadays implies an undesired effect of content heterogeneity. Although a the saurus is usually created for a specific application domain, some of the concepts defined in thesauri from different applicaFigure �. Persistence Model …… Relations URI 3URI 3 Relations URI 2URI 2 Relations URI 1URI 1 ValueKey …… Relations URI 3URI 3 Relations URI 2URI 2 Relations URI 1URI 1 ValueKey <> Concepts URIprefSymbol List altSymbol List notation StringprefLabel_[lang] List altLabel_[lang] List definition_[lang] List scopeNote_[lang] List example_[lang] List related List broader List narrower ValueKey URIprefSymbol List altSymbol List notation StringprefLabel_[lang] List altLabel_[lang] List definition_[lang] List scopeNote_[lang] List example_[lang] List related List broader List narrower ValueKey <> Relations -Type : URI -Value : String Notation …… List narrower ValueKey …… List narrower ValueKey <> Relations … URI 390 URI 27 URI 3 … URI 390 URI 27 URI 3 <> TopConcepts … -Concept : URI -Label : String TranslationTerm …… Listfr Listes Listen ValueKey …… Listfr Listes Listen ValueKey <> Translations ARTICLE TITLE | AUTHOR �5THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA �5 tions domains may be equivalent. In order to facilitate crossdomain classification of resources, users would benefit from the possibility of knowing the connections of a thesaurus in their application domain to thesauri used in other domains. However, it is difficult to manually detect the implicit links between those different thesauri. Therefore, in order to automatically facilitate these interthesaurus connections, the persistence layer of ThManager tool provides an interrelation function that relates a thesaurus with respect to an upperlevel lexical database (the concept core displayed in figure 2). The interrelation mechanism is based on the method presented in NoguerasIso, ZarazagaSoria, and Muro Medrano (2005). It is an unsupervised disambiguation method that uses the relations between concepts as disam biguation context. It applies a heuristic voting algorithm to select the most adequate sense of the used concept core for each thesaurus concept. At the moment, the concept core is the WordNet lexical database. WordNet is a large English lexical database that groups nouns, verbs, adjectives, and adverbs into sets of cognitive synonyms (synsets), each expressing a distinct concept. Those synsets are interlinked by means of conceptualsemantic and lexical relations. The interrelation component has been conceived as an independent module that receives a thesaurus as input in SKOS and returns the relation respect to concept core using an extended version of the SKOS Mapping model (Miles and Brickley 2004). This model, as commented before, is a part of SKOS that allows describing exact, major, and minor mappings between concepts of two different KOS (in this case between a thesaurus and the common core). SKOS Mapping is still in an early stage of development and has been extended in order to provide the needed functionality. The base SKOS Mapping provides the map:exactMatch, map:majorMatch, and map:minorMatch relations to indicate the degree of relation between two concepts. Given that the interrelation algorithm cannot ensure that a mapping is 100 percent exact, only the major and minor match properties are used. The algorithm returns a list of pos sible mappings with the lexical database for each concept: the one with the highest probability is assigned as major match, and the rest are assigned as minor matches. To store the interrelation probability, SKOS mapping has been extended by adding a blank node with the liability of the mapping. Also, to be able to know which concepts of which thesauri are equivalents to one of the common core, the inverse relations of map:majorMatch and map:minorMatch have been created. An example of SKOS mapping can be seen in figure 5. There, the concept 340 of GEMET thesaurus (alloy) is correctly mapped to the WordNet concept number 13751474 (alloy, metal) with a probability of 91.007 percent, an unrelated minor mapping also is found, but it is given a low probability (8.992 percent). Reusability of ThManager components On top of the API layer, the GUI layer has been con structed. This layer contains several graphical interfaces to provide different types of viewers, searchers, and edi tors for thesauri. This layer is used as base for the con struction of the ThManager tool. The tool groups a subset of the provided components, relating them to obtain a final user application that allows the management of the stored thesauri, their visualization (navigation by the concept relations), their edition, and their importation and exportation using SKOS format. The ThManager tool not only has been created as an independent tool to facilitate thesauri management, but also to allow easy integration in tools that need to use thesauri. It has been done by combining the informa tion management with specific graphical interfaces in different blackbox components. Between the provided components, there is a hierarchical viewer, an alphabetic viewer, a list viewer, a searcher, and an editor, but more components can be constructed if needed. The use of the GUI layer as a library of reusable graphical components makes it possible to create different tools that are able to manage thesauri with different user requirements with minimum effort, allowing also the integration of this technology in other applications that need controlled vocabularies to improve their functionality. For example, in a metadata creation tool, it can be used to provide the graphical component to select controlled values from thesauri and automatically insert them in the metadata. It also can be used to provide the list of possible values to use in a Web search system, or to provide a thesaurus based navigation of a collection of resources in an explor atory search system. Figure 6 shows the integration process of a thesau rus visualization component in an external tool. The provided thesaurus components have been constructed following the Java Beans philosophy (reusable software components that can be manipulated visually in a builder tool), where a component is a black box with methods to read and change its state that can be reused when needed. Here, each thesaurus component is a ThesaurusBean that can be directly inserted in a graphical application to use its functionality (visualize or edit thesauri) in a very simple way. The ThesaurusBeans are provided by the ThesaurusBeanManager that, given the parameters of the thesaurus to visualize and the type of visualization, returns the most adequate component to use. ■ Description of ThManager functionality ThManager tool is a desktop application that is able to manage thesauri stored in SKOS. As regards to the instal �6 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007�6 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 lation requirements, the application requires 100 MBs of free space on the hard disk. With respect to RAM and CPU requirements, they depend greatly on the size and the number of thesauri loaded in the tool. Considering the number and size of thesauri used as testbed in section 6, RAM consumption ranges from 256 to 512 MBs, and with a 3Ghz CPU (for example, Pentium IV), the load times for the bigger thesauri are acceptable. However, if the size of thesauri is smaller, RAM and CPU requirements decrease, being able to operate on a computer with just a 1 Ghz CPU (for example, Pentium III) and 128 MBs of RAM. Given that the management of ThManager is meta data oriented, the first window in the application shows a table including the metadata records describing all the thesauri stored in the system (figure 7). The selection of a record in this table indicates to the rest of the compo nents the selected thesaurus. The creation or deletion of thesauri also is provided here. The only operation that can be performed when no record is selected is to import a new thesaurus stored in SKOS. To import it, the name of the SKOS file must be provided. The import tool also contains the option to interrelate the imported thesaurus to the concept core. The metadata of the thesaurus are extracted from inside of the SKOS if they are available, or they can be provided in an associated XML metadata file. If no metadata record is provided, the application generates a new one with minimum information, using as base the name of the SKOS file. Once the user has selected a thesaurus, it can visualize and modify its metadata or content, export it to SKOS, or, as commented before, delete it. With respect to the metadata describing a thesaurus, a metadata viewer visualizes the metadata in HTML and a metadata editor allows the editing of metadata following the thesaurus metadata profile described in the metadatadriven design section (figure 8 shows a screenshot of the metadata edi tor). Different HTML views can be provided by adding more CSS files to the application. The metadata editor is customiz able. To add or delete metadata elements to the metadata edi tor window, it is only neces sary to modify the description of the IEMSR profile for thesauri included in the application. The main functionality of the tool is to visualize the thesaurus structure, showing all proper ties of concepts and allowing the navigation by relations (see figure 9). Here, different readonly viewers are provided. There is an alphabetic viewer that shows all the concepts ordered by the preferred label in one language. A hierar chical viewer provides navigation by broader and nar rower relations. Additionally, a hypertext viewer shows all properties of a concept and provides navigation by all its relations (broader, narrower, and related) via hyper links. Finally, there also is a search system that allows the typical searches needed for thesauri (equals, starts with, contains). Currently, search is limited to preferred labels in the selected language, but it could be extended to allow searches by other properties, such as synonyms, defini tions, or scope notes. Figure 5. SKOS Mapping extension alloy ... 91.00727 alloy, metal … … 91.00727 map:majorMatch iaaa:probability map:majorMatch iaaa:hasMajorMatch iaaa:hasMajorMatch Resource Property alloy, metal a mixture containing two or more metallic elements or metallic and nonmetallic elements usually fused together or dissolving into each other when molten; "brass is an alloy of zinc and copper" skos:definition map:minorMatch iaaa:hasMinorMatch admixture, alloy map:minorMatch iaaa:hasMinorMatch http://www.eionet.eu.int/ gemet/concept/340 rdf:about A28660 rdf:nodeID A2821 8.992731 iaaa:probability rdf:nodeID http://wordnet.princeton.edu/ Wordnet_2.0/13751474 rdf:about skos:prefLabel alloy skos:prefLabel http://wordnet.princeton.edu/ Wordnet_2.0/13664144 the state of impairing the quality or reducing the value of something skos:prefLabel skos:definition rdf:about Any of a large number of substances having metallic properties and consisting of two or more elements; with few exceptions, the components are usually metallic elements. (Source: MGH) skos:definition Figure 6. GUI component integration Desktop Tool ThesaurusBeanManager Type: Tree, Thesaurus: GEMET ThesaurusBean ARTICLE TITLE | AUTHOR �7THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA �7 All of these viewers are synchronized, so the selec tion of a concept in one of them produces the selection of the same concept in the others. The layered architec ture described previously allows these viewers to be reused in many situations, including other parts of the ThManager tool. For example, in the thesaurus metadata editor described before, the thesaurus viewer is used to facilitate the selection of values for the subject section of metadata. Also, in the thesaurus editor shown later, the thesaurus viewer simplifies the selection of a concept related (by some kind of relation) to the selected, and provides a preview of the hierarchical viewer to help to detect wrong relations. The third available operation is to edit the thesaurus structure. Here, to create a thesaurus following the SKOS model, an edition component is provided (see figure 10). The graphical interface shows a list with all the concepts created in the selected thesaurus, allowing the creation of new ones (providing their URIs) or deletion of selected ones. Once a concept has been selected, its properties and relations to other concepts are shown, allowing the creation of new ones and the deletion of others. To facili tate the creation of relations between concepts, a selector of concepts (based in the thesaurus viewer) is provided, allowing the user to add related concepts without manu ally typing the URI of the associated concept. Also, to see if the created thesaurus is correct, a preview of the hier archical viewer can be shown, allowing the user to easily detect problems in the broader and narrower relations. With respect to the interrelation functionality, at the moment the mapping obtained is shown in the thesaurus viewers, but the navigation between equivalent concepts of two thesauri must be be done manually by the user. However, a navigation component still under develop ment will allow the user to jump from a concept in a the saurus to concepts in others that are mapped to the same concept in the common core. As mentioned before, for efficiency, the format used to store the thesauri in the repository is binary, but the inter change format used is SKOS. So a module for thesauri importation and exportation is provided. This module is able to import from and export to SKOS. In addition, if the thesaurus has been interrelated with respect to the concept core, it is able to export its mapping to the con cept core using the extended version of SKOS mapping above. ■ Results of the work This section shows some experiments performed with the ThManager tool for the storage and management of a selected set of thesauri. In particular, this set of thesauri is relevant in the context of the geographic information community. The increasing relevance of geographic infor mation for decisionmaking and resource management in different areas of government has promoted the cre ation of geolibraries and spatial data infrastructures to facilitate distribution and access of geographic informa tion (NoguerasIso, ZarazagaSoria, and MuroMedrano, 2005). In this context, complex metadata schemes, such as ISO19115, have been proposed for a fulldetail descrip tion of resources. Many of the metadata elements in these schemes are either constrained to a selected vocabulary (ISO639 for language encoding, ISO3166 for country codes, and so on), or the user is told to pick a term from the most suitable thesaurus. The problems with this sec ond case are that typically the choice for thesauri is quite open, the thesauri are frequently large, and the exchange format of available thesauri is quite heterogeneous. In such a context, the ThManager tool has proven to be very useful to simplify the management of the used thesauri. At the moment, eighty KOS between thesauri and other types of controlled vocabulary have been cre ated or transformed to SKOS and managed through this tool. Table 1 shows some of them, indicating their names (Name column), the number of concepts (NC column), their total number of properties and relations (NP and NR columns), and the number of languages in which concept properties are provided (NL column). To give an idea of the cost of loading these structures, the sizes of SKOS and binary files (SS and SB columns) are provided in kilobytes (KB). Additionally, table 1 compares the performance time of ThManager with respect to other tools that load the Figure 7. Thesaurus Selector Figure �. Thesaurus Metadata Editor �� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007�� INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 thesauri directly from an RDF file using the Jena library (time performance has been obtained using a 3Ghz Pentium IV processor). For this purpose, three different load times (in seconds) have been computed. The BT column contains the load time of binary files without the cost of creating the GUI for the thesauri viewers. The LT column contains the total load time of binary files (including the time of GUI creation and drawing). The JT column contains the time spent by a hypothetical RDF based editor tool to invoke Jena and load in its memory model the RDF SKOS files (it does not include GUI cre ation) containing the thesauri. The difference between the BT and LT column shows the time used to draw the GUI once the thesauri have been loaded in memory. The difference between BT and JT columns shows the gain in terms of time of using a binary storage instead of a RDF based one. The thesauri shown in the table are the ADL Feature Types Thesaurus (ADL FTT), the ISOC Thesaurus of Geography (ISOCG), the ISO639, the UNESCO Thesaurus (UNESCO 1995), the OGP Surveying and Positioning Committee Code Lists (EPSG) (OGP 2006), the Multilingual Agricultural Thesaurus (AGROVOC), the European Vocabulary Thesaurus (EUROVOC) (EUPO 2005), the European Territorial Units (Spain and France) (ETU), and the General Multilingual Environmental Thesaurus (GEMET). They have been selected because they have different sizes and can be used to show how the load time evolves with the thesaurus size. Among them, GEMET and AGROVOC can be high lighted. Although they are provided as SKOS, they include nonstandard extensions that we have transformed to standard SKOS relations and properties. EUROVOC and UNESCO are examples of thesauri provided in formats different than SKOS that we have completely transformed into SKOS. The former one was in an XMLbased format, and the latter used a plaintext format. Another thesaurus transformed to SKOS is the European Territorial Units, which contains the administrative political units in Spain and France. Here, the original source was a collection of heterogeneous documents that contained parts of the needed information and have been processed to generate a SKOS file. Some classification schemes also have been trans formed to SKOS, such as the ISO639 and the different EPSG codes for coordinate reference systems (includ ing datums, ellipsoids, and projections). With respect to controlled vocabularies created (by the authors) in SKOS using the ThManager tool, there is an extended version of the ADL Feature Types that includes a more detailed clas sification of features types and different glossaries used for resource classification. Figure 11 depicts the comparison of the different load times shown in table 1 with respect to the size of the RDF SKOS files. The order of the thesauri in the figure is the same as in the table 1. It can be seen that the time to con struct the model using a binary format is almost half the time spent to create the model using a RDF file. In addi tion, once the binary model is loaded, the time to generate the GUI is not very dependent on thesaurus size. This is possible thanks to the redundant information added to facilitate the access to top concepts and to speed up load ing of the alphabetic viewer. This redundant informa tion produces an overhead in the load of the model, but without it the drawing time would be much worse, as it would have to generate it on the fly. However, in spite of the improvements, for the larger thesauri considered, the load time starts to be long, given that it includes the load time of all the structure of the thesaurus in memory and the creation of the objects used to manage it quickly when loaded. But, once it is loaded, future accesses are immediate (quicker than 0.5 seconds). These accesses include opening it again, navigating by Figure 9. Thesaurus Concept Selector Figure 10. Thesaurus Concept Editor ARTICLE TITLE | AUTHOR �9THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA �9 thesaurus relations, changing the visualization language, and searching concepts by their preferred labels. To minimize the load time, thesauri can be loaded in the background when the application is launched, reducing, in that way, the user perception of the load time. Another interesting aspect in figure 11 is the peak of the third element. It corresponds with the ISO639 classifica tion scheme. It has the special characteristic of not having hierarchy and having many notations. These two character istics produce a little increase in the model load time, given that the top concepts list contains all the concepts and the notations are more complex than other relations. But most of the time is used to generate the GUI of the tree viewer. The tree viewer gets all the concepts that are top terms, and for each one it asks for their preferred labels in the selected language and sorts them alphabetically to show the first level of the tree. This is fast for a few hundred concepts, but not for the 7,599 in the ISO639. However, this problem could be easily solved if the metadata contained a descrip tion of the type of KOS to visualize. If the tool knew that the KOS does not have broader and narrower relations, it could use the structures used to visualize the alphabetic list, which are optimized to show all of the KOS concepts rapidly, instead of trying to load it as a tree. The persistence approach used has the advantage of not requiring external persistence systems, such as a DBMS, and providing rapid access after loading, but it has the drawback of loading all thesauri in memory (in time and space). So, for much bigger thesauri, the use of some kind of DBMS would be necessary. If this change were necessary, minimum modifications would be needed (one class). However, if not all the concepts are loaded, the alphabetic viewer (shows all the concepts) would have to be updated (for example, showing the concepts by pages) or it would become too slow to work with it. ■ Conclusions This article has presented a tool for managing the the sauri needed in a digital library, for creating metadata, and for running search processes using SKOS as the interchange format. This work revises the tools that are available to edit thesauri, highlighting the lack of a formalized way to exchange thesauri and the difficulty of integrating those tools in other environments. This work selects SKOS from the available interchange formats for thesauri as the most promising format to become a standard for SKOS repre sentation, and highlights the lack of tools that are able to manage it properly. The ThManager tool is offered as the solution to these problems. It is an open source tool that can manage the sauri stored in SKOS, allowing their visualization and editing. Thanks to the layered architecture, its components can be easily integrated in other applications that need to use thesauri or other controlled vocabularies. Additionally, the components can be used to control the possible values used in a Web search service to facilitate traditional or exploratory searches based on a controlled vocabulary. The performance of the tool is proved through a series of experiments on the management of a selected set of thesauri. This work analyzes the features of this selected set of thesauri and compares the efficiency of this tool with respect to other tools that load the thesauri directly from a RDF file. In particular, it is shown that the internal representation used by ThManager helps to decrease the time spent for the graphical loading of thesauri, facilitating navigation of the thesaurus contents as well as other typical operations, such as sorting or change of visual ization language. Additionally, it is worth noting that the tool can be used as a library of components to simplify the integration of the sauri in other applications that require the use of controlled vocabularies. ThManager has been integrated within the open source CatMDEdit tool Table 1. Sizes of some thesauri and other types of vocabularies Name NC NP NR NL LT BT JT SS SB ADL FTT 210 210 408 1 0.4 0.047 0.062 103 41 ISOCG 5,136 5,136 1,026 1 2.4 1.063 1.797 2,796 1,332 ISO639 7,599 16,247 0 6 5.1 1.969 2.89 3,870 3,017 UNESCO 8,600 13,281 21,681 3 2.1 1.406 2.984 4,034 2,135 EPSG 4,772 9,544 0 1 1.8 0.969 1.796 2,935 1,682 AGROVOC 16,896 103,484 30,361 3 7.5 4.953 14.75 15,859 5,089 EUROVOC 6,649 196,391 20,861 15 11.1 9.266 15.828 18,442 11,483 ETU 44,991 89,980 89,976 2 13.3 10.625 17.844 23,828 10,412 GEMET 5,244 326,602 12,750 21 13.7 11.828 25.61 28,010 15,048 50 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 200750 INFORMATION TECHNOLOGY AND LIBRARIES | SEpTEMBER 2007 (ZarazagaSoria et al. 2003), a metadata editor tool for the documentation of geographic information resources (metadata compliant with ISO19115 geographic informa tion metadata standard). The ThesaurusBeans provided in ThManager library have been used to facilitate keyword selection for some metadata elements. The ThManager component library also has contributed to the develop ment of catalog search systems guided by controlled vocabularies. For instance, it has been used to build a thematic catalog in the SDIGER project (ZarazagaSoria 2007). SDIGER is a pilot project on the implementa tion of the Infrastructure for Spatial Information in Europe (INSPIRE) for the development of a spatial data infrastructure to support access to geographic infor mation resources concerned with the European Water Framework Directive. Thanks to the ThManager compo nents, the thematic catalog allows browsing of resources by means of several multilingual thesauri, including GEMET, UNESCO, AGROVOC, and EUROVOC. Future work will enhance the functionalities provided by ThManager. First, the ergonomics will be improved to show connections between different thesauri. Currently, these connections can be computed and annotated, but the GUI does not allow the user to navigate them. As the base technology already has been developed, only a graphical interface is needed. Second, the tool will be enhanced to support data types different from texts (for example, images, documents, or other multimedia sources) for the encoding of concepts’ property values. Third, it has been noted that the thesauri concepts can evolve with time. Thus, a mechanism for the managing the different ver sions of thesauri will be necessary in the future. Finally, improvements in usability also are expected. Thanks to the componentbased design of ThManager widgets (ThesaurusBeans), new viewers or editors can be readily created to meet the needs of specific users. ■ Acknowledgments This work has been partially supported by the Spanish Ministry of Education and Science through the proj ects TIN200600779 and TIC200309365C0201 from the National Plan for Scientific Research, Development, and Technology Innovation. The authors would like to express their gratitude to Juan José Floristán for his support in the technical development of the tool. References American National Standards Institute (ANSI). 1993. Guidelines for the Construction, Format, and Management of Monolin gual Thesauri. ANSI/NISO Z39.191993. Revision of Z39.19. Batschi, WolfDieter et al. 2002. SuperThes: A New Software for Construction, Maintenance, and Visualisation of Mul tilingual Thesauri. http://www.treks.cnr.it/docs/ST_ enviroinfo_2002.pdf (accessed Sept. 6, 2007). British Standards Institute (BSI). 1985. Guide to establishment and development of multilingual thesauri. BS 6723. British Standards Institute (BSI). 1987. Guide to establishment and development of monolingual thesauri. BS 5723. CERES/NBII. 2003. The CERES/NBII Thesaurus Partnership Project. http://ceres.ca.gov/thesaurus/ (accessed June 12, 2007). Cross, Phil, Dan Brickley, and Traugott Koch. 2001. RDF The saurus Specification. Technical Report 1011, Institute for Learn- ing and Research Technology. http://www.ilrt.bris.ac.uk/ discovery/2001/01/rdfthes/ (accessed June 12, 2007). Denny, Michael. 2002. Ontology building: a survey of edit ing tools. XML.com. http://xml.com/pub/a/2002/11/06/ ontologies.html (accessed June 12, 2007). European Environment Agency (EEA). 2004. GEneral Multilingual Environmental Thesaurus (GEMET). Version 2.0. European Environment Information and Observation Network. http:// www.eionet.europa.eu/gemet/rdf (accessed June 12, 2007). European Union Publication Office (EUPO). 2005. European Vocabulary (EUROVOC). Publications Office. http://europa .eu/eurovoc/ (accessed June 12, 2007). Food and Agriculture Organization of the United Nations (FAO). 2006. Agriculture vocabulary (AGROVOC). Agricul tural Information Management Standards. http://www.fao. org/aims/ag%20alpha.htm (accessed June 12, 2007). Gonzalo, Julio, et al. 1998. Applying EuroWordNet to CrossLan guage Text Retrieval. Computers and the Humanities 32, no. 2/3 (Special Issue on EuroWordNet): 185–207. Heery, Rachel, et al. 2005. JISC metadata schema registry. In 5th ACM/IEEE-CS joint conference on digital libraries, 381–81. New York: ACM Pr. Hill, Linda, and Qi Zheng. 1999. Indirect Geospatial Referencing through Place Names in the Digital Library: Alexandria Digi Figure 11. Thesaurus load times 0 5 10 15 20 25 30 0 5000 10000 15000 20000 25000 30000 SKOS File Size (kB) Lo ad T im e (s ) RDF (Jena) Binary ThManager ARTICLE TITLE | AUTHOR 51THMANAGER | LACASTA, NOGUERAS-ISO, LÓpEz-pELLICER, MURO-MEDRANO, AND zARAzAGA-SORIA 51 tal Library Experience with Developing and Implementing Gazetteers. In ASIS ‘99: Proceedings of the 62nd ASIS annual meeting: Knowledge: creation, organization, and use, 57–69. Med ford, N.J.: Information Today, for the Ameircan Society for Information Science. Hodge, Gail. 2000. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. Washington, D.C.: The Digital Library Federation. International Organization for Standardization (ISO). 1985. Guidelines for the establishment and development of multilingual thesauri. ISO 5964. International Organization for Standardization (ISO). 1986. Guidelines for the establishment and development of monolingual thesauri. ISO 2788. International Organization for Standardization (ISO). 2002. Codes for the representation of names of languages. ISO 639. International Organization for Standardization (ISO). 2003. Information and documentation—The Dublin Core metadata ele- ment set. ISO 15836:2003. Janée, Greg, Satoshi Ikeda, and Linda L. Hill. 2003. The ADL The saurus Protocol. http://www.alexandria.ucsb.edu/~gjanee/ thesaurus/ (accessed June 12, 2007). Lesk, Michael. 1997. Practical digital libraries. San Francisco: Books, Bytes, and Bucks. Matthews, Brian M., et al. 2001. Internationalising data access through LIMBER. In Third international workshop on interna- tionalisation of products and systems: 1–14. Milton Keynes (UK). http://epubs.cclrc.ac.uk/bitstream/401/Limber_IWIPS.pdf (accessed June 12, 2007). Miles, Alistair, and Dan Brickley, eds. 2004. SKOS Mapping Vocab ulary Specification. W3C. http://www.w3.org/2004/02/ skos/mapping/spec/20041111.html (accessed June 12, 2007). Miles, Alistair, Brian Matthews, and Michael Wilson. 2005. SKOS Core: Simple Knowledge organization for the WEB. In 2005 Dublin Core annual conference—Vocabularies in practice, 5–13. Madrid: Universidad Carlos II de Madrid. Miller, George A. 1990. WordNet: An online lexical database. Int. J. Lexicography 3: 235–312. MINDSWAP Group. 2006. SWOOP A Hypermediabased Feath erweight OWL Ontology Editor. Maryland Information and Network Dynamics Lab. Semantic Web Agents Project. http://www.mindswap.org/2004/SWOOP/ (accessed June 12, 2007). NoguerasIso, Javier, Francisco Javier ZarazagaSoria, and Pedro Rafael MuroMedrano. 2005. Geographic Information Metadata for Spatial Data Infrastructures—Resources, Interoperability, and Information Retrieval. New York: Springer Verlag. Noy, Natalie F., Ray W. Fergerson, and Mark A. Musen. 2000. The knowledge model of Protégé2000: Combining interoper ability and flexibility. In Knowledge engineering and knowledge management: Methods, models, and tools: 12th international Con- ference, EKAW 2000, Juan-les-Pins, France, October 2–6, 2000: proceedings, 120 (Lecture notes in computer science, 1937). New York: Springer. OGP Surveying & Positioning Committee. 2006. Surveying and Positioning. http://www.epsg.org/ (accessed June 12, 2007). Semantic Web Advanced Development for Europe (SWAD Europe). 2001. Semantic Web Advanced Development for Europe Thesaurus Activity. http://www.w3.org/2001/sw/ Europe/ reports/thes (accessed June 12, 2007). TolosanaCalasanz, R., et al. 2006. Semantic interoperability based on Dublin Core hierarchical onetoone mappings. International Journal of Metadata, Semantics, and Ontologies 1, no. 3: 183–88. Tylor, Mike. 2004. The ZTHES specifications for thesaurus rep resentation, access, and navigation. http://zthes.z3950.org/ (accessed June 12, 2007). United Nations Educational, Scientific, and Cultural Organiza tion (UNESCO). 1995. UNESCO Thesaurus: A Structured List of Descriptors for Indexing and Retrieving Literature in the Fields of Education, Science, Social and Human Science, Culture, Com- munication and Information. Paris: UNESCO Publ. U.S. Library of Congress. Network Devlopment and MARC Standards Office. 2004. MARC standards. http://www.loc. gov/marc/ (accessed June 12, 2007). Wielemaker, Jan, Guss Schreiber, and Bob Wielinga1. 2005. Using Triples for Implementation: The Triple20 Ontology-Manipulation Tool (Lecture Notes in Computer Science, 3729): 773–85. New York: Springer. ZarazagaSoria, Francisco Javier, et al. 2003. A Java Tool for Creating ISO/FGDC Geographic Metadata. In Geodaten- und Geodienste- Infrastukuren—von der Forschung zur praktischen Anwendung: Beitrage ze den Münsteraner GI-Tagen, 26/27. Juni 2003 (IfGIprints, 18). Münster, Germany: Institut fur Geoin formatik, Universitat Münster. ZarazagaSoria, Francisco Javier, et al. 2007. Providing SDI Ser vices in a CrossBorder Scenario: The SDIGER Project Use Case. In Research and Theory in Advancing Spatial Data Infra- structure Concepts, 113–26. Redlands, Calif.: ESRI. EBSCO cover 2 LITA cover 3, cover 4 Index to Advertisers 3275 ---- 2 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 I write my final President’s column a month after the Midwinter Meeting in Seattle. You will read it as preparations for the ALA Annual Conference in Washington, D.C. are well underway. Despite that discon nect in time, I am confident that the level of enthusiasm will continue uninterrupted between the two events. Indeed, the Midwinter Meeting was highly charged with positive energy and excitement. The feelings are reignited if you listen to the numerous podcasts now found on the LITA blog. The LITA bloggers and podcasters were omni present reporting on all of the meetings and recording the musings of the LITA Top Tech Trendsters. By the time you have read this you will have also, hopefully, cast your ballot for LITA officers and directors after having had the opportunity to listen to brief podcast interviews with the candidates. The LITA Board approved the election pod casts at the Annual Conference in New Orleans. Thanks to the collaborative efforts of the nominating committee and the BIGWIG members, we have this new input into our voting decisionmaking. The most exciting aspects of the Midwinter Meeting were the facetoface, networking opportunities that make LITA so great. The LITA happy hour crowd filled the Six Arms bar and lit it up with the wonderful LITA glow badges. What was particularly gratifying to me was the number of new LITA members alongside those of us who have been around longer than we care to count. The net working that went on there was phenomenal! The other important networking opportunity for LITA members was the LITA Town Meeting led by LITA Vice President Mark Beatty. The room was packed with eager members ready to brainstorm about what they think LITA should be doing after consuming a wonderful breakfast. LITA’s sponsored Emerging Leader, Michelle Boule, and Mark have collated the findings and will be working with the other emerging leaders to finetune a direction. The podcast interview of Michelle and Mark is an excellent summary of what you can expect in the next year when Mark is president. As stated earlier, this is my last President’s column, which means my term is winding down. Using LITA’s strategic plan as a guide, I have worked with many of you in LITA to ensure that we have a structure in place that allows us to be more adaptable to the rapidly chang ing world and to make sure that LITA is relevant to LITA members 365 X 24 X 7 and not just at conferences and LITA National Forum. Attracting and retaining new members is critical for the health of any organization and in that vein, Mark and I have used the ALA Emerging Leaders program as a jumping off point to work with LITA’s Emerging Leaders. The BIGWIG group is foment ing with energy and excitement as they rally bloggers and have this past year launched the podcasting initiative and the LITA wiki. All of these things are making it easier for members to communicate about issues of interest in their work as well as to conduct LITA business. The LITA blog had over nine thousand downloads of its podcasts in the first three weeks after Midwinter which confirms the desire for these types of communications! I appointed two task forces that provided recommen dations to the LITA Board at Midwinter. The Assessment and Research Task Force has recommended that a perma nent committee be established to monitor the collection of feedback and assessment data on LITA programs and services. Having an established assessment process will enable the board to know how well we are accomplishing our strategic plan and to keep us on the correct course to meet membership needs. The Education Working Group has recommended the merger of two committees, the Education and Regional Institutes Committees, into one Education Committee. This merged committee will develop a variety of educational opportunities including online and facetoface sessions. We hope to have both of these committees up and going later in 2007. Happily, the feedback from the Town Meeting parallels the recom mendations of the task forces. The board will be revisit ing the strategic plan at the Annual Conference using information gathered at the Town Meeting. We will also be looking at what new services we should be initiating. All arrows seem to be pointing towards more educational and networking opportunities both virtual and in person. I anticipate that LITA members will see some great new things happening in the next year. I have very much enjoyed the opportunity to serve as the LITA president this past year. The best part has been getting to know so many LITA members who have such creative ideas and who roll up their sleeves and dig in to get the work done. I am very grateful for everyone who has volunteered their time and talents to make LITA such a great organization. Bonnie Postlethwaite (postlethwaiteb@umkc.edu) is LITA President 2006/2007 and Associate Dean of Libraries, University of Missouri–Kansas City. President’s Column Bonnie Postlethwaite 3276 ---- A s I approach the end of my tenure as ITAL edi tor, I reflect on the many LITA members who have not submitted articles for possible publica tion in our journal. I am especially mindful of the smaller number who have promised or hinted or implied that they intended to or might submit articles. Admittedly, some of them may have done so because I asked them, and their replies to me were the polite ones that one expects of the honorable members of the Library and Information Technology Association of the American Library Association. Librarians are as individuals almost all or almost always polite in their professional discourse. Pondering these potential authors, particularly the smaller number, I conjured a mental picture of a fictional, male, potential ITAL author. I don’t know why my fic tional potential author was male—it may be because more males than females are members of that group; it may be because I’m a male; or it may be unconscious sex ism. I’m not very selfanalytic. My mental picture of this fictional male potential author saw him driving home from his place of employ ment after having an afterwork half gallon of rum when, into the picture, a rattlesnake crawled on to the seat of his car and bit him on the scrotum. Lucky him: he was, after all, a figment of my imagina tion. (Any resemblance between my fictional author and a real potential author is purely coincidental.) Lucky me: we all know that such an incident is not unthinkable in library land. Lucky LITA: it is unlikely that any member will cancel his or her membership or any subscriber, his, her, or its subscription because the technical term “scro tum” found its way into my editorial. ITAL is, after all, a technology journal, and members and readers ought to be offended if our journal abjures technical terminology. Likewise they should be offended if our articles discuss library technology issues misusing technical terms or concepts, or confusing technical issues with policy issues, or stating technology problems or issues in the title or abstract or introduction then omitting any mention of said problems until the final paragraph(s). ITAL referees are quite diligent in questioning authors when they think terminology has been used loosely. Their close readings of manuscripts have caught more than one author mislabeling policies related to the uses of informa tion technologies as if the policies were themselves tech nical conundrums. Most commonly, they have required authors who state major theses or technology problems at the beginnings of their manuscripts, then all but ignore these until the final paragraphs, to rewrite sections of their manuscripts to emphasize the often interesting questions raised at the outset. What, pray tell, is the editor trying to communicate to readers? Two things, primarily. First, I have been following with interest the several heated discussions that have taken place on lital for the past number of months. Sometimes, the idea of the traditional quarterly scholarly/professional journal in a field changing so rapidly may seem almost quaint. A typical ITAL article is five months old when it is pub lished. A typical discussion thread on lital happens in “real time” and lasts two days at most. A small number of participants raise and “solve” an issue in less than a half dozen posts. A few times, however, a question asked or a comment posted by a LITA member has led to a flurry of irrelevant postings, or, possibly worse, sustained bomb ing runs from at least two opposing camps that have left some members begging to be removed from the list until the all clear signal has been sounded. I’ve read all of these, and I could not help but won der, what if ITAL accepted manuscripts as short as lital postings? What would our referees do? I suspect, for our readers’ sakes, most would be rejected. Authors whose manuscripts are rejected receive the comments made by the referees and me explaining why we cannot accept their submissions. The most frequent reason is that they are out of scope, irrelevant to the purposes of LITA. When someone posts a technology question to lital that gener ates responses advising the questioner that implementing the technology in question is bad policy, the responses are, from an editor’s point of view, out of scope. How many LITA members have authority—real authority—to set policy for their libraries? A second “popular” reason for rejections is that the manuscripts pose “false” problems that may be technological but that are not technologies that are within the “control” of libraries. These are out of scope in a different manner. Third, some manuscripts do not pass the “so what” test. Some days I wish that lital responders would referee, honestly, their own responses for their relevance to the questions or issues or sowhatness and to the membership. Second, and more importantly to me, LITA members, whether or not your bodies include the part that we all have come to know and defend, do you have the “” to send your ITAL editor a manuscript to be chewed upon not by rattlesnakes but by the skilled professionals who are your ITAL Editorial Board members and referees? I hope (and do I dare beg again?) so. Your journal will not suffer quaintness unless you make it so. Editorial: The Virtues of Deliberation John Webb John Webb (jwebb@wsu.edu) is a Librarian Emeritus, Washington State University, and Editor of Information Technology and Libraries. EDITORIAL | WEBB 3 3277 ---- Drawing upon findings from a national survey of U.S. public libraries, this paper examines trends in Internet and public computing access in public libraries across states from 2004 to 2006. Based on library-supplied information about levels and types of Internet and public computing access, the authors offer insights into the net- work-based content and services that public libraries pro- vide. Examining data from 2004 to 2006 reveals trends and accomplishments in certain states and geographic regions. This paper details and discusses the data, identi- fies and analyzes issues related to Internet access, and suggests areas for future research. T his article presents findings from the 2004 and 2006 Public Libraries and the Internet studies detail ing the different levels of Internet access available in public libraries in different states.1 At this point, 98.9 percent of public library branches are connected to the Internet and 98.4 percent of connected public library branches offer public Internet access.2 However, the types of access and the quality of access available are not uniformly distributed among libraries or among the libraries in various states. While the data at the national level paint a portrait of the Internet and public computing access provided by public libraries overall, studies of these differences among the states can help reveal successes and lessons that may help libraries in other states to increase their levels of access. The need to continue to increase the levels and quality of Internet and public computing access in public libraries is not an abstract problem. The services and con tent available on the Internet continue to require greater bandwidth and computing capacity, so public libraries must address everincreasing technological demands on the Internet and computing access that they provide. 3 Public libraries are also facing increased external pressure on their Internet and computing access. As patrons have come to rely on the availability of Internet and computing access in public libraries, so too have government agencies. Many federal, state, and local government agencies now rely on public libraries to facilitate citizens’ access to egovernment services, such as applying for the federal prescription drug plans, filing taxes, and many other interactions with the gov ernment.4 Further, public libraries also face increased demands to supply public access computing in times of natural disasters, such as the major hurricanes of 2004 and 2005.5 As a result, both patrons and govern ment agencies depend on the Internet and computing access provided by public libraries, and each group has different, but interrelated, expectations of what kinds of access public libraries should provide. However, the data indicate that public libraries are at capacity in meet ing some of these expectations, while some libraries lack the funding, technologysupport capacity, space, and infrastructure (e.g., power, cabling) to reach the expecta tions of each respective group. As public libraries (and the Internet and public com puting access they provide) continue to fill more social roles and expectations, a range of new ideas and strate gies can be considered by public libraries to identify suc cessful methods for providing access that is high quality and sufficient to meet the needs of patrons and commu nity. The goals of the Public Libraries and the Internet stud ies have been to help provide an understanding of the issues and needs of libraries associated with providing Internetbased services and resources. The 2006 Public Libraries and the Internet study employed a Webbased survey approach to gather both quantitative and qualitative data from a sample of the 16,457 public library outlets in the United States.6 A sample was drawn to accurately represent metropolitan status (roughly equating to their designation of urban, suburban, or rural libraries), poverty levels (as derived through census data), state libraries, and the national picture, producing a sample of 6,979 public library out lets.7 The survey received a total of 4,818 responses for a response rate of 69 percent. The data in this article, unless otherwise noted, are drawn from the 2004 and 2006 Public Libraries and the Internet studies.8 While the survey received responses from librar ies in all fifty states, there were not enough responses in all states from which to present statelevel findings. The study was able to provide statelevel analysis for thirtyfive states (including Washington, D.C.) in 2004 and fortyfour states at the outlet level (including Washington, D.C.) and fortytwo states at the system level (including Washington, D.C.) in 2006. In addi tion, there was some variance in states with adequate responses between the 2004 and 2006 studies. A full listing of the states is available in the final reports of the 2004 and 2006 studies at http://www.ii.fsu.edu/ plinternet_reports.cfm. Thus, the findings below reflect 4 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 Public Libraries and Internet Access across the United States: A Comparison by State 2004–2006 Paul T. Jaeger, John Carlo Bertot, Charles R. McClure, and Miranda Rodriguez Paul T. Jaeger (pjaeger@umd.edu) is an Assistant Professor at the College of Information Studies at the University of Maryland; John Carlo Bertot (bertot@ci.fsu.edu) is Professor and Associate Director of the Information Use Management and Policy Institute, College of Information, Florida State University; Charles R. McClure (cmcclure@ci.fsu.edu) is Francis Eppes Professor and Director of the Information Use Management and Policy Institute, College of Information, Florida State University; and Miranda Rodriguez (mrodrig08@umd.edu) is a graduate student in the College of Information Studies at the University of Maryland. PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 5 only those states for which both the 2004 and 2006 stud ies were able to provide analysis. n Public libraries and the Internet across the states Overview of 2004 to 2006 As the Public Library and the Internet studies have been ongoing since 1994, the questions asked in the biennial studies have evolved along with the provision of Internet access in libraries. The questions have varied between surveys, but there have been consistent questions that allow for longitudinal analysis at the national level. The 2004 study introduced the analysis of the data at both the national and the state levels. With both the 2004 and 2006 studies providing data at the state level, some longitudi nal analysis at the state level is now possible. Overall, there were a number of areas of consistent data across the states from 2004 to 2006. Most states had fairly similar, if not identical, percentages of library outlets offering public Internet access between 2004 and 2006. For the most part, changes were increases in the percentage of library outlets offering patron access. Further, the average number of hours open per week in 2004 (44.5) and in 2006 (44.8) were very similar, as were the percentages of library outlets reporting increases in hours per week, decreases in hours per week, and no changes in hours per week. While these numbers are consistent, it is not known whether this average number of hours open, or the distribution of the hours open across the week, is sufficient to meet patron needs in most communities. Data across the states also indicated that physical space is the primary reason for the inability of libraries to add more workstations within the library building. There was also consistency in the findings related to upgrades and replacement schedules. Changes and continuities from 2004 to 2006 While the items noted above show some areas of stability in the Internet access provided by public libraries across the states, insights are possible in the areas of change for libraries overall or in the libraries that are leading in particular areas. Table 1 details the states with the highest average number of hours open per public library outlet in 2004 and 2006. Between 2004 and 2006, the national average for the number of hours open increased slightly from 44.5 hours per week to 44.8 hours per week. This increase is reflected in the numbers for the individual states in 2006, which are generally slightly higher than the numbers for the individual states in 2004. For example, the top state in 2006 averaged 55.7 hours per outlet each week, while the top state in 2004 averaged 54.8 hours. The top four states—Ohio, New Jersey, Florida, and Virginia—were the same in both years, though with the top two switching positions. This demonstrates a continuing commitment in these four states by state and local government to ensure wide access to public librar ies. These states are also ones with large populations and state budgets, presumably fueling the commitment and facilitating the ability to keep libraries open for many hours each week. While the needs of patrons in other states are no less significant, the data indicate that states with larger populations and higher budgets, not surpris ingly, may be best positioned to provide the highest levels of access to public libraries for state residents. The other six states in the 2006 top ten were not in the 2004 top ten. The primary reason for this is that the six states in 2006 increased their hours more than other states. Note that the fifthranked state in 2004, South Carolina, averaged 49 hours per outlet each week, which is less than the tenthranked state in 2006, Illinois, at 49.5 hours. Simply by maintaining the average number of hours open per outlet between 2004 and 2006, South Carolina fell from fifth to out of the top ten. These differ ences are reflected in the fact that there is nearly a ten hour difference from first place to tenth place in 2004; yet only a sixhour discrepancy exists from first place to tenth in 2006. These numbers suggest that hours of operation may change frequently for many libraries, indicating the need for future evaluations of operational hours in rela tion to meeting patron demand. Table 2 displays the states with the highest average number of public access workstations per public library in 2004 and 2006. The national averages between 2004 and 2006 also showed a slight increase from 10.4 workstations Table 1. Highest average number of hours open in public library outlets by state in 2004 and 2006 2004 2006 1. New Jersey 54.8 1. Ohio 55.7 2. Ohio 54.6 2. New Jersey 55.6 3. Florida 52.4 3. Florida 52.3 4. Virginia 51.3 4. Virginia 52.3 5. South Carolina 49.0 5. Indiana 51.9 6. Utah 48.0 6. Pennsylvania 50.6 7. New Mexico 47.4 7. Washington, D.C. 50.6 8. Rhode Island 47.3 8. Maryland 50.0 9. Alabama 46.9 9. Connecticut 49.8 10. New York 46.2 10. Illinois 49.5 National: 44.5 National: 44.8 in 2004 to 10.7 workstations in 2006. A key reason for this slow growth in the number of workstations appears to have a great deal to do with limitations of physical space in libraries; in spite of increasing demands, space con straints often limit computer capacity.9 Unlike table 1, the comparisons between 2004 and 2006 in table 2 do not show acrosstheboard increases from 2004 to 2006. In fact, Florida had the highest average of workstations per library outlet in both 2004 and 2006, but the average number decreased from 22.6 in 2004 to 21.7 in 2006. It is interesting to note that Florida has a significantly higher number of workstations than the next highest state in both 2004 and 2006. In contrast, many of the states in the lower half of the top ten in 2004 had sub stantially lower average numbers of workstations in 2004 than in 2006. In 2004 there were an average of seven more computers in spot two than spot ten; in 2006, there were only an average of four more computers from spot two to ten. The large increases in the number of workstations in some states, like Nevada, Michigan, and Maryland, indicate sizeable changes in budget, numbers of outlets, and/or population size. Also of note is the significant drop of the average number of workstations in Kentucky, declining from 18.8 in 2004 to fewer than 13 in 2006. A possible explanation is that, since Kentucky libraries have been leaders in adopting wireless technologies (see table 3), the demand for workstations has decreased as libraries have added wireless access. Five states appear in the top ten of both years— Florida, Indiana, Georgia, California, and New Jersey. The average number of workstations in Indiana, California, and Georgia increased from 2004 to 2006, while the aver age number of workstations in Florida and New Jersey decreased between 2004 and 2006. Some of the decreases in workstations can be accounted for by increases in the availability of wireless access in public libraries, as librar ies with wireless access may feel less need to add more networked computers, relying on patrons to bring their own laptops. Such a strategy, of course, will not increase access for patrons who cannot afford laptops. Some libraries have sought to address this issue by having lap tops available for loan within the library building. The states listed in table 3 had the highest average levels of wireless connectivity in public library outlets in 2004 and 2006. The differences between the numbers in 2004 and 2006 reveal the dramatic increases in the avail ability of wireless Internet access in public libraries. The national average in 2004 was 17.9 percent, but in 2006, the national average had more than doubled to 37.4 percent of public libraries offering wireless Internet access. This sizeable increase is reflected in the changes in the states with the highest levels of wireless access. Every position in the ratings in table 3 shows a dra matic jump from 2004 to 2006. The top position increased from 47 percent to 63.8 percent. The tenth position increased from 19.6 percent to 47.8 percent, an increase of nearly twoandahalf times. These increases show how much more prominent wireless Internet access has become in the services that public libraries offer to their communities and to their patrons. Four states appear on both the 2004 and 2006 lists— Virginia, Kentucky, Rhode Island, and New Jersey. These four states all showed increases, but the rises in some Table 2. Highest average number of public access workstations in public library outlets by state in 2004 and 2006. 2004 2006 1. Florida 22.6 1. Florida 21.7 2. Kentucky 18.8 2. Indiana 17.5 3. New Jersey 15.5 3. Nevada 15.7 4. Georgia 14.0 4. Michigan 14.8 5. Utah 13.0 5. Maryland 14.6 6. Rhode Island 12.6 6. Georgia 14.4 7. Indiana 12.3 7. Arizona 14.1 8. Texas 11.9 8. California 14.0 9. California 11.8 9. New Jersey 13.8 10. South Carolina 11.7 10. Virginia 13.0 New York 11.7 National: 10.4 National: 10.7 Table 3. Highest levels of public access wireless Internet connectivity in public library outlets by state in 2004 and 2006 2004 2006 1. Kentucky 47% 1. Virginia 63.8% 2. New Mexico 38.6% 2. Connecticut 56.6% 3. New Hampshire 31.6% 3. Indiana 56.6% 4. Virginia 30.8% 4. Rhode Island 53.9% 5. Texas 26.4% 5. Kentucky 52.0% 6. Kansas 25.8% 6. New Jersey 50.9% 7. New Jersey 22.8% 7. Maryland 49.8% 8. Rhode Island 22.5% 8. Illinois 48.3% 9. Florida 21.9% 9. California 47.8% 10. New York 19.6% 10. Massachusetts 47.8% National: 17.9% National: 37.4% 6 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 7 other states were significant enough to reduce Kentucky from the topranked state in 2004 to the fifth ranked, in spite of the fact that the number of public libraries in Kentucky offering wireless access increased from 47 per cent to 52 percent. In both years, a majority of the states in the top ten were located along the East Coast. Further, high levels of wireless access may be linked in some states to areas of high population density or the strong presence of technologyrelated sectors in the state, as in California and Virginia. Smaller states with areas of dense popula tions, such as Connecticut, Rhode Island, and Maryland, are also among the leaders in wireless access. Tables 4 and 5 provide contrasting pictures regarding the number of public access Internet workstations in public libraries by state in 2004 and 2006. Table 4 shows the states with the highest percentages of libraries that consistently have fewer workstations that are needed by patrons, while table 5 shows the states with the highest percentages of libraries that consistently have sufficient workstations to meet patron needs. Of note is the fact that, unlike the preceding three tables, there appears to be no significant geographical clustering of states in tables 4 and 5. Nationally, the percentage of libraries that consis tently have insufficient workstations to meet patron needs declined from 15.7 percent in to 2004 to 13.7 percent in 2006, a change that is within the margin of error (+/ 3.4 percent) of the question on the 2006 survey. Due to the size of the change, it is not known if the national decline was a real improvement or simply a reflection of the margin of error. Washington, D.C., Oregon, New Mexico, Idaho, and California appear on the lists for both 2004 and 2006 in table 4. Washington, D.C. had the highest percentage of libraries reporting insufficient workstations in both years, though there was a significant decrease from 100 percent of libraries in 2004 to 69 percent of libraries in 2006. In this case, the significant drop represents major strides forward to providing sufficient access to patrons in Washington, D.C. Similarly, though California features on both lists, the percentages dropped from 44.9 percent in 2004 to 22.2 percent in 2006, a decline of more than half. States like these are obviously making efforts to address the need for increased workstations. Overall, eight out of ten positions in table 4 remained constant or saw a decline percentage in each position from 2004 to 2006, indicating a national decrease in libraries with insufficient workstations. In sharp contrast, fewer than 20 percent of Nevada libraries in 2004 reported insufficient workstations, placing well out of the top ten. However, in 2006 Nevada ranked second, with 51.5 percent of public libraries reporting insufficient workstations to meet patron demand. With Nevada’s rapidly growing population, it appears that the demand for Internet access in public libraries may not be keeping pace with the population growth. The percentage of public libraries reporting suffi cient workstations to consistently meet patron demands increased slightly at the national level from 14.1 percent in 2004 to 14.6 percent in 2006, again well within the margin of error (+/ 3.5 percent) of the 2006 question. However, in table 5, the top ten positions in 2006 all fea ture lower percentages than the same positions in 2004. In 2004 the topranked state had 53.2 percent of libraries able to consistently meet patron needs for Internet access, but the topranked state in 2006 had only 31 percent of libraries able to consistently meet patron access needs. Table 4. Public library outlet public access workstation availability by state in 2004 and 2006–consistently have fewer workstations than are needed 2004 2006 1. Washington, D.C. 100% 1. Washington, D.C. 69.9% 2. California 44.9% 2. Nevada 51.5% 3. Florida 36% 3. Oregon 34.8% 4. New Mexico 30.7% 4. New Mexico 31.9% 5. Oregon 30.4% 5. Tennessee 30.4% 6. Utah 29.2% 6. Alaska 27.8% 7. South Carolina 28.4% 7. Idaho 26% 8. Kentucky 24.1% 8. California 22.2% 9. Alabama 21.5% 9. New York 21.4% 10. Idaho 21.1% 10. Rhode Island 19% National: 15.7% National: 13.7% Table 5. Public library outlet public access workstation availability by state in 2004 and 2006—always have a sufficient number of workstations to meet demand. 2004 2006 1. Wyoming 53.2% 1. Louisiana 31% 2. Alaska 34.9% 2. New Hampshire 30.4% 3. Kansas 32.2% 3. North Carolina 28.4% 4. Rhode Island 31.4% 4. Arkansas 26.2% 5. New Hampshire 29.7% 5. Wyoming 25.2% 6. South Dakota 25.2% 6. Mississippi 24.4% 7. Georgia 25% 7. Missouri 23.6% 8. Arkansas 24.8% 8. Vermont 22.2% 9. Vermont 32.7% 9. Nevada 20.9% 10. Virginia 22.4% 10. Pennsylvania 17.9% West Virginia 17.9% National: 14.1% National: 14.6% � INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 Four states—New Hampshire, Arkansas, Wyoming, and Vermont—appear on both the 2004 and 2006 lists. The national increase in the sufficiency of the num ber of workstations to meet patron access needs and decreases in all of the topranked states between 2004 and 2006 seems incongruous. This situation results, however, from a decrease in range of differences among the states from 2004 to 2006, so that the range is compressed and the percentages are more similar among the states. Further, in some states, the addition of wireless access may have served to increase the overall sufficiency of the access in libraries, possibly leveling the differences among states. Nevertheless, the national average of only 14.6 percent of public libraries consistently having sufficient numbers of workstations to meet patron access needs is clearly a major problem that public libraries must work to address. Comparing the 2006 data of tables 4 and 5 demonstrates that patron demands for Internet access are being met neither evenly nor consistently across the states. Nationally, the percentage of public library systems with increases in the information technology budgets from the previous year dropped dramatically from 36.1 percent in 2004 to 18.6 percent in 2006. As can be seen in table 6, various national, state, and local budget crunches have significantly reduced the percentages of public library systems with increases in information technology budgets. When inflation is taken into account, a stationary information technology budget represents a net decrease in funds available in real dollar terms, so the only public libraries that are not actually having reductions in their information technology budgets are those with increases in such budgets. Since Internet access and the accompa nying hardware necessary to provide it are clearly a key aspect of information technology budgets, decreases in these budgets will have tangible impacts on the ability of public libraries to provide sufficient Internet access. Virtually every position on table 6 has a decrease of 20 percent to 30 percent from 2004 to 2006, with the largest decrease being from 84.2 percent in 2004 to 48.3 percent in 2006 in the second position. Five states—Delaware, Kentucky, Florida, Rhode Island, and South Carolina—are listed for both 2004 and 2006, though every one of these states registered a decrease from 2004 to 2006. No drop was more dramatic than South Carolina’s from 84.2 percent in 2004 to 31 percent in 2006. Overall, though, the declining information tech nology budgets and continuing increases in demands for information technology access among patrons cre ates a very difficult situation for libraries. Public libraries and the Internet in 2006 Along with questions that were asked on both the 2004 and 2006 Public Libraries and the Internet studies, the sur vey included new questions on the 2006 study to account for social changes, alterations of the policy environment, and the maturation of Internet access in public librar ies. Several findings from the new questions on the 2006 study were noteworthy among the state data. The states listed in table 7 had the highest percentage of public library systems with increases in total operating budget over the previous year in 2006. Nationally, 45.1 percent of public library systems had some increase in their overall budget, which includes funding for staff, physical structures, collection development, and many other costs, along with technology. At the state level, three Northeastern states clearly led the way, with more than 75 percent of library systems in Maryland, Delaware, and Rhode Island benefiting from an increase in the overall operating budget. Also of note is the fact that two fairly Table 6. Highest levels of public library system overall Internet information technology budget increases by state in 2004 and 2006 2004 2006 1. Florida 87.5% 1. Delaware 60% 2. South Carolina 84.2% 2. Kentucky 48.3% 3. Rhode Island 67.5% 3. Maryland 47.6% 4. Delaware 64.9% 4. Wyoming 45.7% 5. New Jersey 61.5% 5. Louisiana 40% 6. North Carolina 55.5% 6. Florida 38% 7. Virginia 53.6% 7. Rhode Island 33.3% 8. Kentucky 53.2% 8. South Carolina 31% 9. New Mexico 49.3% 9. Arkansas 27.5% 10. Kansas 49% 10. California 27.3% National: 36.1% National: 18.6% Table 7. Highest levels of public library system total operating budget increases by state in 2006 1. Maryland 85.7% 2. Delaware 80% 3. Rhode Island 76.4% 4. Idaho 74.5% 5. Kentucky 73.6% 6. Connecticut 68.6% 7. Virginia 62.8% 8. New Hampshire 62.5% 9. North Carolina 61.6% 10. Wyoming 60.9% National: 45.1% PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz � rural and sparsely populated Western states—Idaho and Wyoming—were among the top ten. Five of the states in the top ten in highest percent ages of increases in operating budget in 2006 were also among the top ten in highest percentages of increases in information technology budgets in 2006. Comparing table 7 with table 6 reveals that Delaware, Kentucky, Maryland, Rhode Island, and Wyoming are on both lists. In these states, increases in information technology budgets seem to have accompanied larger increases in the overall 2006 budget. An interesting point to ponder in comparing table 6 with table 7 is the large discrepancy between average increases in information technology budgets (18.6 per cent) with overall budgets (45.1 percent) at the national level. As Internet access is becoming more vital to pub lic libraries in the content and services they provide to patrons, it seems surprising that such a smaller portion of library systems would receive an increase in information technology budgets than in overall budgets. One growing issue with the provision of Internet access in public libraries is the provision of access at suf ficient connection speeds. More and more Internet con tent and services are complex and require large amounts of bandwidth, particularly content involving audio and video components. Fortunately, as demonstrated in table 8, 53.5 percent of libraries nationally indicate that their connection speed is sufficient at all times to meet patron needs. In contrast, only 16.1 percent of public libraries nationally indicate that their connection speed is insuf ficient to meet patron needs at all times. Georgia has the highest percentage of libraries that always have sufficient connection speed at 80.5 percent. In the case of Georgia, the statewide library network is most likely a key part of ensuring the majority of libraries have sufficient access speed. Many of the other states that have the highest percentages of public librar ies with sufficient connection speeds are located in the middle part of the country. The state with the highest percentage of libraries with insufficient connection speed to meet patron demands is Virginia, with 35 per cent of libraries. Curiously, Virginia consistently ranks in the top ten of tables 1–3. Though Virginia libraries have some of the longest hours open, some of the high est numbers of workstations, and some of the highest levels of wireless access, they still have the highest per centage of libraries with insufficient connection speed. Only five states had more than 25 percent of libraries with connection speeds insufficient to meet the needs of patrons at all times. This issue is significant now in these states, as these libraries lack the necessary connec tion speeds. However, it will continue to escalate as an issue as content and services on the Internet continue to evolve and become more complex, thus requiring greater connection speeds. Comparing table 8 with table 4 (consistently have fewer workstations than are needed) and table 5 (always have a sufficient number of workstations to meet demand) reveals some parallels. Alabama and Rhode Island are among the top ten states both for connection speed being consistently insufficient to meet patron needs (table 8) and consistently have fewer workstations than are needed (table 4). Conversely, Vermont and Louisiana are among the top ten states both for connection speed being sufficient to meet patron needs at all times (table 8) and always have a sufficient number of workstations to meet demand (table 5). Table 9 displays the two leading types of Internet connection providers for public libraries and the states with the highest percentages of libraries using each. Nationally, 46.4 percent of public libraries rely on an Internet Service Provider (ISP) for Internet access. In the states listed in table 9, threequarters or more of librar ies use an ISP, with more than 90 percent of libraries in Kentucky and Iowa using an ISP. The next most common means of connection for public libraries is through a library cooperative or library network, with 26.2 percent of libraries nationally using these means. In such cases, member libraries rely on their established network to serve as the connector to the Internet. The library net work approach seems to be most effective in geographi cally small states. The top three on the list being three of the smallest of the states—Rhode Island, Delaware, and West Virginia—with more than 75 percent of libraries in each of these states connecting through a network. Nationally, the remaining approximately 25 percent of Table �. Highest percentages of public library outlets where public access Internet service connection speed is sufficient at all times or insufficient by state in 2006 Sufficient to meet patrons needs at all times Insufficient to meet patron needs 1. Georgia 80.5% 1. Virginia 35% 2. New Hampshire 70.6% 2. North Carolina 28.1% 3. Iowa 64.2% 3. Alaska 27.3% 4. Illinois 64% 4. Delaware 26.9% 5. Ohio 63.9% 5. Mississippi 26.6% 6. Indiana 63.6% 6. Missouri 24.3% 7. Vermont 63.5% 7. Rhode Island 23.1% 8. Oklahoma 62.8% 8. Oregon 22.4% 9. Louisiana 61.7% 9. Connecticut 21.5% 10. Wisconsin 61.5% 10. Arkansas 21.2% National: 53.5% National: 16.1% 10 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 libraries connect through a network managed by a nonlibrary entity or by other means. The highest percentages of public library sys tems receiving each kind of Erate discount are presented in table 10. Erate discounts are an important source of technology funding for many public libraries across the country, with more than $250,000,000 in Erate discounts distributed to libraries between 2000 and 2003.10 Nationally in 2006, 22.4 percent of public library systems received discounts for Internet connectivity, 39.6 percent for telecommunications services, and 4.4 percent for internal connection costs. Mississippi and Louisiana appear in the top five for each of the three types of discounts. Minnesota and West Virginia are each in the top five for two of the three lists. Many of the states benefiting the most from Erate funding in 2006 have large rural popu lations spread out over a geographically dispersed area, indicating the continuing importance of E rate discounts in bringing Internet connections to rural public libraries. Maryland and West Virginia are both included in the Telecommunications Service column of table 10 due to proportionally large areas of these smaller states that are rural. The importance of the telecommunications dis counts in certain states is obviated by the fact that more than 75 percent of public library systems in all five states listed received such discounts. In comparison, only one state has more than 75 percent of library systems receiv ing discounts for Internet connectivity, while no state has 30 percent of library systems receiving discounts for internal connection costs, with the latter reflecting the manner in which Erate funding is calculated. In spite of the penetration of the Internet into virtually every public library in the United States and the general expectations that Internet access will be publicly available in every library, not all public libraries offer information technology training for patrons. Nationally, 21.4 percent of public library outlets do not offer technology training. Table 10 lists the states with the highest percentages of public library outlets not offering information technol ogy training. Six of the ten states listed are located in the Southeastern part of the country. The lack of resources or adequate number of staff to provide training is a leading concern in these states. Not offering patron training may be strongly linked to lacking economic resources to do so. For example, the two states with the highest percentage of public libraries not offering patron training—Mississippi and Louisiana—are also the two states in the top five recipients of each kind of Erate funding listed in table 10. If the libraries in states like these are economically struggling just to provide Internet access, it seems likely that providing accompany ing training might be difficult as well. A further difficulty is that there is little public or private funding available specifically for training. n Discussion of issues The similarities and differences among the states indi cate that the evolution of public access to the Internet in public libraries is not necessarily an evenly distributed phenomenon, as some states appear to be consistent lead ers in some areas and other states appear to consistently trail in others. While the national picture is one primarily of continued progress in the availability and quality of Internet access available to library patrons, the progress is not evenly distributed among the states. 11 Libraries in different states struggle with or benefit from different issues. Some public libraries are limited by state and local budgetary limitations, while other libraries are seeking alternate funding sources through grant writ ing and building partnerships with the corporate world. Some face barriers to providing access due to their geo graphical location or small service population. It may also be the case that the libraries in some states do not per ceive that patrons desire increased access. Other public libraries are able to provide highend access as a result of having strong local leadership, sufficient state and local funding, welldeveloped networks and cooperatives, and a proactive state library. Though the discussion of the “digital divide” has become much less frequent, the state data seem to indi cate that there are gaps in levels of access among libraries in different states. While every state has very successful individual libraries in terms of providing quality Internet Table �. Highest levels of types of Internet connection provider for public library outlets by state in 2006 Internet service provider Library cooperative or network 1. Kentucky 93.5% 1. Rhode Island 84.7% 2. Iowa 90.9% 2. Delaware 79.5% 3. New Hampshire 83.8% 3. West Virginia 77.9% 4. Vermont 81.1% 4. Wisconsin 71.2% 5. Oklahoma 80.6% 5. Massachusetts 54.7% Wyoming 80.6% 6. Minnesota 52.5% 7. Idaho 80.2% 7. Ohio 48.9% 8. Montana 78.9% 8. Georgia 45.1% 9. Tennessee 78.4% 9. Mississippi 41.2% 10. Alabama 74.6% 10. Connecticut 38.5% National: 46.4% National: 26.2% PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 11 access and individual libraries that could be doing a better job, the state data indicate that library patrons in different parts of the country have variations in the levels and quality of access available to them. Uniformity across all states clearly will never be feasible, though, as differ ent states and their patrons have different needs. For example, tables 1, 2, and 3 all display features that indicate highlevel Internet access in public librar ies—high numbers of hours open, high numbers of public access workstations, and high levels of wireless Internet access. Three states—Maryland, New Jersey, and Virginia—appear in the top ten in these three lists for 2006. Further, Connecticut, Florida, Illinois, and Indiana each appear in the top ten of two of these three lists. These states clearly are making successful efforts at the state and local levels to guarantee widespread access to public libraries and the Internet access they provide. Gaps in access are also evident among different regions of the country. The highest percentages of library systems with increases in total operating budgets were concentrated in states along the East Coast, with seven of the states listed in table 7 being MidAtlantic or Northeastern states. In con trast, the highest percentages of library systems relying on Erate funding in table 10 were concentrated in the Midwest and the Southeast. Further, the numbers in tables 6 and 7 showed far greater increases in the total operating budgets than in the information technology budgets in all regions of the country. As a result, public libraries in all parts of the United States may need to seek alternate sources of funding specifically for information technology costs. As can be seen in table 3, the leading states in adoption of wireless technology are concentrated in the Northeast and MidAtlantic. In table 11, Southern states, particu larly Louisiana and Mississippi, had many of the highest percentages of libraries not offering any Internet training to patrons. It is important to note with data from the GulfStates, however, that the effects of Hurricane Katrina may have had a large impact on the results reported. One key difference in a number of states seems to be the presence of a state library actively working to coordi nate access issues. This particular study was not able to address such issues, but evidence indicates that the state library can play a significant role in ensuring sufficiency of Internet access in public libraries in a state. Maine, West Virginia, and Wisconsin all have state libraries that apply and distribute funds at the statewide level to ensure all public libraries, regardless of size or geography, have highend connections to the Internet. The state library of West Virginia, for example, applied for Erate funding for telecommunications costs on a statewide basis and received 79.1 percent funding in 2006, using such funding to cover not only connection costs for public libraries, but also to provide IT and network support to libraries. Another example of a successful statewide effort to provide sufficient Internet access can be found in Maryland. In the early 1990s, Maryland public library administrators agreed to let the state library use Library Services and Technology Act (LSTA) funds to build the Sailor network, connecting all public libraries in the state.12 This network predates the Erate program by a number of years, but having an established statewide network has helped the state library to coordinate Table 10. Highest percentages of public library systems receiving E-rate discounts by category and state in 2006 Internet connectivity Telecommunications services Internal connection costs 1. Louisiana 89.2% 1. Mississippi 92.6% 1. Mississippi 29.6% 2. Indiana 70.8% 2. South Carolina 89.4% 2. Minnesota 22.6% 3. Mississippi 63% 3. Louisiana 79.5% 3. Arizona 19.3% 4. Minnesota 50.5% 4. West Virginia 79.1% 4. West Virginia 14.2% 5. Tennessee 44.7% 5. Maryland 76.2% 5. Louisiana 12.3% National: 22.4% National: 39.6% National: 4.4% Table 11. Highest levels of public library systems not offering patron information technology training services by state in 2006 1. Louisiana 48.7% 2. Mississippi 40.7% 3. Arkansas 39.6% 4. Alaska 36% 5. Arizona 34.8% 6. Georgia 34.5% 7. New Hampshire 32.8% 8. South Carolina 31.1% 9. Tennessee 30% 10. Idaho 29% National: 21.4% 12 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 applications, funding, and services among the libraries of the state. The state budget in Maryland also provides other types of funding to support the state library, the library systems, and the library outlets in providing Internet access. In states such as Georgia, Maryland, Maine, West Virginia, and Wisconsin, the provision of Internet access in public libraries is shaped not only by library outlets and library systems, but by the state libraries as well. In these and other states, the efforts of the state library appear to be reflected in the data from this study. A final area for discussion is the degree to which librarians understand how much bandwidth is required to meet the needs of library users, how to measure actual bandwidth that is available in the library, and how to determine the degree to which that bandwidth is suf ficient. Indeed, many providers advertise that their con nection speeds are “up to” a certain speed when in fact they deliver considerably less.13 The authors have offered an analysis of determining the quality and sufficiency of bandwidth elsewhere.14 Suffice to say that there is consid erable confusion as to “how good is good enough” band width connection quality. These types of issues frame understandings of how connected libraries in different states are and whether those connections are sufficient to meet the needs of patrons. n Future research While the experience of individual patrons in particular libraries will vary widely in terms of whether the access available is sufficient to meet their information needs, the fact that the state data indicate variations in the levels and quality of access among some states and regions of the country is worthy of note. An important area of sub sequent research will be to investigate these differences, determine the reasons for them, and develop strategies to alleviate these apparent gaps in access. Investigating these differences requires consideration of local and situational factors that may affect access in one library but perhaps not in another. For example, one public library may have access to an Internet provider that offers higher speed connectivity that is not available in another location. The range of the possible local and situational factors affecting access and services is extensive. A prelimi nary list of the factors that contribute to being a success fully networked public library is described in greater detail in the 2006 study.15 However, additional investigation into the degree to which these factors affect access, quality of service, and user satisfaction needs to be continued. The personal experience of the authors in working with various state library agencies suggests the need for additional research that explores relationships among those states ranked highest in areas such as connectivity and workstations with programs and services offered by the state library agencies. One state library, for example, has a specific program that works directly with individual public libraries to assist them in completing the various Erate forms. Is there a link between that state library providing such assistance and the state’s public libraries receiving more Erate discounts per capita than other states? This is but one example where investigating the role of the state library and comparing those roles and services to the rankings may be useful. Perhaps a number of “best practices” could be identified that would assist the libraries in other states in improv ing access and services. In terms of research methods, future research on the topics identified in this article may need to draw upon strategies other than a national survey and onsite focus groups/interviews. The 2006 study, for the first time, included site visits and interviews and produced a wealth of data that supplemented the national survey data.16 Onsite analysis of actual connection speeds in a sample of public libraries is but one example. The degree to which survey respondents know the connec tion speeds at specific workstations is unclear. Simply because a T1 line comes in the front door, it is not nec essarily the speed available at a particular workstation. Other methods such as log file analysis or userbased surveys of networked services (as opposed to surveys completed by librarians) may offer insights that could augment the national survey data. Other approaches such as policy analysis may also prove useful in better understanding access, connectiv ity, and services on a statebystate basis. There has been no systematic description and analysis of statebased laws and regulations that affect public library Internet access, connectivity, and services. The authors are aware of some states that ensure a minimum bandwidth will be provided to each public library in the state and pay for such connectivity. Such is not true in other states. Thus, a better understanding of how statebased policies and regulations affect access, connectivity, and services may identify strategies and policies that could be used in other states to increase or improve access, connectiv ity, and services. The data discussed in this article also point to many other important needs in future research. Libraries in certain states that seem to be frequently ranking high in the tables indicate that certain states are better able to sustain their libraries in terms of finances and usage. However, additional factors may also be key in the differ ences among the states. Future research needs to consider the Internet access in public libraries in different states in relation to other services offered by libraries and to uses of the Internet connectivity in libraries, including types of online content and services available, types of training PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 13 available, community outreach, other collection issues, staffing in relation to technology, and other factors. n Conclusion Internet and public computing access is almost univer sally available in public libraries in the United States, but there are differences in the amounts of access, the kinds of access, and sufficiency of the access available to meet patron demands. Now that virtually every public library has an Internet connection, provides Internet access to patrons, and offers a range of public computing access, the attention of public libraries must refocus on ensuring that every library can provide sufficient Internet and com puting access to meet patron needs. The issues to address include being open to the public a sufficient number of hours, having enough Internet access workstations, hav ing adequate wireless access, and having sufficient speed and quality of connectivity to meet the needs of patrons. If a library is not able to provide sufficient access now, the situation will only continue to grow more difficult as the content and services on the Internet continue to be more demanding of technical and bandwidth capacity. Public libraries must also focus on increasing provi sion of Internet access in light of federal, state, and local governments recently adding yet another significant level of services to public libraries by “requesting” that they provide access to and training in using numerous egov ernment services. Such egovernment services include social services, prescription drug plans, health care, disas ter support, tax filing, resource management, and many other activities.17 The maintenance of traditional services, the addi tion and expansion of public access computing and networked services, and now the addition of a range of egovernment services tacitly required by federal, state, and local governments, in combination, risk stretching public library resources beyond their ability to keep up. To avoid such a situation, public libraries, library sys tems, and state governments must learn from the library outlets, systems, and states that are more successfully providing sufficient Internet access to their patrons and their communities. Among these leaders, there are likely models for success that can be identified for the benefit of other outlets, systems, and states. Beyond the lessons that can be learned from the most connected, however, there are also practical and logistical issues that remain beyond the control of an individual library and sometimes the entire state, such as geographical and economic factors. Ultimately, the analysis of state data offered here sug gests that much can be learned from one state that might assist another state in terms of improving connectivity, access, and services. While the data suggest a number of significant discrepancies among the various states, it may be that a range of best practices can be identified from those more highly ranked states that could be employed in other states to improve access, connectivity, and ser vices. Staff at the various state library agencies may wish to discuss these findings and develop strategies that can then improve access nationwide. Providing access to the Internet is now as established a role for public libraries as providing access to books. Patrons and communities, and now government orga nizations, rely on the fact that Internet access will be available to everyone who needs it. While there are other points of access to the Internet in some communities, such as school media centers and community technology centers, the public library is often the only public access point available in many communities.18 Public libraries across the states must continually work to make sure the access they provide meets all of these needs. n Acknowledgements The 2004 and 2006 Public Libraries and the Internet studies were funded by the American Library Association and the Bill & Melinda Gates Foundation. Drs. Bertot, McClure, and Jaeger served as the coPrincipal Investigators of the study. More information on these studies is available at http://www.ii.fsu.edu/plinternet/. References and notes 1. John Carlo Bertot, Charles R. McClure, and Paul T. Jaeger, Public Libraries and the Internet 2004: Survey Results and Findings (Tallahassee, Fla.: Information Institute, 2005), http://www.ii.fsu .edu/plinternet_reports.cfm; John Carlo Bertot et al., Public Libraries and the Internet 2006: Study Results and Findings (Tal lahassee, Fla.: Information Institute, 2006), http://www.ii.fsu. edu/plinternet_reports.cfm (accessed Mar. 31, 2007). 2. Bertot et al., Public Libraries and the Internet 2006. 3. John Carlo Bertot and Charles R. McClure, “Assessing the Sufficiency and Quality of Bandwidth for Public Libraries,” Information Technology and Libraries 26, no. 1 (2007): 14–22. 4. John Carlo Bertot et al., “Drafted: I Want You to Deliver Egovernment,” Library Journal 131, no. 13 (2006): 34–39; John Carlo Bertot et al., “Public Access Computing and Internet Access in Public Libraries: The Role of Pub lic Libraries in Egovernment and Emergency Situations,” First Monday 11, no. 9 (2006). http://www.firstmonday .org/issues/issue11_9/bertot/ (accessed Mar. 31, 2007). 5. Ibid.; Paul T. Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes: Evolving Roles and Lessons Learned for Public Libraries in Disaster Preparedness and Community Services,” Public Library Quarterly (in press). 6. There are actually nearly 17,000 service outlets in the United States. However, the sample frame eliminated bookmobiles as 14 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 well as library outlets that the study team could neither geocode nor calculate poverty measures. Additional information on the methodology is available in the study report at http://www.ii.fsu .edu/plinternet/ (accessed Mar. 31, 2007). 7. Bertot et al., Public Libraries and the Internet 2006. 8. Bertot, McClure, and Jaeger, Public Libraries and the Internet 2004; Bertot et al., Public Libraries and the Internet 2006. The 2004 survey instrument is available at http://www.ii.fsu.edu/pro jectFiles/plinternet/plinternet_appendixa.pdf. The 2006 survey instrument is available at http://www.ii.fsu.edu/projectFiles/ plinternet/2006/Appendix1.pdf (accessed Mar. 31, 2007). 9. Bertot et al., Public Libraries and the Internet 2006. 10. Paul T. Jaeger, Charles R. McClure, and John Carlo Bertot, “The Erate Program and Libraries and Library Consortia, 2000 2004: Trends and Issues,” Information Technology and Libraries 24, no. 2 (2005): 57–67. 11. Bertot, McClure, and Jaeger, Public Libraries and the Inter- net 2004; Bertot et al., Public Libraries and the Internet 2006; John Carlo Bertot, Charles R. McClure, and Paul T. Jaeger, “Public Libraries Struggle to Meet Internet Demand: New Study Shows Libraries Need Support to Sustain Online Services,” American Libraries 36, no. 7 (2005): 78–79. 12. John Carlo Bertot and Charles R. McClure, Sailor Assess- ment Final Report: Findings and Future Sailor Development (Bal timore, Md.: Division of Library Development and Services, 1996). 13. Matt Richtel and Ken Belson, “Not Always Full Speed Ahead,” New York Times, Nov. 18, 2006. 14. Bertot and McClure, “Assessing the Sufficiency,” 14–22. 15. Bertot et al., Public Libraries and the Internet 2006. 16. Ibid. 17. Bertot et al., “Drafted: I Want You to Deliver Egovern ment”; Bertot et al., “Public Access Computing and Internet Access in Public Libraries”; Jaeger et al., “The 2004 and 2005 Gulf Coast Hurricanes.” 18. Paul T. Jaeger et al., “The Policy Implications of Internet Connectivity in Public Libraries,” Government Information Quar- terly 23, no. 1 (2006): 123–41. 3278 ---- This article discusses structural, systems, and other types of bias that arise in matching new records to large data- bases. The focus is databases for bibliographic utilities, but other related database concerns will be discussed. Problems of satisfying a “match” with sufficient flexibility and rigor in an environment of imperfect data are presented, and sources of unintentional variance are discussed. Editor’s note: This article was submitted in honor of the fortieth anniversaries of LITA and ITAL. S ameness is a sometime thing. Libraries and other informationintensive organizations have long faced the problem of large collections of records growing incrementally. Computerized records in a net worked environment have encouraged the recognition that duplicate records pose a serious threat to efficient information retrieval. Yet what constitutes a duplicate record may be neither exact nor completely predictable. Levels of discernment are required to permit matches on records that do not dif fer significantly and records that do. n Initial definitions Matching is defined as the process by which additions to a large database are screened and compared with existing database records. Ideally, this process of matching ensures that duplicates are not added, nor erroneous replacements made of record pairs that are not really equivalent. OCLC (Online Computer Library Center, Inc.) is a non profit organization serving member libraries and related institutions throughout the world. It is the chief database capital of the organization, and it is “owned” in a sense by the member libraries worldwide that use and contribute to it. At this writing, it contains over seventythree mil lion records. This discussion focuses chiefly on OCLC’s Extended WorldCat (XWC), though many of the issues are common to other bibliographic databases. Examples of these include the Research Libraries Group’s Research Libraries Information Network (RLIN) database, PICA (a European cooperative of libraries headquartered in the Netherlands), and other union catalogs. The literature will demonstrate that the problems described exist in many if not most large bibliographic databases.The database contents are representations or surrogates of the objects in shared collections. Individual records in XWC are com plex bibliographic representations of physical or virtual objects—books, films, URLs, maps, slides, and much more. Each of these records consists of metadata, i.e., “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”1(appendix A). The records use an XML varia tion of the MARC communications format.2 For example, a record for a book might typically contain such fields for author, title, publisher, and date, and many more in addi tion. The representation of any one object can be quite com plex, containing scores of fields and subfields. Such a record may be quite brief, or several thousand characters long. The depth and richness of the records varies enormously. They may describe materials in more than 450 languages. This is a database against which millions of searches and millions of records are processed, each month. Why is matching a challenge? Two records describing the same intellectual creation or work (e.g., Shakespeare’s Othello) can vary by physical form and other attributes. Two records describing both the same work and exactly the same form can differ from each other if the records were created under different rules of record description (catalog ing). Two records intended to describe the same object can vary unintentionally if typographical or other entry errors are present in one or both. Thus sorting out significant from insignificant differences is critical. An example of the challenges of developing matching software in the Metadata Capture Project is described elsewhere.3 The scope of misinformation is limited to information storage and retrieval, and specifically to comparison of incoming records to candidate matches in the database. The authors define misinformation as follows: 1. Anything that can cause two database records, i.e., representations of different items to be mistaken as representations of the same item. These can lead to inappropriate merging or updates. 2. The effect of techniques or processes of search that can obscure distinctions in differing items. 3. Any case where matching misses an appropriate match due to nonsignificant differences in two records that really represent the same item. Note that disinformation (the intentional effort to mis represent) is not considered in scope for this discussion. The assumption is that cooperation is in the interests of all parties contributing to a shared database. We do not assume that all institutions sharing the database have the same goals. MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 15 Misinformation and Bias in Metadata Processing: Matching in Large Databases Gail Thornburg and W. Michael Oskins Gail Thornburg (thornbug@oclc.org) has taught at the University of Maryland and the University of Illinois, and served as an Adjunct Professor at Kent State University, and as a senior-level Software Engineer at OCLC. W. Michael Oskins (oskins@oclc.org) has worked as a Developer and Researcher at OCLC for twenty years. 16 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200716 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 What is bias? Bias can be defined as factors in the creation or processing of database records that feed on misinformation or missing information, and skew charac terizations of the database records in question. Context—Matching and bias How are matching and bias related to each other? The growth of a database is in part a function of the matching process. If matching is not tuned correctly, the database can grow or change in nonoptimal ways. Another way to look at the problem is to consider the goal of success in searching, and the need to know when to stop. Human beings recognize that failure to find the best information for a given problem may be costly. Finding the best information when less would suffice may also be costly. Systems need to know this. For a large shared data base, hundreds of thousands of records may be processed in a day; the system must be as efficient as possible. What are some costs? Fail to match when one should, and duplicates may proliferate in the database. Match badly, and there is risk of merging multiple records that do not represent the same item. A system of matching can fail in more than one way. Balance is needed. 1. Searches, which are based on data in the incom ing record, may be too precise to find legitimate matches. Loosen the criteria too much, and the search may return too many records to compare. 2. Once retrieved, candidate matches are evaluated. Compare candidates too narrowly, and records with insignificant differences will be rejected. Fail to take note of salient differences between incom ing record and database record, and the match will be wrong, undetected, and potentially hard to detect in the future. The goals vary in different matching projects. For some projects, setting “holdings,” the indication that a member library owns a copy of something, is the main goal of the processing. This does not involve adding, replacing, or merging database records. For other projects, the goal is to update the database, either by replacing matched records, merging multiple duplicate records into one, or by adding new records if no match is found in the database. For the latter, bad matching could compromise database contents. n Background Hickey and Rypka provide a good review of the problems of identifying duplicates and the implications for match ing software.4 Their study notes concerns from a variety of library networks including that of the University of Toronto (UTLAS), Washington Library Network (WLN), and Research Libraries Group (RLIN). They also refer ence studies on duplicate detection in the Illinois state wide bibliographic database and at Oak Ridge National Laboratories. Background discussion of broader misinfor mation issues in shared library catalogs can be found in Bade’s paper.5 A good, though dated, review of duplicate record problems can be found in the O’Neill, Rogers, and Oskins article.6 The authors discuss their analysis of differences in records that are similar but not identical, and which elements caused failure to match two records for the same item. For example, when there was only one differing element in a pair, they found that element was most often publication date. Their study shows the difficulties for experts to determine with certainty that a bibliographic record is for the same item. Problems of typographical errors in shared biblio graphic records come under discussion by Beall and Kafadar.7 Their study of copy cataloging errors found only 35.8 percent were corrected later by libraries, though the ordinary assumption is that copy cataloging will be updated when more information is available for an item. Pollock and Zamora report on a spelling error detection project at Chemical Abstracts Service (CAS) and charac terize the types of errors they found.8 Chemical Abstracts databases are among the most searched databases in the world. CAS is usually characterized as a set of sources with considerable depth and breadth. Of the four most common typographical errors they describe, errors of omission are most common, with insertion second, substitution third, and transposition fourth. Over 90 percent of the errors they found were single letter errors. This is in agreement with the findings of O’Neill and Aluri, though the databases were substantially different.9 Another study on moving image materials focuses on problems of nearequivalents in cataloging.10 Yee suggests that cataloging practice tends to lead to making too many separate records for near equivalents. Owen Gingerich provides insight in the use of holdings information in OCLC and other bibliographic utilities such as RLIN for scholarly research in locating early editions of Copernicus’ De Revolutionibus.11 Among other sources, he used holdings information in multiple bibliographic utilities to help in collecting a census of copies of De Revolutionibus, and plotting its movements through Europe in the sixteenth century. His article high lights the importance of distinguishing very similar items for scholarly research. Shedenhelm and Burk discuss the introduction of vendor records into OCLC’s WorldCat database.12 Their results indicate that these minimallevel records increase the duplication rate within the database and can be costly to upgrade. (See further discussion in the section Change in Contributor Characteristics below.) One problem in analysis of sources of mismatch in previous studies is that there is no good way to detect and charac PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 17MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 17 terize typos that form real words. Jasco reviews studies characterizing types and sources of errors.13 Sheila Intner compares the quality issues in the databases of OCLC and the Research Libraries Group (RLG) and finds the issues similar.14 Intner used matched samples of records from both WorldCat and RLIN to list and compare types of errors in the records. She noted that while the perception at that time was that RLIN had higherquality cataloging, the differences found were not statistically significant. Jeffrey Beall, while focusing in his study on the full text online database JSTOR, notes the commonality of problems in metadata quality.15 In addition, he discusses the special quality problems in a database of scanned images. The scanning software itself may introduce typo graphical errors. Like XWC, the database changes rapidly. O’Neill and VisineGoetz present a survey of quality con trol issues in online databases.16 Their sections on dupli cate detection and on matching algorithms illustrate the commonalities of these problems in a variety of shared cataloging databases. They cite variation in title as the most common reason for failure to identify a duplicate record that should match. Variations in publisher, names, and pagination were noted as common. Lei Zeng pres ents a study of Chinese language records in the OCLC and RLIN databases.17 Zeng discusses quality problems including (1) format errors such as field and subfield tagging and incorrect punctuation; (2) content errors such as missing fields and internal record inconsisten cies; and (3) editing and inputting errors such as spacing and misspelling. Part 2 of her study presents the results of the prototype rulebased system developed to catch such errors.18 While the author refrains from comparing the quality of OCLC and RLIN Chinese language catalog records, the discussion makes clear that the quality issues are common to a number of online databases. More work is needed on quality and accuracy of shared records in nonRoman scripts, or in other lan guages transliterated to Roman script. n Types of bias to be considered Specific factors that may tend to bias an attempt to match one record to another include: 1. Violated expectations—system software expects data it does not receive, or data received is not well formed. 2. Temporal bias—changes in rules and philosophies of record creation over time. 3. Design bias—choices in layout of the records, which favor one type of record representation at the expense of another. 4. Judgment calls—distinctions introduced in record representations due to differing but legitimate variation in expert judgment. OCLC is a multina tional cooperative and there is no universal set of standards and rules for creating database records. Rules of cataloging most widely used are not abso lutely prescriptive and are designed to allow local deviation to meet local needs.19 5. Structural bias—process and systems bias. This category reflects internal influences, inherent in the automatic processing, storage, and retrieval of large numbers of records. 6. Growth of the database environment—whether in raw numbers of records, numbers of specific formats, numbers of foreign languages, or other characteristics that may affect efficient location and comparison of records. 7. Changes in contributor characteristics––in the goals or focus of institutions that contribute to the database. Violated Expectations Data may not conform to expectations. Expectations about the nature of records in the data bases are frequently violated. What seem to be good rules for matching may not work well if the incoming data is not well formed, or simply not constructed as expected. Biasing sources in the incoming data include the fol lowing: 1. Typographical errors occur in titles and other parts of the record. Anywhere the software has to parse text, an entry error—or even correction of an entry error by a later update—could con found matching. This could confound both (a) query execution and (b) candidate comparisons. Basically the system expects textual data such as the name of a title or publisher to be correct, and machinebased efforts to detect errors in data are expensive to run. Spelling detection techniques can compensate in some ways for data problems, but will not identify cases of realword errors. See Kukich for a survey of spelling error, realword, and contextdependent techniques.20 2. There is also the issue of real word differences in similar text strings. An automated system with programmed fault tolerance may wrongly equate the publisher name “Mila” with “Mela” when they are distinct publishers. Equivalence tables can crossreference known variations on wellknown publisher names, but cannot predict merges and other organizational changes. Or consider author names: are “John Smith” and “Jon Smith” the 1� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 20071� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 same? This is a major problem with automated authority control where context clues may not be trustworthy. 3. Errors of formatting of variable fields in the meta data contribute to false mismatch. The rules for data entry in the MARC record are complex and have changed over time. Erroneous placement or coding of subfields poses challenges for iden tification of relevant data. The software must be fault tolerant wherever possible. Changes in the format of the data itself in these fields/sub fields may further complicate record comparisons. ISBNs (International Standard Book Numbers) and LCCNs (Library of Congress Control Numbers) have both changed format in the recent past. 4. Errors occur in the fields that indicate format of the information. In bibliographic records, format information is used to derive the overall type of material being described: book, URL, DVD, and so on. Errors in the data in combination can generate an incorrect material type for the record. 5. Language of cataloging: this comparison has in the past caused inappropriate mismatches. The require ments in the new matching aimed to address this. 6. Language in formation of queries: MARC records frequently are a mixture of languages. As has been seen in other projects with intensive comparison of text, overlap in languages has the potential to confuse comparisons of short strings of text.21 The assumption made here is that the use of all pos sible syllables contained in the title should tend to mitigate language problems. Nothing short of semantic analysis by the software is likely to solve such a problem, and contextual approaches to detection have had most success (in the produc tion environment) in carefully controlled cases. Matching overall must be generic in its problem solving techniques. Temporal bias Large databases developed over time have their contents influenced by changes in standards for record creation, changes in contributor perception of the role of the data base, and changes in technology to be described. Changes may include the following: 1. Description level: e.g. changes such as book or elec tronic book. These have evolved from format to contentbased descriptions that transcend format. Over time, the cataloging rules for describing formats have changed. Thus a format description created earlier might inadvertently “mismatch” the newer description of exactly the same item. For example, the rules for describing a book on a CD originally emphasized the CD format, whereas now, the emphasis might be shifted to focus on the intellectual content, the fact that it is a book. 2. The role of the database once perceived as chiefly repository or even backup source for a given library has become a shared resource with responsibilities to a community larger than any one library. 3. Over time, the use of the database may change. (This is further discussed in the section on Growth of the Environment later.) Searching has to satisfy the reference function of the database, but match ing as a process also relies on searching, and its goals are different. 4. Varied standards worldwide challenge coopera tion. While U.S. libraries usually follow AACR2 and use the MARC21 communications format, other parts of the world may use UNIMARC and countryspecific cataloging rules. For instance, the PICA Bibliotekssystem, which hosts the Dutch Union Catalog, used the Prussian cataloging rules, which tended to focus on title entries.22 The switch to the RAK was made by the early nineties.23 5. Some libraries may not use any form of MARC but submit a spreadsheet that is then converted to MARC. There is some potential for ambiguities in those conversions due to lack of 1:1 correspon dence of parts. 6. Even within a country, standards change over time, so that “correct” cataloging in one decade may not match that in a later period. Neither is wrong, in its own temporal context, but each results in different metadata being created to describe the same item. Intner points out that OCLC’s database was initi ated a full decade before RLG implemented RLIN, and RLIN started almost the same time as the AACR2 publication.24 Thus RLIN had many fewer preAACR2 records in its database, while Worldcat had many more preexisting records to try to match with the newer AACR2 forms. 7. Objects referenced in the database may change over time. For instance, a record describing an elec tronic resource may point to a location no longer valid for that resource. 8. Vendor records are created as advance advertis ing, but there is no guarantee the records will be updated later. Estimating the time before updates occur is impossible. 9. Records themselves change over time as they are copied, derived, and migrated into other systems. They may be enhanced or corrected in any system where they reside. So when they return to the origi nating database, they may have been transformed so far as to be unrecognizable as representations of the same item. This problem is not unique to XWC; PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 1�MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 1� it is a challenge for any shared database where export of records and reentry is likely. Design bias The title, author, publisher, place of publication, and other elements of a record, designed in a time when most of the contents of a library were books, may not appear as clear or usable for other forms of informa tion, such as Web sites or software. There is a risk to any design of a representation for an object, that it may favor distinctions in one format over another. Or representations imported from other schemes may lose distinctions in the crosswalk from one scheme to another. A crosswalk is a mechanism for the mapping of data elements/content from one metadata scheme to another. Dublin Core and MARC are just two examples of schemes used by library professionals. Software exists to convert Dublin Core metadata to MARC for mat, but the process of converting less complex data to a scheme of more structured data has inevitable limita tions. For instance, Dublin Core has “SUBJECT” while MARC has dozens of ways to indicate subject, each with a different kind of designation for subject aspects of an item.25 See discussion in Beall.26 Libraries commonly exchange or purchase records from external sources to reduce the volume or costs of inhouse cataloging. If an institution harvests metadata from multiple sources, there can be varying structures, content standards, and overall quality, all of which can make record compari sons error prone. While library and information science professionals have been creating metadata in the form of catalog records for a long time, the wider community of digital repositories may be outside the LIS commu nity, and have varied understanding of the need for consistent representations of data. Robertson discusses the challenges of metadata creation outside the library community.27 Museums and archives may take a dif ferent view of what quality standards in metadata are. For example, for a museum, extensive detail about the provenance of an object is necessary. Archives often record information at the collection level rather than the object level; for example, a box of miscellaneous papers, as opposed to a record for each of the papers within the box. Educators need to describe resources such as learning objects. A learning object is any entity, digital or nondigital, which can be used, reused, or referenced during technologysupported learning 28 For these objects a metadata record using the IEEE LOM standard may be used.29 While this is as complex as a MARC record, it has less bibliographic description and more focus on description of the nature and use of the learning object. In short, for one type of institution the notion of appropriate granularity of description may be too detailed or too vague for the needs of another type of institution. Judgment calls Two persons creating independent records for the same item exercise judgment in describing what is most impor tant about the object. One may say it is a book with an accompanying CD, another may say it is software on a CD, accompanied by a book of documentation. Another example of legitimate variation is the choice of use of ellipses […] to leave out parts of long titles in a metadata description. One record creator may list the whole title, another may list only the first part followed by the mark of ellipsis to indicate abbreviation of the lengthy title. Either is correct, but may not match each other without special techniques. See appendix B for the perils of ellipsis handling. The form of name of a publisher, given other occur rences of a publisher name in a record, may be abbrevi ated. For instance, in one place the corporate author who is also the publisher might be listed in the author field as “Department of Health and Human Services” and then abbreviated—or not—in the publisher area as “The Department.” Note that there are limitations inherent to the valida tion of any system of matching, in that human reviewers may not be able to determine whether two representa tions in fact describe the same item. Structural bias 1. Process bias refers to any features of the software which at runtime may change the way matching is carried out, whether by shortening or lengthen ing the analysis, or otherwise branching the logical flow. This can arise from many sources, including but not limited to the following factors. a. There is need for efficient processing of large num bers of incoming records. This can force an empha sis on speedy matching. That is, matching not required to replace records tends to be optimized to stop searching/matching as early as is reason able. In the case where unique key searching finds a single match to an incoming record, it is fairly easy for the software to “justify” stopping. If there are multiple matches found, more analysis may be needed before the decision to stop matching can be made. Over time the numbers of records processed has increased enormously. b. Matching needs to exploit “unique” keys to speed searching, yet these may not prove to be unique. Though agreements are in place for use of numeric keys such as ISBNs, creation of these keys is not under the control of any one organization. 20 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200720 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 c. Problems arise when brief records are com pared with fuller records. Comparisons may be biased inadvertently towards false matches. Such sparseness of data has been identified as a problem in RLIN matching as well as in XWC. d. At the same time there is bias toward less generic titles in matching. Requirements of sys tem throughput mandate an upper limit on the size of result set that the matching software will even attempt to analyze. This upper limit could tend to discriminate against effective retrieval of generic titles. Matching will reject very large results sets of searches. So the query that has fewer title terms may tend to retrieve too much. Titles such as “Proceedings” or “Bulletin” may be difficult to match if insufficient other informa tion is present in the record for the query to use. Ironically this can mean addition of more generic titles to the database, since what is there is in effect less findable. e. Transparency can contribute to bias in that, for each layer of transparency a layer of opacity may be added, when information is filtered out from a user’s view. That user may be a human or an application. OpenURL access to “appropriate copy” is an example from the standards world. The complexity of choosing among multiple online copies has become known as the “appro priate copy” problem. There are a number of instances where more than one legitimate copy of an electronic article may exist, such as mir roring or aggregator databases. It is essentially a problem of where and how to introduce localiza tion into the linking process.30 Appropriateness reflects the user’s context, e.g., location, license agreements in place, cost, and other factors. 2. Systems bias. What is this, really? The database can be seen as “agent.” The weight of its own mass may affect efforts to use its contents. a. For maintainers of large database systems, the goals of database repository and search engine may be somewhat at odds. Yet librarians do make use of the database as reference source. b. Search strategies for the software that acts as a user of the database is necessarily developed and optimized at a certain point in time. Yet a river of new information flows into this data base. 1. If the numbers of types of entries in various database indexes grows nonproportion ally, search strategies that worked well in the past could potentially fall “out of tune” with the database contents. See Growth of the Environment section below. 2. Change in proportions of languages in the database may render an application’s use of stopword lists less effective. 3. If changes in technology or practice result in new forms of material being described in the database, the software searches using material type as a limiter may not work properly. The software is using abstractions provided by the database, and they need to be kept synchronized. c. Automated query construction presents its own problems. The use of Boolean searching [term A and term B and term C] is quite restrictive in the sense that there is no “halfway” or flex for a record being included in a set of candidates. Matching starts with the most specific search to avoid toohigh numbers of records retrieved, and all it can do is drop or rearrange terms from a query in the effort to broaden the results. d. Disconnects in metadata object creation/revision are another problem. Links can point to broken URIs (uniform resource identifiers). Controlled vocabularies can drift or expand. Even more confusing, a URI that is not broken may point to content which has changed to the point where the metadata no longer describes the item it once did. At one extreme, Bruce and Hillmann describe the curious case of citation of judicial opinions, for which a record of the opinion may be created as much as eighteen months before the volume with the official citation is printed, and thus the official citation cannot be created.31 e. Expectations for creation of metadata play a role as well. Traditional cataloging has generally had an expectation that most metadata is being cre ated once and reused. Yet current practice may be more iterative, and must be, if such problems as records with broken Internet URIs are to be avoided. f. Loss of synchronization can subvert process ing. Note that other elements of metadata may become divorced or out of synch with the origi nal target /purpose. The prefix to an ISBN was originally intended to describe the publisher, but is now an unreliable discriminator. Numeric keys intended to identify items uniquely can retrieve multiple items, if the scheme for assign ing them is not applied consistently. In the worst case, meaningful data elements may become so corrupted as to be useless for record retrieval or even comparison of two records. g. Ownership issues can detract from optimal data base management. Member institutions’ percep tions of ownership of individual records can conflict with the goals of efficient search and retrieval. Members may resist the idea of a “bet PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 21MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 21 ter” record being merged with a “lesser” one. So systems have ways of ranking records by source or contents with the general goal of trying to avoid losing information, but with the specific effect of refraining from actions that might be enriching in a given case. Growth of the database environment A shared database can grow in unpredictable ways. A change in the relative proportions of different types of materials or topical coverage can render onceeffective searches ineffective due to large result sets. An example of this is the number of Internetrelated entries in XWC. A search such as “dog” restricted to “Internetrelated” entries in 1995 retrieved thirtyfour hits. This might be a manageable number. But in 2005, 225 entries were in the result set. Similarly with subject headings, one search on “computer animation” retrieved fourteen hits in 1980, and 342 in 2005. In both cases the result sets grew from manageable to “too large” over time. The increase in the number of foreign language entries in a database can cause problems. Just determining what language an entry is in can be difficult, and records may contain multiple languages. Also, such languages as Chinese, Japanese, and Korean can overlap. Chinese syllables such as: “a, an, to, no, Jan, Ka, Jun, lung, sung, I, lo, la, le, so, sun, Juan,” seen out of context might be Chinese or any one of several other languages. Determining appropriate handling of stopwords and other rules for effective title matching becomes more complex as more languages populate the database. Changes in contributor characteristics Copy cataloging practices in an institution can affect XWC indirectly. An institution previously oriented to fixing downloaded records may adopt a policy of refrain ing from changing downloaded records. Historical inde pendence of libraries is one illustration. Prior to the 1970s, most libraries did not share their cataloging with other libraries. Many institutions, especially smaller ones, were outside the loop and did things their own way. They used what rules they felt were useful, if they used any rules at all. Later they converted sparse and poorly formed data into MARC records and sent them to OCLC for matching, perhaps in an effort to get back a more complete and useful record. Yet the matching process is not always able to distinguish or interpret these local dialects. Changes in specialization of cata loging staff at an institution, or cutbacks in staff can lead to reduced facility in providing original cataloging. Outsourcing of cataloging work can affect handling of specialized materials as well. The introduction of Vendor Records and their characteristics has been noted by Shedenhelm and Burk.32 As they note, these records are very brief bibliographic records originally designed to advertise an item for sale by the vendor. These mini mal level records have a relatively high degree of dupli cation with existing records (37.5 percent in their study) and because of their sparseness can increase the cost of cataloging. Changes in the proportion of contribu tors who create records in nonMARC formats such as Dublin Core can affect the completeness of bibliographic entries. The use of such formats, meant to facilitate the entry of bibliographic materials, does come with a cost. Group cataloging is a process whereby smaller libraries can join a larger set of institutions in order to reduce costs and facilitate cataloging. This larger group then contributes to OCLC’s database as an entity. The growth of group cataloging has resulted in the addition of more records from smaller libraries, which may in the future have an effect on searching/matching in XWC WorldCat overall. Internationalization may be a factor as well. The MARC format is an Anglobased format with Englishlanguagebased documentation. Rapid inter national growth thrusts a broader range of traditions into a MARC/OCLC world. The role of character sets is heightened as the database grows. A Cyrillic record may not be confidently matched to a transliterated record for the same item. Although WorldCat has a long his tory with CJK records, MARC and WorldCat are not yet accustomed to a wide repertoire of character sets. Now, however, XWC is an environment in which expanding character coverage is possible, and likely. Future research n We need more systematic study of the types of errors/omissions encountered in MARC record cre ation. n How can the process of matching accomodate objects that change over time? n How does the conversion from new metadata schemes affect matching to MARC records? Does it help to know in what format a record arrived, or under what rules it was created? n How can we address sparseness in vendor records or legal citations? How can we deal with other advance publication issues? n How do changes in philosophy of the database affect the integrity of the matching process? n Conclusions In this review we have seen that characterizing metadata at a high level is difficult. Challenges for adding to a large, complex database include some of the following: 22 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200722 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 n Rules for expert creation of metadata inevitably change over time. n The object of the metadata itself may change, more often than may be convenient. n Comparisons of briefer records to records that are more elaborate descriptions can have pitfalls. Search and comparison strategies for such record pairs are challenged by the need to have matching algorithms that work for every scenario. n Changes within the database may themselves con tribute to exacerbation of matching problems if duplicates are added too often, or records are merged that actually represent different contents. Because of the risk, policies for merging and replacing records tend to be conservative, but this does not always favor the greatest efficiency in database processing. n Changes in the membership sharing a database are likely to affect its shape and searchability. n Newer schemes of metadata representation are likely to challenge existing algorithms for determining matches. References 1. National Information Standards Organization, Under- standing Metadata (Bethesda, Md.: NISO Pr., 2004), 1. http:// www.niso.org/standards/resources/Understanding Metadata. pdf (accessed Feb. 26, 2006). 2. Library of Congress, “MARC 21 Concise Format for Bibliographic Data (2002).” http://www.loc.gov/marc/ bibliographic/ecbdhome.html (accessed Nov. 20, 2004). 3. Gail Thornburg, “Matching: Discrimination, Misinforma tion, and Sudden Death,” Informing Science Conference, Flag staff, Ariz., June 2005. 4. Thomas B. Hickey and David J. Rypka, “Automatic Detec tion of Duplicate Monographic Records,” Journal of Library Auto- mation 12, no. 2 (June 1979): 125–42. 5. David Bade, “The Creation and Persistence of Misinfor mation in Shared Library Catalogs,” Occasional Paper No. 211, (Graduate School of Library and Information Science, Univer sity of Illinois at Urbana–Champaign, Apr. 2002). 6. Edward T. O’Neill, Sally A. Rogers, and W. Michael Oskins, “Characteristics of Duplicate Records in OCLC’s Online Union Catalog,” Library Resources and Technical Services 37, no.1 (1993): 59–71. 7. Jeffrey Beal and Karen Kafadar, “The Effectiveness of Copy Cataloging at Eliminating Typographical Errors in Shared Bibliographic Records,” Library Resources & Technical Services 48, no. 2 (Apr. 2004): 92–101. 8. J. J. Pollock and A. Zamora, “Collection and Characteriza tion of Spelling Errors in Scientific and Scholarly Text,” Journal of the American Society for Information Science 34, no. 1 (1983): 51–58. 9. Edward T. O’Neill and Rao Aluri, “A Method for Cor recting Typographical Errors in Subject Headings in OCLC Records,” Research Report # OCLC/OPR/RR80/3 (1980). 10. Martha M. Yee, “Manifestations and YearEquivalents: Theory, with Special Attention to MovingImage Materials,” Library Resources and Technical Services 38, no. 3 (1995): 227–55. 11. Owen Gingerich, “Researching the Book Nobody Read: The De Revolutionibus of Nicolaus Copernicus,” The Papers of the Bibliographical Society of America 99, no. 4 (2005): 484–504. 12. Laura D. Shedenhelm and Bartley A. Burk, “Book Vendor Records in the OCLC Database: Boon or Bane?” Library Resources and Technical Services 45, no. 1 (2001): 10–19. 13. Peter Jasco, “Content Evaluation of Databases,” in Annual Review of Information Science and Technology, vol. 32 (Medford, N.J.: Information Today, Inc., for the American Society for Infor mation Science, 1997), 231–67. 14. Sheila Intner, “Quality in Bibliographic Databases: An Analysis of MemberControlled Cataloging of OCLC and RLIN,” Advances in Library Administration and Organization 8 (1989): 1–24. 15. Jeffrey Beall, “Metadata and Data Quality Problems in the Digital Library,” Journal of Digital Information 6, no. 3 (2005): 10–11. 16. Edward T. O’Neill and Diane VizineGoetz, “Quality Control in Online Databases,” Annual Review of Information Sci- ence and Technology 23 (Washington, D.C.: American Society for Information Science, 1988). 17. Lei Zeng, “Quality Control of ChineseLanguage Records Using a RuleBased Data Validation System. Part 1: An Evalua tion of the Quality of ChineseLanguage Records in the OCLC OLUC Database,” Cataloging and Classification Quarterly 16, no. 4 (1993): 25–66 18. Lei Zeng, “Quality Control of ChineseLanguage Records Using a RuleBased Data Validation System. Part 2: A Study of a RuleBased Data Validation System for Online Chinese Cata loging,” Cataloging and Classification Quarterly 18, no. 1 (1993): 3–26. 19. Anglo-American Cataloguing Rules, 2nd ed., 2002 rev. (Chi cago: ALA, 2002). 20. Karen Kukich, “Techniques for Automatically Correct ing Words in Text,” ACM Computing Surveys 24, no. 4 (1992): 377–439. 21. Gail Thornburg, “The Syllables in the Haystack: Techni cal Challenges of NonChinese in a WadeGiles to Pinyin Con version,” Information Technology and Libraries 21, no. 3 (2002): 120–26. 22. Hartmut Walravens, “Serials Cataloguing in Germany: The Historical Development,” Cataloging and Classification Quar- terly 35, no. 3/4 (2003): 541–51; Instruktionen für die alphabetischen kataloge der preuszischen bibliotheken vom 10. mai 1899. 2 ausg. in der fassung vom 10. august 1908 (Berlin: Behrend & Co., 1909). 23. Richard Greene, email message to author, Nov. 13, 2006; Regeln für die alphabetische Katalogisierung: RAK / Irmgard Bou vier (Wiesbaden, Germany: L. Reichert, 1980, c1977). 24. Intner, “Quality in Bibliographic Databases.” 25. Richard Greene, email message to author, Feb. 27, 2006. 26. Beall, “Metadata and Data Quality Problems in the Digital Library.” 27. R. John Robertson, “Metadata Quality: Implications for Library and Information Science Professionals,” Library Review 54, no. 5 (2005): 295–300. PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 23MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 23 28. IEEE. Learning Technology Standards Committee, “WG12: Learning Objects Metadata.” http://ltsc.ieee.org/wg12 (accessed Feb. 26, 2006). 29. Ibid. 30. Orien BeitArie et al., “Linking to the Appropriate Copy: Report of a DOIBased Prototype,” D-Lib 7, no. 9 (Sept. 2001). 31. Thomas R. Bruce and Diane I. Hillmann,“The Continuum of Metadata Quality: Defining, Expressing, Exploiting,” in Meta- data in Practice (Chicago: ALA, 2004), 238–56. 32. Shedenhelm and Burk, “Book Vendor Records in the OCLC Database.” 24 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200724 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 Appendix A. Sample CDFRecord Record from the XWC Database cgm 7a 27681290 vf bcahru mr baaafu 920714r19551952fr 092 mleng 92513007 DLCamim DLC LP5921U.S. Copyright Office xxu mr VBE 63606361 (viewing copy) FGB 56435647 (ref print) FPA 06210625 (master pos) Othello (Motion picture : Welles) The Tragedy of Othello the Moor of Venice / a Mercury Production, [Films Marceau?] ; directed, produced, and written by Orson Welles. U.S. ; [Morocco?] France :Films Marceau,1952 ; [Morocco?: :s.n., 1952?] ;United States : United Artists,1955. 2 videocassettes of 2 (ca. 92 min.) :sd., b&w ; 3/4 in. viewing copy. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. ref print. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. masterpos. Copyright: Orson Welles; 19Sep52; LP5921. Reference sources cited below and M/B/RS preliminary cataloging card list title as Othello. Photography, Anchisi Brizzi, G.R. Aldo, George Fanto ; film editors, John Shepridge, Jean Sacha, Renzo Lucidi, William Morton ; music, Francesco Lavagnino, Alberto Barberis. Orson Welles, Suzanne Cloutier, MicheaÌ l MacLiamoÌ ir, Robert Coote. Director, producer, and writer credits taken from Focus on Orson Welles, p. 205. LC has U.S. reissue copy.DLC New York times,9/15/55. An adaptation of the play by William Shakespeare. Reference sources used: New York times, 9/15/55; International motion pic ture almanac, 1956, p. 329; Focus on Orson Welles, p. 205206; Monthly film bulletin, v. 23, no. 267, p. 44; Index de la cineÌ matog raphie francÌ§aise, 1952, p. 496. Received: 5/26/87 from LC video lab;viewing copy; preservation, made from ref print, paperwork in ACQ: CopyrightMaterial Movement Form file, LWO 21635; Copyright Collection. Received: 12/2/64; ref print;copyright deposit; Copyright Collection. Received: 5/70; masterpos;gift; AFI Theatre Collection. Othello (Fictitious charac ter)Drama. PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 25MISINFORMATION AND BIAS IN METADATA PROCESSING | THORNBuRG AND OSkINS 25 Plays. mim Features. mim Welles, Orson, 1915direction, production,writing, cast. Cloutier, Suzanne,1927cast. Mac LiammoÌ ir, MicheaÌ l, 18991978,cast. Coote, Robert,19091982,cast. Copyright Collection (Library of Congress)DLC AFI Theatre Collection (Library of Congress)DLC Othello. Appendix B. The Perils of Judging Near Matches A. Challenges of Handling Ellipses in Titles Thought to be Similar Incoming title: General explanation of tax legislation enacted in ... / prepared by the staff of the Joint Committee on Taxation Match: General explanation of tax legislation enacted in the 104th Congress prepared by the staff of the Joint Committee on Taxation Incoming title: General explanation of tax legislation enacted in ... / prepared by the staff of the Joint Committee on Taxation Match: General explanation of tax legislation enacted in the 106th Congress prepared by the staff of the Joint Committee on Taxation Incoming title: General explanation of tax legislation enacted in ... / prepared by the staff of the Joint Committee on Taxation Match: General explanation of tax legislation enacted in the 107th Congress prepared by the staff of the Joint Committee on Taxation Incoming title: General explanation of tax legislation enacted in ... / prepared by the staff of the Joint Committee on Taxation Match: General explanation of tax legislation enacted in the 108th Congress prepared by the staff of the Joint Committee on Taxation B. Partial Matches in Names Which Might Represent the Same Publisher Publisher comparison is challenging in an environment where organziations are regularly merged or acquired by other organziations. There is no real authority control for publishers that would help cataloguers decide on a preferred form. When governmental organizations are added to the mix, the challenges increase. Below are some examples of nonmatch ing text of publisher names in records, which might or might not considered the same by a human expert. (The publisher names have been normalized.) 26 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200726 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 1. Publisher name may be partially or differently recorded in two records Incoming publisher: konzeptstudien kantonale planungsgruppe Match: kantonale planungsgruppe konzeptstudien (word order different) Incoming publisher: institut francais proche orient Match: institut francais darcheologie proche orient Incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environ mental satellite data and information service Match: national oceanic and atmospheric administration 2. Publisher name may have changed due to acquisition by another organization Incoming publisher: pearson prentice hall Match: prentice hall Incoming publisher: uxl Match: uxl thomson gale Incoming publisher: thomson arco Match: arco thomson learning 3. One record may show “publisher” which is actually government distributing agency or clearinghouse such as the U.S. Government Printing Office or National Technical Information Service (NTIS), while the candidate match shows the actual government agency. These can be almost impossible to evaluate. Incoming publisher: u s congressional service Match: supt g p o (Here the distributor is the Government Printing Office, listed as the publisher) Incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environmental satellite data and information service Match: national oceanic and atmospheric administration Incoming publisher: u s gpo Match: u s fish and wildlife service 4. The publisher in a record may start with or end with the publisher in the second record. Should it be called a match? Good: Incoming publisher trotta Match: editorial trotta Incoming publisher wiley Match: john wiley Questionable? Incoming publisher prentice hall Match: prentice hall regents canada Incoming publisher geuthner Match: orientaliste geuthner Incoming publisher oxford Match: distributed royal affairs oxford Incoming publisher: pan union general secretariat organization states Match: social science section cultural affairs pan union 3279 ---- Index Blending is the process of database development whereby various components are merged and refined to create a single encompassing source of information. Once a research need is determined for a given area of study, existing resources are examined for value and possible contribution to the end product. Index Blending focuses on the quality of bibliographic records as the primary factor with the addition of full text to enhance the end user’s research experience as an added convenience. Key examples of the process of Index Blending involve the fields of communication and mass media, hospitality and tourism, as well as computers and applied sciences. When academia, vendors, subject experts, lexicographers, and other contributors are brought together through the various factors associated with Index Blending, relevant discipline-specific research may be greatly enhanced. A s consumers, when we set out to make a purchase, we want the utmost in quality, and when applica ble, quantity, and of course all of the other ”appeal” factors that might be associated with a given product or service. These factors may include any number of catego ries, not the least of which is price. In other words, let it suffice to say that, as buyers, we want to have our cake and eat it, too. But how often is this a realistic approach to evaluating a given item for purchase? We first must decide what is important to us, decipher the order of this importance as we see it, and evaluate our options. Wouldn’t it be much easier if one product in every situ ation had all of the factors that we deem important, and the appropriate price to go along with it? According to Veliyath and Fitzgerald in an article published in Competitiveness Review, firms can either posi tion themselves at the high end, offering higher quality at higher prices, or at the lower end, offering lower quality at a lower price (or anywhere inbetween on the continuum of constant value for customers). Customers, however, want more of what they value, such as convenience, speed, stateoftheart design, quality, etc. Competitors then try to differentiate themselves from their rivals along the same line of constant value, either by offering a higher quality at the same price or the same quality at a lower price (thereby increasing value for the customer).1 As such, and using a common example, is it possible to have the handling of a BMW sports car, the luxurious ride of a Cadillac, the passenger space of a Winnebago, the cargo space of an oversized pickup truck, all for the price of an economy car? It’s doubtful. But through recent developments in the electronic research database market place, and a process known as “Index Blending,” we may be closer than ever to this ideal formula when it comes to Webbased reference resources for academic libraries. The phrase “Index Blending” is used here to describe an original concept/methodology initiated by EBSCO Publishing (EBSCO). This is not to say that EBSCO is the first vendor ever to have combined resources to create a new product, but to the authors’ best knowledge, no other vendor has pursued the “blending” of resources to the same extent and with such a strong guiding directive as EBSCO has. Index Blending is the combining of niche indexes and other important components to create a single defini tive index for a particular discipline. As vendors seek to offer the most powerful research database for a given area of study, the pieces may come together through a combination of existing resources and proprietary development. In other words, in order to refine the tools used for research in a discipline, existing resources may be combined, fleshed out, further expanded upon, and enhanced to culminate in the archetypical index for the particular discipline. Perhaps this represents the solution to the dilemma that “database choices become increas ingly complex when multiple sources exist that cover the same discipline.”2 The idea may seem elementary, but the process, however, can be arduous. Processes involved with Index Blending expand upon the basic development stages asso ciated with creating a research database from “scratch,” coupled with an increase in applicable factors, which become evident when several existing and emerging resources are involved and subsequently interwoven. As is always the case, the first step to building a solution is to identify the problem and/or the need. In database devel opment, this is, in a nutshell, pinpointing a subject area of research that is lacking a corresponding definitive index, and where study patterns and research interest dictate a need for such a resource. This involves not only conduct ing surveys and engaging in discussion with advisory boards, librarians, subject experts, users, etc., but also taking a close look at the research resources that are cur rently available to determine value. Because the process begins with the fact that there is a problem (no definitive index for the particular area in question), the idea is to understand the strengths of available resources, as well as to identify weaknesses. Through this research process, vendors can further identify independent elements of each resource that may INDEx BLENDING | BROOkS AND HERRICk 27 Index Blending: Enabling the Development of Definitive, Discipline-Specific Resources Sam Brooks and Mark Herrick Sam Brooks (sbrooks@ebscohost.com) is the Senior Vice President of Sales & Marketing for EBSCO Information Services. Mark Herrick (mherrick@ebscohost.com) is the Vice President of Business Development for EBSCO Publishing. 2� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 20072� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 provide significant benefit or value, as well as pinpoint the additional important pieces that are not represented in any of the available resources. In both cases (available and not available), these elements may represent various aspects associated with a research index such as content coverage (both current and backfile), quality of indexing and abstracts, software/search functionality, thesauri, etc. Once the identification and research has taken place, vendors should have the necessary knowledge to proceed to the production phase. Figure 1 helps to illustrate how the Index Blending process can help to develop a new database that fuses together the strengths of existing resources while simul taneously compensating for any individual weaknesses that they may have. If value is attributed to currently available databases, then, if appropriate, database acquisition may come into play. This is often a critical phase of the process, and may involve the acquisition of more than a single index. However, the desire by a vendor to acquire a given resource is based on several motivating factors, including the qual ity of the database as a whole, the depth and breadth of its coverage, and at times, the extreme quality of an intricate aspect of a database, which will eventually be said data base’s contribution to the process of index blending, thus representing its “mark” on the final product. Because there is no authoritative resource available for a given subject area does not mean necessarily that certain aspects of existing resources are not of utmost quality. Hence, utilizing strengths of existing resources makes sense so as to not “reinvent the wheel” when applicable. In a Journal of Academic Librarianship article discussing the research environment in libraries and the simultaneous utilization of existing library resources, similar principles to those used in Index Blending are apparent. “Properly combining library resources to func tion collectively as a cohesive, efficient unit is the basis of information integration.”3 Similar themes to those asso ciated with information integration run through Index Blending. This is attributed largely to the fact that the basic goal of each is to enable the extraction and utiliza tion of essential material pertinent to specific research so as to enhance the overall research process. n The process of Index Blending An example An interesting example of Index Blending utilized for a major area of study is in the case of communication and mass media. An article in Searcher outlined the develop ment process and release of the database, Communication & Mass Media Complete, which may be the quintessential instance of the power brought about through Index Blending. In the article, the author first identifies the problem/need as such: When a communication studies student approaches my reference desk, it can take a few moments before I choose a database to search. Why the delay? Well, to be perfectly blunt, the communication studies literature is all over the place. If the question relates to an aspect of the communications industry, I will often begin with a business database. If the question concerns the effects of media violence on children, I may choose to search one or more of the following: ComAbstracts, PsychInfo [sic], Sociological Abstracts, ERIC, and even a few large aggregators, such as WilsonWeb’s OmniFile and EBSCO’s Academic Search Premier. In addition, there is the question of finding a single database that covers the communication science and disorders field and the more mass mediafocused communication studies field. The result has been a searching strategy that relies on consulting multiple databases—a strategy that may not please impatient or inexperienced patrons. The need for such an assortment of databases is symptomatic of the discipline. The field of com munication studies is extremely interdisciplinary. The discipline’s roots began in the study of rhetoric and journalism and now encompass subjects ranging from political communication to film studies to advertising to journalism to communication disorders to digital convergence and to every manner of media. The dis cipline has strong roots in the social sciences, but also draws heavily on the humanities and the sciences. As some have put it, there is an aspect of communication studies in every discipline. This leaves librarians with the difficult task of finding a single database that cov ers this wideranging discipline. Enter EBSCO’s new Communication & Mass Media Complete database.4Figure 1. The index blending process PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 2�INDEx BLENDING | BROOkS AND HERRICk 2� This overview of the need for a comprehensive resource in areas related to communication and mass media is indicative of the type of information that vendors must extract when deciding their course of action for creat ing (or not creating) a database to meet such needs. In this instance, the need became apparent to EBSCO upon conducting investigative research in this direction. There were certainly important, quality resources available cov ering some of the subject areas and subdisciplines, but not a single, allencompassing resource. Hence, the table was set to move forward and begin the process of data base development using the process of Index Blending. Once the need for a comprehensive communication and mass media database was established, EBSCO began the phases of looking closely at available resources and gathering specific important details about what was required to develop such a database. In order to under stand the finer details and make appropriate forward progress in formulating an index for a given research area, a dedicated group of subject experts (advisory board, indexers, lexicographers, etc.) must be estab lished. In addition, aggregators must develop appro priate relationships and key partnerships. In the case of the database Communication & Mass Media Complete, EBSCO worked diligently to assemble a panel of experts to provide direction. Often, suggestions made by advi sory board members ultimately led to larger organiza tional partnerships. The first of EBSCO’s major partnerships for the benefit of the development of Communication & Mass Media Complete was with the National Communication Association (NCA). NCA is the oldest and largest national organization to promote communication schol arship and education. Founded in 1914, the NCA is a nonprofit organization of approximately 7,100 educa tors, practitioners, and students who work and reside in every U.S. state and more than twenty countries. The purpose of the association is to promote study, criti cism, research, teaching, and application of the artistic, humanistic, and scientific principles of communica tion. NCA is a scholarly society and, as such, works to enhance the research, teaching, and service produced by its members on topics of both intellectual and social significance. Staff at the NCA national office follows trends in national research, teaching, and service pri orities. It relays those opportunities to its members and represents the academic discipline of communication in those national efforts.5 In addition to providing insight and advice into the areas associated with communication and mass media, NCA found in EBSCO an ideal partner to further the tremendous efforts the organization had put into its database, CommSearch. CommSearch, in its original form, was a scholarly communication database with deep, archival coverage of the journals of the NCA and other major journals in the field of communication studies. The database provided bibliographic and keyword references to twentysix journals in communication studies with coverage extending to the inaugural issue of each—some from as far back as the early decades of the twentieth century. The database also included covertocover indexing of the NCA’s first six journals (from their first editions to the present) and authorsup plied abstracts from their earliest appearance in NCA journals. As EBSCO’s goals were in line with the NCA in terms of improving scholarly research in areas sur rounding communication as well as enhancing the dis semination of applicable materials, a partnership was formed, and EBSCO acquired CommSearch. The com pany acquired this database with the intent to enhance the collection through content additions such that it would take residence immediately as a core component of Communication & Mass Media Complete. The second major database acquisition came about similarly to the CommSearch arrangement; only this time, EBSCO worked closely with Penn State University, the developers of a database called Mass Media Articles Index. Created by Jack Pontius and maintained by the Penn State Libraries since 1984, Mass Media Articles Index provided citation coverage for over forty thousand articles on mass media published in over sixty research journals, as well as major journalism reviews, recent encyclopedias, and handbooks in the area of communications studies. This database, which was once a standalone research tool, is a good example of how a goodquality resource can arise out of the passion and unique vision of an individual, yet never fully develop into its full potential due to a lack of funding, dedicated staff, and experience in database publishing. Seeing the incredible potential of Mass Media Articles Index, EBSCO earmarked this database as the sec ond major component in its larger communication and mass media product. As mentioned, the basic idea with Index Blending is to pinpoint the best and most important aspects of each database to carry forward into the final product. It is at this point that difficulty typically arises in the normalization of data. Once core database components are determined, a vendor ’s expertise in building data bases, standardizing entries, etc., comes to the forefront. Furthermore, because another basic ingredient to the process of Index Blending revolves around additional material included by the database developer, that aggre gator has the burden of taking the core building blocks of the database and elevating these raw materials to the point where their combination and refinement become the desired end result—a definitive, cohesive index to research in the subject area. With this in mind, EBSCO carefully selected the indexing components of each resource that were essen tial to carry forward and substantially expanded the 30 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200730 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 abstracting and indexing coverage of appropriate journals in CommSearch and Mass Media Articles Index. The company also added indexing and abstracts for many more of the important titles in the communication and mass media fields that were not covered by these databases. Through its initial research, EBSCO gained a thorough knowledge of which journals and other content sources were not covered by the two acquired databases, and worked to provide coverage for those missing sources. As such, the idea with this database was to cover all appropriate, qual ity titles indexed in all other currently avail able communication and mass mediaspecific databases combined, as well as other important journals not previously covered by any such database. Further still, the company took the database to new levels through the creation and deployment of features such as searchable cited references and index browsing. Figure 2 provides a visual interpretation of the elements associated with this particular example of Index Blending. Often academic librarians consider aggre gated fulltext databases as a means for access ing fulltext information quickly, but with a negative outlook toward the quality of the indexing included in these databases. However, it is EBSCO’s intention to create first and fore most a powerful index, such that any full text included is that much easier to locate and utilize. According to Cleveland and Cleveland in the book Introduction to Indexing and Abstracting, 3rd ed., “In any retrieval system, success or failure depends on the adequacy of the indexing and the related search ing procedures.”6 EBSCO wholeheartedly agrees with this statement. And though the company is the leader in providing fulltext databases, it continues to raise the bar for these databases through not only constantly increasing the quality and quantity of full text, but also by enhancing indexing, abstracts, and associated search functionality. A database may provide the greatest collection of full text, yet it is still only as good as its underlying indexing framework that guides users to the appropriate content. Index Blending allows for this ideal because the development of the indexing takes place at the onset as the primary objective, and full text may be included at a later stage. This is precisely the case with EBSCO’s communication/communications database where the first iteration of the collection (Communication & Mass Media Index) did not include full text, and the Complete (fulltext version) was soon to follow. Thus, in the case of Communication & Mass Media Complete, once the core elements for the index were in place, refined, and normalized, EBSCO moved forward in the area of fulltext content. In addition to the inclu sion of full text for all of the NCA journals, which David Oldenkamp refers to as “heavyweights in communication studies,” EBSCO included fulltext coverage for nearly 230 titles. According to Oldenkamp, as of April 2004, the competing database with the next largest number of publications covered in full text included only sixteen fulltext titles.7 Though Index Blending is not the traditional way in which to build a database, and may actually be the most laborintensive way in which to proceed, the end results can be remarkable when done properly. Using this process, “EBSCO has managed to create the largest and most comprehensive database serving the needs of communication studies scholars, faculty, students, and librarians.”8 In addition, a review published in The Charleston Advisor determined that “EBSCO has brought together two reliable but atrophied resources and refreshed them with new search capabilities and added content, such as abstracts. These have been combined with a healthy dose of ‘not indexed anywhere’ new titles and interdisciplinary sources to create a comprehensive Figure 2. Indexing components of Communication & Mass Media Complete PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 31INDEx BLENDING | BROOkS AND HERRICk 31 resource that will satisfy the needs of students, faculty, and researchers.”9 n Another example of Index Blending Hospitality & Tourism Index Index Blending is a concept as much as it is a process and a means to an end. Much like applying a particular theory to a number of different instances, Index Blending is interdisciplinary in application. Thus, the area of com munication/communications as described previously, is simply an example of practical implementation of this concept, and a particular way in which the process was approached given the specific elements involved. Another discipline to which Index Blending has been applied is the niche areas related to hospitality and tourism. According to Professor Vivienne Sario, Director of Travel and Tourism at Community College of Southern Nevada, “On a global basis the hospitality and tourism industry employs more than 10 percent of the worldwide workforce. It contributes over $4 trillion in gross global output. This means travel and tourism is the world’s largest industry.”10 Though still considered (perhaps incorrectly) a “niche” area of study, the number of hospitality and tour- ism programs supported in colleges and universities around the globe has also increased to the point where dozens and dozens of two- and four-year academic institutions provide related courses of study. From a business perspective, in order to justify the amount of resources that would inevitably be expended to develop a high-end, comprehensive database, the basic criteria needed for database development must first be in place. Considering the economic vastness of the hospitality and tourism industry, the interest and research need is quite apparent. If there is at least one clearly definitive academic resource covering the subject area, in all likelihood, the decision would be made to cease exploration and devel- opment in that area. Contrarily, when EBSCO conducted exhaustive research to determine the need for a new index to literature in the areas of hospitality and tourism, the unani- mous conclusion was to move forward in the development of a product that would go above and beyond the level of the existing resources. This is not to say that quality was not inherent in some of the existing resources. In actual- ity, the fact that there were already quality (albeit perhaps incomplete) resources available, paved the way for utilizing principles of Index Blending in the development of a more comprehensive resource. The first element of what was to become EBSCO’s Hospitality & Tourism Index was Purdue University’s Lodging, Restaurant, & Tourism Index (LRTI). As an indi cator of the level of emphasis attributed to this subject area by the university, Purdue’s hospitality and tourism management undergraduate program was ranked num ber one nationally by a survey published in the Journal of Hospitality & Tourism Education.11 A previous survey conducted by the same journal used a different method ology and sample, but still ranked Purdue’s hospitality and tourism management (HTM) program number one in the nation.12 To provide insight into the Purdue HTM program, the origins and history of LRTI, the need for a compre hensive database, and the university’s decision to work with EBSCO, questions were asked of two prominent Purdue faculty members: Raphael Kavanaugh, Head, Hospitality and Tourism Management Department, and Priscilla Geahigan, Head, Consumer and Family Sciences Library. The following is taken from email cor respondence among one of the authors (Sam Brooks), Kavanaugh, and Geahigan: Brooks: How long has Purdue offered a Hospitality & Tourism Management program? kavanaugh: The program began in 1928 as the Department of Institutional Management. Brooks: When and why did Purdue decide to create the Lodging Restaurant & Tourism Index (LRTI)? kavanaugh: To fill a serious void of access to relevant research conducted related to the industry. geahigan: Before 1990 coverage of the hospitality industry within business indexes and databases was limited. To meet the needs of researchers and students, Purdue’s Restaurant, Hotel, Institutional, and Tourism Management Department, an inhouse indexing project, started in the Purdue Consumer and Family Sciences Library in 1977. Citations of articles from scholarly and trade journals were entered on index cards, filed by subject headings. In 1985 the project became more for malized and migrated into partnership with a few other academic institutions. A printed index titled Lodging and Restaurant Index started. In 1987, Purdue became the sole producer of the Index. In 1995, the Index was renamed the Lodging, Restaurant, and Tourism Index (LRTI), with expanded scope and coverage. Over the years, data diskettes and CDROM formats were added to the printed version. Brooks: How important are “niche” or subjectspecific databases to support research in a given area such as H&T? geahigan: In contrast to earlier years, students can now get their information from a multitude of databases and 32 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200732 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 venues. At Purdue, we have databases that cover all aspects of business and management. Undergraduate students often get confused and impatient at the large number of databases offered. A subject specific database like HTI gives them a place to start without feeling lost. Brooks: Why did Purdue decide to partner with EBSCO, and subsequently merge LRTI in the larger Hospitality & Tourism Index (HTI)? geahigan: We realized that we do not have the resources to support a database that measures up to industry technology standards and have long decided to look for a company to take over LRTI. EBSCO’s offer was attrac tive to Purdue because of their willingness to assume future indexing of the LRTI journals. In addition, many Purdue students are already familiar with the EBSCO interface because we have numerous other EBSCO hosted databases. We are pleased that LRTI became the foundation of EBSCO’s building of HTI.13 The second foundational component of the database also came about through acquisition from an academic institution. Articles in Hospitality and Tourism was copro duced by Oxford Brookes University and the University of Surrey. Bournemouth University was also a source of data for this database between the years of 1988 and 1998. This database provided details of more than fortysix thousand Englishlanguage articles selected from more than 330 relevant academic and trade journals published worldwide from 1984 to 2003.14 Rounding out the list of three existing resources that were acquired by EBSCO, the Hospitality Database (acquired from the original developers at Cornell University) was also assimilated into the new hospitality and tourism database. The Hospitality Database evolved from the print publication Bibliography of Hotel Management and Related Subjects that was originally established in the 1950s by Blanche Fickle, the first director of the library at Cornell University’s School of Hotel Administration.15 This database, founded on the vision of Ms. Fickle, would serve as a core resource for EBSCO’s new Hospitality & Tourism Index by providing it with a foundation of quality indexing for journals related to the study of hotel adminis tration and management. EBSCO completed the initial development of its hospitality and tourism database by reviewing applicable subscription statistics maintained by its sister company, EBSCO Subscription Services, in order to locate other publications relevant to the various subdisciplines of hospitality and tour ism. Any such publications that were not already indexed by the other three existing resources were targeted for inclusion in the new Hospitality & Tourism Index. Figure 3 provides a visual interpretation of the ele ments associated with this particular example of Index Blending. Following the initial release of Hospitality & Tourism Index, in order to provide an even more inclusive research experience, EBSCO proceeded to develop and release a fulltext version of this resource entitled Hospitality & Tourism Complete. This new variant of the database offers users the same indexing infrastructure as Hospitality & Tourism Index, as well as provides the additional benefit of immediate access to relevant fulltext content. While the availability of full text is certainly of immense value, it is still the quality of underlying indexing that allows this database to be regarded as truly innovative. In fact, this same perspective was echoed in a recent review in CHOICE where the author states that “Hospitality & Tourism Complete indexes its specialized subject area bet ter than any other product currently available.”16 n The whole is greater than the sum of its parts The process of Index Blending not only brings together content from a variety of resources, it also has the power to increase the research value of that same content. By combining such content under the umbrella of a single comprehensive database, pertinent information can now be more efficiently accessed and crossreferenced with other relevant content. Previously, the same body of information could only be explored via a highly ineffec tive, piecemeal research process. One last example that demonstrates this potential increase in research value is found in the Computers & Figure 3. Indexing components of Hospitality & Tourism Complete PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 33INDEx BLENDING | BROOkS AND HERRICk 33 Applied Sciences Complete database. This resource was shaped through the acquisition and merger of three distinct indexes—Computer Science Index (CSI), Internet & Personal Computing Abstracts (IPCA), and Information Science & Technology Abstracts (ISTA)—and rounded out with addi tional indexed content relevant to the larger discipline. This resulted in a total of 1,100 active journals indexed back as far as 1965. Then, after two years of dedicated licensing work with pub lishers, full text for more than 570 of those titles was added to provide more direct access to such content for researchers. Figure 4 illustrates how the various subject areas (unique and shared) covered by the three original databases were merged together in the blending process. From this diagram, it is apparent that the original three databases were already quality resources in their own right and adequately rep resented their respective subject areas. However, it should also be apparent that, through the pro cess of Index Blending, the value of the original databases has been enhanced via the fusion of their unique, yet complementary content into a single comprehensive resource. n Conclusion Though the above examples of Communication & Mass Media Complete, Hospitality & Tourism Index, and Computers & Applied Sciences Complete represent only three of sev eral subjectspecific databases culminating from the process of Index Blending, most database producers (including EBSCO) would likely agree that this is not a common procedure for database development. However, the knowledge that a company derives from the pro cess often has a significant impact on the company’s other, “nonblended” databases. Index Blending typically requires a high degree of refinement in order to be fully successful, so when a company engages in this rigorous developmental process, the newfound experience and expertise gained from it may spill over into the com pany’s other database initiatives. End users may notice improved indexing, abstracts, and other valuable com ponents that are now included in other more established fulltext resources from the same vendor. Databases that were once viewed simply as “aggregated fulltext data bases” may be looked upon in a different light after the company adopts the process of Index Blending for other, unrelated database projects. Though these databases may still provide easy access to an abundance of fulltext content, they may also now be considered the definitive index for their respective subject area(s). Therefore, when a company implements the practice of Index Blending for some of its products, the resulting effects are two fold. The databases created directly as a result of the Index Blending process are the first to benefit, and the company’s other databases (including those with full text) may also benefit from Index Blending in an indirect manner. In the end, however, the success of any Index Blending initiative is measured by the level of benefit that it provides to applicable researchers and other users of the resulting databases. References 1. Rajaram Veliyath and Elizabeth Fitzgerald, “Firm Capabil ities, Business Strategies, Customer Preferences, and Hypercom petitive Arenas: The Sustainability of Competitive Advantages with Implications for Firm Competitiveness,” Competitiveness Review 10 (2000): 56–82. 2. M. Suzanne Brown, Jana S. Edwards, and Jeneen LaSee Willemssen, “A New Comparison of the Current Index to Jour nals in Education and the Education Index: A Deep Analysis of Indexing,” The Journal of Academic Librarianship 25 (May 1999): 216–22. 3. Sam Brooks, “Integration of Information Resources and Collection Development Strategy,” The Journal of Academic Librarianship 27 (July 2001): 316–19. 4. David Oldenkamp, “EBSCO’s New Communication and Mass Media Complete (CMMC) Database,” Searcher 12, no. 4 (Apr. 2004): 40. 5. National Communication Association Web site. http:// www.natcom.org (accessed Aug. 2004). Figure 4. Subject areas of component databases are merged into a cohesive whole through index blending 34 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200734 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 6. Donald B. Cleveland and Ana D. Cleveland, Introduction to Indexing and Abstracting, 3rd ed. (Greenwood Village, Colo.: Libraries Unlimited, 2001): 26. 7. Oldenkamp, “EBSCO’s New Communication and Mass Media Complete (CMMC) Database.” 8. Ibid. 9. Dodie Owens, “Advisor Reviews—Standard Review: Communication and Mass Media Complete,” The Charleston Advisor 6, no. 4 (Apr. 2005): 45. 10. Vivienne Sario, “Hospitality & Tourism Programs,” http://www.studyusa.com/articles/hospitality.asp (accessed June 1, 2006). 11. Purdue University Web site. http://news.uns.purdue. edu/UNS/html4ever/030130.Kavanaugh.rank2003.html (accessed June 1, 2006). 12. Michael G. Brizek and Mahmood A. Khan, “Ranking of U.S. Hospitality Undergraduate Programs: 2000–01,” Journal of Hospitality & Tourism Education 14, no. 2 (2002): 4. 13. Raphael Kavanaugh and Priscilla Geahigan, email mes sage with author Sam Brooks, Feb. 3, 2005. 14. Articles in Hospitality and Tourism Web site (hosted by the University of Surrey). http://libweb.surrey.ac.uk/aht2/about .asp (accessed June 1, 2006). 15. Cornell University’s School of Hotel Administration Web site. http://www.nestlelib.cornell.edu/history.html (accessed June 1, 2006). 16. S. C. Awe, “ReferenceSocial and Behavioral Sciences— Hospitality & Tourism Complete,” CHOICE 43, no. 10 (June 2006). 3280 ---- The goal of this paper is to describe a design—includ- ing the hardware, software, and configuration––for an open source wireless network. The network designed will require authentication. While care will be taken to keep the authentication exchange secure, the network will oth- erwise transmit data without encryption. W ireless networks are an essential tool for provid ing service for colleges and libraries. This paper will explain the setup of a wireless network using opensource software and inexpensive commodity hardware. Opensource software was employed exclu sively. This allowed for flexibility in design and reduction in expense while also providing a platform for students to learn more about the internal workings of the system by examining particular sections of code in which they have interest. Standard commodity hardware was used as a means of saving cost. This should allow others to repeat this design with a minimum of funding. The purpose of a network, like any resource, is to provide a service for those who own it; in this case, the patrons of a library, or students, faculty, and staff at a col lege. To ensure that this network serves its owners, users will be required to authenticate before gaining access. Once authenticated, the central captive portal can pro vide different levels of service for specific user groups, including guest access, if desired. For this system, ease of access for users was the primary concern; other than using the Secure Socket Layer for authentication, the remainder of the traffic was unencrypted. Other than the base nodes, the remaining access points were connected to each other using a wireless connection in order to avoid physically connecting all access points across campus and to further reduce the expense for the deployment of the network. This was accomplished using the WDS (wireless distributed system) feature on the wireless routers. All access points connect to a centralized set of servers that provide: DHCP, Webcaching proxy, DNS caching, radius, Web server, a captive portal, and logging of network traffic. n Hardware Requirements for the network were relatively modest, using inexpensive wireless routers along with several Linux servers built upon older Pentium 3 desktop systems. Linksys WRT54GS routers were chosen as the access points as they are inexpensive, readily available, and possess the ability to run custom opensource firmware. Other access points could be used; however, the configuration sugges tions are specific to the WRT54GS and may not apply to other hardware. The routing functions of the WRT54GS were not used in this implementation. The servers need not be anything special; older hardware will work just fine. For this implementation, decommissioned 900 MHz units with 512MB of RAM and 40GB hard drives were used. n Wireless router software In order to provide the functionality required, the units had their firmware flashed with an opensource, Linux based operating system available from OpenWrt for the Linksys routers (http://www.openwrt.org). Support is also available for other wireless devices. “The firmware from OpenWrt provides a fully writable file system with pack age management. This allows developers the freedom to customize the devices by choosing only the packages and software that are necessary for their applications.”1 As the routers have limited storage, being able to hand select only the necessary components is a definite advantage. n Server software For the operating system on the servers, Fedora Core was chosen.2 Fedora provides the Yellow Dog Updater, Modified (yum), which eases the updating of all pack ages installed on the system, including kernel updates.3 This aids security by providing a platform for easily and frequently updating the system. Fedora Core is an open source distribution that is available for free. Fedora Core also comes with many other opensource packages that were used in this design, such as the Apache Web server. While the designers had more familiarity with Fedora, other distributions are also available that provide simi lar benefits (Suse, Ubuntu, OpenBSD, Debian, etc.). The server was run in command line mode with no graphical user interface in order to reduce the load on the server and save space on the hard drive. n Captive portal In order to require authentication before gaining access to the network, a captive portal was used. Some of the OPEN SOuRCE WIFI HOTSPOT IMPLEMENTATION | SONDAG AND FEHER 35 Open Source Wifi Hotspot Implementation Tyler Sondag and Jim Feher Jim Feher (jdfeher@mckendree.edu) is an Associate Professor of Computer Science at McKendree College in Lebanon, Illinois. Tyler Sondag (tnsondag@mckendree.edu), is a senior in Computer Science at McKendree College. 36 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200736 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 desired features in the choice of the captive portal were: encrypted authentication, traffic logging, and the ability to provide different levels of service for different user groups. Logging traffic allows the system administrators to identify accounts that have been misusing the network. Those who inadvertently misuse the system or perhaps have had their accounts compromised can have their access temporarily disabled until they can be contacted with instructions concerning acceptable use of the net work. As the network must be shared by all, those who habitually abuse the resource can have their accounts per manently disabled. The captive portal should also redi rect Web traffic to a login page that is served on the Secure Socket Layer until the user logs in. Chillispot was chosen as it possesses all of the features mentioned above.4 n Server layout As can be seen in appendix A, three servers were used for this implementation. The first server was used as the main router to the Internet. The second server ran a Squid Web caching server.5 It also ran a DNS cach ing server and the FreeRADIUS server.6 The third was used for the captive portal. Three servers were used for various reasons. First, this distributed the load. Second, portions of the network that were not behind the cap tive portal could more easily use the services on the second server running Squid, DNS, and FreeRADIUS. It should be noted that three independent servers are not required; many of the services could be consolidated on two or even one single server to reduce the hardware requirements. The implementation depends upon the specific needs for the network. n Server installation Installing the operating system (Fedora Core) on each server is a relatively straightforward procedure. Each machine was partitioned with 1024 MBs of swap space with the rest of the drive being an ext3 partition with the mount point “/”. Only the minimal set of packages required were installed at this time. The first server, server #1 (router), was given three network interfaces, one for the Internet connection, one to connect to a switch that then connects to server #2 (Web/DNS caching and radius) as well as other machines that do not connect through the captive portal, and one connecting to server #3 (captive portal machine). The second server, server #2, only needs one interface, but the third, server #3, requires two interfaces, one for the master wireless access point, and one to connect to the switch connecting this machine to the rest of the network (appendix A). SSH login for root was also disabled at this time for added security. n Server #1 configuration For server #1, very little setup was required. Since this server works mainly as a router, the only major items that went into its configuration were the iptables rules, which are shown and described in appendix B.7 Rules were set up to: n set up network address translation; n allow traffic to flow within the network; n log the traffic from the wireless portion of the net work; n allow for the transparent setup of the Web proxy server; and n set up port knocking before allowing users to log into the router via SSH.8 A reference to this script was placed in the /etc/rc.d/ rc.local file so that it would run when the server boots. Last was the setup of the three network interfaces in the machine. This can be done during system installation or afterwards on the Fedora Core based server by editing the configuration files in the /etc/sysconfig/networking scripts/ directory. One of the configuration files used in this implementation can be seen in appendix C. Of course the configuration will change as the topology of the net work changes. n Server #2 configuration The second server required significantly more setup to configure all of the necessary services that it runs. The first service added for this implementation was the Web caching proxy server, Squid. Squid’s default configura tion file (/etc/squid.conf) is quite large; fortunately it requires little modification to get a simple server up and running.9 The changes made for this implementation can be seen in appendix D. The most important lines in this configuration are the last few, which enable it to act as a transparent proxy server, making it invisible to the users and requiring no setup of their browsers. As there was no need for an authoritative DNS server, just DNS caching for the network, dnsmasq, which is easy to configure and can handle both DHCP services as well as DNS caching, was chosen.10 In this instance, the captive portal was used to provide DHCP services for the wireless clients; however dnsmasq was used for dynamic clients on the remaining portion of the network. Dnsmasq PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 37OPEN SOuRCE WIFI HOTSPOT IMPLEMENTATION | SONDAG AND FEHER 37 is relatively easy to configure, requiring only one change in its default configuration file, which points to the file in which the DNS server addresses are stored, in this case /etc/dnsmasq_resolv.conf. Next is the configuration of FreeRADIUS server. There are two files that need to be modified for the radius server; both are in the /etc/raddb/ directory. The first is clients.conf (appendix E). In this file at least two clients must be listed, one for localhost (this machine) and one for the captive portal machine. For each machine, a pass word must be specified as well as the hostname for that machine. This establishes the shared key that is used to encrypt communication between the captive portal and the radius server. The second is the users file (appendix F). In this file, each user for the captive portal system must be listed with his/her password. This implementa tion also included a class, a session timeout (dhcp lease time), idle timeout, accounting interim interval, and the maximum upload and download speeds. If guest access is required, one or several guest accounts should be added to this file along with entries for the registered users. An entry was added for each access point so that they can obtain an IP address from the DHCP server. Finally for this machine, the interface configuration file was changed according to the network specifications. For this machine the configuration is simple since it only has one interface, and the only requirement for its address is that it be on the same network as the interface on the main router server to which it is connected. n Server #3 configuration The third server required the installation of the captive portal software, in this case Chillispot. In order to install Chillispot, if Fedora was used for the base system, it may be possible to install it as a prepackaged binary in the form an RPM package manager (rpm) file. Otherwise, if you find that you need to compile Chillispot from source code, you may need to deviate from a minimal installa tion of the operating system and base components and also include the GNU compiler collection (gcc). When installing from source code, first download the code from the Chillispot Web site. Once the code is down loaded, unzipped and untarred, installing the Chillispot daemon is done by entering the directory containing the source files and entering the standard commands: ./configure make make install When Chillispot is on the system, either by compiling from source or through an rpm file, two more files must be configured and copied to the proper directory, the main configuration file and the login file. The configuration file, chilli.conf, is located in the directory that contains the source files. Move this file to the /etc/ directory and make the necessary changes. In this implementation, the file required several changes (appendix G). One of the more significant alterations was to change the default network range of 192.168.182.0/24, which would be limited to less than 256 addresses. The address range was for the DHCP server was also expanded to allow for more users. The lower portion of the network range was left to make room for addresses that could be assigned to the wireless access points. An entry was added to allow the access points to obtain a static IP address in that lower range. After this, settings must be changed for the DNS addresses given out to clients, and the address of the radius server. There is also a setting in the Chillispot configuration file that allows users to access a certain list of domains without logging in. For this implementation, the decision was to allow the users access to the campus network, as well as to the DNS server. Next, the “radi ussecret” must be set. This is the same password that was entered into the clients.conf file on the radius server for this machine. It is also necessary to set the address of the page to which users will be directed. Two lines must also be added to allow authentication using the physical or media access control (mac) address for the access points. All of the access points shared a common password. Chillispot passes the physical address of the access point to the radius server along with this password. A separate entry must exist in the radius configuration file for each IP/physical address combination. For this setup, the redirect page was placed on this server, therefore Apache (using yum) was also installed, and this server’s address was added as the Web address for the redirect page (also note that the https module may be required for apache if it does not automatically install). Rather than write a new page at this time, the sample page (hotspotlogin.cgi) from the Chillispot source folder was copied and modified slightly (appendix H). In addi tion, a Secure Socket Layer (SSL) certificate was installed on this server. This is not necessary, but it helps to avoid the warnings that pop up when a client attempts to access the login page with a browser. A few iptables rules need to be added. The first com mand needs to be executed in order to utilize Network Address Translation (NAT) and have the server forward packets to the outside network. /sbin/iptables t nat A POSTROUTING o eth0 \ j MASQUERADE The next is used to drop all outbound traffic originating from the access points. This prevents anyone spoofing the physical address of the access point from accessing 3� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 20073� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 the Internet, while still allowing the access points and the Chillispot server to communicate for configuration and monitoring. /sbin/iptables A FORWARD s 192.168.182.0/24 \ j DROP These commands need to be executed when the Chillispot machine boots, so they were placed into the /etc/rc.d/rc.local file. It may also be necessary to ensure that the machine can forward network traffic. This can be accomplished with the following command, which is also found as the first executable command from the script in appendix B: echo “1” > /proc/sys/net/ipv4/ip_forward Finally, the configuration files for the interfaces were set up. n OpenWrt installation and configuration Several ways exist to replace the default Linksys firmware with the OpenWrt firmware.11 The tftp protocol can be used with both Windows and Linux, and one such method can be found in Appendix I.12 In addition, other methods for using the standard Web interface can be found on the OpenWrt Web site.13 There are several versions of the OpenWrt firmware available; the newest version that uses the squashfs filesystem was chosen because it utilizes com pression that frees more space on the access point. OpenWrt comes with a default Web interface that can be used for configuration, however, ssh was enabled and a script using the nvram command was used to configure each access point (see appendix J). Before ssh can be used, you must telnet into the router and change the default password (which for Linksys routers is ‘admin’). NOTE: Even if you decide to use the Web interface, you should still change the default password. As several services that were installed with the default configuration were not used in the implementa tion, they were disabled once the firmware was flashed by removing the modules that boot at startup: the Web interface, dnsmasq, and the firewall. This is done by deleting their entries in the /etc/init.d directory. Changes were needed to set the mode of the access point, to turn on and configure the clients needing to use WDS, to set the network information for the access point and then to save these settings. All of the wireless access points that communicate with each other via a wireless connec tion must have their physical addresses entered using a nvram command. For example, the command used for the main access point for the library would be: nvram set w10_wds=”MAC_4_lib1 MAC_4_lib2” All of this is detailed in appendix J. A final set of com mands, which were needed for the WRT54GS, are included to allow the access point to obtain its IP address from the DHCP server. These commands may not be necessary depending upon the type of access point used. Since extra wireless access points are available, if an access point fails or is having problems for some reason, it is simply a matter of running a script similar to the one found in the appendix on one of the extra routers and swapping it out. n Security Unfortunately this system is not very secure. Only the login credentials are encrypted via SSL. General data packets are in no way encrypted, so any information being transmitted is available to anyone sniffing the channel. WEP and WPA could be used for encryption, but they have known vulnerabilities. Other methods exist for securing the network such as WPA with RADIUS or the use of a Virtual Private Network, however the client setup for such systems may not be considered trivial for the typical user. Therefore it was decided that it was better to inform the users that the data was not being encrypted and let them act accordingly, rather than use encryption with known flaws or invest the time required to train the general population on how to configure their mobile units to use a more secure form of encryption. As the main goal of this particular network was connectivity and not security, it was felt that this was a fair trade off. As new standards for wireless communication are developed and commodity hardware that supports them becomes available, this may change so that encrypted channels can be employed more easily. n Conclusion This implementation is in no way completed. It is a work in progress, with many goals still in mind. Also, as new features are desired, parts of the system will change to accommodate these requirements. Current plans for the future are first to develop scripts to check the status of the access points and display this information to a Web page. These scripts will also notify network administrators when access points go offline. This will help the adminis trators in making sure the system is up at all times. After this, scripts will be developed to parse the log files to find abusive activity (spamming, viruses, etc). However, the current project as described is complete and has already functioned successfully for nearly a year providing con nectivity for the library and portions of the McKendree College campus. PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 3�OPEN SOuRCE WIFI HOTSPOT IMPLEMENTATION | SONDAG AND FEHER 3� References and Notes 1. OpenWrt, Wireless Freedom. www.openwrt.org (accessed June 16, 2006). 2. The Fedora Project. www.fedora.redhat.com (accessed Nov. 29, 2005). 3. Yum: Yellow dog Updater, Modified. www.linux.duke. edu/projects/yum (accessed July 22 2006). 4. ChilliSpot—Open Source Wireless LAN Access Point Controller. www.chillispot.org (accessed June 23, 2006). 5. Squid Web Proxy Cache. www.squidcache.org (accessed June 1, 2006). 6. FreeRADIUS—Building the Perfect RADIUS Server. www. freeradius.org (accessed June 28, 2006). 7. Netfilter/iptables Project Homepage—The netfilter.org Project. www.netfilter.org (accessed Aug. 8, 2006). 8. Thomas Eastep, “Port Knocking and Other Uses of ‘Recent Match.’” www.shorewall.net/PortKnocking.html (accessed Aug. 11, 2006). 9. Squid Web Proxy Cache, “SQUID Frequently Asked Questions: Interception Caching/Proxying.” www.squidcache. org/Doc/FAQ/FAQ17.html (accessed Aug. 8, 2006). 10. Dnsmasq—A DNS Forwarder for NAT Firewalls. www. thekelleys.org.uk/dnsmasq/doc.html (accessed June 1, 2006). 11. Linksys.com. www.linksys.com (accessed Dec. 15, 2005). 12. OpenWrtDocs/Installing/TFTP—OpenWrt. wiki.open wrt.org/OpenWrtDocs/Installing/TFTP?action=show&redirect =OpenWrtViaTfp (accessed Aug. 2, 2006). 13. OpenWrtDocs/Installing—OpenWrt. wiki.openwrt.org/ OpenWrtDocs/Installing (accessed Aug. 2, 2006). Appendix A. Network configuration 40 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200740 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 Appendix B. iptables script—Server #1 # this particular bit must be set to one to allow the # network to forward packets echo “1” > /proc/sys/net/ipv4/ip_forward # set up path to the internal network from Internet if the # internal network initiated the connection iptables A FORWARD i eth0 o eth1 d 10.4.0.0 \ m state state ESTABLISHED,RELATED j ACCEPT # Same for the Chillispot subnet iptables A FORWARD i eth0 o eth2 d 10.5.0.0 \ m state state ESTABLISHED,RELATED j ACCEPT # allow the internal subnets to communicate with one another iptables A FORWARD i eth1 d 10.5.0.0 o eth2 \ j ACCEPT iptables A FORWARD i eth2 d 10.4.0.0 o eth1 \ j ACCEPT # allow subnet containing server 2 to reach the Internet iptables A FORWARD i eth1 o eth0 j ACCEPT # Chillispot – accept and forward packets iptables A FORWARD i eth2 s 10.5.3.30 j ACCEPT # Set up transparent proxy for wireless network, but allow # connections that go through to the campus network # to bypass proxy iptables t nat A PREROUTING i eth2 ! \ d 66.99.172.0/23 p tcp dport 80 s 10.5.0.0/16 \ j DNAT todestination 10.4.1.90:3128 # nat iptables t nat A POSTROUTING o eth0 \ j MASQUERADE # simple port knocking to allow port 22 connection adapted # from www.shorewall.net/PortKnocking.html1 another # excellent document can be found at # www.debian-administration.org/articles/26814 # once connection started let it continue iptables A INPUT m state state \ ESTABLISHED,RELATED j ACCEPT # if name SSH has been set, then allow connection iptables A INPUT p tcp dport 22 m recent \ rcheck name SSH j ACCEPT # Surround the port that opens ssh so that a sequential port # scanners will end up closing it right after opening it. iptables A INPUT p tcp dport 1233 m recent \ –name SSH remove j DROP iptables A INPUT p tcp dport 1234 m recent \ name SSH set j DROP iptables A INPUT p tcp dport 1235 m recent \ name SSH remove j DROP # drop all packets that do not match a rule above by default iptables A INPUT j DROP Appendix C. Server configuration for first network card (ethernet 0) # /etc/sysconfing/networkingscripts/ifcfgeth0 # Server #1 # DEVICE=eth0 BOOTPROTO=static BROADCAST=66.128.109.63 HWADDR=00:11:22:33:44:66 IPADDR=66.128.109.60 NETMASK=255.255.255.248 NETWORK=66.128.109.56 ONBOOT=yes TYPE=Ethernet Appendix D. /etc/squid.conf—Server #2 #default squid port http_port 3128 # settings changed to specify memory for squid cache_mem 32 MB cachedir ufs /var/spool/squid 1000 16 256 # allow assess to squid for all within our network acl all src 0.0.0.0/0.0.0.0 http_access allow all http_reply_access allow all # internal host with no externally known name so we put # our internal host name visible_hostname hostname # specifications needed for transparent proxy2 httpd_accel_port 80 httpd_accel_host virtual httpd_accel_with_proxy on httpd_accel_uses_host_header on PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 41OPEN SOuRCE WIFI HOTSPOT IMPLEMENTATION | SONDAG AND FEHER 41 Appendix E. /etc/raddb/clients.conf— Server #2 client 127.0.0.1 { secret = password shortname = localhost nastype = other } client 10.5.3.30 { secret = password shortname = other machine } Appendix F. /etc/raddb/users—Server #2 # example of an entry for a user joeuser AuthType:=Local, UserPassword==”passwd” Class = 0702345678, SessionTimeout = 3600, IdleTimeout = 600, AcctInterimInterval = 60, WISPrBandwidthMaxUp = 128000, WISPrBandwidthMaxDown = 512000 # example of an entry for an access point # The physical/mac address listed below is for the # lan side of the router/access point mac_address AuthType := Local, UserPassword == “password” FramedIPAddress = 192.168.182.10, AcctInterimInterval = 3600, SessionTimeout = 0, IdleTimeout = 0 Appendix G. /etc/chilli.conf—Server #3 # used to expand the network net 192.168.176.0/20 # used to expand the number of hosts that can connect # while still leaving a portion of the network for # infrastructure dynip 192.168.184.0/21 # used to give static addresses to the access points statip 192.168.182.0/24 # internal DNS followed by external DNS dns1 10.4.1.90 dns2 24.217.0.3 # radius server for the network radiusserver1 10.4.1.90 radiusserver2 10.4.1.90 # radius secret used radiussecret password # interface Chillispot server to listens to DHCP requests dhcpif eth1 # specified default login page uamserver https://10.5.3.30/cgibin/hotspotlogin.cgi # addresses that users can visit without authenticating uamallowed 10.4.1.90,24.217.0.3,66.99.172.0/24 # this allows the access points to authenticate based on # mac address only, this is required to log into the access # points from the captive portal server macauth # this password corresponds with the password from the # radius users file macpasswd password 42 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 200742 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 Appendix H. Redirection page Appendix I. Method for flashing firmware of Linksys router The firmware can be flashed using the builtin Web inter face or via tftp. While help is available online3 for this, the procedure outlined here may also be helpful. On newer versions of the Linksys routers, an older version of the Linksys firmware must be installed first that supports a bug in the ping function on the router. Once the older version is installed, you can exploit a bug in the ping com mand on the router to enable “boot wait,” which enables the router to accept a connection to flash its firmware as it is booting. Detailed instructions for this installation are as fol lows: n First, download an old version of a Linksys firmware that supports the ping bug to enable boot wait. One is available at: ftp://ftp.linksys.com/pub/network/ WRT54GS_3.37.2_US_code.zip n Download and unzip this file. n Plug an Ethernet patch cable into link #1 on the router (not the wan port) and the interface on your machine. Set the IP address of your computer to a static IP address in the 192.168.1.x range, not 192.168.1.1, which is used by the router. n Log into router by opening a browser window and putting 192.168.1.1 into the address bar. (NOTE: This is only for factory preset routers.) Username: (leave blank) Password: admin n Click on "administration". n Click on "Firmware upgrade". n Click "browse" and locate the old Linksys firmware on your machine. n Click "upgrade". n Wait patiently while it flashes the firmware…. n Click "setup". n Click "basic setup". PuBLIC LIBRARIES AND INTERNET ACCESS | JAEGER, BERTOT, MCCLuRE, AND RODRIGuEz 43OPEN SOuRCE WIFI HOTSPOT IMPLEMENTATION | SONDAG AND FEHER 43 n Choose "static ip" from the first box. n For the IP address put in "10.0.0.1". n For the netmask put in "255.0.0.0". n For the gateway put in "10.0.0.2". n You can leave everything else as their default set tings. n Choose save settings at the bottom of the page. n Click on "administration". n Click on "diagnostics". n Click on "ping". In the “address” box put the following commands in one at a time and click on “ping”; if you see the message that the host was unreachable you have done something wrong. ;cp${IFS}*/*/nvram${IFS}/tmp/n ;*/n${IFS}set${IFS}boot_wait=on ;*/n${IFS}commit ;*/n${IFS}show>tmp/ping.log n After the last command you will see a list of all the nvram settings on the router, make sure that the line for "boot_wait" is set to on n Unplug the router (the Linksys router will only look for new firmware on boot). n Use tftp on your Linux or Windows machine. n If the openwrt0wrt54gssquashfs.bin file is not in this directory, copy the file to this directory n Run the following commands at the prompt (below are the Linux commands) tftp 192.168.1.1 tftp> binary tftp> rexmt 1 tftp> timeout 60 tftp> trace tftp> put openwrtxxxx.xxxx.bin n The router will now reboot (it may take a very long time), when it is done rebooting, the DMZ light will turn off The new firmware is now loaded onto the router. Appendix J. Nvram script for wireless routers ## server information stored as comments ##192.168.182.10 mainap 00:11:22:33:44:00 ##192.168.182.11 cl202a 00:11:22:33:44:11 ##192.168.182.20 lib01 00:11:22:33:44:22 ##192.168.182.21 lib02 00:11:22:33:44:33 ##192.168.182.22 lib03 00:11:22:33:44:44 ##192.168.182.30 car01 00:11:22:33:44:55 ## SAME for all nvram set wl0_mode=ap nvram set wl0_ssid=McK_Wireless nvram set wl0_channel=9 nvram set lan_proto=dhcp ## Sample configuration for a few access points. ## Uncomment and run for the appropriate node. ## Make sure to ## add a line for every access point you have. ## UNIQUE for lib01 ## allow connections to/from lib02, and lib03 #nvram set wl0_wds=”00:11:22:33:44:33 00:11:22:33:44:44” ## UNIQUE for lib02 ## allow connections to/from lib01 #nvram set wl0_wds=”00:11:22:33:44:22” ## UNIQUE for lib03 ## allow connections to/from lib01 #nvram set wl0_wds=”00:11:22:33:44:22” ## SAME for all nvram commit ## SAME for all ## This needed to be done to allow each wrt54gs router ## to accept an IP address from a DHCP server. This is ## only for the wrt54gs. Other access point/routers ## may require something different. # cd /etc/init.d # rm S05nvram # cp /rom/etc/init.d/S05nvram . # vi S05nvram ## place a # in front of (comment out) ## nvram set lan_proto=”static” References 1. Thomas Eastep, “Port Knocking and Other Uses of ‘Recent Match.’ ” www.shorewall.net/PortKnocking.html (accessed Aug. 11, 2006) 2. Ibid. 3. OpenWrtDocs/InstallingOpenWrt, wiki.openwrt.org/ OpenWrtDocs/Installing (accessed Aug. 2, 2006). 3281 ---- In March 2003 the University of Mississippi Libraries made our MetaSearch tool publicly available. After a year of working with this product and integrating it into the library Web site, a wide variety of libraries interested in our implementation process and experiences began to call. Libraries interested in this product have included consor- tia, public, and academic libraries in the United States, Mexico, and Europe. This article was written in an effort to share the recommendations and concerns given. Much of the advice is general and could be applied to many of the MetaSearch tools available. Google Scholar and other open Web initiatives that could impact the future of MetaSearching are also discussed. M any libraries are looking for ways to facilitate the discovery process for users. Implementing a onestop search product that does not require databasespecific knowledge is one of the paths librar ies are choosing.1 As these search engines are made available to patrons, the burden of design falls to the library as well as to the product developers. Most library users may be familiar with a few databases, but the vast majority of electronic resources remain unrevealed. Using a MetaSearch product, a single search is broadcast out to similar and divergent electronic resources, and search results are returned and typically mixed together. MetaSearch results are returned in realtime and link the user to the native interface. Although there are many products that support onestop searching, the University of Mississippi Libraries chose to purchase Innovative Interfaces’ MetaFind product because it tied into a digital initiative partnership with Innovative. Some of the possibilities of the types of resources you can search include: n library catalogs n licensed databases n locally created databases n full text from journals and newspapers n digital collections n selected Web sites Internet search engines The simplicity of Google searching is very appeal ing to users. In fact, users have come to expect this kind of empowering tool. At the University of Mississippi, students use and have been using Google for research. As Google Scholar went public, it became evident that university faculty also use it for the same reasons. It was apparent from the University of Mississippi Libraries’ 2003 LibQUAL+ survey results that users would like more personal control than the library was offering (table 1). Unintentionally elaborate mazes are created and users become lost in a quagmire of choices. As indicated by our LibQUAL+ survey results, our users want easytouse tools that allow them to find informa tion on their own, and they want information to be easily accessible for independent use. These are clearly two areas that many libraries are struggling to improve for their patrons. The question is how to go about it. Based on several changes made between 2003 and 2005, which included implementing a MetaSearch tool, the adequacy mean improved for both questions and for undergradu ates as well as graduate students and faculty (table 2). The adequacy mean compares the minimum level of ser vice that a user expects with the level of service that they perceive. In table 1, the negative adequacy mean figures indicate that the library was not meeting users’ minimum level of service for these two questions or that the per ceived level of service was lower than the minimal level of service. Table 2 compares the adequacy mean from 2005 with 2003 and indicates a notable, positive change in adequacy mean for each question and with each group. n Design perspectives and tension Generally, there are conflicts within libraries regarding the question of how to improve access for patrons and allow for independent discovery. For those leading a MetaSearch implementation, these tensions are important to understand. In implementing new technologies, there are key development issues that may decrease internal acceptance until they are addressed. However, one may also find that there are some underlying fears regarding this technology. Although the following crosssubculture comparisons simply do not do justice to each of the valid perspectives, these brief descriptions highlight the types of perspectives one might encounter when considering or implementing a MetaSearch product. Expert searchers prefer native interfaces and all of the functionalities of the native interface. They are typically unhappy with the “dumbeddown” or clunky searching of a MetaSearch utility. They would prefer for patrons to be taught the ins and outs of the database they should be using for their research. This presupposes that the students either know which database to use, will spend time inves tigating each database on their own, or that they will ask for assistance. However, there are clearly native interface 44 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 MetaSearching and Beyond: Implementation Experiences and Advice from an Academic Library Gail Herrera Gail Herrera (gherrera@olemiss.edu) is Assistant Dean for Technical Services & Automation and Associate Professor at the University of Mississippi. METASEARCHING AND BEYOND | HERRERA 45 functionalities—such as lim iting to full text—that, while wonderful to patrons, are not consistent across resources or a part of the MetaSearch standard. Users would cer tainly benefit if limiting to fulltext was ubiquitous among vendors and if there were some way to determine fulltext availability within MetaSearch tools. Results ranking is another issue that expert searchers may bring to the table. Currently, there is a NISO MetaSearch Initiative that is striving to standard ize MetaSearching.2 Another downside for the expert searcher is that there is no browse function. Those who are in administrative or manage rial positions working with electronic resources see MetaSearching as an opportunity to reveal these resources to users who might not otherwise discover them. For example, many users have learned to search EBSCO’s Academic Search Premier not realizing that key articles on a local civil rights figure such as James Meredith are also available in America: History & Life, JSTOR, and LexisNexis. MetaSearching removes the need for the user to spend additional time choosing databases that seem relevant and searching them indi vidually. From a financial perspective, if a library is pay ing for these electronic resources, they should be using them as much as possible. And while the University of Mississippi Libraries generally target the undergraduate audience with our MetaSearch tool, the James Meredith search is a good example of how a MetaSearch tool might reveal other databases with information that a serious researcher could then further investigate by link ing through the citation to the native interface. Those associated with library instruction may also be uncomfortable with MetaSearching. In fact within a short time of implementing the product, several instructors conveyed their fear that in making searching so simple, they would no longer have a job as the product developed. Generally, it seems that users are always in need of instruc tion although the type of instruction and the tools continue to change. It is an understandable fear and one that would be wise to acknowledge for those embarking on a MetaSearch implementation. While MetaSearch can be an empowering tool for users, you may also encounter some emotional reactions among library employees. From an information literacy point of view, Frost has noted that MetaSearching is “a step backward” and “a way of avoiding the learning process.”3 It is true that in providing an easy search tool, the library is not endeavoring to teach all students intermedi ate or advanced information retrieval knowledge or skills. However, it is important to provide tools that meet users at their level of expertise and as previously noted, this is an area identified in need of improvement. For those working at public service points such as the Reference Desk, MetaSearching is an adjustment. Many times those working with patrons tend to use databases with which they are more familiar or in which they feel more confident. Federated search tools may reveal resources that are typically less used and therefore unfa miliar to library employees. Training may then become an issue worthy of addressing not just for the MetaSearch interface and design but also for the lessused resources. For those involved in technical support, this product may range from exciting to exasperating. The amount of time your technical support personnel have to dedicate to your MetaSearch project should be a major factor when investigating the available products. Just like any other technological investment, you are either going to (1) purchase the technology and outsource manage ment or (2) obtain a lesser price from a vendor for the tool and invest in developing it yourself. There is also a middle ground, but this costshifting is important to keep in mind. Regardless of your approach, it is critical to include the technical support person on your imple mentation team and to keep in mind the kind of time investment that is available when reviewing prices. Along with developing this product, one may also find oneself investing additional time and money into infra structural upgrades such as the proxy server, network equipment, or DNS servers. In addition to these perspectives, there is a general tension in library Web site design philosophies between how librarians would like patrons to use their services Table 1. 2003 LibQUAL adequacy mean Undergrad Grad Faculty Easy-to-use access tools that allow me to find things on my own -.10 -.30 -.29 Making information easily accessible for independent use .37 -.09 .03 Table 2. Positive change in LibQUAL adequacy mean from 2003 to 2005 Undergrad Grad Faculty Easy-to-use access tools that allow me to find things on my own .53 .46 .24 Making information easily accessible for independent use .22 .22 .45 46 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 and what patrons want. The traditional design based on educating users and having users navigate to information “our way” has definitely curtailed over the past several years with attention being paid increasingly to usability. As usability studies give librarians increasing informa tion, libraries are moving toward designing for our users based on their approaches and needs rather than how librarians would have them work. Depending on where one’s library is in this spectrum of design philosophy, implementing a MetaSearch tool may be harder or easier. Judy Luther surmised the situa tion well, “For many searchers, the quality of the results matter less than the process—they just expect the process to be quick and easy.”4 Moving toward this lofty goal is to some extent dictated by the abilities and inabilities of the technologies chosen. As a technologist, the general rule seems to be that the easier navigation is made for our users; the more complex the technical structure becomes. n MetaSearch categories In arranging categories of searches for a MetaSearch product, some libraries group their electronic resources by subject, and others use categories that reflect fulltext avail ability. The University of Mississippi Libraries use both. The most commonly used category is our fulltext category. This fulltext category was set as the default on our most popular search box located on our articles and databases Web page (figure 1). Since limiting to fulltext materials is not a standard, the category was defined by the percentage of fulltext they contain. This is an important distinction to understand because a user may receive results that are not fulltext, but the majority of results will likely be fulltext. At our library, if the resource contains more than 50 percent fulltext, it is included in the fulltext category. Other categories included in this implementation are ready reference, library catalogs, digital collections, lim ited resources, publicly available databases, and broad subject categories. One electronic resource may be included in the fulltext category, a broad sub ject category such as “arts and humanities” and also have its own individual category in order to mix and match individual resources on sub ject guides using a tailormade search box. The limited resource category contains resources that should be searchable using the MetaSearch tool but that have a limited number of simultaneous users. If it were included in the default fulltext category that is used so much, it would tie up the resource too much. Investigating resources with only one or two simultaneous users at the begin ning of the project may help you avoid error messages and user frustration. One might wonder, “Why profile limited resources then?” There may be specific search boxes on subject guides where librarians decide to add that individual but limited resource. It might also be necessary to shorten the timeout period for limited user resources. Along those same lines, having paypersearch resources profiled could also be expensive and is not recommended. Since the initial implementation, migrating away from per search resources has become a priority. Within the first few months of implementation, the free resources such as PubMed and AskEric were moved to a new “publicly available” category. The reason is that since there is not any authentication involved, these results return very quickly and are always the first results a user sees. While they are important resources, our intent was really to reveal our subscription resources. This approach allows users to search these resources if specifically chosen but they are not included in the default fulltext category. This approach does still allow Subject Librarians to mix and match these free individual resources on subject guide search boxes. n Response time Of all of the issues with our MetaSearch tool, response time has been the most challenging. There are so many issues when it comes to tracking down sluggish response that it can be extremely difficult to know where to start. If one’s MetaSearch software is not locally hosted, response time could involve the library network, campus network, offcampus network provider, and the vendor’s network, not to mention the networks of all the electronic resources users are searching. When one adds the other variable of authentication, the picture becomes even more over whelming and difficult to troubleshoot. For authentication, the University of Mississippi Libraries purchased Innovative’s Web Access Management Module (WAM), which is based on the Figure 1. MetaSearch tailored search box with full text category selected METASEARCHING AND BEYOND | HERRERA 47 EZproxy software. As the use of our electronic resources from oncampus and offcampus has grown, the inci dence of increasing network issues has risen. In work ing with our campus telecommunications group, the pursuit of evergreater bandwidth has become a priority. Troubleshooting has included tracking down trouble some switch settings, firewall settings, as well as campus DNS and vendor DNS issues. If your network adminis trators use packet shapers, this may be another hurdle. Clearly, our MetaSearch product has placed a significant load increase on the proxy server. In looking at proxy statistics, 24 percent of total proxy hits were from the MetaSearch product (figure 2). With this in mind, one may find the load on one’s proxy server increasing very dramatically during peak usage and may need to plan for upgrades accordingly. Even with improvements and tweaks along the way, response time is still an issue and one of the highest hurdles in selling a MetaSearch product internally and externally. One MetaSearch statistical module includes response time information for individual resources along with usage data. The response time information would be very helpful in troubleshooting and in working with electronic resource vendors. Usage tracking is another criterion to consider in reviewing MetaSearch products. n Response time and tailored search boxes During implementation, one of the first discussions to have is who will be the target audience for this product. At this institution, undergraduates were the target audi ence and more specifically, those looking for three to five articles for a paper. While our MetaSearch software has a master screen showing all of the resources divided into the main categories, facing users with over sixty check boxes was not a good solution (figure 3). This master screen is good for demonstrating categories to library staff, overall functionality of the technology, and also for quickly checking all of your resources for connectivity errors. From early conversations with students, keeping basic users far away from this busy screen is a good goal. Remember, the purpose is to give them an easy starting point. The best way to keep users in a simple search box is to construct search boxes and handpick either individual resources or categories keep ing in mind the context of the Web page. For example, the articles and databases page has a simple search box that searches for articles. Subject guide boxes search individual electronic resources selected by the Subject Librarian. The University of Mississippi Libraries also have a large col lection from the American Institute of Certified Public Accountants (AICPA). The search box on that page searches our catalog, which contains AICPA books along with the AICPA digital collection. Some libraries are interested in developing a standard MetaSearch box to display as a widget or standing content area throughout their Web site. This is interesting and worth considering. However, matching the Web page content with appropri ate resources has been our approach. As the standards and technology develop, this may be worth further con sideration depending on usability findings. For the most commonly used search box on the articles and databases page (figure 1), the default category checked is the full text articles category. Donna Fyer stated that, “For the average end user, the less decision making, the better.”5 This certainly rings true for our users. Originally, a simple MetaSearch search box was placed on the library homepage. The library catalog and the basic MetaSearch box were both displayed. This seemed confusing for users since both products have search capabilities. With the next Web site redesign, the basic MetaSearch box moved from the library homepage to the articles and journals Web page. This was a success ful place for the article quick search box to reside since the default was set to search the fulltext category. There were some concerns that users might be typing journal titles into the search box but these were rare instances and not necessarily inappropriate uses. The next rede sign eventually moved this search box to the articles and databases page, where it remains. For the articles and databases pages, the simple search box (figure 1) by default searches the fulltext category and searches the title keyword index. The index category with the label, “Article Citations,” can also be checked by the user. The majority of MetaSearches begin with this search box and Figure 2. Total proxy hits vs. MetaFind proxy hits 4� INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 most users do not change the default settings for the resources or the index. n Subject guide search boxes In addition to the “Article Quick Search” box, Subject Librarians slowly became interested in a search box for their subject guides as the possibili ties were demonstrated. In order to do this, the ven dor was asked to profile each resource with its own unique value in order to mix and match individual resources. While the idea of searching resources by subject category sounds useful and appealing, sometimes universal design begets universal dis cord. Even with a steering committee involved, it is hard for everyone to agree what resources should be in each of the main subject categories: arts and humanities, science and engineering, business and economics, and social science. Some libraries have put a lot of time and effort into creating a large number of subject categories. The master search screen (figure 3) displays several of this library’s categories but not the broad subject categories noted above. These general sub ject categories are brought out in the multipurpose interface called the “Library Search Engine” (figure 4). The Library Search Engine design is a collection of the categories and resources showing the full functionality of our MetaSearch tool. The subject categorization approach within our MetaSearch interface is a good way to show the multifunction ality of the product but remains relatively unused by patrons. By giving each resource its own value, Subject Librarians have the flexibility to select spe cific resources and/or categories for their subject guides. It is worth noting that it required additional setup from our vendor and was not part of the original implementation. After a few months of testing with the initial implemen tation, willing Subject Librarians chose individual resources for their tailored search boxes. Once a simple search box has been constructed, it can be easily copied with minor modi fications to make search boxes for those requesting them. While progress was slow to add these boxes to Subject Guides, after about a year there was growing interest. In setting these up, Subject Librarians have several choices to make. First of all, they choose the resources that will be searched. For example, the biology subject guide search box searches Academic Search Premier, BioOne, and JSTOR by default. BasicBIOSIS and PubMed are also avail able but are not checked by default. Users can check these search boxes if they also wish to search these resources. Choosing the resources to include in the search box as well as setting what resources are checked by default is the most important decision. The Subject Librarian is also encour aged to assist in evaluating the number of hits per resource returned. With response time being a critical factor, deter mining the number of hits per resource should involve testing and take into consideration the overall number of resources being searched. n Relevance Selecting the default index is another decision in setting up search boxes. Again, users are Googleoriented and tend to go with whatever is set as the default option. Out of the box, our MetaSearch tool defaults to the keyword index or keyword search. The issue of relevancy is a hot topic for MetaSearch products. This issue typically comes up in MetaSearch discussions. It is also listed as an issue in the NISO MetaSearch initiative. From the technical side of the equation, results are displayed to the user as soon as they are retrieved. This allows users to begin immediately exam Figure 3. Master screen display (partial screenshot) Figure 4. Library search engine subject categories METASEARCHING AND BEYOND | HERRERA 4� ining the results. Adding a relevancy algorithm as a step would mean all of the results would have to be returned, ranked, and then displayed. With response time being a key issue, a faster response is more important than relevance. Another consideration is if the MetaSearch results are displayed to the user as interfiled or by electronic resource where the resource is returning results based on its own relevancy rankings. One way to increase relevance is to change the default index from keyword to title keyword. For our students, bringing back keywords in the title made the results more relevant. This is the default index used for our article search on the articles and database Web page. Subject Librarians have the choice of indexes they prefer when blending resources. One caveat in using title keyword is that there are resources that do not support title keyword searching. For other resources, title keyword is not an appropriate index. For example, Wilson Biographies does not have a title keyword search. It makes perfect sense that a biography database would not support title keyword searching. In these cases, the search may fail and note that the index is not supported. To accommodate this type of exception, the profile for Wilson Biographies needed to be changed to have the title keyword searchmapped to a basic keyword search. While this does not make the results as relevant as the other search results, it keeps any errors from appearing and allows results to be retrieved. n Results per source and per page For MetaFind, there are also two minor controls that can work as hidden values unseen by the patron or as compo nents within the search box for users to manipulate. The first control is the number of hits to return per resource. If a Subject Librarian is only searching two or three resources in his tailored search box, he probably will want to set this number higher. If there are many resources, this number should be lower in order to keep response time reasonable. The second control is the number of results to return per page. In general, it is important to adjust these controls after testing the response for the resources selected. While users typically use the default settings, showing these two con trols gives the user a visual clue that the MetaSearch tool is not retrieving all of the results from the resource. Instead, it is only retrieving the first twentyfive, for example. n Implementation advice One of the most important pieces of advice is that it is extremely important to have a date in one’s contract or RFP for all of the profiling to be completed if the vendor is doing the resource profiling. From this library’s experi ence, the profiling of a resource can take a very long time, and this is a critical point to include in the contract. One might also consider adding cost and turnaround time for new resources after the initial implementation to the contract. The more resources profiled, the more useful the product. However, one also needs to pay attention to response time. If the plan is to profile one’s own resources or connectors, librarians should be mindful of the time involved and ask other libraries with the same product about time investments. Being able to work with vendors who will provide an opportunity to evaluate the product “live” is preferable. In deciding who to target for an implementation team, consider representatives from reference, collection development, and systems. It is also very important to include whoever manages electronic resource access/ subscriptions and a Web manager. In watching other pre sentations, exclusion of any of these representatives can seriously undermine the implementation. Buyin is essen tial to success. Additionally, giving librarians as many options as possible, such as control over what types of resources are in their search boxes as well as the number of hits per resource makes the product more appealing. n Questions to ask Once the implementation team is set, interviewing refer ences for the products under consideration is an impor tant part of the process. Unstructured conversations with references really allow librarians to explore together what the group wants and how its needs fit with the services the vendor offers. A survey of questions via email is another possibility. In choosing this method, be sure to leave some room for open comments. Regardless of the approach, it is important to spend some time asking ques tions. Provided are a list of recommended questions: n Who is responsible for setting up each resource—the vendor or you? n How much time does it typically take to set up a new resource and what is the standard cost to add a new resource? n Is there a list or database of alreadyestablished pro files for electronic resources for this product? n How much time would you estimate that it took to implement the product? n Will you be able to edit all of the public Web pages yourself or will you be using vendor support staff to make changes? If the vendor support staff has to make some of the changes, how responsive are they? 50 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 n Can you easily mix and match individual resources for subject guides, departmental pages, or other kinds of Web pages? Or do you only have the option to set up global categories? n Is your installation local or does the vendor host it? Are there response issues? n Is there an administrative module to allow you to maintain categories, resource values, and configura tion options? n How much time goes into managing the product monthly? And who manages the product at your library? n What kind of statistical information does the vendor provide? n How satisfied are you with the training, implementa tion support, and technical documentation? n How does the vendor handle broken resources or subscription changes? As with most technologies, there are upfront and hid den costs. It is important to determine what hidden costs are involved and if you have the resources to support all of the costs. Sometimes libraries choose the least expen sive product. However, this approach can lead librar ies down the path of hidden costs. For example, if the product is less expensive but your library is responsible for setting up new electronic resources, managing all of the pages, and finding ways to monitor and troubleshoot performance outside of the tools provided, the hidden expenditures in time and training may be more costly in the end than purchasing the premium MetaSearch tool. In essence, one must pay for the product one way or another. The big question is, Where are the resources to support the product? If one’s library has more IT/Web personnel than money, the lowercosting product may be the way to go, but be sure to check with other librar ies to see if they have been able to successfully clear this hurdle. Additionally, if your library has more onetime money than yearly subscription money, this may dictate the details of the RFP, and your library may lean toward a purchase rather than an annual subscription. n MetaSearch summary Clearly, students want a simple starting place for their research. Implementing a MetaSearch tool to meet this need can be a hard sell internally for many reasons. At this institution, response time has been the overriding critical issue. Response has lagged due to server and network issues that have been difficult to track down and improve. However, authentication is truly the most time consuming and complex part of the equation. Some fed erated search tools are actually searching locally stored information, which helps with response. While these are not truly MetaSearch tools and are not performing real time searches, this approach may yield more stability with faster response. Over the years in implementing new services such as the library Web site, ILLiad, electronic resources, and off campus authentication, new services are often adopted at a much faster rate by library users than by library employees. Typically, there will be early adopters who use the services immediately based on need. It then takes general users about a year to adopt a new service. III’s MetaSearch technology has been available for the past four years. However, our implementation is evolving with each Web site redesign. Still, it is used regularly. The University of Mississippi Libraries has been pro viding access to its electronic resources in two distinct ways: (1) providing URLs on Web pages to the native interface of the electronic resource and (2) MetaSearching. As the library moves forward in developing digital col lections and the number of electronic resources profiled for MetaSearching increases, it is possible that this kind of global discovery tool will compete in popularity with the library catalog. Providing such information mining tools to patrons will cause endless frustration for the library literate. Response times, record retrieval order, as well as licensing and profiling issues, are all obstacles to pro viding a successful MetaSearch infrastructure. Retrieval inconsistency and ad hoc retrieval order of records is very unsettling for librarians. However, this is the kind of tool to which Web users have become accustomed and certainly seems to fill a need that to date has been lacking where library electronic resources are concerned. n Open Web developments One other trend appearing is scholarly research discovery tools on the open Web. Enter Google Scholar along with other similar initiatives such as Windows Live Academic Search. Google Scholar BETA was released in November 2004 and very soon after began an initiative to work with libraries and their OpenURL resolvers.6 This bridging between an open Web tool and libraries is an interest ing development. A fair amount has been written about Google Scholar to date although the project is still in its beta phase. What does Google Scholar have to do with MetaSearching? Good question. It remains to be seen how much scholarly information will become search able via Google Scholar. For now, the jury is still out as to whether Google Scholar will begin to encroach upon the traditional territory of the indexing and abstracting world. If sufficient content becomes available on the open Web, whether from publishers or vendors allowing their METASEARCHING AND BEYOND | HERRERA 51 content to be included, then the authentication piece that directly effects response time may be overcome. In using Google Scholar or other such open Web portals, search ing happens instantly. When a user uses the OpenURL resolver to get to the fulltext, that is where authentication enters into the picture and removes the negative impact on searching. The tradeoff is that there are many issues involved in OpenURL linking and the standardization of the metadata needed to provide consistent access. There are many parallels between what Google Scholar is attempting to offer and what the promises of MetaSearching have been. For MetaSearching, under graduate students looking for their three to five articles for a paper are considered our target audience. For in depth searching, MetaSearching does have limitations, but for the casual searcher looking for a few fulltext articles, it works well. Interestingly, similar recommen dations are being made for Google Scholar.7 However, opinions differ on this point. Roy Tennant went so far as to indicate it is a step forward in access to those users without access to licensed databases, but remained reserved in his opinion regarding the usefulness for those with access.8 Google Scholar also throws in a few bonuses. While providing access to open access (OA) materials in our OPAC for specific collections such as the Directory of Open Access Journals, these same resources have not been included in our MetaSearch discovery tool. Google Scholar is searching these open repositories of scholarly informa tion, although there is some concern over the automatic inclusion of materials such as syllabi and undergraduate term papers within the institutional repositories.9 Google Scholar also provides a useful citation feature and rel evancy. Google Scholar recognizes the user’s preference for fulltext access and provides a visual cue from the brief results when article fulltext is available. This func tionality is not currently available from our MetaSearch software but would be extremely helpful to users. On the downside, some of Google Scholar’s linking policies make it difficult for libraries to extend services beyond full text articles to their users. Another notable development among subscription indexing services is the ability to reveal content to Web search engines. EBSCO’s initiative is called Ebscohost Connection.10 In implementing MetaSearching, libraries have debated about providing access to free versus subscrip tion resources. For our purposes, free resources were not included in the most commonly used search in the full text category. There are those who would argue against this decision, and they have very good points. In fact, it has already been noted that some libraries use Google Scholar to verify incomplete interlibrary loan citations quickly.11 In watching the development of Google Scholar, it seems possible that this free tool that uncovers free open access resources and institutional repository mate rials may not necessarily be a competitive product, but may be a very complementary one. n Impact on the OPAC What will this mean for the “beloved” OPAC? For a very long time, users have expected more of the library catalog than it has provided. While the library catalog is typically appreciated by library personnel, its usefulness for finding materials other than books has been hard for general users to understand. Many libraries including the University of Mississippi have been loading records from their electronic resources in hopes of making the library catalog more useful. The current conversation regarding digital library creation also begs the question, “What is the library catalog?” Although the library catalog serves as a searchable inventory of what the library owns, it is simply a pointing mechanism, whether it points the user to a shelf, a building, or a URL. In our endeavor to provide instant gratification and fulltext, as well as the user’s desire for information regardless of format, the library catalog is beginning to take a backseat. It was clear four years ago in plan ning digital collections that a MetaSearch tool would be needed to tie together subscription resources, digital collections, publicly available resources, and the library catalog. It will be interesting to see whether patrons choose to use the formal tools provided by the library or the informal tools developing on the open Web, such as Google Scholar, to perform their research. More than likely, discovery and access will happen through many avenues. While this may complicate the big picture for those in library instruction, it is important to meet users on the open Web. One’s best intentions and designs are presented to users but they may choose unintended paths. Librarians should watch the paths they are taking and build upon them. Sometimes even one’s best attempts fall short, as pointed out clearly in Karen Schneider’s latest series, “How OPACs Suck.”12 Still it is important to acknowl edge design shortcomings and keep forging ahead. Dale Flecker, who spoke at the TAIGA Forum, recommended not to spend years trying to “get it right” before imple menting, but instead to consider ourselves in perpetual beta and simply implement and iterate.13 In other words, do not try to make the service perfect before implement ing it. Most libraries do not have the time and resources to do this. Instead, find ways to gain continual feedback and constantly adjust and develop. Students are familiar with Internet search engines and do not want to choose between resources. Access to a simple resource discovery tool is an important service for users. Unfortunately, authentication, product design 52 INFORMATION TECHNOLOGY AND LIBRARIES | JuNE 2007 and management, and licensing restrictions tend to be stumbling blocks to providing fast and comprehen sive access. Regarding the MetaSearch tool used at the University of Mississippi Libraries, development part nerships have already been formed between the vendor and a few libraries to improve upon many of the issues discussed. Innovative is developing a nextgeneration Metasearch product called Research Pro that leverages Ajax technology. While efforts are made to participate in discussions and develop our alreadyexisting tools, it is also impor tant to pay attention to other developments such as Google Scholar. At this point, Google Scholar is in beta but this kind of free searching could turn the current infra structure on its ear to the benefit of patrons. The efforts to meet users on the open Web and reveal scholarly content are definitely worth keeping an eye on. References 1. Roland Dietz and Kate Noerr, “OneStop Searching Bridges the Digital Divide,” Information Today 21, no. 7 (2004): S24. 2. NISO MetaSearch Initiative, http://www.niso.org/ committees/MS_initiative.html (accessed May 8, 2006). 3. William J. Frost, “Do We Want or Need Metasearching?” Library Journal 129, no. 6 (2004): 68. 4. Judy Luther, “Trumping Google? Metasearching’s Prom ise.” Library Journal 128, no. 16 (2003): 36. 5. Donna Fyer, “Federated Search Engines,” Online 28, no. 2 (2004): 19. 6. Jill E. Grogg and Christine L. Ferguson, “OpenURL Link ing with Google SCHOLAR,” Searcher 13, no. 9 (2005): 39–46. 7. Mick O’Leary, “Google Scholar: What’s in It for You?” Information Today 22, no. 7 (2005): 35–39. 8. Roy Tennant, “Is Metasearching Dead?” Library Journal 130, no. 12 (2005): 28. 9. O’Leary, “Google Scholar.” 10. What Is EBSCOhost Connection?, http://support.epnet .com/knowledge_base/detail.php?id=2716 (accessed May 10, 2006). 11. Laura Bowering Mullen and Karen A. Hartman, “Google Scholar and the Library Web Site: The Early Response by ARL Libraries,” College & Research Libraries 67, no. 2 (2006): 106–22. 12. Karen G. Schneider, “How OPACs Suck,” ALA Tech- Source, http://www.techsource.ala.org/blog/Karen+G./Sch neider/100003/ (accessed May 10, 2006). 13. Dale Flecker, “My Goodness, Life Is Different,” pre sentation to the Taiga Forum, Mar. 27–28, 2006, http://www .taigaforum.org/pres/FleckerLifeIsDifferentTaiga20060327.ppt (accessed May 10, 2006). LITA cover 2, cover 3, cover 4 Index to Advertisers 3282 ---- 2 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 M any things happen on the national front that affect libraries and their use of technology. Legislative action, national policy, and stan dards development are all arenas in which ALA and LITA both take an active role. LITA has articulated in its strategic plan the need to pursue active involvement in providing its expertise on national issues and standards development. LITA achieves these important objectives in a variety of ways. LITA has several committees, interest groups, and representatives to ALA standing committees that address legislation, regulation, and national policy issues that pertain to technology. The charge of the LITA Legislation and Regulations Committee reads: “The Legislation and Regulation Committee monitors legislative and regula tory developments in the areas of information and communications technologies; identifies relevant issues affecting libraries and assists in developing appropri ate strategies for responding to these issues.” As its educational mission, the committee publicizes issues and strategies on the LITA Web site. The chairperson of this committee serves as the LITA representative to the ALA Legislation Assembly which advises ALA on positions to take regarding legislative and regulatory action. LITA also has a representative to the ALA Office of Information Technology Policy Advisory Committee who works closely with the Legislation and Regulation Committee on IT policy issues that may cross over into the legislative realm. LITA also appoints a representa tive to the ALA Intellectual Freedom Committee whose purpose is “to recommend such steps as may be neces sary to safeguard the rights of library users, libraries, and librarians, in accordance with the First Amendment to the United States Constitution and the Library Bill of Rights.” Much has happened on the national front in the past few years that provides plenty of work for these LITA and ALA committees. The PATRIOT Act, CALEA, Net Neutrality, DOPA, ADA compliance, and debates over copyright and intellectual property rights in an electronic world are all examples of issues that require technologi cal control or affect systems and network solutions. They also touch at the heart of what librarians have always stood for: protection of intellectual property, personal pri vacy, and intellectual freedom. Library technologists exert enormous time and effort protecting the privacy of patron records through data retention policies, system controls, and strong authentication systems all while providing authorized access to intellectual property according to copyright or licensing restrictions. Keeping LITA mem bers apprised of all of these issues and the technologies required to abide by legal requirements is an enormous task of the committees and interest groups. These groups do this through programming, publications, and postings to the LITA Web site. LITA has always been very active on the standards development front. From the start, LITA was involved with the MARC standards through the hard work of Henriette Avram. The number of standards that affect libraries has mushroomed. There are standards for all aspects of technology—data formats, hardware and firmware, and networking. ALA regularly calls on LITA to provide expertise on developing standards that per tain to library technology. LITA has a Standards Interest Group and shares membership with ALCTS and RUSA on the MARBI committee. Most LITA Interest Groups deal with standards of some sort at least occasionally. The LITA Board felt that LITA’s work on develop ing standards was so important that in 2006 a new standards coordinator position was created and Diane Hillman, Cornell University, was appointed as the first person in this role. The standards coordinator identifies LITA experts to assist in calls for review of developing standards and seeks input from the membership. The standards coordinator works closely with the Standards Interest Group to help educate the membership. Because of the nature of digital information, networks, and the standards that enable the distribution of digital informa tion and services, it has become impossible for any one person to understand all the standards that affect the library technologist. As standards proliferate, it becomes more important for LITA to provide educational oppor tunities alongside the involvement in the development of these standards that so impact our daily lives. The LITA Web site provides a wealth of information about standards. A new means of contributing to the dialogue about developing standards is to participate in the LITA Wiki where Diane Hillman will be leading the way in posting information about various library technology standards. Also, a great place to learn about various stan dards is right here in ITAL. Practically every issue has at least one article about one standard or another. LITA’s participation in technological developments on the national front is critical to all libraries. Policy, regu lation, and standards form the infrastructure to techno logical implementation and are the cornerstone to library technology. LITA is the place where you can learn more about these developments and participate in the dialogue about them. Bonnie Postlethwaite (postlethwaiteb@umkc.edu) is LITA President 2006/2007 and Associate Dean of Libraries, University of Missouri–Kansas City. President’s Column Bonnie Postlethwaite 3283 ---- 2007 is ITAL’s 40th volume. My 40th birthday was the occasion of a great deal of bizarre behavior by my work colleagues, who boobytrapped my office. I do not like cake but love radishes. My birthday “cake” at work was a cheese ball decorated with forty radishes stuck on toothpicks. Since I didn’t have to blow them out, I ate them—all forty. ITAL’s fortieth is no time for such shenanigans. Rather it is a time for reflection, celebration, and memoriam. Fred Kilgour, the founding editor of the Journal of Library Automation (JOLA), ITAL’s original title, died last summer. In planning for the 40th anniversaries of LITA in 2006 and ITAL in 2007, the Editorial Board and I wanted to honor Fred as founding editor. I called him and invited him to submit an article of his choosing. He thanked me but graciously declined. He was busy writing his mem oirs and said that he needed to conserve his strength for that task. To honor him as founding editor, I have invited a number of authors to submit articles describing their research or their seminal thoughts on our profession. Readers have, I hope, seen those articles that are so des ignated by notes. I have also invited all LITA members to submit such articles in previous editorials and in a posting to lital. Several articles have resulted from these invitations. This being the first issue of the 2007 volume, it is neither too late for me to reissue an invitation, nor too late for you LITA members and ITAL readers to respond with articles that commemorate our fortieth. I’m old enough to know that it is a cliché to proclaim “there has never been a more exciting time to be a librar ian.” It was so when volume 1 of JOLA appeared in 1967. It is so today. Let us together peruse the tables of contents (TOCs) of the first two issues. Vol. 1, no. 1 Ned C. Morris, “Computer Based Acquisitions System at Texas A&I University”; Richard D. Johnson, “A Book Catalog at Stanford”; Robert Wedgeworth, “Brown University Library Fund Accounting System”; Richard E. Chapin and Dale H. Pretzer, “Comparative Costs of Converting Shelf List Records to Machine Readable Form”; Richard De Gennaro, “The Development and Administration of Automated Systems in Academic Libraries” Vol. 1, no. 2 Lawrence Auld, “Automated Book Order and Circulation Control Procedures at the Oakland University Library”; Donald V. Black, “Creation of Computer Input in an Expanded Character Set”; Frederick C. Kilgour, “Costs of Library Catalog Cards Produced by Computer”; R. A. Kennedy, “Bell Laboratories’ Library RealTime Loan System (BELLREL)” Four things are immediately striking about those titles. Their authors described computerbased solutions and systems for big issues facing libraries forty years ago. Second, those problems were all administrative, i.e., they involved using computers to increase the productivity of major operations performed by librarians and library staff. To paraphrase an oftcited goal, they were systems designed to attempt to control the rate of rise of library costs of operations—to improve the efficiency and effec tiveness of internal library processes. Therefore third, they were not systems for library users per se. And fourth, they were harbingers of success. Global cooperative cataloging and wellintegrated library systems have revolutionized our operations. We are devoting relatively more resources to direct services than we did forty years ago. I do not mean that no thoughts or efforts were being devoted to improved user services. When these articles were published, Lockheed and the System Development Corporation (SDC) were in the process of developing the first commercially successful, general online database search systems. In fact, forty years ago, in a former life, as it were, I was present at what I believe was the first trans continental online information search, from a Teletype machine in SDC’s office in Dayton, Ohio, to a computer at its Santa Monica headquarters. (Aside to readers: As an impatient young man, I was struck less by the “magic” of the event than by an observation that I expressed on the spot: the response time was horrible—unacceptable. I opined that no one would put up with such a wait. I narrowly escaped with my scalp intact.) The National Library of Medicine (NLM) was perfecting the Medical Literature Analysis and Retrieval System (MEDLARS), MEDLINE’s (MEDLARS Online’s) predecessor. Selective Dissemination of Information (SDI) services were already being provided using batch processes. Computers gen erated a myriad of printed article and technical report indexes. We’ve come a long way in forty years. An article in the current issue describes what librarians need to know about “Facebook.” Increasingly, in informationrich soci eties, our students and others want and need their infor mation technology on the run. The first five paragraphs of this editorial were com posed three weeks ago using the word processor on my Palm Treo 650 whilst I sat in medicalcenter waiting and examining rooms in Portland, Oregon. I downloaded the TOCs of JOLA to my home desktop computer in Vancouver, Washington, two weeks ago. Yesterday, I Editorial: Reflections on Forty John Webb John Webb (jwebb@wsu.edu) is a Librarian Emeritus, Washington State University, and Editor of Information Technology and Libraries. EDITORIAL | WEBB 3 contiuned on page 34 34 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200734 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 12. If you answered “Yes” to question 11, please describe how Facebook could be considered an aca demic endeavor. ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ 13. Please check all answers that best describe what effect, if any, use of Facebook in the library has had on library services and operations?  Has increased patron traffic  Has increased patron use of computers  Has created computer access problems for patrons  Has created bandwidth problems or slowed down Internet access  Has generated complaints from other patrons  Annoys library faculty and staff  Interests library faculty and staff  Has generated discussion among library faculty and staff about Facebook 14. Is privacy a concern you have about students using Facebook in the library?  Yes  No  Not sure Please list any observations, concerns, or opinions you have regarding Facebook use in libraries. extracted the paragraphs from my Palm to my desktop, and saved that document and the TOCs on a Universal Serial Bus (USB) key. Today, I combined them in a new document on my laptop and keyed the remaining paragraphs in my room at an inn on a pier jutting into Commencement Bay in Tacoma on southern Puget Sound. I sought inspiration from the view out my window of the water and the fall color, from Old Crow Medicine Show on my iPod, and from early sixties Beyond the Fringe skits on my Treo. Fred Kilgour was committed to delivering informa tion to users when and where they wanted it. Libraries must solve that challenge today, and I am confident that we shall. Editorial continued from page 3 3284 ---- 4 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 This study examines how social scientists arrive at and utilize information in the course of their research. Results are drawn about the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encoun- tered in information seeking for academic research in today’s information environment. These findings refine the understanding of the dynamic relationship between information systems and services and their users within social-scientific research practice and provide implica- tions for scholarly information-system development. T he information needs and informationseeking behavior of social scientists have been the focus of inquiry within library and information science (LIS) research for decades. Folster reviewed the major studies that have been conducted in this area over the past three decades.1 She found that research methods had developed through several stages. Research prior to the 1960s usually consisted of questionnairebased user studies that gathered basic demographic data and quan titative data on the type of information used. Following that were citation studies in the mid1960s, and then the combination of questionnaire and interview techniques to develop profiles of users and their needs in the 1970s. The information environment of the 1980s witnessed a major transition in research design. The former practice of studying large groups via questionnaires or struc tured interviews gave way to the use of unstructured interviews or observation of smaller groups, resulting in a more holistic picture of social scientists’ research practices. More fully developed techniques for behavioral models emerged in the 1990s. Folster summarized these studies done over decades and concluded that (1) social scientists place a high importance on journals; (2) most of their citation identification comes from journals; (3) infor mal channels, such as consulting colleagues and attend ing conferences, are an important source of information; (4) library resources, such as catalogs, indexes, and librar ians, are not very heavily utilized; and (5) computerized services are ranked very low in their importance to the research process. There are many examples of studies about the infor mationseeking behavior of social scientists. For example, the INFROSS project (Investigation into Information Requirements of the Social Scientist) studied the informa tion needs of British social scientists in the late 1960s and early 1970s and found that they preferred to use journal citations instead of traditional bibliographic tools, and that they tended to consult with colleagues and subject experts, rather than library catalogs or librarians in order to locate information.2 Other socialscientist studies reinforced the findings of the INFROSS project.3 Several studies indicated that computerized literature searching was ranked low as a source of information among social scientists and suggested the promotion of electronic information services by librarians to enhance their roles as information providers.4 In an influential study on social scientists’ informa tionseeking patterns, Ellis developed a behavioral model with six features based on the stages they went through in gathering information: ■ Starting—includes activities characteristic of the ini tial search for information, such as asking colleagues or consulting literature reviews, online catalogs, and indexes and abstracts; ■ Chaining—following chains of citations and other forms of referential connection between materials; ■ Browsing—semidirected searching in an area of potential interest, such as scanning published jour nals, tables of contents, references, and abstracts; ■ Differentiating—using differences (authors or jour nal hierarchies) between sources as a filter on the nature and quality of the material examined; ■ Monitoring—maintaining awareness of develop ments in an area through the monitoring of particular sources such as core journals, newspapers, confer ences, magazines, books, and catalogs; and ■ Extracting—systematically working through a par ticular source to locate material of interest, for exam ple, sets of journals, collections of indexes, abstracts, or bibliographies.5 Meho and Tibbo revised Ellis’s informationseeking model by studying the informationseeking behavior of socialscience faculty who study stateless nations.6 They confirmed Ellis’s model and derived four additional fea tures—accessing, networking, verifying, and information managing. Accessing is getting hold of the materials or sources of information once they have been identified and located. Networking includes communicating and maintaining a close relationship with a broad range of people such as friends, colleagues, and intellectu als. Verifying is checking the accuracy of the informa tion found, and information managing includes filing, archiving, and organizing the collected information to facilitate research. Yi Shen Yi Shen (yishen@wisc.edu) is a Ph.D. candidate in the School of Library and Information Studies, University of Wisconsin- Madison. Her article is the winner of the 2006 LITA/Endeavor Student Writing Award. Information Seeking in Academic Research: A Study of the Sociology Faculty at the University of Wisconsin-Madison ARTICLE TITLE | AUTHOR 5INFORMATION SEEkING IN ACADEMIC RESEARCH | SHEN 5 With the exception of Ellis’s work in 1987–1990 and the followup study by Meho and Tibbo, studies inves tigating academic social scientists have been in steady decline since the mid1970s.7 According to Line, in an information world radically changed by the Internet, it is essential to carry out new studies of information uses and needs.8 Most of the studies discussed in this paper were conducted before the development of the Internet. The present study focuses on the informationseeking behav ior of social scientists in a new information environment featuring the Internet and other dramatic technological advances. Kling and McKim pointed out the growing importance of information technology and the resulting major shifts in scientific practice.9 Costa and Meadows studied the impact of computer usage on scholarly com munication among social scientists and found that major changes in their communication habits were occurring.10 The most significant impacts of information technology were greater interactivity, widened community boundar ies, extended access to information, and an increasing democratization of the international research community. They suggested that the developments were influenced by new pressures (social, economic, political) from the research community and the institutional environment, and by newly available resources (infrastructure, ser vices, sources) being introduced into the academic envi ronment by information technology. It could be expected that social scientists’ informationseeking behavior would change within a new socialtechnical environment. The purpose of this study is to extend the findings of the pre vious studies by examining social scientists’ information needs and their activities and perceptions in relation to today’s information systems and services. This paper provides a theoretical framework for the study, discusses the methods for data collection and data analysis, and summarizes findings. Finally, it discusses results, reflects on the theoretical and practical implica tions that ensue, and notes the limitations imposed by the study design. ■ Theoretical framework The theoretical frame for this study is the idea of “com munities of practice.” Wenger, McDermott, and Snyder define a community of practice as “a group of people who share a common concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise in this area by interacting on an ongoing basis.”11 Within communities of practice, people share common values, observe and interact with each other, exchange views and ideas, and contribute to the knowl edgecreation process.12 According to Wenger, communities of practice are combinations of three elements: a domain of knowledge, which defines the key issues in the community; a com munity of people who care about the domain; and the shared practice that they create.13 Communities of prac tice are loosely connected, informal, and selfmanaged. They are about knowledge sharing, and the best way to share knowledge is through social interaction and infor mal relationship networks. Effective communication and mutual understanding are important factors in fostering communities of practice. This form of social construction is highly situated and highly improvised.14 It essentially suggests that researching some thing is inseparable from its own historical and social locations of practice and should be carried out in the process of actually doing that thing.15 A process organizes knowledge in a way that is especially useful to practitioners whose shared learning brings value to a community.16 Pragmatically, the exami nation of contextbased research processes draws “atten tion away from abstract knowledge and cranial processes and situates it in the practice and communities in which knowledge takes on significance.”17 What is learned is highly dependent in the context on which the learning takes place, as it is central to the transfer and consump tion of information. This requires “looking at the actual practice of work, which consists of a myriad of fine grained improvisations that are unnoticed in any formal mapping of work tasks.”18 Such beliefs are utilized in this present study to approach and explain informationseek ing behavior among social scientists. Researchers used communities of practice in orga nization and business studies to investigate knowledge sharing and knowledgecreation processes within orga nizational settings to cultivate the building of knowl edgemanagement systems. Researchers also used this approach in the field of computersupported cooperative work (CSCW) to study the social interactions of group ware systems and community computing and support systems. This study selected communities of practice as the theoretical frame because it has been widely applied in the study of knowledge sharing and has been tested and verified through empirical research. This study rep resents an exploration of the usefulness of communities of practice for research on informationseeking behavior within a knowledgeintensive scholarly community. The primary purpose of the present study is to pro vide empirical evidence on social scientists’ information seeking in scientific research. The main research ques tions are: (1) how do social scientists make use of different information sources and channels to satisfy their infor mation needs? (2) what strategies do they apply when seeking information for academic research? and, (3) what difficulties are encountered in searching for supporting 6 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 20076 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 information? Information service providers should find the results of this study interesting because identifying users’ perceptions of the information environment pro vides guidance for informationsystem development that will closely reflect or accommodate the informationseek ing activities of social scientists. ■ Methods The research questions described in the preceding section were tested in the context of information use in scientific inquiry by faculty in the department of Sociology at the University of WisconsinMadison during March and April 2003. The participants were selected from the faculty list on the department Web page and then contacted by email to arrange facetoface interviews. Four people were interviewed based on their willingness to par ticipate. Three of them are fulltime professors and have teaching experience of more than twenty years (one of them has been teaching for more than thirty years). The fourth is an assistant professor with four years of teaching experience. All of the participants are female. Each inter view lasted from fortyfive minutes to an hour. All participants were interviewed in their campus offices to allow for easy access to supporting materials as examples of how they go about their work. After explain ing her identity, the purpose of the research, and assuring the confidentiality of the interview, the researcher asked initial questions in a relatively structured way to glean backgroundrelated information and research context. The second part of the interview dealt with informa tionrelated behavior, such as information sources and channels used to address research inquiry, and the major strategies for selecting needed information. The third part focused on problems the participants encountered in information seeking. The researcher took field notes and taperecorded all interviews. As a consistency check, the participants were sometimes asked to comment on disciplinary work prac tices gleaned from other interviews. The selection of four participants reflected the practicalities of collecting data with limited time and resources. ■ Findings Based on the idea of communities of practice that what is learned is highly dependent on the context in which the learning takes place because it is central to the trans fer and consumption of information, the present study provides a holistic picture of information use situated in actual research practice and academic context among these social scientists.19 These findings can be summarized into several interrelated stages as shown in figure 1. The figure shows that the social scientists’ information seeking moves from academic information needs, choice of information sources, searching for information, to use of the information. The researchers move back and forth between stages until the information inquiry is satisfied. Searching for information involves the implementation of strategies, confrontation of difficulties, and continuous decision making. Choice of information channels goes through the whole informationseeking process based on researchers’ momentary or changing information activi ties and information needs. This figure is intended to provide a general view of the information seeking behav ior in this specific case, but is not intended to generate a model or pattern of information seeking. The findings are organized into the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encountered in searching for information, which together constitute the major informationseeking practice of the participants. Figure 1. Stages of the social scientists’ information seeking ARTICLE TITLE | AUTHOR 7INFORMATION SEEkING IN ACADEMIC RESEARCH | SHEN 7 ■ Use of information resources and channels to address information needs Information needs The respondents reported their researchoriented infor mation needs in the context of their research activities. Those information needs can be grouped into seven cat egories. Examples of responses follow. 1. General academic issues and current research dis courses in the field. “I find conferences are more useful for seeing what kinds of general things are going on. I guess some of these are research, some are academic politics kinds of things, and what’s happening in the disci pline as a whole.” “In conferences, you find out what other people are doing research on. The most current research is not published yet, so you know what’s happening now.” 2. Feedback from colleagues on personal research. “The best thing about conferences is that when I present my own research, I get comments about it.” “You show your paper to people and ask them for comments, and they show you their papers and ask you for comments. This is kind of the normal part of academic life.” “I usually send a copy of a paper or something and get actual comments through email.” 3. Current research topics and activities of specific authors. “I’ll look for key people, and see what they’ve done. . . .” “Knowing who is doing what where. . . .” “You sort of inevitably talk about your research with other people doing comparable research and find out what they are doing to keep current to what the different research projects are.” 4. Existing datasets (existing survey research data bases) and statistics for secondary data analysis. “There are online statistical sources that I get to put in the papers.” “I use the Internet to download all the . . . data that we analyze. . . .” “I do a lot of data research, so I use government sites on the Internet, like the Science’s Bureau, or the National Center for Health Statistics. We also have a little Center for Demography and Ecology Library. I use our inhouse databases too.” “In social science, there are many existing sets of data. We have something called Data and Program Library Service here. They have all kinds of data bases that will tell you where there are data sources that have certain variables in them. . . . So you can go and do your own statistical analysis on those data.” 5. Information needed for management purposes, such as the cooperation and coordination of research activities. “In this department, we conduct community busi ness by email. We pass messages around. . . . A decision is usually made through this dialogue.” “I am constantly in interaction with people by e mail to cooperate on research projects.” 6. Community recognition and inspirational support from colleagues. For example, one respondent commented, “In conferences, I feel invigorated when sitting and talking to field colleagues who are interested in my research. The whole conversa tion makes me feel excited and inspired.” Another respondent indicated, “To see people facetoface that you respect and they think your work is good, that’s good.” It is echoed by a third respondent: “You just talk about your work, and people act like what you are doing is very interesting, then it makes you more inspired.” Those needs for information constitute a major research practice of the participants and thus determine how they go about seeking information. ■ Information resources Supporting information resources could be divided into internally built university resources and external resources. Moreover, these internal and external resources could be further subdivided into human resources and nonhuman resources based on their physical forms. Internal, nonhuman resources The participants identified two major categories of internal nonhuman information resources for academic research based on their intended use. The first of these categories is books and journals that are available in the university libraries for literature reviews and to provide awareness of current research. However, because of phys ical inconvenience, campus libraries are not often used. One participant indicated, “The library is down the hill, so even before there were lots of good Internet resources, I wasn’t going down to the library a lot.” On the other hand, the participants reported that they frequently used the library online public access catalog (OPAC) to order 8 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 20078 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 document delivery from the libraries. “I find Madcat (the library online catalog system) very useful for a whole variety of specific searches for journals, books, and differ ent online information.” Another participant remarked, “I can request a book online through the document deliv ery services.” Another internal nonhuman resource consists of exist ing survey datasets that are collected by the Center for Demography and Ecology Library for secondary data analysis and research. It was indicated that in social sci ence, as more and more survey research databases were available, there was an increasing amount of research conducted on secondary data. The Data and Program Library Service provides all kinds of databases informing researchers of the location of data sources and the vari ables contained in certain datasets. External, nonhuman resources The participants identified three types of external non human resources based on their medium. Some of these resources are purchased and managed internally by the campus libraries but developed and maintained exter nally by outside library and information professionals. One type is electronic resources, such as electronic news papers, external OPACs, electronic fulltext databases, online statistical reports, survey databases, and govern ment or personal Web sites. Some named examples include Sociological Abstracts, LexisNexis, Science Bureau’s Web site, the National Center for Health Statistics Web site, Web of Science, and online British newspapers. The second type is printed resources, such as books, newspapers and magazines, archives, and newspaper indexes that are available from outside of the campus. Named examples include the paper indexes for the New York Times and Los Angeles Times back in the 1960s. The third type is audio video resources, such as radio broadcasts, tapes, video tapes, and television. One major finding was that the participants depended primarily on electronic information resources. All looked for information on both literature and research data via the Internet. Literally, each participant had her own fre quent visit to search engines or OPACs for information on specific research topics and general research subjects. Examples of responses include: “I start with Internet Explorer and go to Google.” “I work a lot online. . . . I just do Internet searches. “Both these journal and newspaper databases, I use a lot for various purposes.” “I want to find out if there is work on this specific topic or concept. I would almost always start with Sociological Abstracts.” “The citation index is terrific for finding contempo rary work building on something important.” Moreover, the respondents also conducted research on the Internet to study Web behavior or social networks on the Internet. “There are more and more people actually doing research on the Internet, studying Web sites or connec tions between Web sites. . . . They collect data online. . . .” “In socialmovement research, more and more researchers study how people coordinate transactional movements, protest movements, various ethnic move ments, and political movements through the Internet.” “Online is a big way of doing cooperation as well as doing research. It is one of the reasons that we are inter ested in studying what kind of connections there are on the electronic network.” “A current research project that I am doing is looking at network of . . . Web sites. So we are gathering primary data from the Web sites.” Thus, the electronic mechanism for information sys tems and services dominates the manner in which the participants carry out their research. Internal, human resources The faculty participants were not only electronicinforma tion consumers, but also electronicinformation producers. For example, one described, “I maintain my own Web page, on which I post my research and add links to outside resources that I collected for years. I have my own gateway to organize the link pages, which can be used for my future reference and by my students. The library links to my Web page as well.” Moreover, this participant advocated the creation and collection of electronic materials by her col leagues as well. “It’s an evolving process. The more people put their information on the Internet, the more useful it is to be on the Internet. We are right in that transition.” The department can easily take a step further to build a shared pool of information and information resources in its internal system. A second type of internal human resources comes from the technical staff who provided announcements of technical developments and product information, as well as technical assistance for socialscience research. Working as the Social Science Computing Cooperative (SSCC), the technical staff provides the faculty with detailed instruc tions and useful tips for creating electronic materials as well as with directions for publishing them. Librarians, as a third type of human resource, provided reference services and collected necessary information resources for their academic research. External, human resources The external human resources that the respondents gathered and contacted are of two types: people shar ARTICLE TITLE | AUTHOR �INFORMATION SEEkING IN ACADEMIC RESEARCH | SHEN � ing similar research interests and concerns, and people having different fields of interest. The former types were valued for supporting suggestive and creative commu nication and interaction as well as potential cooperation. For example, “when it comes to really think[ing] about things, sit down in one place and talk, and then stuff comes out. You don’t even know what you are thinking until you sit down and talk to people. It’s idea generating.” “Knowing who is doing what where in the field is important. . . . I am working on a . . . research topic, which requires the awareness of other people with similar inter est around the world. . . . I cooperate with the scholars from different countries and with different knowledge background.” The latter types are used for current awareness of research works in other fields and general disciplinary activities and academic trends. For example, “I need to know people who know what’s going on in other fields, and they tell me what’s going on.” “I get a lot in terms of contemporary research at con ferences, which are useful for things that haven’t been published in journals.” “[A conference] will generate a lot of interesting inter change.” “[At conferences], I think about how what other people are doing is related to what I would want to do, or how they can do it differently. A lot of times, I think about whether the methods they are using would be useful for my work at all.” ■ Channels The major information channels through which the par ticipants delivered and exchanged information included email, telephone, facetoface communication, and proj ect reports or other documents. Email was a domi nant communication and informationacquisition tool in research. Facetoface or oral communication channels in this case were often used as a supplementary means. “Mostly, email is how I communicate with people, occasionally telephone, but not very often. Even with people here and we can walk right next door, mostly we just email each other. It’s nice, because you have a record.” “I get hundreds of emails a week. . . . I live on email. My colleagues know I am easier to reach by email than in person.” “[Faceto face] it’s just the more personal and emo tional mode [of communication] . . . you can see the person’s expression, and figure out what they are really thinking.” Email communication helped accomplish several scientists’ tasks, including quick exchange of timely infor mation, teamwork coordination, nonworkoriented mes sage exchange, field discussion, field information seeking and finding communities of interest. For example, one participant indicated the coordination of community activities through email. In this department, we conduct community business by email. Community members rarely meet face to face. The chairperson finds out what the research task is, and sends out messages. People exchange opinions through email messages. And a decision is usually made through this dialogue, instead of talking face to face. When scholars are going to have a facetoface meet ing, they deliver the data, records, and reports before hand, and share their initial viewpoints with supporting information through email. The following factors affected a scholar’s choice of channels for information delivery and exchange: the char acteristics of the information receiver, the characteristics of the information, the task or purpose of delivering or sharing information, and the immediacy of response. For example, one respondent mentioned that she usu ally delivered data, records, and research documents via email for formal announcement and record keeping by the receivers. When there was no stress of immediate response, she preferred email communication for the thoughtful input and feedback allowed by the asynchro nousexchange feature of email. “Intellectual questions are more easily handled by email because I have the time to think about it and formulate my responses.” She con tinued, “I usually email a copy of my paper to colleagues for detailed feedback.” In another case, a participant indicated, “Some of us are well aware that email is archived, it’s not anonymous and not private. If you are concerned about something and want to say something that you don’t want to have an email record of, you may want to go to talk to some body about it, instead of writing it in an email.” Another participant explained that because of her research topics, she usually adopted the facetoface method of communication and attended all kinds of international academic conferences. In other circum stances, when collecting opinions for resolving certain questions, she chose to use email. ■ Strategies for information seeking The participants indicated certain strategies applied to gathering information and tracking resources to address 10 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200710 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 their information needs. Those strategies with response examples are: 1. Extracting abstracts: “I use abstracts to get the parameters of what’s happening and then know more narrowly where to focus.” 2. Tracking citations: “The citation index is terrific for finding contemporary work” that builds on previ ous major work in a subject area. 3. Restricting the search to a limited set of sources or types of sources to achieve satisfactory results within an acceptable timeline. 4. Constantly filtering and interpreting the search results by referring to the summary description of Web sites: “In most searches that I do, the first ten hits are book dealers. I don’t bother with them. I go to the next page and try. . . . I look at the summary of what the site is and try to figure out what the worthy things are.” 5. Avoiding search terms prone to commercial infor mation: “When searching for something without a lot of commercial stuff, you are more likely to get what you want on the top.” 6. Setting the default for the number of search results with consideration of information completeness, information usefulness, importance of research, and timeliness: For example, one participant stated, “I usually set my least default to a hundred cita tions. Five hundred is too many, but it depends on what you’re looking for, how much you care about your findings, how much faith you have for the existence of useful information. If you think it’s not worth a minute of your time, you just forget it. But if you are sure it’s there, you just have to keep looking for and work[ing] harder at it.” As shown in the findings, the participants employed certain criteria for evaluation of the information they gathered. Those judgment criteria were: importance of research, usefulness, accuracy, completeness, and timeli ness. The results imply that to accomplish the research tasks on hand in a fastpaced and distributed digital information environment, the practicalities of time and human effort have come into play in the ways in which the participants sought information. ■ Difficulties in seeking supporting information The problems encountered by the participants when col lecting information through various resources were iden tified and are grouped into categories, including: ■ Information is scattered in different places and with different qualities; it is difficult to have a complete and valuable picture of a research phenomenon. The participants described this difficulty as “how tricky computerized search is.” ■ There is too much information on the Internet to filter, and the current search techniques and ranking tools are not intelligent enough to capture the most relevant information of interest. The participants described trying alternative search strategies as a “gameplaying” and “brainstorming” process. ■ No sources of information or mechanisms assist in the identification of people with similar research interests and their activities in the broad virtual space. For example, one participant described: I am trying to find what’s in public debate on con troversial topics. And it’s very common to have trouble finding both sides of the debate. I started with diffuse searches on the Internet trying to see if I can find the potential academic community and tag into their debate. I basically searched on [the search term] on the whole Internet because I had no idea where it would be, who got involved, and how it was formed. When doing [the research] issue, it’s easy to find the people in favor of [a topic], but difficult to find anybody who was an opponent. Eventually, I got hundreds of hits [search results], and I had to wade through a lot of proponents to find the opponents. Sometimes, it’s an issue to find [an] ethnic minority perspective of a topic. ■ Technology upgrades and system integration arouse another concern. As one participant expressed it, “technology is changing [so] fast that lots of com puter files from the 1970s are no longer readable. The danger of an information system lies in the tradeoff between the accessibility provided by digitization and the longterm survival of intellectual proper ties.” ■ There are no digital sources of information for some historical documents and no retrievable data bases for book chapters. One participant noted, “The online strategy is very good for really current stuff, but not for older stuff. The people who started the . . . research were actually writing before the online revolution, so they are not turning up so much in keyword searches online.” Another participant also mentioned the inconvenience of using hardcopy indexes for newspapers from the 1960s and archival data that go back to the 1970s and 1980s. ■ Discussion This study shows how the ‘communities of practice’ perspective situates the process of using information in the actual practice of scientific research. It provides an information context in which knowledge takes on sig ARTICLE TITLE | AUTHOR 11INFORMATION SEEkING IN ACADEMIC RESEARCH | SHEN 11 nificance. The results provide empirical evidence of the participants’ activities as well as insights into the ways they seek information. In his discussion of useroriented evaluation and qualitative analysis of information use, Ellis emphasized a smallscale qualitative analysis of users’ perceptions of system performance to construct insights into the complex reality of the information environment.20 He argued that a detailed understanding of the complexity and interaction of information systems and services and their users can be used to explain problems and provide guidance on the development of information systems. The present study is in accord with Ellis’s idea by focus ing on a specific sample of academic social scientists working in a university setting. The choice of University of WisconsinMadison is based on the grounds of conve nience and ease of access. The restriction to one specific sample also avoids the added complexity and compound problems of information use situated in different practice and contexts. Ellis also considered the feasibility of interviews to “provide enough information for a detailed and accurate account of the perceptions of the social scientists of their informationseeking activities to be made, and to enable an authentic picture to be constructed of those activi ties.”21 He thought the informationseeking activities of social scientists were too diffuse to carry out triangulation of methods. By applying the interview method, this cur rent study complies with Ellis’s suggestion. On the other hand, Ellis’s informationseeking behav iormodel of social scientists presented six generic fea tures. These conclusions are far too general for specific application. From the perspective of communities of practice, the current study examines the way social sci entists use information in their research practices and specific circumstances; it also presents specific informa tionrelated behavior, strategies, and difficulties. This study also extends the understanding of the way infor mation is used by social scientists in a new information environment with dramatic technical advances. The findings of this study support the conclusions of Kling and McKim and Costa and Meadows by showing the growing importance of information technology and the resulting major shifts in informationseeking practice among social scientists.22 Unlike research findings prior to the 1990s, the social scientists in this study make exten sive use of a variety of information sources and channels, primarily electronicinformation systems and services, in seeking information. In the new information environ ment, these new information mechanisms also presented limitations and difficulties. Moreover, many LIS researchers have examined users’ relevance criteria in information seeking.23 Great emphasis is given to the “situational dynamism of user centered relevance estimation.”24 Situated in their research practices, the present study also identified the social sci entists’ applications of certain criteria for evaluation of information. Although the smallscale study has limitations for research generalization, the rich description of social sci entists’ perspective on the information environment has some practical implications for informationsystemand service design for academic social scientists. ■ Plan for system-to-system integration This study identified technology upgrades and sys temintegration problems existing in current academic information systems. Technology was developed and applied without the capability of intergenerational com munications and transactions at the cost of intellectual properties. Kling and Star addressed the same issue that “computerized systems appear like the layers of an archaeological dig, with newer systems built upon older systems with various workplace surveillance capa bilities.”25 They stated that such “legacy systems” are fragile and inflexible for information use and knowledge management. Therefore, planning for system integration should be underway. ■ Enhance the Web resource-retrieval system The study identified the difficulties encountered by fac ulty in locating relevant, complete, and valuable informa tion effectively and efficiently on the large and dynamic Web. An advanced Web resource system thus is required that allows Web content to be indexed and retrieved more intelligently. Moreover, the findings of informationseek ing strategies in this case study suggest a oneway user system interaction process. There is no interactive query refinement between the user and the system. Thus, the users have to brainstorm and play with alternative search strategies in the hope of significant results. To enhance system effectiveness, a relevancefeedback mechanism that takes into account the users’ relevance judgment is thus needed. This mechanism should have a twoway usersystem interaction component. ■ Construct an internal information system The findings of the study point to a need for a shared pool of information resources in the University of Wisconsin– Madison Department of Sociology. Through the leverage and reuse of existing internal knowledge assets in the 12 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200712 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Department, this system could help collectively create or gather information resources for crossreference by colleagues. ■ Construct a collaborative information mechanism for the social-scientific community According to the findings, there are no sources of infor mation or mechanisms that assist the identification of people with similar research interests and their activi ties on the broad virtual space. However, awareness of shared interests and experiences constitutes an important external human resource that is valued for suggestive and creative interaction and for potential cooperation. Thus, a collaborative information mechanism for identification with personal academic interests will be helpful. ■ Limitations Certain limitations inherent in the study need to be acknowledged. Due to the time and resource constraints, the study sample includes only four scholars. Given this small sample, results cannot be generalized. Although Ellis mentioned the feasibility of interviews in a user oriented study of information use, dependence on a single method has the disadvantages of the restriction of views. For example, interviewer characteristics, expecta tions, and verbal idiosyncrasies, and participants’ socially desirable responses are recognized in many studies as potential sources of method biases (Podsakoff et al.).26 If time and resources permit, triangulation of methods—for example, combining interviews with observations and diaries—would increase the level of specificity and justify the validity and reliability of the research results. ■ Conclusion Drawing upon the idea of communities of practice that what is learned: (1) is dependant on the context in which the learning takes place, and (2) is central to the transfer and consumption of information, this study examined the informationseeking behavior of four social scien tists. Results were drawn about their use of information resources and information channels to meet their infor mation inquiries, their strategies for information seeking, and the difficulties encountered in searching for relevant information, situated in the course of their actual scien tific research. This work has two primary contributions. First, it provides a rich description of social scientists’ per spectives on their researchoriented informationseeking behavior in the context of today’s information environ ment. Second, it situates information seeking behavior in a socially constructed practice and presents specific features of information seeking. These results will help refine the understanding of the dynamic relationship between information systems and services and their users within scientific research. Several areas remain for future research. Researchers could make a comparative study of academics in differ ent institutional settings. Future research could also study the dynamic interaction of information systems and ser vices and their users within each stage of Ellis’s model of informationseeking patterns among social scientists to get insights into the specific features of their information seeking behaviors and to enrich their general patterns of information inquiry with specific details. Research on informationseeking behaviors of social scientists could also focus on specific research tasks or certain research stages to decide differences or similarities of informa tionseeking behaviors across academic practice. Similar research could also be done on faculty in other disci plines. References 1. M. B. Folster, “InformationSeeking Patterns: Social Scien tists,” The Reference Librarian 23, no. 49/50 (1995): 83–93. 2. M. B. Line, “Information Requirements in the Social Sci ences: Some Preliminary Considerations,” Journal of Librarianship 1, (1969): 1–19; M. B. Line, “The Information Uses and Needs of Social Scientists: An Overview of INFROSS,” Aslib Proceedings 23, (1971): 412–34. 3. P. Stenstrom and R. B. McBride, “Serial Use by Social Sci ence Faculty: A Survey,” College and Research Libraries 40 (1979): 426–31; R. H. Epp and J. S. Segal, “The ACLS Survey and Aca demic Library Service,” College and Research Libraries News 48, (1987): 63–69; M. Slater, “Social Scientists’ Information Needs in the 1980s,” Journal of Documentation 44, no. 3 (1988): 226–37; M. B. Folster, “A Study of the Use of Information Sources by Social Science Researchers,” The Journal of Academic Librarianship 15, no. 1 (1989): 7–11; C. C. Gould and M. J. Handler, Information Needs in the Social Sciences: An Assessment (Mountain View, Calif.: Research Libraries Group, 1989). 4. Folster, “A Study of the Use of Information Sources by Social Science Researchers”; Epp and Segal, “The ACLS Survey and Academic Library Service.” 5. D. Ellis, “The Derivation of a Behavioral Model for Infor mation Retrieval System Design” (Ph.D. diss., Univ. of Sheffield, 1987); D. Ellis, “A Behavioral Approach to Information Retrieval System Design,” Journal of Documentation 45, no. 3 (1989): 171– 212. 6. L. I. Meho and H. R. Tibbo, “Modeling the Information Seeking Behavior of Social Scientists: Ellis’s Study Revisited,” ARTICLE TITLE | AUTHOR 13INFORMATION SEEkING IN ACADEMIC RESEARCH | SHEN 13 Journal of the American Society for Information Science and Technol- ogy 54, no. 6 (2003): 570–87. 7. H. C. Hobohm, “Social Science Information and Docu mentation: Time for a State of the Art?” Inspel 33, no. 3 (1999): 123–30. 8. M. B. Line, “Social Science Information: The Poor Rela tion,” IFLA Journal 26, no. 3 (2000): 177–79. 9. R. Kling and G. McKim, “Not Just a Matter of Time: Field Differences and the Shaping of Electronic Media in Supporting Scientific Communication,” Journal of the American Society for Information Science 51, no. 14 (2000): 1306–20. 10. S. Costa and J. Meadows, “The Impact of Computer Usage on Scholarly Communication Among Social Scientists,” Journal of Information Science 26, no. 4 (2000): 255–62. 11. E. Wenger, R. McDermott, and W. M. Snyder, Cultivating Communities of Practice: A Guide to Managing Knowledge (Boston: Harvard Business Sch. Pr., 2002), 4. 12. S. AlHawamdeh, Knowledge Management: Cultivating Knowledge Professionals (Oxford: Chandos Pubs., 2003). 13. E. Wenger, Communities of Practice: Learning, Meaning, and Identity (Cambridge: Cambridge Univ. Pr., 1998). 14. J. S. Brown and P. Duguid, “Organizational Learning and Communities of Practice: Toward a Unified View of Working, Learning, and Innovation,” Organization Science 2, no.1 (1991): 40–57. 15. J. S. Brown, “Internet Technology in Support of the Con cept of ‘Communities of Practice’: The Case of Xerox,” Account- ing, Management, and Information Technologies 8, no. 4 (1998): 227–36; Brown and Duguid, “Organizational Learning and Com munities of Practice; F. Blackler, “Knowledge, Knowledge Work, and Organizations: An Overview and Interpretation,” Organi- zation Studies 16, no. 6 (1995): 1021–46; J. Lave and E. Wenger, Situated Learning: Legitimate Peripheral Participation (Cambridge: Cambridge Univ. Pr., 1991); N. Hayes and G. Walsham, “Par ticipation in GroupwareMediated Communities of Practice: A SocioPolitical Analysis of Knowledge Working,” Information and Organization 11, no. 4 (2001): 263–88. 16. Wenger, McDermott, and Snyder, “Cultivating Communi ties of Practice.” 17. Brown and Duguid, “Organizational Learning and Com munities of Practice,” 48. 18. Hayes and Walsham, “Participation in GroupwareMedi ated Communities of Practice,” 264. 19. K. Grosser, “Human Networks in Organizational Informa tion Processing,” in M. E. Williams, ed., Annual Review of Informa- tion Science and Technology (Medford, N.J.: Learned Information, 1991), 349–402; Brown, “Internet Technology in Support of the Concept of ‘Communities of Practice’”; Brown and Duguid, “Organizational Learning and Communities of Practice”; Black ler, “Knowledge, Knowledge Work, and Organizations”; Lave and Wenger, Situated Learning: Legitimate Peripheral Participation; Hayes and Walsham, “Participation in GroupwareMediated Communities of Practice.” 20. D. Ellis, “UserOriented Evaluation and Qualitative Anal ysis of Patterns of Information Use,” in D. Bawden, User-Ori- ented Evaluation of Information Systems and Services (Brookfield, Vt.: Gower, 1990), 172–79. 21. Ibid., 177. 22. Kling and McKim, “Not Just a Matter of Time”; Costa and Meadows, “The Impact of Computer Usage on Scholarly Com munication Among Social Scientists.” 23. C. L. Barry, “UserDefined Relevance Criteria: An Explor atory Study,” Journal of the American Society for Information Science 45, no. 3 (1994): 149–59; H. W. Bruce, “A Cognitive View of the Situational Dynamism of UserCentered Relevance Estimation,” Journal of the American Society for Information Science 45, no. 3 (1994): 142–48; S. Mizzaro, “Relevance: The Whole Story,” Jour- nal of the American Society for Information Science 48, no. 9 (1997): 810–32; X.J. Yuan, N. J. Belkin, and J.Y. Kim, “The Relation ship between ASK and Relevance Criteria,” in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New York: ACM Pr., 2002), 359–60; S. Y. Rieh, “Judgment of Information Quality and Cognitive Authority in the Web,” Journal of the American Society for Information Science and Technology 53, no. 2 (2002): 145–61; C. N. Wathen and J. Burkell, “Believe It or Not: Factors Influencing Credibility on the Web,” Journal of the American Society for Infor- mation Science and Technology 53, no. 2 (2002): 134–44; A. Tombros, I. Ruthven, and J. M. Jose, “Searchers’ Criteria for Assessing Web Pages,” in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Toronto: ACM Pr., 2003), 385–86. 24. Bruce, “A Cognitive View of the Situational Dynamism of UserCentered Relevance Estimation,” 142. 25. R. Kling and L. Star, “HumanCentered Systems in the Perspective of Organizational and Social Informatics,” Comput- ers and Society 28, no. 1 (1998): 22–29. 26. P. M. Podsakoff et al., “Common Method Biases in Behav ioral Research: A Critical Review of the Literature and Recom mended Remedies,” Journal of Applied Psychology 88, no. 5 (2003): 879–903. 3285 ---- 14 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Article Title: subtitle in same font Author Name and Second Author Author ID box for 2 column layout 14 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Article Title: subtitle in same font Author Name and Second Author Author ID box for 2 column layout Based on data collected as part of the 2006 Public Libraries and the Internet study, the authors assess the degree to which public libraries provide sufficient and quality bandwidth to support the library’s networked services and resources. The topic is complex due to the arbitrary assignment of a number of kilobytes per sec- ond (kbps) used to define bandwidth. Such arbitrary definitions to describe bandwidth sufficiency and quality are not useful. Public libraries are indeed connected to the Internet and do provide public-access services and resources. It is, however, time to move beyond connectiv- ity type and speed questions and consider issues of band- width sufficiency, quality, and the range of networked services that should be available to the public from public libraries. A secondary, but important issue is the extent to which libraries, particularly in rural areas, have access to broadband telecommunications services. T he biennial Public Libraries and the Internet studies, conducted since 1994, describe public library involve ment with and use of the Internet.1 Over the years, the studies showed the growth of publicaccess comput ing (PAC) and Internet access provided by public libraries to the communities they serve. Internet connectivity rose from 20.9 percent to essentially 100 percent in less than ten years; the average number of public access computers per library increased from an average of two to nearly eleven; and bandwidth rose to the point where 63 percent of public libraries have connection speeds of greater than 769kbps (kilobytes per second) in 2006. This dramatic growth, replete with related information technology challenges, occurred in an environment of challenges—among them budgetary and staffing—that public libraries face in main taining traditional services as well as networked services. One challenge is the question of bandwidth suf ficiency and quality. The question is complex because typically an arbitrary number describes the number of kbps used to define “broadband.” As will be seen in this paper, such arbitrary definitions to describe band width sufficiency are generally not useful. The Federal Communications Commission (FCC), for example, uses the term “high speed” for connections of 200kbps in at least one direction.2 There are three problematic issues with this definition: 1. It specifies unidirectional bandwidth, meaning that a 200kbps download, but a much slower upload (e.g., 56kbps) would fit this definition; 2. Regardless of direction, bandwidth of 200kbps is neither high speed nor does it allow for a range of Internetbased applications and services. This inad equacy will increase significantly as Internetbased applications continue to demand more bandwidth to operate properly. 3. The definition is in the context of broadband to the single user or household, and does not take into consideration the demands of a highuse multiple workstation publicaccess context. In addition to connectivity speed, there are many ques tions related to public library PAC and Internet access that can affect bandwidth sufficiency—from budget and sus tainability, staffing and support, to services public librar ies offer through their technology infrastructure, and the impacts of connectivity and PAC on the communities that libraries serve. One key question, however, is what is qual- ity PAC and Internet bandwidth for public libraries? And, in attempting to answer that question, what are measures and benchmarks of quality Internet access? This paper provides data from the 2006 Public Libraries and the Internet study to foster discussion and debate around determining quality PAC and Internet access.3 Bandwidth and connectivity data at the library outlet or branch level are presented in this article. The band width measures are not systemwide but rather at the point of service delivery in the branch. ■ The bandwidth issue There are a number of factors that affect the sufficiency and quality of bandwidth in a PAC and Internet service context. Examples of factors that influence actual speed include: ■ number of workstations (publicaccess and staff) that simultaneously access the Internet; ■ provision of wireless access that shares the same con nection; ■ ultimate connectivity path—that is, a direct connec tion to the Internet that is truly direct, or one that goes through regional or other local hops (that may have aggregated traffic from other libraries or orga nizations) out to the Internet; John Carlo Bertot and Charles R. McClure Assessing Sufficiency and Quality of Bandwidth for Public Libraries John Carlo Bertot (jbertot@fsu.edu) is the Associate Director of the Information Use Management and Policy Institute and Professor at the College of Information, Florida State University; and Charles R. McClure (cmcclure@ci.fsu.edu) is the Director of the Information Use Management and Policy Institute (www .ii.fsu.edu) and Francis Eppes Professor of Information Studies at the College of Information, Florida State University. ARTICLE TITLE | AUTHOR 15ASSESSING SUFFICIENCY AND qUALITY OF BANDWIDTH FOR PUBLIC LIBRARIES | BERTOT AND MCCLURE 15 ■ type of connection and bandwidth that the telecom munications company is able to supply the library; ■ operations (surfing, email, downloading large files, streaming content) being performed by users of the Internet connection; ■ switching technologies; ■ latency effects that affect packet loss, jitter, and other forms of noise throughout a network; ■ local settings and parameters, known or unknown, that impede transmission or bog down the delivery of Internetbased content; ■ range of networked services (databases, videoconfer encing, interactive/realtime services) to which the library is linked; ■ if networked, the speed of the network on which the publicaccess workstations reside; and ■ general application resource needs, protocol priority, and other general factors. Thus, it is difficult to precisely answer “how much bandwidth is enough” within an evolving and dynamic context of public access, use, and infrastructure. Putting publicaccess Internet use into a more typi cal applicationanduse scenario, however, may provide some indication of adequate bandwidth. For example: ■ a typical threeminute digital song is 3MB; ■ a typical digital photo is about 2MB; and ■ a typical PowerPoint presentation is about 10MB. If one person in a public library were to email a PowerPoint presentation at the same time that another person downloaded multiple songs, and another was exchanging multiple pictures, even a library with a T1 line (1.5mbps—megabytes per second) would experience a temporary network slowdown during these operations. This does not take into account many other new high bandwidthconsuming applications such as CNN stream ingvideo channel; uploading and accessing content to a wiki, blog, or YouTube.com; or streaming content such as CBS’s webcasting the 2006 NCAA basketball tournament. An increasingly used technology in various settings is twoway Internetbased video conferencing. With an installed T1 line, a library could support two 512kbps or three 384kbps videoconferences, depending on the amount of simultaneous traffic on the network—which, in a public access context, would be heavy. Indeed, the 2006 Public Libraries and the Internet study indicated a near continuous use of publicaccess workstations by patrons (only 14.6 percent of public libraries indicated that they always had a sufficient number of workstations available for patron use). Public libraries increasingly serve as access points to egovernment services and resources, e.g., social services, disaster relief, health care.4 These services can require the simple completion of a Webbased form (lowbandwidth consumption) to more interactive services (highband width consumption). And, as access points to continuing education and online degree programs, public libraries need to offer adequate broadband to enable users to access services and resources that increasingly can depend on streaming technologies that consume greater bandwidth. ■ Bandwidth and PAC in public libraries today As table 1 demonstrates, public libraries continue to increase their bandwidth, with 63.3 percent of public libraries reporting connection speeds of 769kbps or greater. This compares to 47.7 percent of public libraries reporting connection speeds of greater than 769kbps in 2004. There are disparities between rural and urban pub lic libraries, with rural libraries reporting substantially fewer instances of connection speeds of greater than 1.5mbps in 2006. On the one hand, the increase in con nectivity speeds between 2004 and 2006 is a positive step. On the other, 16.1 percent of public libraries report that their connection speeds are insufficient to meet patron demands all of the time, and 29.4 percent indicate that their connection speeds are insufficient to meet patron demands some of the time. Thus, nearly half of public libraries indicate that their connection speeds are insuf ficient to meet patron demands some or all of the time. In terms of public access computers, the average number of workstations that public libraries provide is 10.7 (table 2). Urban libraries have an average of 17.1 workstations, as compared to rural libraries, which report an average of 7.1 workstations. A closer look at bandwidth and PAC For the next sections, the data offer two key views for analysis purposes: (1) workstations—divided into libraries with ten or fewer publicaccess workstations and libraries with more than ten publicaccess worksta tions (given that the average number of publicaccess workstations in libraries is roughly ten); and (2) band width—divided into libraries with 769kbps or less and libraries with greater than 769kbps (an arbitrary indicator of broadband for a public library context). In looking across bandwidth and publicaccess work stations (table 3), overall 31.8 percent of public libraries have connection speeds of less than 769kbps while 63.3 percent have connection speeds of greater than 769kbps. A majority of public libraries—68.5 percent—have ten or fewer workstations, while 30.9 percent have more than ten workstations. In general, rural libraries have fewer workstations and lower bandwidth as compared to sub urban and urban libraries. Indeed, 75.2 percent of urban 16 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200716 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 libraries with fewer than ten workstations have connec tion speeds of greater than 769kbps, as compared to 45.2 percent of rural libraries. When examining PAC capacity, it is clear that public libraries have capacity issues at least some of the time in a typical day (tables 4 through 6). Only 14.6 percent of public libraries report that they have sufficient numbers of workstations to meet patron demands at all times (table 6), while nearly as many, 13.7 percent, report that they consistently are unable to meet patron demands for publicaccess workstations (table 4). A full 71.7 percent indicate that they are unable to meet patron demands during certain times in a typical day (see table 5). In other words, 85.4 percent of public libraries report that they are unable to meet patron demand for publicaccess workstations some or all of the time during a typical day—regardless of number of workstations available and type of library. The disparities between rural and urban libraries are notable. In general, urban libraries report more difficulty in meeting patron demands for publicaccess workstations. Of urban public libraries, 27.8 percent report that they consistently have difficulty in meeting patron demand for workstations, as compared to 11.0 percent of suburban and 10.6 percent of rural public libraries (table 4). By contrast, 6.6 percent of urban libraries report sufficient workstations to meet patron demand all the time as compared to 18.9 percent of rural libraries (table 6). When reviewing the adequacy of speed of connectiv ity data by the number of workstations, bandwidth, and metropolitan status, a more robust and descriptive pic Table 1. Public library outlet maximum speed of public-access Internet services by metropolitan status and poverty Metropolitan status Poverty level Maximum speed Urban Suburban Rural Low Medium High Overall Less than 56kbps 0.7% ±0.8% (n=18) 0.4% ±0.6% (n=17) 3.7% ±1.9% (n=275) 2.0% ±1.4% (n=245) 2.7% ±1.6% (n=61) 2.6% ±1.6% (n=5) 2.1% ±1.4% (n=311) 56kbps– 128kbps 2.5% ±1.6% (n=67) 5.4% ±2.3% (n=264) 15.2% ±3.6% (n=1,132) 9.9% ±3.0% (n=1,237) 9.5% ±2.9% (n=216) 5.3% ±2.2% (n=10) 9.8% ±3.0% (n=1,463) 129kbps– 256kbps 2.7% ±1.6% (n=72) 6.8% ±2.5% (n=332) 11.1% ±3.1% (n=829) 8.5% ±2.8% (n=1,067) 7.3% ±2.6% (n=166) - 8.2% ±2.8% (n=1,233) 257kbps–768kbps 9.1% ±2.9% (n=241) 10.4% ±3.1% (n=504) 13.4% ±3.4% (n=1,002) 12.5% ±3.3% (n=1,557) 8.4% ±2.8% (n=190) - 11.7% ±3.2% (n=1,747) 769kbps– 1.5mbps 33.6% ±4.7% (n=889) 40.0% ±4.9% (n=1,945) 31.0% ±4.6% (n=2,310) 34.3% ±4.8% (n=4,286) 34.6% ±4.8% (n=788) 38.1% ±4.9% (n=70) 34.4% ±4.8% (n=5,144) Greater than 1.5mbps 49.4% ±5.0% (n=1,304) 31.6% ±4.7% (n=1,533) 19.9% ±4.0% (n=1,488) 27.4% ±4.5% (n=3,423) 35.5% ±4.8% (n=808) 50.5% ±5.0% (n=93) 28.9% ±4.5% (n=4,324) Don’t know 1.9% ±1.4% (n=50) 5.4% ±2.3% (n=263) 5.7% ±2.3% (n=427) 5.5% ±2.3% (n=685) 2.1% ±1.4% (n=48) 3.5% ±1.8% (n=6) 4.9% ±2.2% (n=739) Weighted missing values, n=1,497 Table 2. Average number of public library outlet graphical public- access Internet terminals by metropolitan status and poverty* Poverty level Metropolitan status Low Medium High Overall Urban 14.7 20.9 30.7 17.9 Suburban 12.8 9.7 5.0 12.6 Rural 7.1 6.7 8.1 7.1 Overall 10.0 13.3 26.0 10.7 * Note that most library branches defined as “high poverty” are in general part of library systems with multiple branches and not single building systems. By and large, library systems connect and provide PAC and Internet services systemwide. ARTICLE TITLE | AUTHOR 17ASSESSING SUFFICIENCY AND qUALITY OF BANDWIDTH FOR PUBLIC LIBRARIES | BERTOT AND MCCLURE 17 ture emerges. While overall, 53.5 percent of public librar ies indicate that their connection speeds are adequate to meet demand, some parsing of this figure reveals more variation (tables 7 through 10): ■ Libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at all times, with 24.0 percent of rural libraries, 25.8 percent of suburban libraries, and 25.4 percent of urban libraries so reporting (table 7). ■ Libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at some times, with 35.0 percent of rural libraries, 38.1 per cent of suburban libraries, and 53.4 percent of urban libraries so reporting (table 8). ■ Libraries with connection speeds of greater than 769kbps also report bandwidthsufficiency issues, with 12.0 percent of rural libraries, 10.5 percent of suburban libraries so reporting; and 14.0 percent of urban librar ies indicating that their connection speeds are insuf ficient all of the time (table 7); 20.3 percent of rural libraries, 29.5 percent of suburban libraries, and 30.0 percent of urban libraries indicating that their connec tion speeds are insufficient some of the time (table 8). ■ Libraries that have ten or fewer workstations tend to rate their bandwidth as more sufficient at either 769kbps or less or greater than 769kbps (tables 7, 8, and 10). Thus, in looking at the data, it is clear that libraries with fewer workstations indicate that their connection speeds are more sufficient to meet patron demand. Table 3. Public library public-access workstations and speed of connectivity by metropolitan status Rural Suburban Urban LT769kbps GT769KBPS LT769kbps GT769KBPS LT769kbps GT769KBPS 10 or fewer workstations 48.4% n=2,929 45.2% n=2,737 30.1% n=891 63.2% n=1,872 21.6% n=269 75.2% n=937 More than 10 workstations 22.0% n=307 75.5% n=1,053 12.0% n=225 85.1% n=1,595 9.6% n=130 89.8% n=1,221 Total 43.4% n=3,242 50.9% n=3,802 23.0% n=1,116 71.6% n=3,474 15.1% n=399 83.0% n=2,194 Missing: 7.6% (n=1,239) Table 4. Fewer public library public-access workstations than patrons wishing to use them by metropolitan status Rural Suburban Urban Total 10 or fewer workstations 10.5% n=681 10.8% n=339 23.6% n=300 12.1% n=1,321 More than 10 workstations 10.8% n=158 11.4% n=220 31.2% n=430 16.9% n=808 Total 10.6% n=845 11.0% n=562 27.8% n=748 13.7% n=2,157 Missing: 2.9% (n=473) Table 5. Fewer public library public-access workstations than patrons wishing to use them at certain times during a typical day by metropolitan status Rural Suburban Urban Total 10 or fewer workstations 68.8% n=4,444 74.5% n=2,347 69.1% n=880 70.5% n=7,670 More than 10 workstations 78.1% n=1,139 80.2% n=1,548 62.8% n=866 74.5% n=3,553 Total 70.5% n=5,605 76.7% n=3,905 65.6% n=1,764 71.7% n=11,273 Missing: 2.9% (n=473) Table 6. Sufficient public library public-access workstations avail- able for patrons wishing to use them by metropolitan status Rural Suburban Urban Total 10 or fewer workstations 20.6% n=1,331 14.7% n=464 7.4% n=94 17.4% n=1,889 More than 10 workstations 11.0% n=161 8.4% n=163 6.0% n=83 8.5% n=406 Total 18.9% n=1,501 12.3% n=627 6.6% n=177 14.6% n=2,304 Missing: 2.9% (n=473) 18 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200718 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Table 7. Public library connection speed insufficient to meet patron needs by metropolitan status Rural Suburban Urban LT769kbps GT769KBPS LT769kbps GT769KBPS LT769kbps GT769KBPS 10 or fewer workstations 25.4% n=668 12.1% n=297 27.4% n=233 9.8% n=173 15.4% n=34 10.2% n=90 More than 10 workstations 11.6% n=34 11.4% n=108 19.2% n=41 11.3% n=168 25.4% n=32 17.1% n=199 Total 24.0% n=705 12.0% n=408 25.8% n=274 10.5% n=341 18.7% n=72 14.0% n=293 Table 8. Public library connection speed insufficient to meet patron needs at some times by metropolitan status Rural Suburban Urban LT769kbps GT769KBPS LT769kbps GT769KBPS LT769kbps GT769KBPS 10 or fewer workstations 34.1% n=898 19.3% n=474 37.1% n=315 29.0% n=511 50.0% n=130 27.0% n=238 More than 10 workstations 43.2% n=127 22.5% n=214 42.3% n=90 30.3% n=450 60.3% n=76 32.0% n=374 Total 35.0% n=1,025 20.3% n=694 38.1% n=405 29.5% n=961 53.4% n=206 30.0% n=626 Table �. Public library connection speed is sufficient to meet patron needs by metropolitan status Rural Suburban Urban LT769kbps GT769KBPS LT769kbps GT769KBPS LT769kbps GT769KBPS 10 or fewer workstations 38.9% n=1,025 68.3% n=1,675 35.0% n=297 60.2% n=1,062 34.6% n=90 62.9% n=556 More than 10 workstations 45.2% n=133 66.1% n=628 38.5% n=82 54.9% n=817 14.3% n=18 50.9% n=594 Total 39.5% n=1,158 67.5% n=2,306 35.7% n=379 57.9% n=1,886 28.0% n=108 56.0% n=1,168 Table 10. Public library connection speed insufficient to meet patron needs some or all of the time by metropolitan status Rural Suburban Urban LT769kbps GT769KBPS LT769kbps GT769KBPS LT769kbps GT769KBPS 10 or fewer workstations 59.5% n=1,566 31.4% n=771 64.6% n=549 38.8% n=684 65.4% n=170 37.1% n=328 More than 10 workstations 54.8% n=161 33.9% n=322 61.5% n=131 41.6% n=618 85.7% n=108 49.1% n=573 Total 24.0% n=1,025 32.3% n=1,102 64.0% n=680 40.0% n=1,302 72.0% n=278 44.0% n=919 ARTICLE TITLE | AUTHOR 1�ASSESSING SUFFICIENCY AND qUALITY OF BANDWIDTH FOR PUBLIC LIBRARIES | BERTOT AND MCCLURE 1� ■ Discussion and selected issues The data presented point to a number of issues related to the current state of public library PAC and Internetaccess adequacy in terms of available public access computers and bandwidth. The data also provide a foundation upon which to discuss the nature of quality and sufficient PAC and Internet access in a public library environment. While public libraries indicate increased ability to meet patron bandwidth demand when providing fewer publicly avail able workstations, public libraries indicate that they have difficulty in meeting patron demand for public access computers. Growth of wireless connections In 2004, 17.9 percent of public library outlets offered wire less access, and a further 21.0 percent planned to make it available. Outlets in urban and highpoverty areas were most likely to have wireless access. The majority of librar ies (61.2 percent), however, neither had wireless access nor had plans to implement it in 2004. As table 11 demon strates, the number of public library outlets offering wire less access has roughly doubled from 17.9 percent to 36.7 percent in two years. Furthermore, 23.1 percent of outlets that do not currently have it plan to add wireless access in the next year. Thus, if libraries follow through with their plans to add wireless access, 61.0 percent of public library outlets in the United States will have it by 2007. The implications of the rapid growth of the public library’s provision of wireless connectivity (as shown in table 11) on bandwidth requirements are significant. Either libraries added wireless capabilities through their current overall bandwidth, or they obtained additional bandwidth to support the increased demand created by the service. If the former, then wireless access created an even greater burden on an already problematic band width capacity and may have actually reduced the overall quality of connectivity in the library. If the latter, libraries then had to shoulder the burden of increased expendi tures for bandwidth. Either scenario required additional technology infrastructure, support, and expenditures. Sufficient and quality connections The notion of sufficient and quality public library con nection to the Internet is a moving target and depends on a range of factors and local conditions. For purposes of discussion in this paper, the authors used 769kbps to differentiate “slower” from “faster” connectivity. If, how ever, 1.5mbps or greater had been used to define faster connectivity speeds, then only 28.9 percent of public libraries would meet the criterion of “faster” connectiv ity (see table 1). And in fact, simply because 28.9 percent of public libraries report connection speeds of 1.5mbps or faster does not also mean that they have sufficient or quality bandwidth to meet the computing needs of their users, their staff, their vendors, and their service provid ers. Some public libraries may need 10mbps to meet the PAC needs of their users as well as the internal staff and management computing needs. The library community needs to become more edu cated and knowledgeable about what constitutes sufficient and quality connectivity in their library for the communi ties that they serve. A first step is to understand clearly the nature and type of the connectivity of the library. The next step is to conduct an internal audit that minimally: ■ identifies the range of networked services the library provides both to users as well as for the operation of the library; ■ identifies the typical bandwidth consumption of these services; ■ determines the demands of users on the bandwidth in terms of services they use; ■ determines peak bandwidthusage times; ■ identifies the impact of highconsumption networked services used at these peakusage times; ■ anticipates bandwidth demands of newer services and resources that users will want to access through the library’s infrastructure—Myspace.com, YouTube. com—regardless of whether or not the library is the direct provider of such services; and ■ determines what broadband services are available to the library, the costs of these services, and the “fit” of these services to the needs of the library. Based on this and related information from such an audit, library administration can better determine the degree to which the bandwidth is sufficient in speed and quality. ■ Planning for sufficient and quality bandwidth Knowing the current condition of existing bandwidth in the library is not the same as successful technology plan ning and management to ensure that the library has, in fact, bandwidth that is sufficient in speed and quality. Once an audit such as has been suggested is completed, careful planning for bandwidth deployment in the library is essential. It appears, however, that currently much of the management and planning for networked services is based first on what bandwidth is available as opposed to the bandwidth that is needed to provide the necessary services and resources in a networked environment. This stance puts public libraries in a reactive condition rather than a proactive condition regarding provision of net worked services. 20 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200720 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Most public library planning approaches stress the importance of conducting some type of needs assessment as a precursor to any type of planning.5 Further, technology plans should include such things as goals, objectives, ser vices provision, and evaluation as they relate to bandwidth and the appropriate bandwidth needed. Recent library technology planning guides, however, give little attention to the management, planning, and evaluation of band width as it relates to provision of networked services. It must be noted that some public libraries may be prevented from accessing higher bandwidth due to high cost, lack of availability of bandwidth alternatives, or other local factors that determine access to advanced telecommunications in their areas. In such circumstances, the audit may serve to inform the public service/utilities commissions, FCC, and others of the need for deploy ment of advanced telecommunications services in these areas. ■ Bandwidth planning in a community context The audit and planning processes that have been described are critical activities for libraries. It is essential, however, for these processes to occur in the larger community con text. Investments in technology infrastructure are increas ingly a communitywide resource that services multiple functions—emergency services, community access, local government agencies, to name a few. It is in this larger context that library PAC and Internet access occurs. Moreover, there is a convergence of technology and service needs. For example, public libraries increasingly serve as agents of egovernment and disasterrelief providers.6 First responders rely on the library’s infrastructure when theirs is destroyed, as Hurricane Katrina and other storms demonstrated. Local, state, and federal government agen cies rely on broadband and PAC and Internet access (wired or wireless) to deliver egovernment services. Thus, at their core, libraries, emergency services, gov ernment agencies, and others have similar needs. Pooling resources, planning jointly, and looking across needs may yield economies of scale, better service, and a more robust community technology infrastructure. Emergency providers need access to reliable broadband and commu nications technologies in general, and in emergency situ ations in particular. Libraries need access to highquality broadband and PAC technologies. Both need access to wireless technologies. As broadcast networks relinquish ownership of the 700 MHz frequency used for analog television in February 2009, and this frequency is distributed to municipali ties for emergency services, now is an excellent time for libraries to engage in community technology planning for egovernment, disaster planning and relief efforts, and PAC and Internet services. By working with the larger community to build a technology infrastructure, the library and the entire community benefit. ■ Availability to high-speed connectivity One key consideration not known at this time is the extent to which public libraries—particularly those in rural areas—even have access to highspeed connec tions. Many rural communities are served not by the large telecommunications carriers, but rather by small, privately ownedandrun local exchange carriers. Iowa and Wisconsin, for example, are each served by more than eighty exchange carriers. As such, public libraries are limited in capacity and services to what these exchange Table 11. Public-access wireless Internet connectivity availability in public library outlets by metropolitan status and poverty Metropolitan status Poverty level Provision of public-access wireless Internet services Urban Suburban Rural Low Medium High Overall Currently available 42.9% ± 4.9% (n=1,211) 42.5% ± 4.9% (n=2,240) 30.7% ± 4.6% (n=2,492) 38.0% ± 4.8% (n=5,165) 28.1% ±4.5% (n=679) 53.8% ± 5.0% (n=99) 36.7% ± 4.8% (n=5,943) Not currently available and no plans to make it available within the next year 23.1% ± 4.2% (n=651) 29.7% ± 4.6% (n=1,562) 49.2% ± 5.0% (n=3,988) 37.4% ± 4.8% (n=5,091) 44.4% ± 4.9% (n=1,072) 21.0% ± 4.1% (n=39) 38.3% ± 4.9% (n=6,201) Not currently available, but there are plans to make it available within the next year 30.6% ± 4.6% (n=864) 26.0% ± 4.4% (n=1,369) 18.6% ± 3.9% (n=1,509) 22.5% ± 4.2% (n=3,063) 26.2% ± 4.4% (n=633) 25.3% ± 4.4% (n=46) 23.1% ± 4.2% (n=3,742) ARTICLE TITLE | AUTHOR 21ASSESSING SUFFICIENCY AND qUALITY OF BANDWIDTH FOR PUBLIC LIBRARIES | BERTOT AND MCCLURE 21 carriers offer and make available. Thus, in some areas, DSL service may be the only form of highspeed connec tivity available to libraries. And, as suggested earlier, DSL may or may not be considered high speed given the needs of the library and the demands of its users. Communities that lack highquality broadband ser vices by telecommunications carriers may want to con sider building a municipal wireless network that meets the community’s broadband needs for emergency, disas ter, and publicaccess settings. As a community engages in communitywide technology planning, it may become evident that local telecommunications carriers do not meet the broadband needs of the community. Such com munities may need to build their own networks, based on identified technologyplan needs. ■ Knowledge of networked services connectivity needs Patrons may not attempt to use highbandwidth services at the public library because they know from previous visits that the library cannot provide acceptable connec tivity speeds to access that service—thus, they quit trying to access that service, limiting the usefulness of the pub lic library. In addition, librarians may have inadequate knowledge or information to determine when bandwidth is or is not sufficient to meet the demands of their users. Indeed, the survey and site visits revealed that some librarians did not know the connection speeds that linked their library to the Internet. Consequently, libraries are in a dilemma: increase both the number of workstations and the bandwidth to meet demand; or provide less service in order to operate within the constraints of current connectivity infrastruc ture. And yet, roughly 45 percent of public libraries indi cate that they have no plans to add workstations within the next two years; the average number of workstations has been around ten for the last three surveys (2002, 2004, and 2006); and 80 percent of public libraries indicate that space limitations affect their ability to add workstations.7 Hence, for many libraries, adding workstations is not an option. ■ Missing the mark? The networked environment is such that there are multi ple uses of bandwidth within the same library—for exam ple, public Internet access, staff access, wireless access, integrated library system access. We are now in the Web 2.0 environment, which is an interactive Web that allows for content uploading by users (e.g., blogs, Mytube.com, Myspace.com, gaming). Streaming content, not text, is increasingly the norm. There are portable devices that allow for text, video, and voice messaging. Increasingly, users desire and prefer wireless services. This is a new environment in which libraries provide public access to networked services and resources. It is an enabling environment that puts users fully in the content seat—from creation to design to organization to access to consumption. And users have choices, of which the public library is only one, regarding the information they choose to access. It is an environment of competition, advanced applications, bandwidth intensity, and highquality com puters necessary to access the graphically intense content. The impacts of this new and substantially more com plex environment on libraries are potentially significant. As user expectations rise, combined with the provision of highquality services by other providers, libraries are in a competitive and service and resourcerich informa tion environment. Providing “bare minimum” PAC and Internet access can have two detrimental effects in that they: (1) relegate libraries to places of last resort, and (2) further digitally divide those who only have publicaccess computers and Internet access through their public librar ies. It is critical, therefore, for libraries to chart a highend course regarding PAC and Internet access, and not access that is merely perceived to be acceptable by the librarians. ■ Additional research The context in which issues regarding quality PAC and sufficient connectivity speeds to Internet access reside is complex and rapidly changing. Research questions to explore include: ■ Is it possible to define quality PAC and Internet access in a public library context? ■ If so, what are the attributes included in the defini tion? ■ Can these attributes be operationalized and mea sured? ■ Assuming measurable results, what strategies can the library, policy, research, and other interested communities employ to impact public library move ment toward quality PAC and Internet access? ■ Should there be standards for sufficient connectivity and quality PAC in public libraries? ■ How can public librarians be better informed regard ing the planning and deployment of sufficient and quality bandwidth? ■ What is the role of federal and state governments in supporting adequate bandwidth deployment for public libraries?8 ■ To what extent is broadband deployment and avail ability truly universal as per the Universal Service 22 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200722 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 (section 254) of the Telecommunications Act of 1996 (P.L. 104104)? These questions are a beginning point to a larger set of activities that need to occur in the research, practitioner, and policymaking communities. ■ Obtaining sufficient and quality public-library bandwidth Arbitrary connectivity speed targets, e.g., 200kbps or 769kbps, do not in and of themselves ensure quality PAC and sufficient connectivity speeds. Public libraries are indeed connected to the Internet and do provide public access services and resources. It is time to move beyond connectivitytype and speed questions and consider issues of bandwidth sufficiency, quality, and the range of networked services that should be available to the public from public libraries. Given the widespread connectivity now provided from most public libraries, there continue to be increased demands for more and better networked services. These demands come from governments that expect public libraries to support a range of egovernment services, from residents who want to use free wireless connectivity from the public library, to patrons who need to download music or view streaming videos (to name but a few). Simply providing more or better connectivity will not, in and of itself, address all of these diverse service needs. Increasingly, PAC support will require additional public librarian knowledge, resources, and services. Sufficient and quality bandwidth is a key component of those services. The degree to which public libraries can provide such enhanced networked services (requiring exceptionally high bandwidth that is both sufficient and of high quality) is unclear. Mounting a significant effort now to better understand existing bandwidth use and plan for future needs and requirements in individual public libraries is essential. In today’s networked envi ronment, libraries must stay competitive in the provision of networked services. Such will require sufficient and highquality connectivity and bandwidth. ■ Acknowledgements The authors gratefully acknowledge the support of the Bill & Melinda Gates Foundation and the American Library Association for support of the 2006 Public Libraries and the Internet Study. Data from that study have been incorpo rated into this paper. References 1. Information Institute, Public Libraries and the Internet (Tal lahassee, Fla.: Information Use Management and Policy Insti tute, 2006). All studies conducted since 1994 are available at: http://www.ii.fsu.edu/plinternet (accessed March 1, 2007). 2. U.S. Federal Communications Commission, High Speed Services for Internet Access: Status as of December 31, 2005 (Wash ington, D.C.: FCC, 2006), available at http://www.fcc.gov/ Bureaus/Common_Carrier/Reports/FCCState_Link/IAD/ hspd0604.pdf (accessed Mar. 1, 2007). 3. J. C. Bertot et al., Public Libraries and the Internet 2006 (Tal lahassee, Fla.: Information Use Management and Policy Insti tute, forthcoming), available at http://www.ii.fsu.edu/plinternet (accessed Mar. 1, 2007). 4. J. C. Bertot et al., “DRAFTED: I Want You to Deliver E Government,” Library Journal 131, no. 13 (Aug. 2006): 34–37. 5. C. R. McClure et al., Planning and Role Setting for Public Libraries: A Manual of Options and Procedures (Chicago: ALA, 1987); E. Himmel and W. J. Wilson, Planning for Results: A Public Library Transformation Process (Chicago, ALA, 1997). 6. J. C. Bertot et al., “DRAFTED: I Want You to Deliver EGov ernment.”; P. T. Jaeger et al., “The Policy Implications of Internet Connectivity in Public Libraries,” Government Information Quarterly 23, no. 1 (2006): 123–41. 7. J. C. Bertot et al., Public Libraries and the Internet 2006. 8. Jaeger et al., “The Policy Implications of Internet Connec tivity in Public Libraries.” 3286 ---- CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 23 Author Name and Second Author CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 23 Author Name and Second Author Author ID box for 2 column layout While the burgeoning trend in online social networks has gained much attention from the media, few studies in library science have yet to address the topic in depth. This article reports on a survey of 126 academic librar- ians concerning their perspectives toward Facebook.com, an online network for students. Findings suggest that librarians are overwhelmingly aware of the “Facebook phenomenon.” Those who are most enthusiastic about the potential of online social networking suggested ideas for using Facebook to promote library services and events. Few individuals reported problems or distractions as a result of patrons accessing Facebook in the library. When problems have arisen, strict regulation of access to the site seems unfavorable. While some librarians were excited about the possibilities of Facebook, the majority surveyed appeared to consider Facebook outside the purview of professional librarianship. D uring the fall of 2005, librarians noticed something unusual going on in the Houston Cole Library (HCL) at Jacksonville State University (JSU). Students were coming into the library in droves. Patrons waited in lines with photos to use the publicaccess scan ner (a stack of discarded pictures quickly grew). Library traffic was noticeably busier than usual and the computer lab was constantly full, as were the publicaccess termi nals. The hubbub seemed to center around one particular Web site. Once students found available computers, they were likely to stay glued to them for long stretches of time, mesmerized and lost in what was later determined to be none other than “Facebook addiction.” This addic tion was all the more obvious the day the Internet was down. Withdrawal was severe. Soon after the librarians noticed this curious behavior, an article in the Chanticleer, the campus newspaper for JSU, dispelled the mystery surrounding the Website brouhaha. A campus reporter broke the exciting news to the JSU community that “after months of waiting and requests from across the country, it’s finally here. JSU is officially on the Facebook.”1 The library suddenly became a popular hangout for students in search of computers to access Facebook. Apparently JSU jumped on the bandwagon relatively late. The Facebook craze had already spread throughout other colleges and universities since the Web site was founded in February 2004 by Mark Zuckerberg, a former student at Harvard University. The creators of Facebook vaguely define the site as “a social utility that connects you with the people around you.”2 Although originally created to allow students to search for other students at colleges and universities, the site has expanded to allow individuals to connect in high schools, companies, and within regions. Recently, Zuckerberg has also announced plans to expand the network to military bases.3 Currently, students and alumni in more than 2,200 colleges and uni versities communicate, connect with other students, and catch up with past high school classmates daily through the network. Students who may never physically meet on campus (a rather serendipitous occurrence in nature) have the opportunity to connect through Facebook. Establishing virtual identities by creating profiles on the site, students post photographs, descriptions of academic and personal interests such as academic majors, campus organizations of which they are members, political orientation, favorite authors and musicians, and any other information they wish to share about themselves. Facebook’s search engine allows users to search for students, faculty, and staff with similar interests by keyword. It would be hard to gauge how many of these students actually meet in person after connecting through Facebook. The authors of this study have heard students mention that either they or their friends have made dates with other students on campus through Facebook. Many of the “friends” Facebook users first add when they initially establish their accounts are the ones they are already acquainted with in the physical world. When Facebook made its debut at JSU, it had become the “ninth most highly trafficked Web site in the U.S.”4 One source estimated that 85 percent of college students whose institutions are registered in Facebook’s directory have created personal profiles on the site.5 Membership for the university network requires a university email address, and an institution cannot be registered in the directory unless a significant number of students request that the school be added. Currently, more than nine mil lion people are registered on Facebook.6 Soon after JSU was registered on Facebook’s direc tory, librarians began to receive questions regarding use of the scanner and requests for help uploading pictures to Facebook profiles. Students seemed surprisingly open about showing librarians their profiles, which usually contained more information than the librarians wanted to know. However, not all students were enthusiastic about Facebook. Complaints began to surface from students awaiting access to computers for academic work while classmates “tied up” computers on Facebook. Some stu dents complained about the distraction Facebook caused Checking Out Facebook.com: The Impact of a Digital Trend on Academic Libraries Laurie Charnigo and Paula Barnett-Ellis Laurie Charnigo (charnigo@jsu.edu) is an Education Librarian and Paula Barnett-Ellis (pbarnett@jsu.edu) is a Health, Science, and Nursing Librarian at the Houston Cole Library, Jacksonville State University, Alabama. 24 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200724 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 in the library’s computer lab, a complaint that eventually reached the president of JSU. Currently, the administra tion at JSU has decided to block access to Facebook in the computer labs on campus, including the lab in the library. Opinions of faculty and staff in the library about Facebook vary. Some librarians scoff at this new trend, viewing the site primarily as just another dating service. Others have created their own Facebook accounts just to see how it works, to connect with students, and to keep up with the latest Internet fad.7 ■ Study rationale Prompted by the issues that have arisen at HCL as a result of heavy patron use of Facebook, the authors surveyed academic librarians throughout the United States to find out what impact, if any, the site has had on other libraries. The authors sought information about the practical effect Facebook has had on libraries, as well as librarians’ perspectives, perceived roles associated with, and awareness of Internet social trends and their place in the library. Online social networking, like email and instant messaging, is emerging as a new method of com munication. Recently, the librarians have heard Facebook being used as a verb (e.g., “I’ll Facebook you”). Few would probably disagree that making social connections and friends (and Facebook revolves around connecting friends) is an important aspect of the campus experi ence. Much of the attraction students and alumni have toward college yearbooks (housed in the library) stems from the same fascination that viewing photos, student profiles, and searching for past and present classmates on Facebook inspires. Emphasis in this study centers on librarians’ awareness of, experimentation with, and atti tudes towards Facebook and whether or not they have created policies to regulate or block access to the site on publicaccess computers. However trendy an individual Web site such as Facebook may appear, online social networking, a cat egory Facebook falls within, has become a new subject of inquiry to marketing professionals, sociologists, commu nication scholars, and library and information scientists. Downes defines social networks as a “collection of indi viduals linked together by a set of relations.”8 According to Downes, “Social networking Web sites fostering the development of explicit ties between individuals as ‘friends’ began to appear in 2002.”9 Facebook is just one of many popular online social network sites (MySpace, Friendster, Flickr), and survey respondents often asked why questions focused solely on Facebook. The authors decided to investigate it specifically because it is cur rently the largest online social network targeted for the academic environment. Librarians are also increasingly exploring the use of what have loosely been referred to as “Internet 2.0” com panies and services, such as Facebook, to interact with and reach out to our users in new and creative ways. The term Internet 2.0 was coined by O’Reilly Media to refer to Internet services such as blogs, wikis, online social net working sites, and types of networks that allow users the ability to interact and provide feedback. O’Reilly lists the core competencies that define Internet 2.0 services. One of these competencies, which might be of particular inter est to librarians, is that Internet 2.0 services must “trust the users” as “codevelopers.”10 As librarians struggle to develop innovative ways to reach users beyond library walls, it seems logical to observe online services, such as Facebook and MySpace, which appeal to a huge portion of our clientele. From a purely evaluative standpoint of the site as a database, the authors were impressed by several of the search features offered in Facebook. Graphtheory algo rithms and other advanced network technology are used to process connections.11 Some of the more interesting search options available in Facebook include the ability to: ■ search for students by course field, class number, or section; ■ search for students in a particular major; ■ search for students in a particular student organiza tion or club; ■ create “groups” for student organizations, clubs, or other students with common interests; ■ post announcements about campus or organization events; ■ search specifically for alumni; and ■ block or limit who may view profiles, providing users with builtin privacy protection if the user so wishes. Since the authors finished the study, the site has added a News Feed and a Mini Feed, features that allow users to keep track of their friends’ notes, messages, profile changes, friend connections, and group events. In response to negative feedback about the News Feeds and Mini Feeds by users who felt their privacy was being violated, Facebook’s administrators created a way for users to turn off or limit information displayed in the feeds. The addition of this technology, however, provides a sophisticated level of connectivity that is a benefit to users who like to keep abreast of the latest happenings in their network of friends and groups. The Pulse, another feature on the site, keeps daily track of popular interests (e.g., favorite books) and member demographics (number of members, political orientation) and compares them with overall Facebook member averages. The authors were pleasantly surprised to discover that the Beatles and Led Zeppelin, beloved bands of the baby boomers, ARTICLE TITLE | AUTHOR 25CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 25 continue to live on in the hearts of today’s students. These groups were ranked in the top ten favorite bands by stu dents at JSU. As of October 2006, the top campaign issues expressed by Facebook users were: reducing the drinking age to eighteen (go figure) and legalization for samesex marriage. Arguably, much of the information provided by Facebook is not academic in nature. However, an evaluation or review of Facebook might provide useful information to instruction librarians and database ven dors regarding interface design and search capabilities that appeal to students. ProviteraMcGlynn suggests that facilitating learning among millennials, who “represent 70 to 80 million people” born after 1992 (a large percent age of Facebook members) involves understanding how they interact and communicate.12 Awareness of students’ cultural and social interests, and how they interact online, may help older generations of academic librarians better connect with their constituents. ■ The literature on online social networks Although social networks have been the subject of study by sociologists for years and social network theories have been established to describe how these networks func tion, the study of online social networks has received little attention from the scholarly community. In 1997, Garton, Haythornthwaite, and Wellman were among the first to describe a method, social network analysis, for studying online social networks.13 Their work was published years before online social networks similar to Facebook evolved. Currently, the literature on these networks is predominantly limited to popular news pub lications, business magazines, occasional blurbs in library science and communications journals, and numerous student newspapers.14 Privacy issues and concerns about sexual predators lurking on Facebook and similar sites have been the focus of most articles. In the Chronicle of Higher Education, Read details numerous arrests, suspensions, and schol arship withdrawals that have resulted from police and administrators searching for incriminating information students have posted in Facebook.15 Read discovered that, because students naively reveal so much informa tion about their activities, some campus police were regularly trolling Facebook, finding it “an invaluable ally in protecting their campuses.”16 Students may feel a false sense of security when they post to Facebook, regarding it as their private space. However, Read warns that “as more and more colleges offer alumni email accounts, and as campus administrators demonstrate more Internet savvy, students are finding that their conversations are playing to a wider audience than they may have antici pated.”17 Privacy concerns expressed about Facebook appear to revolve more around surveillance than stalk ers. In a Web seminar on issues regarding Facebook use in higher education, Shawn McGuirk, director of Judicial Affairs, Mediation, and Education at Fitchburg State College, Massachusetts, recommends that administrators and others concerned with students posting potentially incriminating, embarrassing, or overtly personal infor mation draft a document similar to the one created by Cornell University’s Office of Information Technologies, which advises students on how to safely and responsibly use online social networking sites similar to Facebook.18 After pointing out the positive benefits of Facebook and reassuring students that Cornell University is proud of its liberal policy in not monitoring online social networks, the essay, entitled “Thoughts on Facebook,” provides poignant advice and examples of privacy issues revolv ing around Facebook and similar Web sites.19 The Golden Rule of this essay states: Don’t say anything about someone else that you would not want said about yourself. And be gentle with your self too! What might seem fun or spontaneous at 18, given caching technologies, might prove to be a liability to an ongoing sense of your identity over the longer course of history.20 A serious concern discussed in this document is the real possibility that potential employers may scan Facebook profiles for the “real skinny” on job candidates. However, unless the employer uses an email issued from the same school as the candidate, he or she is unable to look at the individual’s full profile without first request ing permission from the candidate to be added as a “friend.” All the employer is able to view is the user’s name, school affiliation, and picture (if the user has posted one). Unless the user has posted an inappropriate picture or is applying for a job at the college he or she is attending, the threat of employers snooping for informa tion on potential candidates in Facebook is minimal. The same, however, cannot be said of MySpace, which is much more open and accessible to the public. Additionally, three pilot research studies have also focused on privacy issues specifically relating to Facebook, including those of Stutzman, Gross and Acquisti, and Govani and Pashley. Results from all three studies revealed strikingly close findings. Individuals who participated in the studies seemed willing to dis close personal information about themselves—such as photos and sometimes even phone numbers and mailing addresses—on Facebook profiles even though students also seemed to be aware that this information was not secure. In a study of fifty Carnegie Mellon University undergraduate users, Govani and Pashley concluded that these users “generally feel comfortable sharing their per sonal information in a campus environment. Participants said they “had nothing to hide” and “they don’t really 26 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200726 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 care if other people see their information.”21 A separate study of more than four thousand Facebook members at the same institution by Gross and Acquisti echoed these findings.22 Comparing identity elements shared by members of Facebook, MySpace, Friendster, and the University of North Carolina Directory, Stutzman discov ered that a significant number of users shared personal information about themselves in online social networks, particularly Facebook, which had the highest level of campus participation.23 Gross and Acquisti provide a list of explanations suggesting why Facebook members are so open about sharing personal information online. Three explanations that are particularly convincing are that “the perceived benefit of selectively revealing data to strang ers may appear larger than the perceived costs of possible privacy invasions”; “relaxed attitudes toward (or lack of interest in) personal privacy”; and “faith in the network ing service or trust in its members.”24 In public libraries, concern has primarily centered on teenagers accessing MySpace.com, an online social net working site much larger than Facebook. MySpace, whose membership, unlike Facebook, does not require an .edu email address, has a staggering 43 million users, a num ber that continues to rise.25 Julian Aiken, a reference librar ian at the New Haven Free Public Library, wrote about the unpopular stance he took when his library decided to ban access to MySpace due to the hysterical hype of media reports exposing the dangers from online predators lurking on the site.26 For Aiken, the damage of censorship policies in libraries far outweighs the potential risk of sex crimes. Furthermore, he suggests that there are even edu cational benefits of MySpace, observing that “[t]eenagers are using MySpace to work on collaborative projects and learn the computer and design skills that are increasingly necessary today.”27 What is apparent is that whether Facebook continues to rise in popularity or fizzles out among the college crowd, the next generation of college students, who now constitute the largest percentage of MySpace users, are already solidly entrenched and adept at using online social networks. Librarians in institutions of higher education might need to consider what implica tions the communication style preferences of these future students could have, if any, on library services. While most of the academic attention regarding online social networks has centered on privacy concerns, perhaps the business sector has done a more thorough investiga tion of user behavior and students’ growing attraction towards these types of sites. Business magazines have naturally focused on the market potential, growth, and fluctuating popularity of various online social networks. Advertisers and investors have sought ways to capital ize on the exponential growth of these hightraffic sites. Business Week reported that as of October 2005, Facebook .com had 4.2 million members. More than half of those members were between the ages of twelve and twenty four.28 While some portended that the site was losing momentum, as of August 2006, membership on Facebook had expanded beyond eight million.29 Marketing experts have closely studied, apparently more so than com munication scholars, the behavior of users in online social networks. In a popular business magazine, Hempel and Lehman describe user behavior of the “MySpace Generation”: “Although networks are still in their infancy, experts think they’re already creating new forms of social behavior that blur the distinctions between online and realworld interactions.”30 The study of user behavior in online social networks, however, has yet to be addressed in length by those outside the field of marketing. Although evidence of interest in online social net works is apparent in librarian Weblogs and forums (many librarians have created Facebook groups for their libraries), actual literature in the field of library and information science is scarce.31 Dvorak questions the lack of interest displayed by the academic community toward online social networks as a focus of scholarly research. Calling on academics to “get to work,” he argues “aca demia, which should be studying these phenomena, is just as out of the loop as anyone over 30.”32 This discon nect is also echoed by Michael J. Bugeja, director of the Greenlee School of Journalism and Communication at Iowa State University, who writes, “While I’d venture to say that most students on any campus are regular visitors to Facebook, many professors and administrators have yet to hear about Facebook, let alone evaluate its impact.”33 The lack of published research articles on these types of networks, however, is understandable given the newness of the technology. A few members of the academic community have sug gested opportunities for using Facebook to communicate with and reach out to students. In a journal specifically geared toward student services in higher education, Shier considers the impact of Facebook on campus community building.34 Although she cannot identify an academic purpose for Facebook, she describes how the site can con tribute to the academic social life of a campus. Facebook provides students with a virtual campus experience, particularly in colleges where students are commuters or are in distance education. Shier writes, “As the student’s definition of community moves beyond the geographic and physical limitations, Facebook.com provides one way for students to find others with common interests, feel as though they are part of a large community, and also find out about others in their classes.”35 Furthermore, Facebook membership extends beyond students to fac ulty, staff, and alumni. Shier cites examples of professors who used Facebook to connect or communicate with their students, including the president of the University of Iowa and more than one hundred professors at Duke University. Professors who teach online courses make ARTICLE TITLE | AUTHOR 27CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 27 themselves seem more human or approachable by estab lishing Facebook profiles.36 Greeting students on their own turf is exactly the direction staff at Washington University’s John M. Olin Library decided to take when they hired Web Services librarian Joy Weese Moll to communicate and answer questions through a variety of new technologies, includ ing Facebook.37 Brian Mathews, information services librarian at Georgia Institute of Technology, also created a Facebook profile in order to “interact with the students in their natural environment.”38 Mathews decided to experiment with the possibilities of using Facebook as an outreach tool to promote library services to 1,700 stu dents in the School of Mechanical Engineering after he discovered that 1,300 of these students were registered on Facebook. Advising librarians to become proactive in the use of online social networks, Mathews reported that overall, his experience helped him to effectively “expand the goal of promoting the library.”39 Bill Drew was among the first librarians to create an account and profile for his library, the SUNY Morrisville Library. As of September 2006, nearly one hundred librarians had created profiles or accounts for their libraries on Facebook. One month later, however, the administration at Facebook began shutting down library accounts on the grounds that libraries and institutions were not allowed to represent themselves with profiles as though they were individu als. In response, many of these libraries simply created groups for their libraries, which is completely appropri ate, similar to creating a profile, and just as searchable as having an account. The authors of this study created the “Houston Cole Library Users Want Answers!” group, which currently has ninetyone members. Library news and information of interest about the library is announced in the group.40 In this study, one trend the authors will try to identify is whether other librarians have considered or are already using Facebook in similar ways that Moll, Mathews, and Drew have explored as avenues for com municating with students or promoting library services. ■ The survey In February 2006, 244 surveys were mailed to reference or public service librarians (when the identity of those per sons could be determined). These individuals were chosen from a random sample of the 850 institutions of higher education classified by the Carnegie Classification Listing of Higher Education Institutions as “Master’s Colleges and Universities (I and II)” and “Doctoral/ Research Universities (Extensive and Intensive).”41 The sample size provided a 5.3 percent margin error and a 95 percent confidence level. One hundred twentysix surveys were completed, providing a response rate of 51 percent. Fifteen survey questions (appendix A) were designed to target three areas of inquiry: awareness of Facebook, practical impact of the site on library services, and perspectives of librarians toward online social networks. Awareness of Facebook A series of questions on the survey queried respondents about their awareness and degree of knowledge about Facebook. The overwhelming majority of librarians were aware of Facebook’s existence. Out of 126 librarians, 114 had at least heard of Facebook; 24 were not familiar with the site. As one individual wrote, “I had not heard of Facebook before your survey came, but I checked and our institution is represented in Facebook.” Universities registered in Facebook are easily located through a searchbyregion on Facebook’s home page. Thirtyeight colleges and universities for Alabama (JSU’s location) are registered in Facebook. (In comparison, 143 academic institutions in California are listed.) Out of those librar ians who had heard of the site, 27 were not sure whether their institutions were registered in Facebook’s directory. Sixty survey participants were aware that their institu tions were registered in the directory, while fifteen librar ians reported that their universities were not registered (figure 1). Several comments at the end of the survey indicated that some of the institutions surveyed did not issue school email accounts, making membership in Facebook impossible for their university. Interestingly, out of the sixty individuals who could claim that their universities were in the directory, 34 percent have created their own personal Facebook accounts and two libraries have individual profiles (figure 2). One individual who established an account on the site wrote, “Personally, I’m a little embarrassed by having an account because it’s such a teenybopper kind of thing and I’m a little old for it. But it’s an interesting cultural phenomenon and academic librarians need to get on the bandwagon with it, if only to better understand their constituents.” Another survey respondent with an individual profile on the site reported a group created by his or her institution on Facebook titled “I totally want to have sex in the library.” This individual wanted to make it clear, however, that the students—not the librarians—created this group. A particularly help ful participant went so far as to poll the reference col leagues in all nine of the libraries at his/her institution and found that “only a few had even heard of Facebook.” That librarians will become increasingly aware of online social networks was the sentiment expressed by another individual who wrote, “Most librarians at my institu tion are unaware of social software in general, much less Facebook. However, I think this will change in the future as social software is mentioned more often in traditional media (such as television and newspapers).” According to survey responses, it does not appear 28 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200728 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 that use of Facebook by students has been as noticeable or distracting in other libraries as it has been at HCL. When asked to describe their observation of student use of library computers to access Facebook, 56 percent of those surveyed checked “rarely to never.” Only 20 percent indicated “most of the time” to “all of the time” (table 1). However, it is important to remember that only sixty individuals could verify that their institutions are regis tered on Facebook. Through comments, some librarians hinted that “snooping” or keeping mental notes of what students view on library computers is frowned upon. It simply is not our business. “We do not regulate or track student use of computers in the library,” wrote one indi vidual. Several librarians noted that students were using Facebook in the libraries, but more so on personal laptops than publicaccess computers. Practical impact of Facebook Another goal of this study was to find out whether Facebook has had any real impact on library services, such as an increase in bandwidth, library traffic, and noise, or in use of publicaccess computers, scanners, or other equipment. Student complaints about monopolization of computers for use of Facebook led administrators to block the site from computer labs at JSU. Access to Facebook on publicaccess terminals, however, was not regulated. Survey responses revealed that Facebook has had minimal impact on library services elsewhere. Only one library was forced to develop a policy for specifically addressing computeruse concerns as a result of Facebook use. One individual mailed the sign posted on every computer terminal in the library, which states, “If you are using a computer for games, chat, or other recreational activity, please limit your usage to thirty minutes. Computers are primarily intended for academic use.” Another librarian reported that academic computing staff had to shut down access to Facebook on library computers due to band width and access issues. This individual, however, added, “Interestingly, no one has complained to the library staff about its absence!” Given a list of possible effects Facebook may have had on library services and operations, 10 per cent of respondents indicated that Facebook has increased patron use of computers. Seven percent agreed that it has increased patron traffic, and only 2 percent reported that the site has created bandwidth problems or slowed down Internet access. Only four individuals received patron complaints about other users “tying up” the computers with Facebook (figure 3). Since the advent of Facebook, the public scanner has become one of the hottest items in HCL. Librarians at JSU know that use of the scanner has increased tremendously due to Facebook because the scanner used by students to upload photos is attached to a public workstation next to the general reference desk. Students often ask questions about uploading pictures to their Facebook profiles as well as how to edit photos (e.g., resizing and cropping). One survey question asked whether scanner use had increased as a result of Facebook. Of the sixtytwo respon dents who answered this question (it was indicated that only those libraries that provide public access to scanners should answer the question), 77 percent reported that Figure 1. Institutions added to the Facebook directory Figure 2. Involvement with Facebook Table 1. Student use of library computers to access Facebook (based on observation) Total Percentage Never 23 32 Rarely 17 24 Some of the time 17 24 All the time 7 10 Most of the time 7 10 ARTICLE TITLE | AUTHOR 2�CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 2� scanner use had not increased. Furthermore, only two librarians have assisted students with the scanner or pro vided any other type of assistance, for that matter, with Facebook. The assistance the two librarians gave included scanning photographs, editing photos, uploading photos to Facebook profiles, and creating accounts. However, in a separate question, 21 percent of participants agreed that librarians should be responsible for helping students, when needed, with questions about Facebook. No librar ian has added additional equipment such as computers or scanners as a result of Facebook. Only one individual reported future plans by his/her library to add additional equipment in the future as a result of heavy use of the site. Perspectives toward Facebook One of the main goals of the study was to obtain a snapshot of the perspectives and attitudes of librarians toward Facebook and online social networks in general. Most of the librarians surveyed were neither enthusiastic nor disdainful of Facebook. A small group of the respon dents, however, when given the chance to comment, were extremely positive and excited about the possibilities of online social networking. Twentyone individuals saw no connection between libraries and Facebook. Sixty seven librarians were in agreement that computer use for academic purposes should take priority, when needed, over use of Facebook. However, fiftyone respondents indicated that librarians needed to keep up with Internet trends, such as Facebook, even when such trends are not academic in nature (table 2). Out of 126 librarians who completed the survey, only 23 reported that Facebook has generated discussion among library faculty and staff about online social networks. On the other hand, few individuals voiced negative opinions toward Facebook. Only 5 percent of those surveyed indicated that Facebook annoyed faculty and staff. One individual wrote, “I don’t like Facebook or most social networking services. They encourage the formation of cliques and keep users from meeting and accepting those who are different than themselves.” Comments like this, however, were rare. Although the majority of librarians seemed fairly apa thetic toward Facebook, few individuals expressed nega tive comments toward the site. Few librarians indicated that Facebook should be addressed or regulated in library policy. Most individu als viewed the site as just another communication tool similar to instant messaging or cell phones. In fact, while most librarians did not express much interest in Facebook, many were quite vocal about not regulating its use. The following comment by one survey partici pant captures this sentiment: “Attempts to restrict use of Facebook in the library would be futile, in my opinion, in the same way it is now impossible to ban use of USB drives and AIM in academic libraries.” While most indi Table 2. Access, assistance, and awareness of Facebook and similar trends: perspectives Total Percentage Computer use for academic purposes should take priority, when needed, over use of Facebook. 67 53 Librarians need to “keep up” with Internet trends, such as Facebook, even when these trends are not aca- demic in nature. 51 40 Library resources should not be monopolized with use of Facebook. 35 28 Librarians should help students, when able, with questions regarding Facebook. 27 21 There is no connection between libraries and Facebook. 21 17 Student use of Facebook on library computers should not be regulated. 15 12 Library computers should be available for access to Facebook, but librarians should not feel that it is their responsibility to assist students with questions regarding the site. 11 9 (Respondents were allowed to check any or all responses that applied.) Figure 3. Patron complaints about Facebook 30 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200730 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 viduals agreed that academic use of computers should take priority over recreational use, a polite request that a patron using Facebook allow another student to use the computer for academic purposes, when necessary, appears more preferable than the creation and enforce ment of strict policies. As one librarian put it, “I don’t want students to see the library as a place where they are ‘policed’ unnecessarily.” When asked if Facebook serves any academic pur pose, 54 percent of those surveyed indicated that it does not, while 34 percent were “not sure.” Twelve percent of the librarians identified academic potential or pos sible benefits of the site (figure 4). The authors were surprised to find that 46 percent of those surveyed were not completely willing to dismiss Facebook as pure rec reation. Some librarians found Facebook to be a distrac tion to academics: “Maybe I’m old fashioned, but when do students find time for this kind of thing? I wonder about the impact of distractions like this on academic pursuits. There’s still only twentyfour hours in a day.” Another individual asked two students who were using Facebook in the library what they thought of the site and they admitted that it was “frequently a distraction from academic work.” For the 34 percent who were not sure whether Facebook has any academic value, there were comments such as “I am continuing to observe and will decide in the future.” Academic uses for Facebook included suggestions that it be used as a communication tool for student collaboration in classes (Facebook allows students to search for other students by course and sec tion number). One individual suggested it could be used as an “online study hall,” but then wondered if this might lead to plagiarism. Some thought instructors could somehow use Facebook for conducting online discussion forums, with one participant observing “it’s ‘cooler’ than using Blackboard.” “Building rapport” with students through a communication medium that many students are comfortable with was another benefit mentioned. Respondents who were enthusiastic about Facebook thought it most beneficial as a virtual extension of the campus. Facebook could potentially fill a void where facetoface connections are absent in online and dis tanceeducation classes. Several librarians suggested that Facebook has had a positive influence in fostering col legiate bonds and school spirit. As one individual wrote, “[t]he academic environment is not only responsible for scholarly growth, but personal growth as well. This is just one method for students to interact in our highly techno logical society.” Facebook could provide students who are not physically on campus with a means to connect with other students at their institutions who have similar academic and social interests. Some librarians were so enthusiastic about Facebook that they suggested libraries use the site to promote their services. Using the site to advertise library events and creating online library study groups and book clubs for students were some of the ideas expressed. One librar ian wrote: “Facebook (and other social networking sites) can be a way for libraries to market themselves. I haven’t seen students using Facebook in an academic manner, but there was a time when librarians frowned on email and AIM too. If it becomes a part of students’ lives, we need to welcome it. It’s part of welcoming them, too.” More librarians, however, felt that Facebook should serve as a space exclusively for students and that librarians, profes sors, administrators, police, and other uninvited folks should keep out. Furthermore, as one individual noted, it is not “an appropriate venue” for librarians to promote their services. While the review of literature demonstrates that much has been made of online social networks and privacy issues, the librarians surveyed were not particularly con cerned about privacy. Only 19 percent indicated that they were concerned about privacy issues related to Facebook. However, some librarians voiced concerns that many stu dents are ignorant about the risks of posting personal infor mation and photographs on Facebook and do not seem fully aware of the possibility that individuals outside their social sphere might also have reason to access the site. One individual mentioned that the librarians at her institution have begun to emphasize this to students during library instruction sessions on Internet research and evaluation. ■ Limitations Several limitations to this study must be noted when attempting to reach any type of conclusion. Participants who had never heard of Facebook obviously could not answer any questions except that they were not famil iar with the site. Some questions required respondents to “guesstimate.” Unless librarians have access to their Figure 4. Finds conceivable academic value in Facebook ARTICLE TITLE | AUTHOR 31CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 31 institution’s Internet usage statistics, it would be hard for them to really know how much bandwidth is being used by students accessing Facebook. Librarians, having been trained in a profession that places a high value on freedom of access, might also be wary of activities that suggest any type of censorship. Therefore, it is conceivable that some of the librarians surveyed do not know whether students are using Facebook in the library because they make a point not to snoop or make note of individual Web sites that students view. ■ Discussion While online education is growing at a rapid rate across the United States, so is the presence of virtual academic social communities. Although Facebook might prove to be a passing fad, it is one of the earliest and largest online social networking communities geared specifically for students in higher education. It represents a new form of communication that connects students socially in an online environment. If online academics have evolved and continue to do so, then it is only natural that online academic social environments, such as Facebook, will continue to evolve as well. While traditionally considered the heart of the campus, one is left to ponder the library’s presence in online academic social networks. What role the library will serve in these environments might largely depend on whether librarians are proactive and experi mental with this type of technology or whether they simply dismiss it as pure recreation. Emerging technolo gies for communication should provoke, at the very least, an interest in and knowledge of their presence among library and information science professionals. This survey found that librarians were overwhelmingly aware of and moderately knowledgeable about Facebook. Some librarians were interested in and fascinated with Facebook, but preferred to study it as outsiders. Others had adopted the technology, but more for the purpose, it would seem, of having a better understanding of today’s students and why Facebook (and other online social net working sites) appeals to so many of them. It is apparent from this study that there is a fine line between what now constitutes “academic” activity and “recreational” activity in the library. Sites like Facebook seem to blur this line fur ther and librarians do not seem eager or find it necessary to distinguish between the two unless absolutely pressed (e.g., asking a student to sign out of Facebook when other patrons are waiting to use computers for academic work). One area of attention this study points to is a lack of con cern among librarians toward the Internet and privacy issues. Some individuals surveyed suggested that librari ans play a larger role in making students aware that people outside their society of friends—namely, administrative or authority figures—have the ability to access the informa tion they post online to social networks. Participants were most enthusiastic about Facebook’s role as a space where students in the same institution can connect and share a common collegiate bond. Librarians who have not yet “checked out” Facebook might consider one individual’s description of the site as “just another ver sion of the college yearbook that has become interactive.”42 Among the most cherished books in HCL that document campus life at JSU are the Mimosa Yearbooks. Alumni and students regularly flip through this treasure trove of pho tographs and memories. No administrator or librarian would dare weed this collection or find its presence irrele vant. While year books archive campus yesteryears, online social networks are dynamically documenting the here and now of campus life and shaping the future of how we communicate. As Casey writes, “Libraries are in the habit of providing the same services and the same programs to the same groups. We grow comfortable with our provision and we fail to change.”42 By exploring popular new types of Internet services such as Facebook instead of quickly dismissing them as irrelevant to librarianship, we might learn new ways to reach out and communicate better with a larger segment of our users. ■ Acknowledgements The authors would like to acknowledge Stephanie M. Purcell, student worker at the Houston Cole Library, for her excellent editing suggestions and insight into online social networks from the student’s point of view, and JohnBauer Graham, head of public services at the Houston Cole Library, for his encouragement. References and Notes 1. Angela Reid, “Finally . . . the Facebook,” The Chanticleer, Sept. 22, 2005, 4. 2. Facebook.com, http://www.facebook.com/about.php (accessed Dec. 2, 2005). 3. Angus Loten, “The Great Communicator,” Inc.Com., June 6, 2006, http://www.inc.com/30under30/zuckerberg.html (accessed Dec. 4, 2005). 4. Adam Lashinsky, “Facebook Stares Down Success,” For- tune, Nov. 28, 2005, 4. 5. Michael Amington, “85 Percent of College Students Use Facebook,” TestCrunch: Tracking Web 2.0 Company Review on Face- book (Sept. 7, 2005), http://www.techcrunch.com/2005/09/07/ 85ofcollegestudentsusefacebook (accessed Dec. 2, 2005). 6. http://www.facebook.com/about.php. 7. Facebook us! If you are a registered member of Facebook, do a global search for “Laurie Charnigo” or “Paula Barnett Ellis.” 32 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200732 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 8. Stephen Downes, “Semantic Networks and Social Net works,” The Learning Organization 12, no. 5 (2005): 411. 9. Ibid. 10. Tim O’Reilly, “What is Web 2.0?” http://www.oreilly net.com/pub/a/oreilly/tim/news/2005/09/30/whatisweb 20.html (accessed Aug. 6, 2006). 11. http://www.facebook.com/about.php. 12. Angela Provitera McGlynn, “Teaching Millennials, Our Newest Cultural Cohort,” The Education Digest 71, no. 4 (2005): 13. 13. Laura Garton, Caroline Haythornthwaite, and Barry Well man, “Studying Online Social Networks,” Journal of Computer Mediated Communication 31, no. 4 (1997). 14. Facebook.com’s “About” page archives a collection of col lege newspaper articles about Facebook: http://www.facebook .com/about.php (accessed Dec. 4, 2005). 15. Brock Read, “Think Before You Share,” The Chronicle of Higher Education, Jan. 20, 2006, A38–A41. 16. Ibid., A41. 17. Ibid., A40. 18. Shawn McGuirk, “Facebook on Campus: Understanding the Issues,” Magna Web seminar presented live on June 14, 2006. Transcripts available for a fee from Magna Pubs. http://www .magnapubs.com/catalog/cds/5987551.html (accessed Aug. 2, 2006). 19. Tracy Mitrano, “Thoughts on Facebook” (Apr. 2006) Cor nell University of Information Technologies, http://www.cit .cornell.edu/oit/policy/ memos/facebook.html (accessed June 22, 2006). 20. Ibid., “Conclusion.” 21. Tabreez Govani and Harriet Pashley, “Student Awareness of the Privacy Implications When Using Facebook,” unpublished paper presented at the “Privacy Poster Fair” at the Carnegie Mellon University School of Library and Information Science, Dec. 14, 2005, 9, http://lorrie.cranor.org/courses/fa05/tubzhlp .pdf (accessed Jan. 15, 2006). 22. Ralph Gross and Alessandro Acquisti, “Information Rev elation and Privacy in Online Social Networks,” paper presenta tion at the ACM Workshop on Privacy in the Electronic Society, Alexandria, Va., Nov. 7, 2005, 79, http://portal.acm.org/citation .cfm?id=1102214 (accessed Nov. 30, 2005). 23. Frederic Stutzman, “An Evaluation of IdentitySharing Behavior in Social Network Communities,” paper presentation at the iDMAa and IMS Code Conference, Oxford, Ohio, April 6–8, 2006, 3–6, http://www.ibiblio.org/fred/pubs/stutzman _pub4.pdf (accessed May 23, 2006). 24. Gross and Acquisti, “Information Revelation and Privacy in Online Social Networks,” 73. 25. “MySpace: Design Anarchy That Works,” Business Week, Jan. 2, 2006, 16. 26. Julian Aiken, “Hands off MySpace,” American Libraries 37, no. 7 (2006): 33. 27. Ibid. 28. Jessi Hempel and Paula Lehman, “The MySpace Genera tion,” Business Week, Dec. 12, 2005, 94. 29. http://www.facebook.com/about.php. 30. Hempel and Lehman, “The MySpace Generation,” 87. 31. The authors created the “Librarians and Facebook” group on Facebook to discuss issues concerning Facebook and librari anship, such as censorship issues, policies, and ideas for con necting with students through Facebook. This is a global group. If you have a Facebook account, we invite you to do a search for “Librarians and Facebook” and join our group. 32. John C. Dvorak, “Academics Get to Work!” PCMagazine Online, http://www.pcmag.com/article2/0,1895,1928970,00 .asp (accessed Feb. 21, 2006). 33. Michael J. Bugeja, “Facing the Facebook,” The Chronicle of Higher Education, Jan. 27, 2006, C1–C4; Ibid. 34. Maria Tess Shier, “The Way Technology Changes How We Do What We Do,” New Directions for Student Services 112 (Winter 2005): 83–84. 35. Ibid., 84 36. Shier, “The Way Technology Changes How We Do What We Do,” 112; J. Duboff, “Poke” Your Prof: Faculty Discovers thefacebook.com,” Yale Daily News, Mar. 24, 2005, http://www .yaledailynews.com/article.asp?aid=28845 (accessed Jan. 15, 2006; Mingyang Liu, “Would you Friend Your Professor? Duke Chronicle Online, Feb. 25, 2005, http://www.dukechronicle.com/ media/paper884/news/2005/02/25/News/Would.You.friend .Your.Professors1472440.shtml?norewrite&sourcedomain =www.dukechronicle.com (accessed Jan. 15, 2006). 37. Brittany Farb, “Students Can ‘Check Out’ New Librarian on the Facebook,” Student Life (Washington Univ. in St. Louis), Feb. 27, 2006, http://www.studlife.com/home/index.cfm?eve nt=displayArticle&uStory_id=5914a90d53b (accessed Feb. 27, 2006). 38. Brian S. Mathews, “Do You Facebook? Networking with Students Online,” College & Research Libraries News 37, no. 5 (2006): 306. 39. Ibid., 307. 40. View the “Houston Cole Library Users Want Answers!” group by doing a search for the group title on Facebook. 41. NCES Compare Academic Libraries, http://nces.ed.gov/ surveys/libraries/ compare/PeerVariable.asp (accessed Dec. 2, 2005). The random sample was chosen using the Research Ran domizer available online, http://www.randomizer.org/form .htm (accessed Dec. 2, 2005). 42. Michael E. Casey and Laura C. Savastinuk, “Library 2.0,” Library Journal 131, no. 14 (2006): 40. ARTICLE TITLE | AUTHOR 33CHECkING OUT FACEBOOk.COM | CHARNIGO AND BARNETT-ELLIS 33 1. Has your institution been added to the Facebook directory?  Yes  No (skip to questions 10, 11, and 12  Not sure (skip to questions 10, 11, and 12)  I am not familiar with Facebook (skip all questions and submit) 2. Which best describes your involvement with Facebook?  I have a personal account  My library has an account  No involvement 3. Which best describes your observation of student use of library computers to access Facebook?  All the time  Most of the time  Some of the time  Rarely  Never 4. Has your library added additional equipment such as computers or scanners as a result of Facebook use?  Yes  No  No, but we plan to in the future 5. Have patrons complained about other patrons using library computers for Facebook?  Yes  No  Not sure 6. Has your library had to develop a policy or had to address computer use concerns as a result of Facebook use?  Yes  No  Not sure 7. If your library provides public access to a scanner, has patron use of scanners increased due to the use of Facebook?  Yes  No 8. Have you assisted students with the library’s scan ner for Facebook?  Yes  No 9. If you have provided assistance to students with Facebook, please check all that apply:  Creating accounts  Scanning photographs or offering advice on where students can access a scanner  Editing photographs (e.g., resizing photos or use of a photo editor)  Uploading photographs to Facebook profiles  Other __________________________________ 10. Check the responses that best describe your opinion about the responsibilities of librarians in assisting students with Facebook questions and access to the Web site:  Student use of Facebook on library computers should not be regulated.  Library resources should not be monopolized with Facebook use.  Computer use for academic purposes should take priority, when needed, over use of Facebook.  Librarians should help students, when able, with Facebook questions.  Librarians need to “keep up” with Internet trends, such as Facebook, even if they are not academic in nature.  There is no connection between librarians, libraries, and Facebook.  Library computers should be available for Facebook use, but librarians should not feel that they need to assist students with Facebook questions. 11. Would you consider Facebook to be a relevant aca demic endeavor?  Yes  No  Not sure Appendix A: Survey on the Impact of Facebook on Academic Libraries 34 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200734 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 12. If you answered “Yes” to question 11, please describe how Facebook could be considered an aca demic endeavor. ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ 13. Please check all answers that best describe what effect, if any, use of Facebook in the library has had on library services and operations?  Has increased patron traffic  Has increased patron use of computers  Has created computer access problems for patrons  Has created bandwidth problems or slowed down Internet access  Has generated complaints from other patrons  Annoys library faculty and staff  Interests library faculty and staff  Has generated discussion among library faculty and staff about Facebook 14. Is privacy a concern you have about students using Facebook in the library?  Yes  No  Not sure Please list any observations, concerns, or opinions you have regarding Facebook use in libraries. extracted the paragraphs from my Palm to my desktop, and saved that document and the TOCs on a Universal Serial Bus (USB) key. Today, I combined them in a new document on my laptop and keyed the remaining paragraphs in my room at an inn on a pier jutting into Commencement Bay in Tacoma on southern Puget Sound. I sought inspiration from the view out my window of the water and the fall color, from Old Crow Medicine Show on my iPod, and from early sixties Beyond the Fringe skits on my Treo. Fred Kilgour was committed to delivering informa tion to users when and where they wanted it. Libraries must solve that challenge today, and I am confident that we shall. Editorial continued from page 3 3287 ---- 36 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200736 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Author ID box for 2 column layout OPAC Design Enhancements and Their Effects on Circulation and Resource Sharing within the Library Consortium Environment Michael J. Bennett A longitudinal study of three discrete online public access catalog (OPAC) design enhancements examined the pos- sible effects such changes may have on circulation and resource sharing within the automated library consor- tium environment. Statistical comparisons were made of both circulation and interlibrary loan (ILL) figures from the year before enhancement to the year after implemen- tation. Data from sixteen libraries covering a seven-year period were studied in order to determine the degree to which patrons may or may not utilize increasingly broader OPAC ILL options over time. Results indicated that while ILL totals increased significantly after each OPAC enhancement, such gains did not result in signifi- cant corresponding changes in total circulation. M ost previous studies of online public access catalog (OPAC) use and design have centered on transactionlog analysis and user survey results in the academic library environment. Measures of patron success or lack thereof have traditionally been expressed in the form of such concepts as “zerohit” analysis or the “branching” analysis of Kantor and, later, Ciliberti.1 Missing from the majority of the literature on OPAC study, however, are the effects that use and design have had on public library patron borrowing practices. Major drawbacks to transactionlog analyses and user surveys as a measure of successful OPAC use include a lack of standardization and the inherent difficulties in interpreting resulting data. As Peters notes, “[s]urveys measure users’ opinions about online catalogs and their perceptions of their successes or failures when using them, while transaction logs simply record the searches conducted by users. Surveys,” he concludes, “mea sure attitudes, while transaction logs measure a specific form of behavior.”2 In both cases it is difficult, in many instances, to draw clear conclusions from either method. Circulation figures, on the other hand, measure a more narrowly defined level of patron success. Circulation is a discrete output that is the direct result of patrons’ initiated interaction with one or many library collections, one or many levels of library technology. With the recent advent of such enhanced OPAC functionality as patronplaced holds on items from broader and broader catalogs, online catalogs now more than ever not only serve as search mechanisms but also as ways for patrons to directly obtain materials from multiple sources. It follows that an investigation of the possible effects such enhancements may have on general circulation trends is warranted. ■ Literature review During the midtolate 1980s, transactionlog analysis was introduced as an inexpensive and easy method of looking at OPAC use in primarily the academic library environment. Peters’s transactionlog survey of more than thirteen thousand searches executed over a five month period at the University of MissouriKansas City remains particularly instructive today for its large sample and transferable design as well as its interpreta tion of results.3 Here analysis was broken into two phases. In phase one, usage patterns by search type and failure rates as measured by zero hits were examined as dependent vari ables with search type as the independent variable in a comparison study. Phase two took this one step further in the assigning of what Peters termed “probable cause” of zero hits. These probable causes fell into patterns that, in turn, resulted in the identification of fourteen discernable error types that included such things as typographical errors and searches for items not in the catalog. Once again, search type formed the independent variable while error type shaped the dependent variable in a simple study of error types as a percentage of total searches. Peters found that users rarely employed truncation or any advanced feature searches and that failures were due primarily to such consistent erroneous search patterns as typographical errors and misspellings. More importantly, however, he cogently reassessed transactionlog analysis as a tool and critiqued its limitations. Zero hits, for exam ple, need not necessarily construe failure when a patron performs a quality search and finds that the library simply does not own the title in question. Concerning intelligible outputs from transactionlog study, Peters found that, “if the user is seen as carrying on a dialog of sorts with the online catalog, then it could be said that most transaction logs record only half of the conversa tion. More information about the system’s response to the user’s queries would help us better understand why patrons do what they do.”4 A look at subsequent transactionlog analyses into the 1990s reveals somewhat differing research approaches yet strikingly similar results. Wallace (1993) duplicated Peters’s methods at eleven terminals within the University of Colorado Library System.5 Her efforts spanned twenty hours of search monitoring and resulted in 4,134 logged searches. These were defined by CARL system search type, (e.g., word, subject), then analyzed as cumulative totals and percentages of all searches. In this case, how Michael J. Bennett Michael J. Bennett (mbennett@cwmars.org) is Digital Initiatives Librarian, C/W MARS Library Network, Worcester, Massachusetts. ARTICLE TITLE | AUTHOR 37OPAC DESIGN ENHANCEMENTS | BENNETT 37 ever, failed searches (Peters’s zero hits) were eliminated entirely from the sample as Wallace focused primarily on patterns of completed searches and did not concern her self with questions of search success or failure, thus limit ing the scope of her findings. Among searches analyzed, results were comparable to Peters’s.6 In keeping with Peters’s line of thinking, Wallace remarked, intriguing vagaries in human behavior during an infor mation search process continue to stymie researchers’ efforts to understand that process. . . . Current, widely used and described guidelines, rules and principles of searching simply do not take into account important aspects of what is really going on when an individual is using a computer to search for information.7 In 1998, Ciliberti et al. conducted a materials avail ability study of 441 OPAC searches at Adelphi University over a threeweek period during fall semester.8 Their work combined Kantor’s branchinganalysis methodol ogy with transactionlog analysis of OPAC use in order to better understand if users obtain the materials they need through the online catalog.9 Sampling was accom plished during random open hours and drew informa tion from undergraduate, graduate, and faculty users. Survey forms included questions of what patrons were searching for. Forms were then picked randomly by staff for recreation. The study was unclear as to the actual design of these forms and their queries. As a result their effectiveness remains questionable. A sevencategory scheme was developed to code search failures that closely followed Kantor’s branching analysis, where the concept of errors extends beyond just OPAC and its design to include such things as library collection devel opment and circulation practices.10 The survey itself along with the loss of accuracy that can be expected from patrons attempting to describe their searches on paper, then having these same searches recreated by research staff lead this author to question the data’s validity. As Peters has noted, surveys are good for assessing OPAC users’ opinions but not necessarily their behavior.11 It would seem that in this instance the tool did not fit the task. This study did, however, use transaction logs after the initial survey analysis and indeed found discrepancies between the selfreport (survey) and actual transactionlog data. Search errors were subsequently categorized as pre viously described.12 Though branching analysis is adept at examining on a holistic, entirelibrary scale (e.g., the ques tion of why patrons are not able to obtain materials), the method’s inherent breadth of focus does not lend itself to fine scrutiny of OPAC design issues in and of themselves. Further refinement of the transactionlog analysis methodology may be seen in Blecic’s et al. fouryear longi tudinal study of OPAC use within the University of Illinois library system.13 Once again, failed searches, termed “zero postings” by the authors, were examined as dependent variables and percentages of the total number of searches and were used as a control. Reasons for zero postings (e.g., searches missing search statements, author names entered in incorrect order) fell into seven separate catego ries. Subsequent transactionlog sets were then culled after three incremental OPAC enhancements. Enhancements included redesigns of general Introductory and Explain screens. Ztest analysis of the level of equality between percentages of zero postings from log set to log set was then made in order to assess whether or not the enhance ments had any affect on diminishing said percentages and thus improving searching behavior. What Blecic et al. found was temporary improve ment in patron searches followed by an unexpected lowering of patron performance over time. Confounding attributes to the study include its longitudinal nature in an academic setting where user groups are not constant but variable. Sadly, no attempt at tracking such possible changes in user populations was made. Also of note was the fact that, as time passed, the commandbased OPAC was increasingly being surrounded by Webbased journal database search interfaces that did not require the use of sophisticated search statements and arguments. As users became accustomed to this type of searching, their com mand syntax skills may have suffered as a result.14 Merits of the study include its straightforward design, logical data analysis, and plausible conclusions. Longitudinal studies, though prone to the confound ing variables described, nevertheless form a persuasive template for further research into how incremental OPAC enhancements affect actual OPAC use over time. Variations of transactionlog analysis also include the purely experimental. Thomas’s 2001 simulation study of eightytwo firstyear undergraduates at the University of Pittsburg utilized four separate experimental screen inter faces.15 These interfaces included one that mimicked the current catalog with data labels and brief bibliographic displays, a second interface with the same bibliographic display but no data labels, and a third that contained the data labels but modified the brief display to include more subjectoriented fields. A fourth interface viewed the same brief displays as the third group but with the labels removed. Users were pretested for basic demographic informa tion and randomly assigned to one of the four experi mental interface groups. Each group was then given the same two search tasks. For the first task, users were asked to select items that they would examine further for a hypothetical research paper on bigband music and the music of Duke Ellington. The second task involved asking participants to examine twenty bibliographic records and to decide whether they would choose to look into these records further. Participants were then asked to identify the data elements used to inform their 38 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200738 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 relevance choices. Resulting user behavior was subse quently tracked through transaction logs. For Thomas’s experimental purposes, though, trans action logs took on a higher level of sophistication than in earlier comparative studies. Here participants’ actions were monitored with a greater level of granularity. Quantitative data were tracked for screens visited, time spent viewing them, total number of screens, total number of bibliographic citations examined at each level of speci ficity, and total time it took to complete the task. Because of the obtrusive nature of the project, a third party was hired to administer the experiment. Chisquare analysis of demographic data found no significance among partici pant groups in terms of their experience in using comput ers, online catalogs, or prior knowledge of the problem topic. This important analysis allowed the researchers a higher level of confidence in their subsequent findings. Results in many instances were, however, inconclu sive. Factors impairing the clarity of conclusions included the number of variables analyzed and the artificiality of the test design itself. Thomas comments on one particular example of this: One of the fields that previous researchers said that library users found important was the call number field. Obviously, without the call number, locating the actual item on the shelf is greatly complicated. In this experi ment, however, participants were not asked to retrieve the items they selected; thus, their perceived need for the call number may well have been mitigated.16 Here is further evidence that a study of OPAC activity viewed in the context of actual outcomes, namely circula tion, is a logical approach to consider. Most recently, Graham at the University of Lethbridge, Alberta, examined OPAC subject searching and no hit results and considered two possible experimental enhancement types in order to allow users the ability to conduct more accurate searches.17 Over a oneweek period, 1,521 nohit subject searches were first sampled and placed into nine categories by error type. Subtotals were then expressed as percentage distributions of the total. A similar examination of 37,987 nohit findings was also made over the course of four calendar years, form ing a longitudinal approach. Percent distribution of error types from the two studies were then compared and were found to be similar with “nonLibrary of Congress Subject Headings” being the predominant area of concern. Graham then attempted to improve subject searching by systematically enhancing the catalog in two ways. First, crossreferences were created based upon the original no hit search term and linked to existing Library of Congress subject headings (LCSHs) that Graham interpreted as appropriate to the searcher’s original intentions. Second, in instances where the original search could not be easily linked to an existing LCSH, a pathfinder record was cre ated that suggested alternate search strategies. All total, 10,520 new authority records and 2,312 pathfinder records were created over the course of the longitudinal study.18 The experiment, unfortunately, only went this far. No attempt was subsequently made to test whether these two methods of adding value to an existing OPAC search interface made a difference in users’ experiences. Though creative in its suggested ameliorations to nohit searches, the study also lacked any statistical testing of comparative data among sample years. Possible problematic design issues, such as the relative complexity of pathfinders and how this might affect their end use were discussed but never tested through the analysis of real outcomes. In summary, major weaknesses of the transactionlog analysis model as demonstrated through the literature include: 1. Lack of standardization among general study methodologies. 2. Lack of standardization of OPACs themselves: Command structure and screen layout differ among software vendors. 3. Lack of standards on measurable levels of search “success” or “failure.” While the following study of OPAC design enhance ments in the public library consortium environment did not directly address the first two points of emphasis, it was this author’s expectation that the lack of stan dardized notions of OPAC search success or failure found throughout the literature may be better addressed through a longitudinal analysis of discrete circulation and ILL statistics. In this way, these quantifiable outcomes, both the direct results of patron initiation, would better assume clearer measures of patron success or failure in OPAC end use. ■ Purpose and methodology In recent years, both academic and public libraries have invested substantial capital in improving OPAC design and automated systems. To what extent have these improvements affected the use of library materials by public library patrons? In order to better examine the question, this study tracked, over a sevenyear period dating back from July 1998 through June 2005, the circulation and systemwide holds statistical trends of sixteen member libraries of C/ W MARS, a Massachusetts automated library network of 140 libraries. During this time a number of discrete, incre mental OPAC modifications granted patrons the ability to accomplish tasks remotely through the OPAC that previ ously had required library staff mediation. Among these ARTICLE TITLE | AUTHOR 3�OPAC DESIGN ENHANCEMENTS | BENNETT 3� changes, the initiation of intraconsortium (C/W MARS) patronplaced holds, and the subsequent introduction of a link from the existing OPAC to the Massachusetts Virtual Catalog (nine Massachusetts consortiums, four University of Massachusetts System Libraries) were examined. This author hypothesized that such OPAC enhance ments that allow for broader choices of patronplaced holds would result in increases in both total circulation and total network transfers (ILL) of library materials one year after initial enhancement adoption. As both total cir culation and total ILL grew, it was hypothesized that ILL as a percent of total circulation would likewise increase due to the fact that each OPAC enhancement was targeted directly toward facets of ILL procurement. OPAC enhancements followed the schedule below: 1. General C/W MARS network systemwide holds (requests mediated through library staff only), November 2000 2. Patronplaced holds (request button placed on C/ W MARS OPAC screens), December 2002 3. C/W MARS participation in the Massachusetts Virtual Catalog (additional button for pass through OPAC searches and requests from C/W MARS catalog into the Massachusetts Virtual Catalog), August 2004 These dates served as independent variables in a study of separate dependent variables (total circulation and total ILLs received) for all eight libraries one year after initial adoption of a new enhancement. For the sake of continu ity the terms Holds and ILLs were used interchangeably throughout this examination. Ttest comparisons to fig ures from the year prior to enhancement were then made for statistical significance. In addition, ILLs received as a percentage of total circulation (dependent variable) for all fifteen libraries one year after initial adoption of a new enhancement were also calculated and compared to the year prior to enhancement through Ztest analysis. Libraries chosen were a random sample from both central and western geographic regions of the network. Sampled institutions did not go through any substantial renovations, drastic open hours changes, or closures dur ing the study period in order to better avoid potential con founding variables that may have skewed the resulting data. Raw circulation and ILL figures were taken directly from the Massachusetts Board of Library Commissioners’ (MBLC) data files for fiscal years 1999 through 2004.19 In the MBLC’s data files, the following fields, sorted by library, correlated to this study’s statistical reporting: “DIRCIRC” = “Circulation” “LOAN FROM” = “ILL” As fiscal year (FY) 2005 figures for circulation and ILL had not yet been compiled by MBLC at the time of this writing, these statistics were in turn taken directly from reports run off of C/W MARS’s network servers. It should be noted that similar C/W MARS reports are distributed and used by the consortium’s libraries them selves each fiscal year for reporting circulation and ILL statistics to MBLC. Raw data by library were entered into Microsoft Excel spreadsheets. Totals for circulation and ILLs received for all libraries by FY of OPAC enhancement were totaled and then compared to FY data prior to enhancement as a percent change value. Excel’s Data Analysis Tools were then employed to run ttests (paired two sample for means) in tables 1 through 5 to analyze the level of change for significance from one sample to the next in both total circulation and total ILLs. (All tables and charts can be found in appendix following article.) Tests for sig nificance employed twotailed ttests with an alpha level set to .05. Raw data for these same libraries across identical study years were also entered into subsequent spread sheets (tables 6 through 10) for additional ztests (two samples for means) to analyze the level of change for significance from one FY sample to the next in ILLs received as a percentage of total circulation. Here tests for significance employed twotailed ztests with an alpha level set to .05. ■ Results and discussion The results of a sixteenlibrary, sevenyear longitudinal study of total circulation and total ILLsreceived statistics are outlined in tables 1 through 5, charts 1 through 10. In addition, an analysis of ILLs received as a percentage of total circulation during this same time period among sampled libraries is represented in tables 6 through 10. Over the course of the study a total of 22,277,245 circula tion and 624,286 ILL transactions were examined from July 1998 through June 2005. Yearly comparisons in total circulation and total ILLs received from FY ’99 to FY ’00 were made to analyze the level of changes in circulation and ILL statistics between years before any OPAC ILL enhancements were under taken. As such these numbers gave insight into what changes, if any, normally occur in circulation and ILL fig ures prior to a schedule of substantial OPAC ILL enhance ments. Although the yeartoyear comparisons over the course of subsequent enhancement rollouts were made to test for the statistical significance of the year prior and following a particular functionality addition, the ’99 to ’00 40 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200740 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 comparison was made to form a control of what circula tion and ILL trends may look like between years of no drastic workflow or design changes. Results showed that this yearly comparison prior to the beginning of OPAC enhancements (table 1, charts 1 and 2) showed no significant change from one year to the next in total circulation (t = 1.81, p > 0.05) or total ILLs received (t = 0.76, p > 0.05). Circulation from ’99 to ’00 declined slightly by 3.42 percent while total ILLs received increased 3.35 percent. The MBLC’s available retrospec tive data set currently only goes back to FY ’99, so a deeper understanding beyond this twoyear comparison of normal yeartoyear trends was impossible to achieve. Yet data from this sample suggest that both circulation and ILLs may trend statistically flat from one year of little if any alteration of ILL design to the next. Additionally, comparisons of the percent of total ILLs received to total circulation were made between ’99 and ’00 (as will be seen in table 6) and were found to be insignificantly different (z = 0.23, p > 0.05). ILLs received made up 0.61 percent of total circulation in FY ’99 and 0.65 percent of total circulation in FY ’00. During FY ’01 (November 2001), C/W MARS rolled out automated systemwide holds functionality whereby library staff were first able to place patron requests for materials at other C/W MARS member libraries through the consortium’s automated circulation system. Up until this point, holds (ILLs) were placed primarily by staff through email or faxed requests from one ILL depart ment to another. Patrons would request material either verbally with staff or through the submission of a paper or electronic form. Staff would then look up the item in the electronic catalog and make the request. With the advent of systemwide holds, staff still accepted requests in a similar fashion, but instead of using the fax or email, they began to place requests directly into the network’s Innovative Millennium circu lation clients. From there, the automated system not only randomly chose the lending library within the system but also automatically queued paging slips at the lending library for material that would subsequently be sent in transit to the borrowing location. By this time in the network’s development, OPAC had also graduated from a characterbased telnet system to a smoother Web design. But the catalog, in terms of directly assisting in the placing of ILL requests, func tioned as it always had—it was still individually a search ing mechanism. The introduction of systemwide holds led to the sec ond largest jump in ILL figures out of all comparative samples (table 2, chart 4). Interestingly enough, the con siderably significant 127.23percent gain in ILL activity from FY ’00 to FY ’01 (t = 4.07, p < 0.05) did not translate into a significant increase in total circulation. In fact, cir culation declined during this period, not significantly (t = 1.87, p > 0.05), but by 2.40 percent nonetheless (table 2, chart 5). A comparison of the percent of ILLs to total circulation from FY ’00 to FY ’01 (table 7) indicated a sig nificant increase of 0.65 percent to 1.52 percent (z = 4.20, p < 0.05). More on the possible effects to circulation that rising levels of ILLs may elicit will be touched upon. Though no statistical evaluations were made between FY ’01 and FY ’02 (as no novel ILL changes were made over this period), it should be noted that during FY ’02 the network first allowed patrons the ability, through OPAC, to log into their own accounts remotely. Patrons were given the ability to set up a personal identification number and view such things as a list of their checked out items. Patrons were also allowed to place checks next to such items and to renew these items remotely. FY ’03 saw the original direct ILL enhancement to OPAC. During this year patrons were first given the opportunity to directly place ILL requests of their own (patronplaced holds) for material found in the catalog through the addition of an OPAC screen request button. Up until this time, all material requests had been medi ated by library staff. Comparative total circulation results from the year before enhancement to FY ’03 (table 3, chart 5) showed only a slightly significant 4.18 percent increase (t = 2.94, p < 0.05). ILLsreceived figures (table 3, chart 6), however, jumped by a considerable 25.58 percent margin (t = 4.66, p < 0.05), strongly suggesting that the OPAC request button addition and its facilitation of patronplaced holds had a positive effect upon total ILL activity as was hypothesized. Finally, total ILLs received as a percentage of total circulation increased slightly from FY ’02 (2.52 percent) to FY ’03 (3.04 percent) (table 8) but did not rep resent a significant shift (z = 1.51, p > 0.05). The last augmentation to the network’s OPAC design that this study examined was an additional link for ILLs through the Massachusetts Virtual Catalog. The Massachusetts Virtual Catalog at the time of this study was an online union catalog of nine Massachusetts net work consortia and four University of Massachusetts System Libraries. Unlike the previous requestbutton enhancement that allowed for seamless patronplaced holds within the C/ W MARS catalog, the Massachusetts Virtual Catalog link was not a button but a descriptive hyperlink (Can’t find the title you want here? Try the Massachusetts Virtual Catalog next!) from the network’s OPAC to the Virtual Catalog’s own dedicated OPAC interface. Once there, patrons were required to login to the Virtual Catalog and recreate their search queries from scratch as previous searches were not automatically passed through to the second catalog. In essence, the Virtual Catalog acted as an additional step for patrons to take beyond C/W MARS’s list of holdings to broaden their search for materials that the network’s member libraries did not own. ARTICLE TITLE | AUTHOR 41OPAC DESIGN ENHANCEMENTS | BENNETT 41 Comparative figures for total circulation between FY ’04 and FY ’05 (table 4, chart 7) when the Virtual Catalog link was added to the C/W MARS OPAC screen found circulation down an insignificant 2.04 percent (t = 0.97, p > 0.05), which ran counter to hypothesized expectations. Total ILLs received between FY ’04 and FY ’05 (table 4, chart 8), however, rose 30.85 percent, which proved to be a highly significant increase (t = 7.03, p < 0.05). Additionally ILLs as a percent of total circulation rose from 4.70 percent in FY ’04 to 6.27 percent in FY ’05 (table 9), which was sta tistically significant (z = 3.28, p < 0.05) and pointed to not only gains in ILL itself after the introduction of the Virtual Catalog link but also to the ever increasing proportion of total circulation that ILL activity accounted for. The final statistical comparison accomplished in this study was a look at what possible cumulative effect, if any, both OPAC enhancements may have had from the year before the first enhancement’s rollout (patronplaced holds Request button) to one year after the latest addition (Virtual Catalog hyperlink from OPAC). In turn, com parative numbers for circulation and ILLs between FY ’02 and FY ’05 were examined. Total circulation over this time (table 5, chart 9) increased insignificantly by 3.46 percent (t = 1.47, p > 0.05). Total ILLs received (table 5, chart 10), how ever, increased by 157.47 percent, the highest significant increase of any two comparative samples (t = 7.20, p < 0.05). ILLs as a percent of total circulation also increased significantly from 2.52 percent in FY ’02 to 6.27 percent in FY ’05 (z = 7.71, p < 0.05) (table 10). If one steps back and examines the various compari sons discussed up to this point, certain trends become evident. Over the course of the sevenyear study, total circulation remained relatively flat, oscillating slightly back and forth, year to year with only one significant increase that occurred after the introduction of patron placed holds in FY ’03. These results, excluding FY ’03, ran against hypothesized expectations that predicted that as ILL enhancements were rolled out, correspondingly significant increases in circulation would result. Total ILLs received (the FY ’99 to FY ’00 control com parison) before the advent of first, network systemwide holds, then a succession of OPAC design enhancements that allowed for a broader range of patroninitiated ILLs suggested that these totals run statistically flat from one year to the next. With the advent of systemwide holds, the ILL picture, however, began to change dramatically with a significant increase in total ILLs. This was fol lowed by significant increases in ILL activity in each study year that came after an OPAC ILL enhancement. These results pointed toward the substantial effect that these enhancements made in total ILL activity and sup ported hypothesized expectations. When such OPAC rollouts were examined as a cumu lative influence through the prism of ILL levels of this past fiscal year (FY ’05) compared to the year before their initial advent (FY ’02), the positive effect that such enrich ments had on not only total ILL but also on total circula tion becomes clearest. For it is through this comparison that it was found that not only did total ILLs increase significantly but that ILLs as a percentage of total circula tion also increased significantly from the time before the first OPAC enhancement to the present. Total circulation was surprisingly impervious to change and ran statisti cally flat during this time. It is clear from this longitudinal study that incremen tally granting patrons access to online tools for them to initiate such traditional library business as ILLs spurs sig nificantly large increases in such activity. In other words, these online tools are not ignored but are intellectually and literally grasped. What may be surprising, however, is the degree to which ILL has increased as a result of them, to a point where ILL has not only taken up a sig nificantly greater proportion of total circulation than ever before but also appears to be changing the very nature of circulation itself. Future studies may include a deeper examination of the circulation and ILL statistical picture farther back in time than this investigation covers to better clarify trends leading up to such major enhancement rollouts. Also, similar longitudinal studies from different consortia envi ronments may shed further light on evidence discussed throughout this writing. Consortia are uniquely poised to offer large statistical sample sizes and standardized workflows within their networkwide ILL and circulation software packages and automated statistical programs. This, in turn, results in highquality, consistent data samples from heterogeneous library sources that are rela tively uncorrupted by scattershot recording methods and differing circulation and ILL methodologies. Finally, a future look at the effects that similar OPAC ILL enhancements may have on borrowing trends beyond general raw transactional figures is warranted. Chris Anderson, for example, has recently commented on Long Tail statistical analysis and its relation to library catalogs. Here outwardly shifting demand curves for library mate rials are hypothesized as collections become more visible and interconnected through the Web.20 In a similar vein, a more granular examination of such concepts as possible circulation and ILLactivity trends in terms of discrete material types borrowed, patron types who borrow, or a crosstabulation of these data points would appear to be a fertile next step toward a greater knowledge of ILLs and circulation as a whole. References 1. T. Peters, “When Smart People Fail: An Analysis of the Transaction Log of an Online Public Access Catalog,” The Journal of Academic Librarianship 15, no. 5 (1989): 267–73. 42 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200742 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 2. Ibid., 272. 3. Ibid. 4. Ibid., 272. 5. P. Wallace, “How Do Patrons Search the Online Catalog When No One’s Looking? TransactionLog Analysis and Impli cations for Bibliographic Instruction and System Design,” RQ 33, no. 2 (1993): 239–43. 6. Peters, “When Smart People Fail.” 7. Wallace, “How Do Patrons Search the Online Catalog When No One’s Looking?” 239. 8. A. Ciliberti et al., “Empty Handed? A Material Availabil ity Study and TransactionLog Analysis Verification,” The Journal of Academic Librarianship 24, no. 4 (1998): 282–89. 9. P. Kantor, “Availability Analysis,” Journal of the American Society for Information Science 27, nos. 5–6 (1976): 311–19. 10. Ciliberti et al., “Empty Handed? A Material Availability Study and TransactionLog Analysis Verification.” 11. Peters, “When Smart People Fail.” 12. Ciliberti et al., “Empty Handed? A Material Availability Study and TransactionLog Analysis Verification.” 13. D. Blecic, et al., “A Longitudinal Study of the Effects of OPAC Screen Changes on Searching Behavior and Searcher Suc cess,” College & Research Libraries 60, no. 6 (1999): 515–30. 14. Ibid. 15. D. Thomas, “The Effect of Interface Design on Item Selec tion in an Online Catalog,” Library Resources & Technical Services 45, no. 1 (2001): 20–46. 16. Ibid., 41. 17. R. Graham, “Subject NoHits Searches in an Academic Library Online Catalog: An Exploration of Two Potential Ame liorations,” College & Research Libraries 65, no. 1 (2004): 36–54. 18. Ibid. 19. Massachusetts Board of Library Commissioners 2005, “Public Library Data, Data Files,” http://www.mlin.lib.ma.us/ advisory/statistics/public/index.php (accessed Oct. 13, 2005). 20. C. Anderson, “The Long Tail,” Wired Magazine 12, no. 10 (2004): 170–77; “Q&A with Chris Anderson,” OCLC Newsletter, 2005, no. 268, http://www.oclc.org/news/publications/news letters/oclc/2005/268/interview.htm (accessed July 20, 2006). Appendix A: Tables and Charts Table 1. Yearly comparison prior to the beginning of ILL OPAC enhancements Table 2. General systemwide holds implementation (adopted 11/00) ARTICLE TITLE | AUTHOR 43OPAC DESIGN ENHANCEMENTS | BENNETT 43 Table 3. OPAC design enhancement: patron-placed holds (adopted 12/02) Table 4. OPAC design enhancement: patron-placed Massachusetts virtual catalog holds (adopted 8/04) Table 5. OPAC design enhancements: “Cumulative Effect” (FY ’02 to FY ’05) Table 6. Yearly comparison prior to the beginning of ILL OPAC enhancements of ILL received as a percentage of total circulation 44 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200744 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Table �. OPAC design enhancement: patron-placed Massachusetts virtual catalog holds (adopted 8/04) ILL received as a percentage of total circulation Table 10. OPAC design enhancements: “Cumulative Effect” (FY ’02 to FY ’05) ILL received as a percentage of total circulation Table 7. General systemwide holds (adopted 11/00) ILL received as a percentage of total circulation Table 8. OPAC design enhancement: patron-placed holds (adopted 12/02) ILL received as a percentage of total circulation ARTICLE TITLE | AUTHOR 45OPAC DESIGN ENHANCEMENTS | BENNETT 45 Chart 1. Circulation comparison prior to any ILL OPAC enhance- ment (FY ’99 to FY ’00) Chart 2. ILL received comparison prior to any ILL OPAC enhance- ment (FY ’99 to FY ’00 Chart 4. Holds received comparison before and after general systemwide holds implementation (adopted 11/00) Chart 5. Circulation comparison before and after patron-placed holds OPAC enhancement (adopted 12/02) Chart 3. Circulation comparison before and after general systemwide holds implementation (adopted 11/00) Chart 6. Holds received comparison before and after patron-placed holds OPAC enhancement (adopted 12/02) 46 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200746 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Chart 7. Circulation comparison before and after Massachusetts virtual catalog OPAC enhancement (adopted 8/04) Chart 8. Holds received comparison before and after Massachusetts virtual catalog OPAC enhancement (adopted 8/04) Chart 9. Circulation comparison OPAC enhancements “Cumulative Effect” (FY ’02 to FY ’05) Chart 10. ILL comparison OPAC enhancements “Cumulative Effect” (FY ’02 to FY ’05) LITA 35, 47, cover 2, cover 4 NealSchuman cover 3 Index to Advertisers 3288 ---- 48 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 Author ID box for 3 column layout Column Title Editor Zoomify Image is a mature product for easily publishing large, high-reso- lution images on the Web. End users view these images with existing Web- browser software as quickly as they do normal, downsampled images. A Flash-based Zoomifyer client asyn- chronously streams image data to the Web browser as needed, resulting in response times approaching those of desktop applications using minimal bandwidth. The author, a librarian at Cornell University and the principal architect of a small, open-source com- pany, worked closely with Zoomify to produce a cross-platform, open- source implementation of that com- pany’s image-processing software and discusses how to easily deploy the product into a widely used Web- publishing environment. Limitations are also discussed as are areas of improvement and alternatives. Z oomifyer from Zoomify (www .zoomify.com) enables users to view large, highresolu tion images within existing Web browser software while providing a rich, interactive user experience. A small Zoomifyer client, authored in Macromedia Flash, is embedded in an HTML page and makes asyn chronous requests to the server to stream image data back to the client as needed. By streaming the image data in this way, the image renders as quickly as a normal, downsampled image, even for images that are giga bytes in size. As the user pans and zooms, the response time approaches that of desktop applications while using the smallest possible band width necessary to render the image. And because Flash has 98.3 per cent browser saturation, viewing “Zoomified” images is seamless for most users and allows them to view images interactively in much greater detail than would otherwise be prac tical or even possible.1 Zoomify Image (sourceforge.net/ projects/zoomifyimage) was created at Cornell University in collabora tion with Zoomify to create an open source, crossplatform, and scriptable version of the processing software that creates the image data displayed in a Zoomifyer client. This work was immediately integrated into an inno vative contentmanagement system that was being developed within the Zope Application Server, a premier Web application and publishing plat form. Authors in this system can add highresolution images just as they normally add downsampled images, and the image is automat ically processed on the server by Zoomify Image and displayed within a Zoomifyer client. Zoomify Image is now in its second major release on Source Forge and contains user con tributed software to easily deploy it in other environments such as PHP. Zoomifyer has been used in a number of applications in many fields, and can greatly enhance many research and instructional activities. Applying Zoomifyer to digitalimage collections is obvious, allowing libraries to deliver an unprecedented level of detail in images published to the Web. New applications also suggest themselves, such as serving highresolution images taken from tissue samples in a medical lab or using Zoomifyer in advanced geo spatial image applications, particu larly when advanced client features such as annotations are used. The Zoomifyer approach also has positive implications for preservation and copyright protection. Zoomify Image generates cached derivatives of master image files so the image masters are never directly accessed in the application or sent over the Internet. Image data are stored and transmitted to the client in small chunks so that end users do not have access to the full data of the original image. Deploying Zoomify Image Dependencies and winstal- lation Zoomify Image was designed ini tially to be a faithful, crossplatform port of Zoomify’s imageprocessing software. It was developed in close cooperation with Zoomify to pro vide a scriptable method for invok ing the imagepreparation process for Zoomifyer clients so this technol ogy could be used in more environ ments. Zoomify Image is written in the Python programming language and uses the thirdparty Python Imaging Library (PIL) with JPEG support, both of which are also open source and crossplatform. It has been tested in the following environments: ■ Python 2.1.3 ■ PIL 1.1.3 and ■ Python 2.4.3 ■ PIL 1.1.4 Installers for Python and PIL exist for all major platforms and can be obtained at python.org and www .pythonware.com/products/pil. The installation documentation that comes with PIL will help you locate the appropriate JPEG libraries if they are missing from your system. For MacOSX, you can find prebuilt binary installers for Python, PIL and Zope at sourceforge.net/projects/ mosxzope. Introducing Zoomify Image Adam Smith Adam Smith (ajs17@cornell.edu) is a Systems Librarian at Cornell University Library, Ithaca, New York. INTRODUCING ZOOMIFY IMAGE | SMITH 4�INTRODUCING ZOOMIFY IMAGE | SMITH 4� The “EZ” version of the Zoomifyer client, a Flashbased applet with basic pan and zoom functionality, is pack aged with Zoomify Image for conve nience so the software can be used immediately once installed. The EZ client is covered by a separate license and can be easily replaced with more advanced clients from Zoomify at www.zoomify.com. (A description of how to upgrade the Zoomifyer client is included in this paper.) After Python and PIL with JPEG support are installed, download the Zoomify Image software from sourceforge.net/projects/zoomify image and decompress it. Using Zoomify Image from the command line Begin exploring Zoomify Image by invoking it on the command line: python /ZoomifyFilePr ocessor.py Or, to process more than one file at a time: python /ZoomifyFile Processor.py The file format of the images input to Zoomify Image are typically either TIFF or JPEG, but can be any of the many formats that PIL can read.2 An image called “test.jpg” is included in the Zoomify Image distribution and is of sufficient size and complexity to provide an interesting example. During processing, Zoomify Image creates a new directory to hold the converted image data in the same location as the image file being processed. The name of this direc tory is based on the file name of the image being processed, so that, for example, an image called “test.jpg” would have a corresponding folder called “test” containing the converted image data used by the Zoomifyer client. If the image file has no file extension, the directory is named by appending “_data” to the image name, so that an image file named “test” would have a corresponding directory called “test_data.” If the process is rerun on the same images, any previously generated data are automatically deleted before being regenerated. Zoomify provides substantial documentation and sample code on its Web site that demonstrates how to use the data generated by Zoomify Image in several environments. User contributed code is bundled with Zoomify Image itself, further dem onstrating how to dynamically incor porate this conversion process into several environments. An example of the use of Zoomify Image within the Zope Application Server is given. Incorporating Zoomify Image into the Zope Application Server The popular Zope Application Server contains a number of builtin services including a Web server, FTP and WebDAV servers, plugins for access ing relational databases, and a hier archical objectoriented database that uses a filesystem metaphor for stor age. This object database provides a unique opportunity to incorporate Zoomifyer into Zope seamlessly. To use Zoomify Image with Zope, the distribution must be decom pressed into your Zope Products directory. For versions 2.7.x and up, this is at: /Products/ In Zope versions prior to the 2.7.x series, the Products directory is at: /lib/python/ Products/ Restart Zope and now within the Webbased Zope Management Interface (ZMI), the ability to add Zoomify Image objects appears. After selecting this option, a form is presented that is identical to the form used for adding ordinary Image objects within Zope. When an image is uploaded using this form, Zope automatically invokes the Zoomify Image conversion process on the server and links the generated data to the default Zoomifyer client that comes with the distribution. If the image is subsequently edited within ZMI to upload a new version, any existing conversion data for that image are automatically deleted, and the new conversion data are gener ated to replace them, just as when invoked on the command line. Again, the uploaded image can be in any format that Zope recognizes as having a contenttype of “image/...” and that PIL can read. The only potential “gotcha” in this process is that in the versions of the Zoomifyer client the author has tested, Zoomify Image objects that have file names (in Zope terminology, the file name is the object’s “id” property) with extensions other than “.jpg” are not displayed properly by the Zoomifyer client. So, when uploading a TIFF image, for example, the id given to the Zoomify Image object should either not contain an extension, or it should be changed from image.tif to something like image_tif. This bug has been reported to Zoomify and may be fixed in newer versions of the Flashbased viewing software at the time of publication. To view the image within the Zoomifyer client, simply call the “view” method of the object from within a browser. So, for a Zoomify Image object uploaded to: http:///test/test.jpg go to this URL: http:///test/test. jpg/view Or, to include this view of the image within a Zope Page Template 50 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 200750 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2007 (ZPT), simply call the tag method of the Zoomify Image just as you would a normal Image object in Zope. So, in a ZPT, use this: It is possible that the Zoomify Image conversion process will not have had time to complete when someone tries to view the image. The Zoomify Image object will attempt to degrade gracefully in this situation by trying to display a downsampled version of the image that is gener ated part way through the conver sion process, or, if that is also not available, finally informing the user that the image is not yet ready to be viewed. This logic is built into the tag method. To add larger images more effi ciently, or to add images in bulk, the Zoomify Image distribution contains detailed documentation to quickly configure Zope to accept images via FTP or WebDAV and automatically process them through Zoomify Image when they are uploaded. Finally, the default Zoomifyer cli ent can be overridden by uploading a custom Zoomifyer client into a loca tion where the Zoomify Image object can “acquire” it, and giving it a Zope id of “zoomifyclient.swf”. How it works To be viewed by a Zoomifyer cli ent, an image must be processed to produce tiles of the image at differ ent scales, or tiers. An XML file that describes these tiles is also necessary. Zoomify Image provides a cross platform method of producing these tiled images and the XML file that describes them. Beginning at 100percent scale, the image is successively scaled in half to produce each tier, until both the width and height of the final tier are, at most, 256 pixels each. Each tier is further divided into tiles that are, at most, 256 pixels wide by 256 pixels tall, as seen in figure 1. These tiles are created left to right, top to bottom. Tiles are saved as images with the naming conven tion indicated in figure 2. The numbering is zerobased, so that the smallest tier is represented by one tile that is at most 256 x 256 pixels wide with the name “00 0.jpg.” Tiles are saved in directories in groups of 256, and those directories also follow a zerobased naming con vention starting with “TileGroup0.” Lowernumbered tile groups contain lowernumbered tiles, so 000.jpg is always in TileGroup0. Zoomifyer clients understand this tilenaming scheme and only request tiles from the server that are necessary to stitch together the por tion of the image being viewed at a particular scale. Limitations Zoomify Image was developed to meet two goals: 1. to provide a crossplatform port of the Zoomifyer con Figure 1. Tiers and tiles for a 2048 x 2048 pixel image Figure 2. Tile image naming scheme INTRODUCING ZOOMIFY IMAGE | SMITH 51INTRODUCING ZOOMIFY IMAGE | SMITH 51 verter for use in UNIX/Linux systems, and 2. to make the converter script able, and ultimately integrate it into opensource contentman agement software, particularly Zope. This Zoomifyer port was writ ten in Python, a mature, highlevel programming language with an execution model similar to Java. Although Zoomify Image continues to be optimized, compared to the official Zoomify conversion software, it is slower and more limited in the sizes of images it can reasonably process. Anecdotally, Zoomify Image has been used effectively on images hundreds of megabytes large, but significant performance degradation has been reported in the multigiga byte range. Because of these limitations in Zoomify Image, the official Zoomify imageprocessing software is recom mended for converting very large images manually in a Windows or Macintosh environment. The Zoomify Image product is recommended in the following circumstances: ■ The conversion must be per formed on a UNIX/Linux machine. ■ The conversion process must be scriptable, such as for batch pro cessing or being run dynamically. ■ Images sizes are not in the multi gigabyte range. If a scriptable, crossplatform version of the Zoomifyer converter is needed, but performance is an issue, several things can be done to extend the current limits of the soft ware. Obviously, upgrading hard ware, particularly RAM, is effective and relatively inexpensive. Running the latest versions of Python and PIL will also help. Each new version of Python makes significant perfor mance improvements, and this was a primary goal of version 2.5, which was released in September 2006. The author believes that the cur rent weak link in the performance chain is related to how Zoomify Image is loading image data into memory with PIL during processing. In the current distribution, a Python script contributed by Gawain Avers, which is based partially on the Zoomify Image approach, uses ImageMagick instead of PIL for image manipula tion and is better able to process multigigabyte images. The author would like to add the ability to des ignate the image library at runtime in future versions of Zoomify Image. Future development Beyond improving the performance of the coreprocessing algorithm, the author would also like to explore opportunities for more efficiently processing images within Zope, such as spawning a background thread for processing images so the Zope Web server can immediately respond to the client’s imagesubmission request. The author would also like to improve the tag method to display data more flexibly in the Zoomifyer client and ensure consistent behav ior with Zope’s default Image tag method. Finally, Zoomify Image could also benefit from the addi tion of a simple configuration file to control such runtime properties as image quality and which thirdparty imageprocessing library to use, for example. Conclusion Zoomify Image is mature, open source software that makes it pos sible to publish large, highresolution images to the Web. It is designed to be convenient to use in a variety of architectures and can be viewed within existing browser software. Download it for free, begin using it in minutes, and explore its unique possibilities. References 1. Adobe Systems, Macromedia Flash Player Statistics, http://www.adobe.com/ products/player_census/flashplayer/ (accessed March 1, 2007). 2. PythonWare, Python Imaging Library Handbook: Image File Formats, http:// www.pythonware.com/library/pil/ handbook/formats.htm (accessed Aug. 6, 2006). Resources Macromedia Flash Player Statistics (http://www.adobe.com/ products/player_census/flash player/) (accessed Jan. 2, 2007). Python Imaging Library (PIL) (http:// www.pythonware.com/products/ pil/) (accessed Jan. 2, 2007). Python Programming Language Official Web site (http://www.python.org/) (accessed Jan. 2, 2007). Zoomify Image (http://sourceforge.net/ projects/zoomifyimage/) (accessed Jan. 2, 2007). Zoomify (http://www.zoomify.com/) (accessed Jan. 2, 2007). Zope Community (http://www.zope .org/) (accessed Jan. 2, 2007). Zope installers for MacOSX (http:// sourceforge.net/projects/ mosxzope/) (accessed Jan. 2, 2007). 3323 ---- ITAL_24n4p3 ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 3324 ---- ITAL_24n4.pdf ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 3326 ---- ITAL_24n4p24-32 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 3327 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. WikiWikiWebs: New Ways to Communicate in a Web Environment Chawner, Brenda;Lewis, Paul H Information Technology and Libraries; Mar 2006; 25, 1; ProQuest Education Journals pg. 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3328 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Graphical Table of Contents for Library Collections: The Application ... Herrero-Solana, Victor;Félix Moya-Anegón;Guerrero-Bote, Vicente;Zapico-Alonso, Felipe Information Technology and Libraries; Mar 2006; 25, 1; ProQuest Education Journals pg. 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3329 ---- 50 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Author Name and Second Author F orty years! In July 1966, the Library and Informa- tion Technology Association (LITA) was officially born at the American Library Association (ALA) Annual Conference in New York as the Information Science and Automation Division (ISAD). It was Bastille Day, and I’m sure for those who had worked so hard to create this new organization that it probably seemed like a revolution, a new day. The organizational meeting held that day attracted “several hundred people.” Imagine! I’ve mentioned it before, I know, but the history of the first twenty-five years of LITA is intriguing reading and well worth an investment of your time. Stephen R. Salmon’s article “LITA’s First Twenty-Five Years: A Brief History” (www.lita.org/ala/lita/aboutlita/org/1st 25years.htm) offers an interesting look back in time. Any technology organization that has been in existence for forty or more years has seen a lot of changes and adapted over time to a new environment and new technologies. There is no other choice. Someone (who, I don’t remember; I’d gladly attribute the quote if I did) once told me that library automation began with the electric eraser. I’m sure that many of you have neither seen an electric eraser, nor can you probably imagine its purpose. Ask around. I’m sure there are staff in your organization who do remember using it. There may even be one hidden somewhere in your library. A quick search of the Web even finds cordless, rechargeable electric erasers today in drafting and art supply stores. The 1960s, as LITA was born, was still the era of the big mainframe systems and not-so-common program- ming languages. Machine Readable Cataloging (MARC) was born and OCLC conceived. The 1970s saw the intro- duction of minicomputer systems. Digital Equipment Corporation introduced the VAX, a 32-bit platform, in 1976. The roots of many of our current integrated library systems reach back to this decade. The 1980s saw the introduction of the IBM personal computer and the Apple Macintosh. The graphical interface became the norm or at least the one to imitate. The 1990s saw a shift away from hardware to communication and access as the Web was unveiled and began to give life to the Internet bubble. The new millennium began with Y2K. The Web predomi- nates, and increasingly, the digital form dominates almost everything we touch (text, audio, video). Automation and systems evolved and changed over the years, and so did libraries. Automation, which had been confined to large air-conditioned and moni- tored rooms, moved out into the library. It increas- ingly appeared at circulation desks, on staff desks, and then throughout the library. Information technology (IT) spread into offices everywhere and into homes. Libraries had products and services to deliver to users. Users demanded more convenience. Of course, others knew this trend as well and provided products and services that users wanted. Users often liked what they saw in stores better than what the library was able to provide. Each of us attempts to keep up, compete, and beat those whom we see as our competitors. It’s a moving target and one that seems to be gaining speed. All the while, during these four decades, our asso- ciation and its members continually adapted to the new environment, faced new challenges, and adopted new technologies. We would not exist if we did not. I feel that we, as an association, are again facing the need to change, to trans- form ourselves. IT, digital technology, automation (whatever term you want to use) affects the work of virtually every library staff member. Everyone’s work in the library uses or con- tributes to the digital presence of our employer. IT is not the domain of a few. LITA has a wonderful history and it has great poten- tial to better serve the profession. What do we want our association to be? What programs and services can we provide that others do not? Who can we involve to broaden our reach? How can we better communicate with members and nonmembers? If we had a clean sheet of paper, what would we write? What would we dream? We need to share that dream and bring it to life. I can’t do it. The LITA board can’t do it. We need your help. We need your ideas. We need your energy. We need to break out of our comfort zone. None of us wants the Strategic Plan (www.lita.org/ala/lita/aboutlita/org/plan.htm) we adopted last year to ring hollow. We want to accelerate change and move into a reenergized future. I welcome your aspirations, ideas, and comments. I know that the LITA board does as well. Please feel free to contact me or any member of the board (www.lita .org/ala/lita/aboutlita/org/litagov/board.htm). LITA is your association. Where should we be going? Help us navigate the future. Patrick Mullin Patrick Mullin (mullin@email.unc.edu) is LITA president 2005– 2006, and Associate University Librarian for Access Services and Systems, the University of North Carolina at Chapel Hill. President’s Column 3330 ---- Author Name and Second Author B y now, most Library and Information Technology Association (LITA) members and Information Tech- nology and Libraries (ITAL) readers know that 2006 is the fortieth anniversary of LITA’s predecessor, the Information Science and Automation Division (ISAD) of the American Library Association (ALA). And 2007 marks the fortieth birthday of ITAL, first published in 1967 as the Journal of Library Automation (JOLA). I hope that members and readers know the vital role played by Fred Kilgour in the founding of the division and as JOLA’s founding editor. This issue marks the initiation of a two-volume cel- ebration (volumes 25 and 26) of his role as founding edi- tor by publishing what we hope are significant articles resulting from original research, the development of important and creative new systems, or explications of significant new technologies that will shape future infor- mation technologies. I have invited some of the authors of these articles to submit their manuscripts. Others are being submitted in response to a call I published both in an earlier editorial and in a message to the lita-l discus- sion list. Whether invited or submitted, they will receive the same double-blind refereeing that all ITAL articles undergo. The referees will not know which articles have been invited or submitted for this purpose. The articles will, however, be so designated when they are published. Volume 25 initiates a second landmark for ITAL. Henceforth, ITAL will be published simultaneously in electronic and print versions. The electronic copy will be available to LITA members and ITAL subscribers on the ALA/LITA Web site. Equally significantly, at the 2006 ALA Midwinter Meeting in San Antonio, the LITA board of directors approved a second proposal from the LITA Publications Committee. (The ITAL editor and edito- rial board report to the publications committee.) After six months, the electronic issues will be open to all, not restricted to members and subscribers. Put simply, if you are a member or subscriber read- ing this issue in print, you may also read it and volume 25, number 1 (the March 2006 issue) on the Web. When volume 25, number 3 is published in September 2006, the March issue on the Web will be open for anyone to read. When the December issue is published, this June e-issue will be open to all. The Web versions are to be published in both PDF and html versions. Most ITAL articles now include URLs. Readers will be able to link to them. Most figures and graphs submitted by authors are in color. From now on, these will be available to the readers of the e-copies. ALA publishing allows authors to submit their arti- cles to institutional repositories, and many authors now do so. Authors will retain this option. Some articles have been posted on other portals as well. Martha Yee’s out- standing June 2005 article on how to FRBRize the OPAC appears not only on UCLA’s repository site but also on the eScholarship Repository site of the University of California system, one of the few library-related articles on the site (http://repositories.cdlib.org/escholarhip). Furthermore, on November 29, 2005, it was among the top ten most popular articles on the site. Recently, dLIST (http://dlist.sir.arizona.edu) at the University of Arizona Library received permission to include it. The decisions to allow simultaneous publication of print and electronic versions and to allow open access after six months were not made lightly. The LITA board members carried on extensive electronic discussions among themselves and with Nancy Colyar, chair of the publications committee, and me. LITA president Pat Mullin’s summary of those discussions was more than ten single-spaced pages. Nancy and I also attended a meeting of the board in San Antonio. Publications and memberships are two chief sources of revenue for almost all professional associations. In two surveys in the past ten years, LITA members have indicated they considered ITAL to be their most important membership benefit. LITA membership fell this year, probably because of the recent dues increases by other divisions of ALA. This decline was anticipated by LITA’s leadership. I think both the ITAL editorial board and the LITA leadership would love to take the additional pioneering step of making our journal a full open-access publication. However, legitimate concern was expressed that opening access after six months might lead to both a decrease in members and subscribers. A significant number of LITA leaders said that their membership was based on LITA programs, participation, and interaction with colleagues, not just ITAL. I hope that all LITA members feel the same. I further hope that LITA members will do everything they can to discourage their libraries from canceling their sub- scriptions. Our financial health would be enhanced if all LITA members took two other steps: participating in writ- ing and encouraging the writing of significant articles, and encouraging your many library technology vendors to advertise in ITAL. Fred Kilgour and the other founders of our division were library information technology (IT) pioneers. Fred’s leadership helped make JOLA and now ITAL vital read- ing for library IT professionals. I believe that by celebrat- ing the LITA/ITAL anniversaries with a reconfirmation of our practice of publishing articles of the highest quality and by making ITAL more accessible through electronic publication, we are reaffirming the scholarly and profes- sional commitments first made by Fred Kilgour and his ISAD colleagues such a short forty years ago. John Webb John Webb (jwebb@wsu.edu) is Assistant Director for Systems and Planning, Washington State University Libraries, Pullman, and Editor of Information Technology and Libraries. Editorial: LITA and ITAL: Forty and Still Counting EDITORIAL | WEBB 51 3325 ---- ITAL_24n4p12-23 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 3331 ---- 52 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Author Name and Second Author Author ID box for 2 column layout This paper discusses Google Scholar as an extension of Kilgour’s goal to improve the availability of information. Kilgour was instrumental in the early development of the online library catalog, and he proposed passage retrieval to aid in information seeking. Google Scholar is a direct descendent of these technologies foreseen by Kilgour. Google Scholar holds promise as a means for libraries to expand their reach to new user communities, and to enable libraries to provide quality resources to users dur- ing their online search process. Editor’s Note: This article was submitted in honor of the fortieth anniversaries of LITA and ITAL. F red Kilgour would probably approve of Google Scholar. Kilgour wrote that the paramount goal of his professional career is “improving the availabil- ity of information.”1 He wrote about his goal of achieving this increase through shared electronic cataloging, and even argued that shared electronic cataloging will move libraries toward the goal of 100 percent availability of information.2 Throughout much of Kilgour’s life, 100 percent avail- ability of information meant that all of a library’s books would be on the shelves when a user needed them. In proposing shared electronic cataloging—in other words, online union catalogs—Kilgour was proposing that users could identify libraries’ holdings without having to travel to the library to use the card catalog. This would make the holdings of remote libraries as visible to users as the holdings of their local library. Kilgour went further than this, however, and also pro- posed that the full text of books could be made available to users electronically.3 This would move libraries toward the goal of 100 percent availability of information even more than online union catalogs. An electronic resource, unlike physical items, is never checked out; it may, in theory, be simultaneously used by an unlimited number of users. Where there are restrictions on the number of users of an electronic resource—as with subscription ser- vices such as NetLibrary, for example—this is not a neces- sary limitation of the technology, but rather a limitation imposed by licensing and legal arrangements. Kilgour understood that his goal of 100 percent availability of information would only be reached by leveraging increasingly powerful technologies. The exis- tence of effective search tools and the usability of those tools would be crucial so that the user would be able to locate available information without assistance.4 To achieve this goal, therefore, Kilgour proposed and was instrumental in the early development of much library automation: he was behind the first uses of punched cards for keeping circulation records, he was behind the development of the first online union catalog, and he called for passage retrieval for information seeking at a time when such systems were first being developed.5 This development and application of technology was all directed toward the goal of improving the availability of information. Kilgour stated that the goal of these pro- posed information-retrieval and other systems was “to supply the user with the information he requires, and only that information.”6 Shared catalogs and electronically available text have the effect of removing both spatial and temporal barriers between the user and the material being used. When the user can access materials “from a personal microcom- puter that may be located in a home, dormitory, office, or school,” the user no longer has to physically go to the library.7 This is a spatial barrier when the library is located at some distance from the user, or if the user is physically constrained in some way. Even if the user is perfectly able-bodied, however, and located close to a library, electronic access still eliminates a temporal bar- rier: accessing materials online is frequently faster and more convenient than physically going to the library. Electronic access enables 100 percent availability of information in two ways: by ensuring that the material is available when the user wants it, and by lowering or removing any actual or perceived barriers to the user accessing the material. ■ Library automation Weise writes that “for at least the last twenty to thirty years, we [librarians] have done our best to provide them [users] with services so they won’t have to come to the library.”8 The services that Weise is referring to are the ability for users to search for and gain access to the full text of materials online. Libraries of all types have widely adopted these services: for example, at the author’s own institution, the University of North Carolina at Chapel Hill, the libraries have subscriptions to approximately seven hundred databases and provide access to more than 32,000 unique periodical titles; many of these sub- scriptions provide access to the full text of materials.9 Additionally, the State Library of North Carolina pro- vides a set of more than one hundred database subscrip- tions to all academic and public libraries around the Jeffrey Pomerantz Jeffrey Pomerantz (pomerantz@unc.edu) is Assistant Pro- fessor in the School of Information and Library Science, University of North Carolina at Chapel Hill. Google Scholar and 100 Percent Availability of Information GOOGLE SCHOLAR AND 100 PERCENT AVAILABILITY OF INFORMATION | POMERANTZ 53 state; any North Carolina resident with a library card may access these databases.10 Several other states have similar programs. By providing users with remote access to materials, libraries have created an environment in which it is possible for users to be remote from the library. Or rather, as Lipow points out, it is the library that is remote from the user, yet the user is able to seek and find information.11 This adoption of technology by libraries has had the effect of enabling and empowering users to seek informa- tion for themselves, without either physically going to a library or seeking a librarian’s assistance. The increasing sophistication of freely available tools for information seeking on the Web has accelerated this trend. In many cases, users may seek information for themselves online without making any use of a library’s human-intermedi- ated or other traditional services. (Certainly, providing access to electronic collections may be considered to be a service of the library, but this is a service that may not require the user either to be physically in the library or to communicate with a librarian.) Even technically unsophisticated users may use a search engine and locate information that is “good enough” to fulfill their infor- mation needs, even if it is not the ideal or most complete information for those purposes.12 Thus, for better or worse, the physical library is no longer the primary focus for many information seekers. Part of this movement by users toward self-sufficiency in information seeking is due to the success of the Web search engine, and to the success of Google in particular. Recent reports from the Pew Internet and American Life Project shed a great deal of light on users’ use of these tools. Rainie and Horrigan found that “on a typical day at the end of 2004, some 70 million American adults logged onto the Internet.”13 Fallows found that “on any given day, 56% of those online use search engines.”14 Fallows, Rainie, and Mudd found that of their respondents, “47% say that Google is their top choice of search engine.”15 From these figures, it can be roughly estimated that more than 39 mil- lion people use search engines, and more than 18 million use Google on any given day—and that is only within the United States. This trend seems quite dark for libraries, but it actu- ally has its bright side. It is important to make a distinc- tion here between use of a search engine and use of a reference service or other library service. There is some evidence that users’ questions to library reference ser- vices are becoming more complex.16 Why this is occur- ring is less clear, but it may be hypothesized that users are locating information that is good enough to answer their own simple questions using search engines or other Internet-based tools. The definition of “good enough” may differ considerably between a user and a librarian. Nevertheless, one function of the library is education, and as with all education, the ultimate goal is to make the student self-sufficient in self-teaching. In the context of the library, this means that one goal is to make the user self-sufficient in finding, evaluating, and using informa- tion resources. If users are answering their own simple questions, and asking the more difficult questions, then it may be hypothesized that the widespread use of search engines has had a role in raising the level of debate, so to speak, in libraries. Rather than providing instruction to users on simply using search engines, librarians may now assume that some percentage of library users possess this skill, and may focus on teaching higher-level infor- mation-literacy skills to users (www.ala.org/ala/acrl/ acrlstandards/informationliteracycompetency.htm). Simple questions that users may answer for them- selves using a search engine, and complex questions requiring a librarian’s assistance to answer are not oppo- sites, of course, but rather two ends of a spectrum of the complexity of questions. While the advance of online search tools may enable users to seek and find informa- tion for themselves at one end of this spectrum, it seems unlikely that such tools will enable users to do the same across the entire spectrum any time soon; perhaps ever. The author believes that there will continue to be a role for librarians in assisting users to find, evaluate, and use information. It is also important to make another distinction here, between the discovery of resources, and access to those resources. Libraries have always provided mechanisms for users to both discover and access resources. Neither the card catalog nor the online catalog contains the full text of the materials cataloged; rather, these tools are means to enable the user to discover the existence of resources. The user may then access these resources by visiting the library. Search engines, similar to the card and online catalogs, are tools primarily for discovery of resources: search-engine databases may contain cached copies of Web pages, but the original (and most up-to- date) version of the Web page resides elsewhere on the Web. Thus, a search engine enables the user to discover the existence of Web pages, but the user must then access those Web pages elsewhere. The author believes that there will continue to be a role for libraries in providing access to resources—regardless of where the user has dis- covered those resources. In order to ensure that libraries and librarians remain a critical part of the user’s information-seeking process, however, libraries must reappropriate technologies for online information seeking. Search engines may exist separate from libraries, and users may use them without making use of any library service. However, libraries are already the venue through which users access much online content—newspapers, journals, and other peri- odicals; reference sources; genealogical materials—even if many users do not physically come to the library or consult a librarian when using them. It is possible for 54 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 libraries to add value to search technologies by providing a layer of service available to those using it. ■ Google Scholar One such technology for online information seeking to which libraries are already adding value, and that could add value to libraries in turn, is Google Scholar (scholar. google.com). Google Scholar is a specialty search tool, obviously provided by Google, which enables the user to search for scholarly literature online. This literature may be on the free Web (as open-access publications become more common and as scholars increasingly post preprint or post-print copies of their work on their personal Web sites), or it may be in subscription databases.17 Users may access literature in subscription databases in one of two ways: (1) if the user is affiliated with an institution that subscribes to the database, the user may access it via whatever authentication method is in place at the institu- tion (e.g., IP authentication, a proxy server), or (2) if the user is not affiliated with such an institution, the user may pay for access to individual resources on a pay-per- view basis. There is not sufficient space here to explore the details of Google Scholar’s operation, and anyway that is not the point of this paper; for excellent discussions of the operation of Google Scholar, see Gardner and Eng, and Jacsó.18 Pace draws a distinction between federated searching and metasearching: federated search tools compile and index all resources proactively, prior to any user’s actual search, in a just-in-case approach to users’ searching.19 Metasearch tools, on the other hand, search all resources on the fly at the time of a user’s search, in a just-in-time approach to users’ searching. Google Scholar is a feder- ated search tool—as, indeed, are all of Google’s current services—in that the database that the user searches is compiled prior to the user’s actual search. In this, Google Scholar is a direct descendent of Kilgour’s work to develop shared online library catalogs. A shared library catalog is a union catalog: it is a database of libraries’ physical holdings, compiled prior to any actual user’s search. Google Scholar is also a union catalog, though a catalog of publishers’ electronic offerings pro- vided by libraries, rather than of libraries’ physical hold- ings. It should be noted, however, that while this difference is an important one for libraries and publishers, it might not be understood or even relevant for many users. Many of the resources indexed in Google Scholar are also available in full text. This fact allows Google Scholar to also move in the direction of Kilgour’s goal of making passage retrieval possible for scholarly work. By using Google’s core technology—the search engine and the inverted index that is created when pages are indexed by a search engine—Google Scholar enables full-text search- ing of scholarly work. As mentioned above, when users search Google Scholar, they retrieve a set of links to the scholarly literature retrieved by the search. Google Scholar also makes use of Google’s link- analysis algorithms to analyze the network of citations between publications—instead of the network of hyper- links between Web pages, as Google’s search engine more typically analyzes. A Cited By link is included with each retrieved link in Google Scholar, stating how many other publications cite the publication listed. Clicking on this Cited By link performs a preformulated search for those publications. This citation-analysis functionality resembles the functionality of one of the most common and widely used scholarly databases in the scholarly com- munity: the ISI Web of Science (WoS) database (scientific .thomson.com/products/wos). WoS enables users to track citations between publications. This functionality has wide use in scholarly research, but until Google Scholar, it has been largely unknown outside of the scholarly community. With the advent of Google Scholar, however, this functionality may be employed by any user for any research. Further, there is a plugin for the Firefox browser (www.mozilla.com/firefox) that displays an icon for every record on the page of retrieved results that links to the appropriate record in the library’s OPAC (Google Scholar does not, however, currently provide this func- tionality natively20). This provides a link from Google Scholar to the materials that the library holds in its col- lection. When the item is a book, for example, this link to the OPAC enables users to find the call number of the book in their local library. When the item is a journal, it enables them to find both the call number and any data- base subscriptions that index that journal title. Periodicals are often indexed in multiple databases, so libraries with multiple-database subscriptions often have multiple means of accessing electronic versions of journal titles. A library user may access a periodical via any or all of these individual subscriptions without using Google Scholar— but to do so, the user must know which database to use, which means knowing either the topical scope of a data- base or knowing which specific journals are indexed in a database. As a more centralized means of accessing this material, many users may prefer a link in Google Scholar to the library’s OPAC. Google Scholar thus fulfills, in large part, Kilgour’s vision of shared electronic cataloging. In turn, shared cata- loging goes a long way toward achieving Kilgour’s vision of 100 percent availability of information by allowing a user to discover the existence of information resources. However, discovery of resources is only half of the equa- tion: the other half is access to those resources. And it is here where libraries may position themselves as a critical part of the information-seeking process. Search engines GOOGLE SCHOLAR AND 100 PERCENT AVAILABILITY OF INFORMATION | POMERANTZ 55 may enable users to discover information resources on their own, without making use of a library’s services, but it is the library that provides the “last mile” of service, enabling users to gain access to many of those resources. ■ Conclusion Google Scholar is the topic of a great deal of debate, both in the library arena and elsewhere.21 Unlike union catalogs and many other online resources used in librar- ies, it is unknown what materials are included in Google Scholar, since as of this writing Google has not released information about which publishers, titles, and dates are indexed.22 Google is known to engage in self-censor- ship—or self-filtering, depending on what coverage one reads—and so potentially conflicts with the American Library Association’s Freedom to Read Statement (www .ala.org/ala/oif/statementspols/ftrstatement/freedom readstatement.htm).23 Google is a commercial entity and, as such, a primary motivation of Google must be profit, and only secondarily, meeting the information needs of library users. For all of these and other reasons, there is considerable debate among librarians about whether it is appropriate for libraries to provide access to Google Scholar. Despite this debate, however, users are using Google Scholar. Google Scholar is simply the latest tool to enable users to seek information for themselves; it isn’t the first and it won’t be the last. Google Scholar holds a great deal of promise for libraries due to the combination of Google’s popularity and ease of use, and the resources held by or subscribed to by libraries to which Google Scholar points. As Kesselman and Watstein suggest, “libraries and librarians need to have a voice” in how tools such as Google Scholar are used, given that “we are the ones most passionate about meeting the information needs of our users.” Given that library users are using Google Scholar, it is to libraries’ benefit to see that it is used well. Google Scholar is the latest tool in a long history of information-seeking technologies that increasingly real- ize Kilgour’s goal of achieving 100 percent availability of information. Google Scholar does not provide access to 100 percent of information resources in existence; but rather enables discovery of information resources, and allows for the possibility that these resources will be dis- coverable by the user 100 percent of the time. Google Scholar may be on the vanguard of a new way of integrating library services into users’ everyday information-seeking habits. As Taylor tells us, people have their own individual sources to which they go to find information, and libraries—for many people—are not at the top of their lists.25 Google, however, is at the top of the list for a great many people.26 Properly harnessed by libraries, therefore, Google Scholar has the potential to bring users to library resources when they are seeking information. Google Scholar may not bring users physically to the library. Instead, what Google Scholar can do is bring users into contact with resources provided by the library. This is an important distinction, because it reinforces a change that libraries have been undergoing since the advent of the online database: that of providing access to materials that the library may not own. Ownership of materials potentially allows for a greater measure of con- trol over the materials and their use. Ownership in the context of libraries has traditionally meant ownership of physical materials, and physical materials by nature restrict use, since the user must be physically collocated with the materials, and use of materials by one user precludes use of those materials by other users for the duration of the use. Providing access to materials, on the other hand, means that the library may have less control over materials and their use, but this potentially allows for wider use of these materials. By enabling users to come into contact with library resources in the course of their ordinary Web searches, Google Scholar has the potential to ensure that libraries remain a critical part of the user’s information-seeking process. It benefits Google when a library participates with Google Scholar, but it also benefits the library and the library’s users: the library is able to provide users with a familiar and easy-to-use path to materials. This is (for lack of a better term) a “spoonful of sugar” approach to seeking and finding information resources: by using an interface that is familiar to users, libraries may provide quality information sources in response to users’ informa- tion seeking. Green wrote that “a librarian should be as unwilling to allow an inquirer to leave the library with his ques- tion unanswered as a shop-keeper is to have a customer go out of his store without making a purchase.”27 A modern version of this might be that a librarian should be as unwilling to allow an inquirer to abandon a search with his question unanswered. Google Scholar and online tools like it have the potential to draw users away from libraries; however, these tools also have the potential to usher in a new era of service for libraries: an expansion of the reach of libraries to new users and user communities; a closer integration with users’ searches for information; and the provision of quality resources to all users, in response to all information needs. Google Scholar and online tools like it have the potential to enable libraries to realize Kilgour ’s goals of improv- ing the availability of information, and to provide 100 percent availability of information. These are goals on which all libraries can agree. 56 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 ■ Acknowledgements Many thanks to Lisa Norberg, instruction librarian, and Timothy Shearer, systems librarian, both at the University of North Carolina at Chapel Hill, for many extensive conversations about Google Scholar, which approached coauthorship of this paper. This paper is dedicated to the memory of Kenneth D. Shearer. References and notes 1. Frederick G. Kilgour, “Historical Note: A Personalized Prehistory of OCLC,” Journal of the American Society for Informa- tion Science 38, no. 5 (1987): 381. 2. Frederick G. Kilgour, “Future of Library Computerization,” in Current Trends in Library Automation: Papers Presented at a Work- shop Sponsored by the Urban Libraries Council in Cooperation with the Cleveland Public Library, Alex Ladenson, ed. (Chicago: Urban Libraries Council, 1981), 99–106; Frederick G. Kilgour, “Toward 100 Percent Availability,” Library Journal 114, no. 19 (1989): 50–53. 3. Kilgour, “Toward 100 Percent Availability.” 4. Frederick G. Kilgour, “Lack of Indexes in Works on Infor- mation Science,” Journal of the American Society for Information Science 44, no. 6 (1993): 364; Frederick G. Kilgour, “Implications for the Future of Reference/Information Service,” in Collected Papers of Frederick G. Kilgour: OCLC Years, Lois L. Yoakam, ed. (Dublin, Ohio: OCLC Online Computer Library Center, Inc., 1984): 9–15. 5. Frederick G. Kilgour, “A New Punched Card for Circula- tion Records,” Library Journal 64, no. 4 (1939): 131–33; Kilgour, “Historical Note”; Frederick G. Kilgour and Nancy L. Feder, “Quotations Referenced in Scholarly Monographs,” Journal of the American Society for Information Science 43, no. 3 (1992): 266–70; Gerald Salton, J. Allan, and Chris Buckley, “Approaches to Pas- sage Retrieval in Full-Text Information Systems,” in Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New York: ACM Pr., 1993), 49–58. 6. Kilgour, “Implications for the Future of Reference/Infor- mation Service,” 95. 7. Kilgour, “Toward 100 Percent Availability,” 50. 8. Frieda Weise, “Being There: The Library As Place,” Journal of the Medical Library Association 92, no. 1 (2004): 10, www.pubmedcentral.nih.gov/articlerender.fcgi?artid=314099 (accessed Apr. 9, 2006). 9. It is difficult to determine precise figures, as there is con- siderable overlap in coverage; several vendors provide access to some of the same periodicals. 10. North Carolina’s database subscriptions are via the NC LIVE service, www.nclive.org (accessed Apr. 9, 2006). 11. Anne G. Lipow, “Serving the Remote User: Reference Service in the Digital Environment,” paper presented at the Ninth Australasian Information Online and On Disc Conference and Exhi- bition, Sydney, Australia, 19–21 Jan. 1999, www.csu.edu.au/ special/online99/proceedings99/200.htm (accessed Apr. 9, 2006). 12. J. Janes, “Academic Reference: Playing to Our Strengths,” portal: Libraries and the Academy 4, no. 4 (2004): 533–36, http:// muse.jhu.edu/journals/portal_libraries_and_the_academy/ v004/4.4janes.html (accessed Apr. 9, 2006). 13. Lee Rainie and John Horrigan, A Decade of Adoption: How the Internet Has Woven Itself into American Life (Washington, D.C.: Pew Internet & American Life Project, 2005), 58, www.pewinter net.org/PPF/r/148/report_display.asp (accessed Apr. 9, 2006). 14. Deborah Fallows, Search Engine Users (Washington, D.C.: Pew Internet & American Life Project, 2005), i, www.pew internet.org/pdfs/PIP_Searchengine_users.pdf (accessed Apr. 9, 2006). 15. Deborah Fallows, Lee Rainie, and Graham Mudd, Data Memo on Search Engines (Washington, D.C.: Pew Internet & American Life Project, 2004), 3, www.pewinternet.org/PPF/ r/132/report_display.asp (accessed Apr. 9, 2006). 16. Laura Bushallow-Wilber, Gemma DeVinney, and Fritz Whitcomb, “Electronic Mail Reference Service: A Study,” RQ 35, no. 3 (1996): 359–69; Carol Tenopir and Lisa A. Ennis, “Reference Services in the New Millennium,” Online 25, no. 4 (2001): 40–45. 17. Alma Swan and Sheridan Brown, Open Access Self- Archiving: An Author Study (Truro, England: Key Perspectives, 2005), www.jisc.ac.uk/uploaded_documents/Open%20Access %20Self%20Archiving-an%20author%20study.pdf (accessed Apr. 9, 2006). 18. Susan Gardner and Susanna Eng, “Gaga over Google? Scholar in the Social Sciences,” Library Hi Tech News 8 (2005): 42–45; Péter Jacsó, “Google Scholar: The Pros and the Cons,” Online Information Review 29, no. 2 (2005): 208–14. 19. Andrew Pace, “Introduction to Metasearch . . . and the NISO Metasearch Initiative,” Presentation to the OpenURL and Metasearch Workshop, Sept. 19–21, 2005, www.niso.org/news/ events_workshops/OpenURL-05-ppts/2-1-pace.ppt (accessed Apr. 9, 2006). 20. This plugin was developed by Peter Binkley, Digital Ini- tiatives Technology Librarian at the University of Alberta. See www.ualberta.ca/~pbinkley/gso (accessed Apr. 9, 2006). 21. See, for example, Gardner and Eng, “Gaga over Google?”; Jacsó, “Google Scholar”; M. Kesselman and S. B. Watstein, “Google Scholar and Libraries: Point/Counterpoint,” Reference Services Review 33, no. 4 (2005): 380–87. 22. Jacsó, “Google Scholar.” 23. Anonymous, Google Censors Itself for China, BBC News, Jan. 25, 2006, http://news.bbc.co.uk/2/hi/technology/4645596 .stm (accessed Apr. 9, 2006); A. McLaughlin, “Google in China,” Google Blog., Jan. 27, 2006, http://googleblog.blogspot .com/2006/01/google-in-china.html (accessed Apr. 9, 2006). 24. Kesselman and S. B. Watstein, “Google Scholar and Libraries,” 386. 25. Robert S. Taylor, “Question-Negotiation and Information Seeking in Libraries,” College & Research Libraries 29, no. 3 (1968): 178–94. 26. Fallows, Rainie, and Mudd, Data Memo on Search Engines. 27. Samuel S. Green, “Personal Relations between Librarians and Readers,” American Library Journal I, no. 2–3 (1876): 79. 1–11. 3332 ---- Author Name and Second Author The use of Ajax, or Asynchronous JavaScript + XML, can result in Web applications that demonstrate the flexibility, responsiveness, and usability traditionally found only in desktop software. To illustrate this, a repository metasearch user interface, OJAX, has been developed. OJAX is simple, unintimidating but power- ful. It attempts to minimize upfront user investment and provide immediate dynamic feedback, thus encouraging experimentation and enabling enactive learning. This article introduces the Ajax approach to the develop- ment of interactive Web applications and discusses its implications. It then describes the OJAX user interface and illustrates how it can transform the user experience. W ith the introduction of the Ajax development paradigm, the dynamism and richness of desk- top applications become feasible for Web-based applications. OJAX, a repository metasearch user inter- face, has been developed to illustrate the potential impact of Ajax-empowered systems on the future of library software.1 This article describes the Ajax method, highlights some uses of Ajax technology, and discusses the implica- tions for Web applications. It goes on to illustrate the user experience offered by the OJAX interface. ■ Ajax In February 2005, the term Ajax acquired an additional meaning: Asynchronous JavaScript + XML.2 The con- cept behind this new meaning, however, has existed in various forms for several years. Ajax is not a single technology but a general approach to the development of interactive Web applications. As the name implies, it describes the use of JavaScript and XML to enable asyn- chronous communication between browser clients and server-side systems. As explained by Garrett, the classic Web application model involves user actions triggering a hypertext trans- fer protocol (HTTP) request to a Web server.3 The latter processes the request and returns an entire hypertext markup language (HTML) page. Every time the client makes a request to the server, it must wait for a response, thus potentially delaying the user. This is particularly true for large data sets. But research demonstrates that response times of less than one second are required when moving between pages if unhindered navigation is to be facilitated through an information space.4 The aim of Ajax is to avoid this wait. The user loads not only a Web page, but also an Ajax engine written in JavaScript. Users interact with this engine in the same way that they would with an HTML page, except that instead of every action resulting in an HTTP request for an entire new page, user actions generate JavaScript calls to the Ajax engine. If the engine needs data from the server, it requests this asynchronously in the back- ground. Thus, rather than requiring the whole page to be refreshed, the JavaScript can make rapid incre- mental updates to any element of the user interface via brief requests to the server. This means that the traditional page-based model used by Web applications can be abandoned; hence, the pacing of user interaction with the client becomes independent of the interaction between client and server. XMLHttpRequest is a collection of application pro- gramming interfaces (APIs) that use HTTP and JavaScript to enable transfer of data between Web servers and Web applications.5 Initially developed by Microsoft, XMLHttpRequest has become a de facto standard for JavaScript data retrieval and is implemented in most modern browsers. It is commonly used in the Ajax para- digm. The data accessed from the HTTP server is usually in Extensible Markup Language (XML) but another for- mat, such as JavaScript Object Notation, could be used.6 Applications of Ajax Google is the most significant user of Ajax technology to date. Most of its recent innovations, including Gmail, Google Suggest, Google Groups, and Google Maps, employ the paradigm.7 The use of Ajax in Google Suggest improves the tradi- tional Google interface by offering real-time suggestions as the user enters a term in the search field. For example, if the user enters xm, Google Suggest might offer refine- ments such as xm radio, xml, and xmods. Experimental Ajax-based auto-completion features are appearing in a range of software.8 Shanahan has applied the same ideas to the Amazon online bookshop.9 His experimental site, Zuggest, extends the concept of auto-completion: as the user enters a term, the system automatically triggers a search without the need to hit a search button. The potential of Ajax to improve the responsiveness and richness of library applications has not been lost on the library community.10 Several interesting experiments have been tried. At OCLC, for example, a “suggest-like service,” based on controlled headings from the world- Judith Wusteman and Pádraig O’hIceadha USING AJAX TO EMPOWER DYNAMIC SEARCHING | WUSTEMAN 57 Using Ajax to Empower Dynamic Searching Judith Wusteman (judith.wusteman@ucd.ie) is a lecturer in the UCD School of Information and Library Studies, University College Dublin, Ireland. 58 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 wide union catalog, WorldCat, has been implemented.11 Ajax has also been used in the OCLC DeweyBrowser.12 The main page of this browser includes four iframes, or inline frames, three for the three levels of Dewey Decimal Classification and a fourth for record display.13 The use of Ajax allows information in each iframe to be updated independently without having to reload the entire page. Implications of Ajax There have been many attempts to enable asynchronous background transactions with a server. Among alter- natives to Ajax are Flash, Java Applets, and the new breed of XML user-interface language formats such as XML User Interface Language (XUL) and Extensible Application Markup Language (XAML).14 These all have their place, particularly languages such as XUL. The latter is ideal for use in Mozilla extensions, for example. Combinations of the above can and are being used together; XUL and Ajax are both used in the Firefox extension version of Google Suggest.15 The main advan- tage of Ajax over these alternative approaches is that it is nonproprietary and is supported by any browser that supports JavaScript and XMLHttpRequest—hence, by any modern browser. It could be validly argued that complex client-side JavaScript is not ideal. In addition to the errors to which complex scripting can be prone, there are accessibility issues. Best practice requires that JavaScript interaction adds to the basic functionality of Web-based content that must remain accessible and usable without the JavaScript.16 An alternative non-JavaScript interface to Gmail was recently implemented to deal with just this issue. A move away from scripting would, in theory, be a positive step for the Web. In practice, however, proce- dural approaches continue to be more popular; attempts to supplant them, as epitomized by XHTML 2.0, simply alienate developers.17 It might be assumed that the use of Ajax technol- ogy would result in a heavier network load due to an increase in the number of requests made to the server. This is a misconception in most cases. Indeed, Ajax can dramatically reduce the network load of Web appli- cations, as it enables them to separate data from the graphical user interface (GUI) used to display it. For example, each results page presented by a traditional search engine delivers, not only the results data, but also the HTML required to render the GUI for that page. An Ajax application could deliver the GUI just once and, after that, deliver data only. This would also be pos- sible via the careful use of frames; the latter could be regarded as an Ajax-style technology but without all of Ajax’s advantages. ■ From client-server to SOA The dominant model for building network applications is the client/server approach, in which client software is installed as a desktop application and data generally reside on a server, usually in a database.18 This can work well in a homogenous single-site computing environ- ment. But institutions and consortia are likely to be het- erogeneous and geographically distributed. PCs, Macs, and cell phones will all need access to the applications, and Linux may require support alongside Windows. Even if an organization standardizes solely on Windows, different versions of the latter will have to be supported, as will multiple versions of those ubiquitous Dynamic Link Libraries (DLLs). Indeed, the problems of obtaining and managing conflicting DLLs have spawned the term “DLL hell.”19 In Web applications, a standard client, the browser, is installed on the desktop but most of the logic, as well as the data, reside on the server. Of course, the browser developers still have to worry about “DLL hell,” but this need not concern the rest of us. “Speed must be the overriding design criterion” for Web pages.20 But the interactivity and response times possible with client/server applications are still not avail- able to traditional Web applications. This is where Ajax comes in: it offers, to date, the best of the Web application and client/server worlds. Much of the activity is moved back to the desktop via client-side code. But the advan- tages of Web applications are not lost: the browser is still the standard client. Service-Oriented Architecture (SOA) is an increas- ingly popular approach to the delivery of applications to heterogeneous computing environments and geo- graphically dispersed user populations.21 SOA refers to the move away from monolithic applications toward smaller, reusable services with discrete functionality. Such services can be combined and recombined to deliver different applications to users. Web Services is an implementation of SOA principles.22 The term describes the use of technologies such as XML to enable the seam- less interoperability of Web-based applications. Ajax enables Web Services and hence enables SOA principles. Thus, the adoption of Ajax facilitates the move toward SOA and all the advantages of reuse and integration that this offers. ■ ARC ARC is an experimental open-source metasearch pack- age available for download from the SourceForge open- source foundry.23 It can be configured to harvest Open USING AJAX TO EMPOWER DYNAMIC SEARCHING | WUSTEMAN 59 Archives Initiative-Protocol for Metadata Harvesting (OAI-PMH)-compliant data from multiple repositories.24 The harvested results are stored in a relational database and can be searched using basic Web forms. ARC’s Advanced Search form is illustrated in figure 1. ■ Applying Ajax to the search GUI The use of Ajax has the potential to narrow the gulf between the responsiveness of GUIs for Web applications and those for desktop applications. The flexibility, usabil- ity, and richness of the latter are now possible for the former. The OJAX GUI, illustrated in figure 2, has been developed to demonstrate how Ajax can improve the richness of ARC-like GUIs. OJAX, including full source code, is available under the open-source Apache license and is hosted on SourceForge.25 OJAX comprises a client-side GUI, implemented in JavaScript and HTML, and server-side metasearch Web Services, implemented in Java. The Web Services connect directly to a metasearch database created by ARC from harvested repositories. The database connectivity lever- ages several libraries from the Apache Jakarta project, which provides open-source Java solutions.26 ■ Development process The OJAX GUI was developed iteratively using Agile software development methods.27 Features were added incrementally and feedback gained from a proxy user. In order to gain an in-depth understanding of the sys- tem and the implications for the remainder of the GUI, features were initially built from scratch, using object- oriented JavaScript.They were then rebuilt using three open-source JavaScript libraries: Prototype, script.aculo .us, and Rico.28 Prototype provides base Ajax capability. It also includes advanced functionality for object-oriented JavaScript, such as multiple inheritance. The other two libraries are built on top of Prototype. The script.aculo. us library specializes in dynamic effects, such as those used in auto-completion. The Rico library, developed by Sabre, provides other key JavaScript effects—for example, dynamic scrollable areas and dynamic sorting.29 ■ Storyboard One of the aims of the National Information Standards Organization (NISO) Metasearch Initiative is to enable all library users to “enjoy the same easy searching found in web-based services like Google.”30 Adopting this approach, OJAX incorporates the increasingly common concept of the search bar, popularized by the Google Toolbar.31 OJAX aims to be as simple, uncluttered, and unthreatening as possible. The goal is to reflect the sim- ple-search experience while, at the same time, providing the power of an advanced search. Thus, the user interface has been kept as simple as possible while maintaining equivalent functionality with the ARC Advanced Search interface. All ARC functionality, with the exception of the grouping feature, is provided. To help the intuitive flow of the operation, the fields are set out as a sentence: Find [term(s)] in [all archives] from [earliest year] until [this year] in [all subjects] Tool tips are available for text-entry fields. By default, searching is on author, title, and abstract. These fields map to the creator, title, and description Dublin Core meta- data fields harvested from the original repositories.32 The search can be restricted by deselecting unwanted fields. ARC supports both MySQL and Oracle databases.33 MySQL has been chosen for OJAX as MySQL is an open-source database. Boolean search syntax has been Figure 1. ARC’s Advanced Search form Figure 2. The OJAX Metasearch User Interface 60 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 implemented in OJAX to allow for more powerful search- ing. The syntax is similar to that used by Google in that it identifies AND/OR and exact phrase functionality by +/- and “ ”. Hence it preserves the user’s familiarity with basic Google search syntax. However, it is not as powerful as the full Google search syntax; for example, it does not support query modifiers such as: intitle: 34 The focus of this research is the application of Ajax to the search GUI and not the optimization of the power or expressive capability of the underlying search engine. However, the implementation of an alternative back end that uses a full-text search engine, such as Apache Lucene, would improve the expressive power of advanced que- ries.35 Full-text search expressiveness is likely to be key to the usability of OJAX, ensuring its adequacy for the advanced user without alienating the novice. ■ Unifying the user interface One of the main aims of OJAX is the unification of the user interface. Instead of offering distinct options for simple and advanced search and for refining a completed search, the interface is sufficiently dynamic to make this unnecessary. The user need never navigate between pages because all options, both simple and advanced, are available from the same page. And all results are made available on that same page in the form of a scrollable list. The only point at which a new page is presented is when the resource identifier of a result is clicked. At this stage, a pop-up window, external to the OJAX session, displays the full metadata for that resource. This page is generated by the external repository from which the record was originally harvested. Simple and advanced search options are usually kept separate because most users are unwilling or unable to use the latter.36 Furthermore, the design of existing search-user interfaces is based on the assumption that the retrieval of results will be sufficiently time-consuming that users will want to have selected all options beforehand. With OJAX, however, users do not have to make a complete choice of all the options they might want to try before they see any results. As data are entered, answers flow to accommodate them. Because the inter- face is so dynamic and responsive and because users are given immediate feedback, they do not have to be concerned about wasting time due to the wrong choice of search options. Users iterate toward the search results they require by manipulating the results in real time. The reduced level of investment that users must make before they achieve any return from the system should encourage them to experiment, hence promoting enac- tive learning. ■ Auto-completion In order to provide instant feedback to the user, the search-terms field and the subject field use Ajax to auto- complete user entries. Figure 3 illustrates the result of typing Smith in the search-terms field. A list is automati- cally dropped down that itemizes all matches and the number of their occurrences. Users select the term they want, the entire field is automatically completed, and a search is triggered. The ARC system denormalizes some of the harvested data before saving them in its database. For example, it merges all the author fields into one single field, each name separated by a bar character. To enable the OJAX auto-completion feature, it was necessary to renormalize the names. A new table is used to store each name in a separate row; names are referenced by the resource iden- tifier. To enable this, ARC’s indexing code was updated so that it creates this table as it indexes records extracted from the OAI-PMH feed. In its initial implementation, OJAX uses a simple algorithm for auto-completion. Future work will involve developing a more complex heuristic that will return results more closely satisfying user requirements. ■ Auto-search As already mentioned, a central theme of OJAX is the attempt to reduce the commitment necessary from users before they receive feedback on their actions. One way in which dynamic feedback is provided is the triggering of an immediate search whenever an entire option has been selected. Examples of entire options include choice of an archive or year and acceptance of a suggested auto- completion. In addition, the following heuristics are used to identify when a user is likely to have finished entering a search term and, thus, when a search should be triggered: 1. Entering a space character in the search-terms field or subject field 2. Tabbing out of a field after having modified its con- tents 3. Five seconds of user inactivity for a modified field The third heuristic aims to catch some of the edge cases that the other heuristics may miss. It is assumed likely that a term has been completed if a user has made no edits in the last five seconds. As each term will be USING AJAX TO EMPOWER DYNAMIC SEARCHING | WUSTEMAN 61 separated by a space, it is only the last term in a search phrase that is likely not to trigger an auto-search via the first heuristic. Users can click the search button whenever they wish, but they should never have to click it. The Zuggest sys- tem abandons the search button entirely; OJAX retains it, mainly in order to avoid confounding user expectations.37 While a search is in progress, the search button is greyed out and acquires a red border. This is particularly useful in alerting the user that a search has been auto- matically triggered. This is the only feature of OJAX that may have an impact on network load in terms of slightly higher traffic. However, the increased number of requests is offset by a reduction in the size of each response because the GUI is not downloaded with it. For example, initiating a search in ARC results in an average response size of 57.32K. The response is in the form of a complete HTML page. Initiating a search in OJAX results in an average response size of 7.96K. The latter comprises a Web Service response in XML. In other words, more than seven OJAX auto- searches would have to be triggered before the size of the initial search result in ARC was exceeded. ■ Dynamic archive list The use of Ajax enables a static HTML page to contain a small component of dynamic data without the entire page having to be dynamically generated on the server. OJAX illustrates this: the contents of the drop-down box listing the searchable archives are not hard-coded in the HTML page. Rather, when the page is loaded, an Ajax request for the set of available archives is generated. This is a useful technique; static HTML pages can be cached by browsers and proxy servers, and only the dynamic portion of the data, perhaps those used to personalize the page, need be downloaded at the start of a new session. ■ Dynamic scrolling Searches commonly produce thousands of results. Typ- ical systems, such as Google and ARC, make these results available via a succession of separate pages, thus requiring users to navigate between them. Finding information by navigating multiple pages can take longer than scrolling down a single page, and users rarely look beyond the second page of search results.38 To avoid these problems and to encourage users to look at more of the available results, those results could be made available in one scrollable list. But, in a typical non-Ajax application, accessing a scrollable list of, say, two thousand items would require the entire list to be downloaded via one enormous HTML page. This would be a huge operation; if it did not crash the browser, it would, at least, result in a substantial wait for the user. The Rico library provides a feature to enable dynamic scrollable areas. It uses Ajax to fetch more records from the server when the user begins to scroll off the visible area. This is used in the display of search results in OJAX, as illustrated in figure 4. To the user, it appears that the scrollable list is seamless and that all 4,678 search results are already downloaded. In fact, only 386 have been downloaded. The rest are available at the server. As the user scrolls further down, say to item 396, an Ajax request is made for the next ten items. Any item downloaded is cached by the Ajax engine and need not be requested again if, for example, the user scrolls back up the list. A dynamic information panel is available to the right of the scroll bar. It shows the current scroll position in relation to the beginning and end of the results set. In Figure 3. Auto-completion in the search terms field Figure 4. Display of search results and dynamic information panel 62 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 figure 4, the information panel indicates that there are 4,678 results for this particular search and that the cur- rent scroll position is at result number 386. This number updates instantly during scrolling, preserving the illusion that all results have been downloaded and providing users with dynamic feedback on their progress through the results set. This means that users do not have to wait for the main results window to refresh to identify their current position. ■ Auto-expansion of results OJAX aims to provide a compact display of key informa- tion, enabling users to see multiple results simultane- ously. It also aims to provide simple access to full result details without requiring navigation to a new Web page. In the initial results display, only one line each of the title, authors, and subject fields, and two lines of the abstract, are shown for each item. As the cursor is placed on the relevant field, the display expands to show any hidden detail in that field. At the same time, the back- ground color of the field changes to blue. When the cur- sor is placed on the bar containing the resource identifier, all display fields for that item are expanded, as illustrated in figure 5. This expansion is enabled via simple Cascading Style Sheet (CSS) features. For example, the following CSS dec- laration hides all but the first line of authors: #searchResults td div { overflow:hidden; height: 1.1em } When the cursor is placed on the author details, the overflow becomes visible and the display field changes its dimensions to fit the text inside it: #searchResults td div:hover { overflow:visible; height:auto } ■ Sorting results Another method used by OJAX to minimize upfront user investment is to provide initial search results before requiring the user to decide on sort options. Because results are available so quickly and because they can be re-sorted so rapidly, it is not necessary to offer pre-search selection of sort options. Ajax facilitates rapid presen- tation of results; after a re-sort, only those on the first screen must be downloaded before they can be presented to the user. Results may be sorted by title, author, subject, abstract, and resource identifier. These options are listed on the gray bar immediately above the results list. Clicking one of these options sorts the results in ascending order; an upward-pointing arrow appears to the right of the Sort option chosen, as illustrated in fig- ure 6. Clicking on the option again sorts in descending order and reverses the direction of the arrow. Clicking on the arrow removes the sort; the results revert to their original order. Functionality for the Sort feature is provided by the Rico JavaScript library. Server-side implementation sup- ports these features by caching search results so that it is not necessary to regenerate them via a database query each time. Figure 5. Auto-expansion of all fields for item number 386 Figure 6. Results being sorted in ascending order by title USING AJAX TO EMPOWER DYNAMIC SEARCHING | WUSTEMAN 63 ■ Search history Several experimental systems—for example, Zuggest— have employed Ajax to facilitate a search-history feature. A similar feature could be provided for OJAX. A button could be added to the right of the results list. When cho- sen, it could expand a collapsible search-history sidebar. As the cursor was placed on one of the previous searches listed in the sidebar, a call out, that is, a speech bubble, could be displayed. This could provide further informa- tion such as the number of matches for that search and a summary of the search results clicked on by the user. Clicking one of the previous searches would restore those search results to the main results window. This feature would take advantage of the Ajax per- sistent JavaScript engine to maintain the history. Its use could help counter concerns about Ajax technology “breaking” the Back button; the feature could be imple- mented so that the Back button returned the user to the previous entry in the search history.39 In fact, this imple- mentation of Back-button functionality could be more useful than the implementation in Google, where hitting the Back button is likely to take the user to an interim results page; for example, it might simply take the user from page 3 of results to page 2 of results. ■ Scrapbook Users browsing through search results on OJAX would require some simple method of maintaining a record of those resource details that interested them. Ajax could enable the development of a useful scrapbook feature to which such resource details could be copied and stored in the persistent JavaScript engine. OJAX could further leverage a shared bookmark Web Service, such as del. icio.us or Furl, to save the scrapbook for use in future ses- sions and to share it with other members of a research or interest group.40 ■ Potential developments for OJAX As well as searching a database of harvested metadata, the OJAX user interface could also be used to search an OAI-PMH-compliant repository directly. With appropri- ate implementation, all of OJAX’s current features could be made available, apart from auto-completion. A recent development has enabled the direct indexing of repositories by Google using OAI-PMH.41 The latter provides Google with additional metadata that can be searched via the Google Web Services APIs. The current OJAX Web Services could be replaced by the Google APIs, thus eliminating the need for OJAX to host any server-side components. Hence, OJAX could become an alternative GUI for Google searching. ■ Conclusion OJAX demonstrates that the use of Ajax can enable features in Web applications that, until now, have been restricted to desktop applications. In OJAX, it facilitates a simple, nonthreatening, but powerful search user inter- face. Page navigation is eliminated; dynamic feedback and a low initial investment on the part of users encour- age experimentation and enable enactive learning. The use of Ajax could similarly transform other Web applica- tions aimed at library patrons. However, Ajax is still maturing, and the barrier to entry for developers remains high. We are a long way from an Ajax button appearing in Dreamweaver. Reusable, well-tested components, such as Rico, and software frameworks, such as Ruby on Rails, Sun’s J2EE framework, and Microsoft’s Atlas, will help to make Ajax technology accessible to a wider range of developers.42 As with all new technologies, there is a temptation to use Ajax simply because it exists. As Ajax matures, it is important that its focus does not become the enabling of “cool” features but remains the optimization of the user experience. References and notes 1. OJAX homepage, http://ojax.sourceforge.net (accessed Apr. 5, 2006). 2. J. J. Garrett, “Ajax: A New Approach to Web Applica- tions,” Feb. 18, 2005, www.adaptivepath.com/publications/ essays/archives/000385.php (accessed Nov. 11, 2005). 3. Ibid. 4. J. Nielsen, “The Need for Speed,” Alertbox Mar. 1, 1997, www.useit.com/alertbox/9703a.html (accessed Nov. 11, 2005). 5. Dynamic HTML and XML: The XMLHttpRequest Object, http://developer.apple.com/internet/webcontent/xmlhttpreq .html (accessed Apr. 5, 2006). 6. JavaScript Object Notation, Wikipedia definition, http:// en.wikipedia.org/wiki/JSON (accessed Apr. 5, 2006). 7. Google Gmail, http://mail.google.com (accessed Apr. 5, 2006); Google Suggest, www.google.com/webhp?complete =1&hl=en (accessed Apr. 5, 2006); Google Groups, http://groups .google.com (accessed Apr. 5, 2006); Google Maps, http://maps .google.com (accessed Apr. 5, 2006). 8. P. Binkley, “Ajax and Auto-completion,” Quædam cuiusdam blog May 18, 2005, www.wallandbinkley.com/quaedam/?p=27 (accessed Nov. 11, 2005). 9. Francis Shanahan, Zuggest, www.francisshanahan.com/ zuggest.aspx (accessed Apr. 5, 2006). 64 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 10. A. Rhyno, “Ajax and the Rich Web Inter- face,” LibraryCog blog Apr. 10, 2005, http://librarycog .uwindsor.ca:8087/artblog/librarycog/1113186562 (accessed Nov. 11, 2005); R. Tennant, “Tennant’s Top Tech Trend Tidbit,” LITA Blog June 22, 2005, http://litablog.org/?p=35 (accessed Nov. 11, 2005). 11. T. Hickey, “Ajax and Web Interfaces,” Outgoing blog, Mar. 31, 2005. Retrieved Nov. 11, 2005 http://outgoing.typepad .com/outgoing/2005/03/web_application.html. 12. OCLC DeweyBrowser. http://ddcresearch.oclc.org/ ebooks/fileServer (accessed Apr. 5, 2006). 13. Hickey, “Ajax and Web Interfaces.” 14. J. Wusteman, “From Ghostbusters to Libraries: The Power of XUL,” Library Hi Tech 23, no 1 (2005a). Retrieved Nov. 11, 2005 www.ucd.ie/wusteman/; Cover Pages, Microsoft Exten- sible Application Markup Language (XAML), http://xml.cover pages.org/ms-xaml.html (accessed Apr. 5, 2006). 15. Google Extensions for Firefox, http://toolbar.google .com/firefox/extensions/index.html (accessed Apr. 5, 2006). 16. C. Adams, “Ajax: Usable Interactivity with Remote Scripting,” SitePoint. (Jul. 13, 2005), www.sitepoint.com/article/ remote-scripting-ajax (accessed Nov. 11, 2005). 17. XHTML 2.0, W3C Working Draft, May 27, 2005, www .w3.org/TR/2005/WD-xhtml2-20050527 (accessed Apr. 5, 2006). 18. Client/server model, http://en.wikipedia.org/wiki/ Client/server (accessed Apr. 5, 2006). 19. DLL Hell, http://en.wikipedia.org/wiki/DLL_hell (accessed Apr. 5, 2006). 20. J. Nielsen, “The Need for Speed.” 21. Service-Oriented Architecture, http://en.wikipedia.org/ wiki/Service-oriented_architecture (accessed Apr. 5, 2006). 22. J. Wusteman, “Realizing the Potential of Web Services,” OCLC Systems & Services: International Digital Library Perspectives 22, no. 1 (2006): 5–9. 23. ARC—A Cross Archive Search Service, Old Dominion University Digital Library Research Group, http://arc.cs.odu .edu (accessed Apr. 5, 2006); NISO MetaSearch Initiative, www .niso.org/committees/MS_initiative.html (accessed Apr. 5, 2006); ARC download page, SourceForge, http://oaiarc.source forge.net (accessed Apr. 5, 2006). 24. Open Archives Initiative Protocol for Metadata Harvest- ing, www.openarchives.org/OAI/openarchivesprotocol.html (accessed Apr. 5, 2006). 25. OJAX download page, SourceForge, http://sourceforge .net/projects/ojax (accessed Apr. 5, 2006). 26. Apache Jakarta Project, http://jakarta.apache.org (accessed Apr. 5, 2006); Apache Jakarta Commons DBCP, http:// jakarta.apache.org/commons/dbcp (accessed Apr. 5, 2006); Apache Jakarta Commons DbUtils, http://jakarta.apache.org/ commons/dbutils (accessed Apr. 5, 2006). 27. Agile software development definition, Wikipedia, http://en.wikipedia.org/wiki/Agile_software_development (accessed Apr. 5, 2006). 28. Prototype JavaScript Framework, http://prototype.conio .net (accessed Apr. 5, 2006); script.aculo.us, http://script.aculo .us (accessed Apr. 5, 2006); Rico, http://openrico.org/rico/ home.page (accessed Apr. 5, 2006). 29. Sabre, www.sabre.com (accessed Apr. 5, 2006). 30. NISO MetaSearch Initiative, www.niso.org/committees/ MS_initiative.html (accessed Apr. 5, 2006). 31. Google Toolbar, http://toolbar.google.com (accessed Apr. 5, 2006). 32. Dublin Core Metadata Initiative, http://dublincore.org (accessed Apr. 5, 2006). 33. MySQL, www.mysql.com (accessed Apr. 5, 2006). 34. Google Help Center, Advanced Operators, www.google .com/help/operators.html (accessed Apr. 5, 2006). 35. Apache Lucene, http://lucene.apache.org (accessed Apr. 5, 2006). 36. J. Nielsen, “Search: Visible and Simple,” Alertbox May 13, 2001, www.useit.com/alertbox/20010513.html (accessed Nov. 11, 2005). 37. Francis Shanahan, Zuggest. 38. J. R. Baker, “The Impact of Paging versus Scrolling on Reading Online Text Passages,” Usability News 5, no. 1 (2003), http://psychology.wichita.edu/surl/usabilitynews/51/ paging_scrolling.htm (accessed Nov. 11, 2005); J. Nielsen, “Search: Visible and Simple.” 39. J. J. Garrett, “Ajax: A New Approach to Web Applica- tions.” 40. del.icio.us, http://del.icio.us (accessed Apr. 5, 2006); Furl, www.furl.net (accessed Apr. 5, 2006). 41. Google Sitemaps (BETA) Help, www.google.com/web masters/sitemaps/docs/en/other.html (accessed Apr. 5, 2006). 42. Ruby on Rails, www.rubyonrails.org (accessed Apr. 5, 2006); Java 2 Platform, Enterprise Edition (J2EE), http://java .sun.com/j2ee (accessed Apr. 5, 2006); M. LaMonica, “Microsoft Gets Hip to AJAX,” CNET News.com, June 27, 2005, http:// news.com.com/Microsoft+gets+hip+to+AJAX/2100-1007_3 -5765197.html (accessed Nov. 11, 2005). 3333 ---- Digitization has bestowed upon librarians and archivists of the late 20th and early 21st centuries the opportunity to reexamine how they access their collections. It draws these two traditional groups together with IT specialists in order to collaborate on this new great challenge. In this paper, the authors offer a strategy for adapting a library system to traditional archival practice. T he librarian and the archivist . . . both collect, pre- serve, and make accessible materials for research; but significant differences exist in the way these materials are arranged, described, and used.”1 Among the items usually collected by libraries are: published books and serials, and in more recent times, commercially available sound recordings, films, videos, and electronic resources of various types. Archives, on the other hand, tend to collect original records of an organization, unique personal papers, as well as other effects of individuals and families. Each type of institution, given its particular emphasis, has its own traditions and its own methods of dealing with its collections. Most mid- to large-sized automated libraries in the United States and abroad use Machine Readable Cataloging (MARC) records to form the basis of their online catalogs. Bibliographic records, including those in the MARC format, generally represent an individually published item, or “information product,”2 and describe the physical characteristics of the item itself. The basic unit of archival description, however, is a much more complex entity than the basic unit of bibliographic description and often involves multiple hierarchical levels that may or may not extend down to the level of individual items. At Portland State University (PSU) the authors examined whether the capabilities of their present integrated library system could be expanded to capture the hierarchical structure of traditional archival finding aids. ■ Background As early as 1841, the cataloging rules established by Panizzi were geared toward locating individual pub- lished items. Panizzi based his rules on the idea that any person looking for any particular book should be able to find it through the catalog.3 This tradition has con- tinued over time up through current standards such as the Anglo-American Cataloguing Rules and reaffirmed in MARC, the standard for the representation and exchange of bibliographic information that has been widely used by libraries for over thirty years.4 Archival description, on the other hand, is generally based on the fonds, that is, the entire collection of materi- als in any medium that were created, accumulated, and used by a particular person, family, or organization in the course of that creator’s activities and functions.5 Thus, the basic unit of archival description, usually a finding aid, is a much more complex entity than the basic unit of biblio- graphic description, often involving multiple hierarchical levels of description that may or may not extend down to the level of individual items. Before archival description begins, the archivist iden- tifies related groups of materials and determines their proper arrangement. Once the arrangement is deter- mined, then the description of the materials reflects both their provenance and their original order.6 The first explicit statement of the levels of arrangement in an archi- val collection was by Holmes and has since been elevated to the level of dogma in the archival community.7 A more recent statement in Describing Archives: A Content Standard (DACS) indicates that the actual levels of arrangement may differ for each collection. By custom, archivists have assigned names to some, but not all, levels of arrangement. The most commonly identified are collection, record group, series, file (or filing unit), and item. A large or complex body of mate- rial may have many more levels. The archivist must determine for practical reasons which groupings will be treated as a unit for purposes of description.8 Rephrasing Holmes, the five levels of arrangement can be defined as: 1. The collection level which Holmes called the depos- itory level—the breakdown of the depository’s complete holdings into a few major divisions based on the broadest common denominator 2. The record group level—the fonds or complete col- lection of the papers of a particular administrative division or branch of an organization or of a par- ticular individual or family 3. The series level—the breakdown of the record group into natural series and the arrangement of each series with respect to the others 4. The filing unit level—the breakdown of each series into unit components, which are usually fairly obvious if the documents are kept in file folders 5. The document level—the level of individual items Digital Collection Management through the Library Catalog Michaela Brenner, Tom Larsen, and Claudia Weston DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 65 Michaela Brenner (brennerm@pdx.edu) and Tom Larsen (larsent@pdx.edu) are Database Maintenance and Catalog Librarians, and Claudia Weston (westonc@pdx.edu) is Assis- tant University Librarian for Technical Services, Portland State University. 66 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 The end result of archival description is usually a find- ing aid that ideally presents an accurate representation of the items in an archival collection so that users can, as independently as possible, locate them.9 Building on the print finding aid, the archival com- munity has explored a number of mechanisms for disseminating information on the availability of items in their collections. In 1983, the USMARC Format for Archival and Manuscript Control (MARC-AMC) was released and subsequently sanctioned for use as one possible standard data structure and communication protocol in the SAA descriptive standard Archives, Personal Papers, and Manuscripts (APPM) and its succes- sor, DACS.10 Its adoption, however, has been somewhat controversial among archivists.11 The difficulty in capturing the hierarchical nature of collections through the MARC format is one factor that has limited the use of MARC by the archival community. While it is possible to encode this hierarchical description in MARC using notes and linking fields, few archivists in practice have actually made use of these linking fields.12 Thus, in archival cataloging, MARC records have been used primarily for collection-level description, allowing users to search and discover only general information about archival collections in online catalogs while the finding aid has remained the primary tool for detailed data at all levels of description. In 1995, the Encoded Archival Description (EAD) emerged as a new standard for encoding descriptions of archival collections. The EAD standard, like the MARC standard, allows for the electronic storage and exchange of archival information; but unlike MARC, it is based on the finding aid. EAD is well suited for encoding the hierarchical relationships between the different parts of the collection and displaying them to the user, and it has become more widely adopted by the archival com- munity. As outlined, the standards and systems chosen by an institution are dictated by the needs and traditions of that institution. The archival community relies heavily on finding aids and, with increasing frequency, on EAD, their electronic extension; whereas the library commu- nity heavily relies on the Online Public Access Catalog (OPAC) and MARC records. New trends capitalizing on the strengths of both traditions are evolving as libraries and archives seek ways to improve access to their archi- val and digital collections. ■ Access to digital archival collections in libraries When searching the Web for collections of informa- tion, one frequently encounters separate interfaces for traditional library, archival, and digital collections even though these collections may be owned, sponsored, hosted, or licensed by a single institution. Descriptive records for traditional library materials reside in the OPAC and are constructed according to standard library practice, while finding aids for the archival and digital collections increasingly appear on specially designed Web sites. This, of course, means that users searching the OPAC may miss relevant materials that are described only in the archival and digital documents database or Web site. Similarly, users searching the archival and digi- tal documents database or Web site may miss relevant materials that are described only in the OPAC. In other instances, libraries, such as the Library of Congress, selectively add records to their OPACs for indi- vidual items in their archival and digital document col- lections. This incorporation allows users more complete access to items within the library’s collections. Authority control and the assignment of descriptors further enhance access to the item-level records. To minimize processing costs, however, libraries frequently create brief descrip- tive records for items, thereby limiting their value to patrons.13 By creating descriptive records for the items only, libraries also obscure the hierarchical relationships among the items and the collections in which they reside. These relationships can provide the user with a useful context for the individual items and are an essential part of archival description. Still other libraries, such as the University of Washing- ton, include collection-level MARC records in the OPAC for their archival and digital document collections. These are searchable in the OPAC in the same way as biblio- graphic records for other materials. These collection-level records can then in turn be linked to finding aids that describe the collections more fully.14 Collection-level records often are used in libraries where library resources may be insufficient for cataloging large collections of materials at the item level.15 The guidelines for collec- tion-level records in APPM and DACS, however, allow for additional fields that are not ordinarily used in library bibliographic records. These include such things as descriptions of the organization and arrangement of the collection, citations for published descriptions of the collection and links to the finding aid, and acknowledg- ment of the donors, as well as ample subject access to the collection. Despite their potential for detail, collection- level records cannot provide the same degree of access to individual items as full item-level records. ■ An approach taken at Portland State University Library In many ways, archival and digital-document collections are continuing resources. A continuing resource is defined as “. . . a bibliographic resource that is issued over time DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 67 with no predetermined conclusion. Continuing resources include serials and ongoing integrating resources.”16 Like published continuing resources, archival and digital collections generally are created over time with no predetermined conclusion. In fact, some archival col- lections continue to grow even after part of the collection has been accessioned by a library or archive. Thus, even though many of the individual items in the collection might be properly treated as monographic (not unlike serial analytics), it would not be unreasonable to treat the entire collection as a continuing resource. With this in mind, the authors examined whether their electronic-resource management system could be adapted to accommodate evolving collections of digitized and born-digital material. More specifically, the present system was examined to determine whether its capabili- ties could be expanded to capture the hierarchical struc- ture found in traditional archival finding aides. The electronic resource management system in use by PSU Library is Innovative Interfaces’ Electronic Resource Management (ERM) product. According to Innovative Interfaces Inc.’s (III) marketing literature, “[ERM] effec- tively controls subscription and licensing information for licensed resources such as e-journals, Abstracting and Indexing (A&I) databases, and full-text databases.”17 To control and provide improved access to these resources, ERM stores details about purchase orders, aggregators and publishers, subscription terms, licensing conditions, breadth of holdings, internal and external contact infor- mation, and other aspects of these resources that individ- ual libraries consider relevant. For increased security and data integrity, multilevel permissions restrict viewing and editing of data to the appropriate level of staff or patron. The ability of ERM to replicate the two-level hierarchi- cal relationships between aggregators or publishers and the electronic and print resources they provide was of par- ticular interest to the authors. Through ERM and III’s batch record load capabilities, bibliographic and resource records can be loaded into the III system using delimited source files such as those provided by Serials Solutions. Resource records are the mechanisms used by III to describe digi- tal resources at a collection, subcollection, or title level, thereby enabling the capture of descriptive information not permitted by standard bibliographic records. III uses holdings records to document serial holdings statements. According to the MARC 21 Formats for Holdings Data, a holdings statement is the “record of the location(s) and bibliographic units of a specific bibliographic item held at one or more locations.”18 III holdings records may also contain a URL for connecting to an electronic resource. In figure 1, for example, the resource record shows that PSU Library provides limited access to a number of journal titles through its Springer Journals Online resource. As seen in figure 2, the display of a holdings record embedded in a bibliographic record provides more spe- cific information on the availability of a title through the library’s collection. In this particular example, the information display reveals that print volumes are avail- able for this title but that PSU only has this title avail- able as a part of the Springer-Verlag electronic collection accessible by clicking on the hotlink. More information on the Springer collection can be discovered by clicking on the About Resource button to retrieve the Springer Journals Online resource record. This example, then, represents a two-level hierarchy where the resource Springer Journals Online is analogous to an archival collection and Abdominal Imaging is analogous to an archival series. Adaptation of ERM for library-created digital collec- tions was explored through work being done to fulfill the requirements of a grant received in 2005 by PSU Library. The goal of this grant was “to develop a digital library under the sponsorship of the Portland State University Library to serve as a central repository for the col- lection, accession, and dissemination of key planning documents and reports, maps, and other ephemeral materials that have high value for Oregon citizens and for scholars around the world.”19 The overall collection is called the Oregon Sustainable Community Digital Library (OSCDL). In addition to having its own Web site, it was decided to make this collection accessible through the PSU Library catalog so that patrons could find digitized original documents about the city of Portland together with other library materials. Bibliographic records would be added to the database with hyperlinks to the digitized original documents using existing staff and tools. These biblio- graphic MARC records would be as complete as possible. Initially, attention was focused on documents origi- nating from four different sources: Ernest Bonner, a for- mer Portland city planner; the city of Portland archives; Metro (the regional government for the Portland, Oregon, metropolitan area); and Trimet (the Portland metro- politan public transportation system). Along with the documents, metadata was received from various data- bases. These descriptions ranged from almost nothing to detailed archival descriptions. Unlike the challenge of shifting titles and holdings with typical serials collections, the challenge of this project was to reflect the four hierarchical levels of PSU Library’s collection (figure 3). Innovative’s system struc- ture was manipulated in order to accomplish this. At the core of III’s ERM module are resource records (RR) created to reflect the peculiarities of a particular collection. Linked to these resource records are holdings records (HR) containing hyperlinks to the actual digi- tized documents (Doc H1 – Doc H3) as well as to their respective bibliographic records (BIB Doc H1 – BIB Doc H3) containing additional information on the individual items within the collection (figure 4). 68 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 First, resource records were manually created for three of the subcollections within the Bonner collection. These subcollections contained documents reflecting the development of Harbor Drive, Front Street, and the Park Blocks. The fields defined for the resource records include the resource title; type (digitized documents) and format (PDF) of the resource; a hyperlink to the new OSCDL Web site; content and systems contact names; a brief descrip- tion of the resource; and, most importantly, the Resource ID used to connect holding records for individual docu- ments to the corresponding resource record. Next, the batch-loading function in ERM was used to create bibliographic and holding records and associ- ate them with the resource records. Taking advantage of tracking data produced during the digitization process (figure 5), spreadsheets were created for each collection reflecting the data assigned to each individual digitized document. The document title, the date the document was created, number of pages, and summaries were included. Coordinates for the streets mentioned in the documents were also included. Because ERM uses ISSN numbers and titles as match points for record loads, ”ISSN” numbers were also manufactured for each docu- ment and included in the spreadsheet. These homemade numbers were distinguished by using pdx as a prefix followed by collection and document numbers or letters, for example, pdx0022090 or pdxhdcoll. Fortunately, ERM accepted these dummy ISSNs (figure 6). From this data spreadsheet, the system-required comma delimited coverage load file (*.csv) was also created. For this file, the system only allows a limited number of fields, and is very particular about the right terms, including correct capitalization, for the header row. Individual document titles, the made-up ISSN numbers, individual URLs to the documents, and a collection-specific resource ID (Provider) that connects all the documents from a collection to their respective resource record were included. The resource ID is the same for all documents in one collection (figure 7). In the first attempt, the system was set up to produce holdings and bibliographic records automatically, using the data from the spreadsheets. For the bibliographic records, a system-provided template was created that included some general subject headings, genre headings, an author field, and selected fixed fields, such as language, bibliographic level, and material type (figure 8). Records for the Harbor Drive collection were loaded, and the system created brief bibliographic and holdings records and linked them to the Harbor Drive resource record. The records were globally updated to add the General Material Designator (GMD) “electronic resource” to the title as well as the phrase “digitized document” as a local “call number” to make these documents more visible in the browse screen of the online catalog (OPAC) (figure 9). The digitized documents now could be found in the library catalog by author, subject, or keyword. The brief bibliographic records (figure 10) allow the user to go either to the digitized document via URL or to the resource record with more information on the resource itself and links to other items in the same collection. The resource record then provides links either to the new OSCDL Web site (via the - Oregon Sustainable Community Digital Library link at the bottom of the resource record), to the bibliographic description of the individual document, or to the digitized document (figure 11). However, the quality of the brief bibliographic re- cords that had been batch generated through the sys- tem-provided template was not satisfactory (figure 8). It was decided that more document-specific data like summaries, number of pages, the dates the documents were created, geographical information, and document- level local subject headings should be included. These data were already available from the original spread- sheets. With limited time and staff resources, full bib- liographic MARC records were batch created using the spreadsheets, detailed templates adjusted slightly to each collection, Microsoft Mail Merge, and finally, the MarcEdit program created by Terry Reese of Oregon State University (http://oregonstate.edu/~reeset/mar- cedit/html/index.html). This gave maximum control over the data to be included and the way they would be included. It also eliminated the need to clean up the data following the record load (figure 12). Subsequently, full bibliographic records were created for the subcollections Harbor Drive, Front Street, and Park Blocks, to connect them to the next higher level, the Bonner Collection (figure 3). These records were also contributed to WorldCat. Mimicking the process used at the document level, a resource record was created for the Bonner Collection and the holdings records for the three subcollections were connected with their corresponding bibliographic records (figure 13). Resource records with their corresponding item-level records for Trimet, the City Archives, and Metro fol- lowed. The final step was then to add the resource record and the bibliographic record for the whole OSCDL col- lection (figure 14). Since this last bibliographic record is not connected to a collection above it, there is only a hyperlink to the OSCDL resource record (figure 15). More subcollections and their corresponding digi- tal documents are continually being added to OSCDL. Structures in PSU Library’s OPAC are adjusted as these collections change. DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 69 ■ Conclusion According to Salter, “Digitizing, the current challenge that straddles the 20th and 21st centuries, has given archi- vists and librarians pause to reconsider access to their collections. The world of digitization is the catalyst for IT people, librarians, and archivists to unify the way they do things.”20 In this paper, a strategy has been offered for adapting a library system to traditional archival practice. By making use of some of the capabilities of the module in PSU Library’s Integrated Library System that was originally designed for managing electronic resources, a method was developed for managing digital archival col- lections in a way that incorporates some of the features of a traditional finding aid. The contents of the various hierarchical levels of the collection are fully represented through the manipulation of the record structures avail- able through PSU’s system. This technique provides for enhanced access to the individual items of a collection by giving the context of the item within the collection. Links between the hierarchical levels facilitate navigation between the levels. Although the records created for traditional library systems are not as rich as those found in traditional finding aids, or in EAD, their electronic equivalent; and the visual arrangements are not as intriguing as a well- planned Web site, the ability to show how items fit within the greater context of their respective collection(s) is a step toward reconciling traditional library and archival practices. Enabling the library user to virtually browse through the overall resources offered by the library and then, if desired, through the various levels of a collection for relevant resources enhances the opportunities pre- sented to the user for finding relevant information. References and notes 1. Society of American Archivists, “So You Want to Be an Archivist: An Overview of the Archival Profession,” 2004, www.archivists.org/prof-education/arprof.asp (accessed Apr. 24, 2006). 2. Kent M. Haworth, “Archival Description: Content and Context in Search of Structure,” Journal of Internet Cataloging 4, no. 3/4 (2001): 7–26. 3. Antonio Panizzi, “Rules for the Compilation of the Cata- logue,” The Catalogue of the British Museum 1 (1841): v–ix. 4. Joint Steering Committee for Revision of AACR, Anglo- American Cataloguing Rules, 2nd ed., 2002 revision (Chicago: ALA, 2002). 5. Society of American Archivists, Describing Archives: A Content Standard (Chicago: Society of American Archivists, 2004). 6. Haworth, “Archival Description.” 7. Oliver W. Holmes, “Archival Arrangement: Five Different Operations at Five Different Levels,” American Archivist 27, no. 1 (1964): 21–41; Terry Abraham, “Oliver W. Holmes Revisited: Levels of Arrangement and Description of Practice,” American Archivist 54, no. 3 (1991): 370–77. 8. Society of American Archivists, Describing Archives: A Content Standard (Chicago: Society of American Archivists, 2004); xiii. 9. Haworth, “Archival Description.” 10. Society of American Archivists, Describing Archives: A Content Standard (Chicago: Society of American Archivists, 2004); Steven L. Hensen, comp., Archives, Personal Papers, and Manuscripts, 2nd ed. (Chicago: Society of American Archivists, 1989). 11. Peter Carini and Kelcy Shepherd, “The MARC Standard and Encoded Archival Description,” Library Hi Tech 22, no. 1 (2004): 18–27; Steven L. Hensen, “Archival Cataloging and the Internet: The Implications and Impact of EAD,” Journal of Inter- net Cataloging 4, no. 3/4 (2001): 75–95. 12. Abraham, “Oliver W. Holmes Revisited.” 13. Elizabeth J. Weisbrod and Paula Duffy, “Keeping Your Online Catalog from Degenerating into a Finding Aid: Con- siderations for Loading Microformat Records into the Online Catalog,” Technical Services Quarterly 11, no. 1 (1993): 29–42. 14. Carini and Shepherd, “The MARC Standard and Encoded Archival Description.” 15. See, for example, Margaret F. Nichols, “Finding the Forest among the Trees: The Potential of Collection-Level Catalog- ing,” Cataloging & Classification Quarterly 23, no. 1 (1996): 53–71; and Weisbrod and Duffy, “Keeping Your Online Catalog from Degenerating into a Finding Aid.” 16. Joint Steering Committee for Revision of AACR, Anglo- American Cataloguing Rules, D-2. 17. Innovative Interfaces Inc., “Electronic Resources Manage- ment,” 2005, www.iii.com/pdf/lit/eng_erm.pdf (accessed Apr. 24, 2006). 18. Library of Congress, MARC 21 Format for Holdings Data: Including Guidelines for Content Designation (Washington, D.C.: Cataloging Distribution Service, Library of Congres, 2000), Appendix E–Glossary. 19. Carl Abbot, “Planning a Sustainable Portland: A Digital Library for Local, Regional, and State Planning and Policy Documents—Framing Paper,” 2005, http://oscdl.research.pdx. edu/framing.php (accessed Apr. 24, 2006). 20. Anne A. Salter, “21st-Century Archivist,” Newsletter, 2003, www.lisjobs.com/newsletter/archives/sept03asalter.htm (accessed Apr. 24, 2006). 70 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Figure 1. Example of resource record from the PSU Library catalog (search conducted Nov. 4, 2005) Appendix. Figures DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 71 Figure 2. Example of a bibliographic record for a journal title from the PSU Library catalog (search conducted Nov. 4, 2005) 72 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Figure 4. Resource record Harbor Drive with linked holdings records, bibliographic records, and original documents Figure 3. Partial diagram of the hierarchical levels of the collection DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 73 Figure 7. Comma delimited coverage load file (*.csv) Figure 6. Data spreadsheet Figure 5. Spreadsheet for tracking data 74 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Figure 9. Browse screen in OPAC Figure 8. Bibliographic records template DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 75 Figure 11. Resource record with various links Figure 10. System-created brief bibliographic record in OPAC 76 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Figure 13. Bonner resource record with linked holdings records, bibliographic records, and original documents Figure 12. Full bibliographic record in OPAC DIGITAL COLLECTION MANAGEMENT THROUGH THE LIBRARY CATALOG | BRENNER, LARSEN, AND WESTON 77 Figure 15. Bibliographic record for the OSCDL collection Figure 14. Outline of linked records in the collection 3334 ---- 78 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 In the early years of modern information retrieval, the fundamental way in which we understood and evalu- ated search performance was by measuring precision and recall. In recent decades, however, models of evaluation have expanded to incorporate the information-seeking task and the quality of its outcome, as well as the value of the information to the user. We have developed a systems engineering-based methodology for improving the whole search experience. The approach focuses on understand- ing users’ information-seeking problems, understand- ing who has the problems, and applying solutions that address these problems. This information is gathered through ongoing analysis of site-usage reports, satisfac- tion surveys, Help Desk reports, and a working relation- ship with the business owners. ■ Evaluation models In the early years of modern information retrieval, the fundamental way in which we understood and evalu- ated search performance was by measuring precision and recall.1 In recent decades, however, models of evaluation have expanded to incorporate the information-seeking task and the quality of its outcome, cognitive models of information behavior, as well as the value of the informa- tion to the user.2 The conceptual framework for holistic evaluation of libraries described by Nicholson defines multiple perspectives (internal and external views of the library system as well as internal and external views of its use) from which to measure and evaluate a library system.3 The work described in this paper is consistent with these frameworks as it emphasizes that, while efforts to improve search may focus on optimizing preci- sion or recall, it is equally important to recognize that the search experience involves more than a perfect set of high-precision, high-recall search results. The total search experience and how well the system actually helps the user solve the search task must be evaluated. A search experience begins when users enter words in a search box. It continues when the users view some representation (such as a list or a table) of candidate answers to their queries. It includes the users’ reactions to the usefulness of those answers and their representa- tion in satisfying information needs, and continues with the users clicking on a link (or links) to view content. Optimizing search results without considering the rest of the search experience and without considering user behavior is missing an opportunity to further improve user success. For example, the experience is a failure if typical users cannot recognize the answers to their infor- mation need because the items lack a recognizable title or an informative description, or they involve extensive scrolling or hard-to-use content. ■ Proposed solutions Problems with search, such as low precision or low recall, are often addressed by either metadata solutions (add- ing topical tags to content objects based on controlled vocabularies) or replacement of the search engine. The problems with the metadata approach include the time and effort required to establish, evolve, and maintain taxonomies, and the need for trained intermediaries to apply the tags.4 A community of stakeholders may be convened to define the controlled vocabulary, but often the lowest common denominator prevails, the champi- ons and stakeholders leave, and no one is happy with the resulting standard. Even with trained intermediaries, inter-indexer inconsistency compromises this approach, and inconsistent term application can cause degradation of search results.5 Another shortcoming of the metadata approach is that a specific metadata classification is just a snapshot in time and assumes that there is only one particular hierarchy of the information in the corpus. In reality, however, there is almost always more than one way to describe a concept, and the taxonomy is the view of only one individual or group of individuals. In addition, topical metadata is often implemented with little understanding of the types of queries that are submitted or the probable user search behavior. The other approach to improving search results— replacing a search engine—is not a guarantee to fixing the problem because it focuses only on improving precision (and perhaps recall as well) without understanding the true barriers to a successful search experience. ■ IRS.gov IRS.gov, one of the most widely used government Web sites, is routinely accessed by millions of people each month (more than 27 million visits in April 2005). As an informational site, the key goal of IRS.gov is to direct visitors quickly to useful information, either through Marcia D. Kerchner (mkerchner@mitre.org) is a Principal Information Systems Engineer at the MITRE Corporation, McLean, Va. A Dynamic Methodology for Improving the Search Experience Marcia D. Kerchner ARTICLE TITLE | AUTHOR 79A DYNAMIC METHODOLOGY FOR IMPROVING THE SEARCH EXPERIENCE | KERCHNER 79 navigation or a search function. Given that there were almost 16 million queries submitted to IRS.gov in April 2005, search is clearly a popular way for its users to look for information. This paper offers an alternative to con- ventional search-improvement approaches by presenting a systems engineering-based methodology for improv- ing the whole search experience. This methodology was developed, honed, and modified in conjunction with work performed on the IRS.gov Web site over a three- year period. A similar strategy of “sense-and-respond” for information technology (IT) departments of public organizations that involves systematic intelligence gath- ering on potential customer demand, a rapid response to fulfill that demand, and metrics to determine how well the demand was satisfied, has recently been described.6 The methodology described in this paper focuses on analyzing the information-seeking behaviors and needs of users and determining the requirements of the busi- ness owners (the IRS business operating divisions that provide content to IRS.gov, such as Small Business and Self-Employed, Wage and Investment) for directing users to relevant content. It is based on the assumption that a Web site must evolve based on its user needs, rather than expecting users to adapt to its singularities. To sup- port this evolution, this approach leverages techniques for query expansion and document-space modification.7 Dramatic improvements in quality of service to the user have resulted, enhancing the user experience at the site and reducing the need to contact the Help Desk. The approach is particularly applicable for those government, corporate, and commercial Web sites where there is some control over the content, and usage can be categorized into regular patterns. The rest of this paper provides a case study in the application of the methodology and the application of metrics, in addition to precision and recall, to measure search experience improvement. ■ Conceptual framework While analysis of search results often focuses on search syntax and search-engine performance, there are actu- ally several steps in the retrieval process, from the user identifying an information need to the user receiving and reviewing query results. As shown in figure 1, find- ing information is a holistic process. There are several opportunities to improve the whole user experience by fine-tuning this process with a variety of tools—from document engineering to results categorization. Once the user and business-owner needs are understood, the appropriate tools to address specific issues can be identified. The tools in our toolkit are described in the follow- ing sections. Document engineering Document engineering includes: ■ Document-space modification: Modifying the docu- ment space by adding terms to content (especially to titles) that are good discriminators and reflect terms commonly entered by users. This approach has the added benefit of making the content more under- standable to users. ■ Establishment of content-quality standards: Defining business processes that improve content quality and organization. Document-space modification There is significant syntactic and semantic imprecise- ness in the English language. In addition, because of the inadequacies of human or automatic keyword assign- ment, standard means of representing documents in indexes by statistical term associations and frequency counts or by adding metadata tags are not definitive enough to produce a space that is an exact image of the original documents. Document-space modifica- tion moves documents in the document space closer to future similar queries by adding new terms or modify- ing the weight of existing terms in the content (figure 2).8 The document space is thus modified to improve retrieval. For IRS.gov, rather than adjusting content weights, titles and content are modified to adjust to changing terminology and user needs. Establishment of content-quality standards The quality of the search correlates with the quality of the content. Improved search results can be achieved by applying good content-creation practices. Retrieval can be significantly improved by addressing problems observed in the content. These problems include inconsistencies in term use—for example, Earned Income Credit (EIC) ver- sus Earned Income Tax Credit (EITC)—duplicate content, insufficiently descriptive page titles, missing document summaries, misspellings, and inconsistent spellings. Figure 1. The Information Retrieval Process 80 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Processes to improve content quality should estab- lish standards for consistent term usage in content, as well as standards for consistent and descriptive naming of content types (for example, IRS types include forms, instructions, and publications). These processes will not only improve search precision, but will also help users identify appropriate content in the search results. For example, content entitled “Publication 503” in response to the query “child care” may be the perfect answer (with excellent precision and recall), but the user will not recog- nize it as the right answer. A title such as “Publication 503: Child and Dependent Care Expenses” will clearly point the user to the relevant information. Usability tests conducted in March 2005 for IRS.gov confirmed that content organization plays an important role in the perceived success of a user’s search experi- ence. Long pages of links or scrolling pages of content left some users confused and overwhelmed, unable to find the needed information. For these queries, although the search results were perfect, with a precision of 100 percent after one document, the search experiences were still failures. Query enhancement The technique of relevance feedback for query expansion improves retrieval in an iterative fashion.9 According to this approach, the user submits a query, reviews the search results, and then reports query-document rel- evance assessments to the system. These assessments are used to modify the initial query, that is, new terms are added to the initial query (hopefully) to improve it, and the query is resubmitted. If one visualizes the content in a collection as a space (figure 3), this approach attempts to move the query closer to the most relevant content. A drawback of relevance feedback is that it is not generally collected over multiple user sessions and over time, so the next user submitting the same query has to go through the same process of providing results evalu- ations for query expansion. Borlund has noted that, given that an individual user ’s information need is personal and may change over session time, relevance assessments can only be made by a user at a particular time.10 However, on IRS.gov, where there are many common queries for which there is a clear best-guess response, there is valuable relevance information that, if captured once, could benefit tens of thousands of users for specific queries. In fact, in April 2005, the top four hundred queries represented almost half of all the queries. Another drawback of the relevance-feedback ap- proach is that it forces the user, novice or expert, to become engaged in the search process. As noted previ- ously, users are generally not interested in becoming search experts or in becoming intimately involved in the process of search. The relevance-feedback approach tries to change users’ behavior and forces them to find the specific word or words that will best retrieve the relevant information. In fact, some research has shown that the potential benefits of relevance feedback may be hard to achieve primarily because searchers have difficulty find- ing useful terms for effective query expansion.11 To avoid requiring users to submit relevance-feedback judgments, the methodology uses alternative approaches for gathering feedback: (1) mining sources of input that do not require any additional involvement on the part of the users; and (2) soliciting relevance judgments from subject matter experts. As noted above, while best results may be different per task and per user, particularly given the shortness of the queries, our goal is to maximize the good results for the maximum number of people. Best-guess results are derived from a variety of sources, including usability testing, satisfaction survey questionnaires, and business- content owners. For example, users entering the common query “1040ez” can be looking for information on the form or the form itself. Given that—as shown in table 1 (based on the responses of 11,715 users to satisfaction sur- veys in 2005)—the goal of 39 percent of IRS.gov searchers is to download a form as opposed to 28 percent seeking to obtain general tax information, the retrieval of the 1040ez form and its instructions is prioritized, while also retrieving any general related information. Figure 2. Document-space modification Figure 3. Query modification ARTICLE TITLE | AUTHOR 81A DYNAMIC METHODOLOGY FOR IMPROVING THE SEARCH EXPERIENCE | KERCHNER 81 We can determine the best-guess results as follows: ■ Review the search results for terms that are on the frequently entered search-terms list ■ Review Help Desk contacts, satisfaction-survey com- ments, and zero-results reports to identify infor- mation users who are having trouble finding or understanding ■ Identify best results by working with the business owners as necessary ■ Analyze why best results are not being retrieved for a particular query ■ Add appropriate synonyms for this and relat- ed queries ■ Engineer relevant documents (as described above) In this way, the thesaurus, as the source for query enhancement, is an evolving structure that adapts to the needs of the users rather than being a fixed entity of elements based on someone’s idea of a standardized vocabulary. Search improvement We can intercept very popular queries and return a set of preconfigured results or a quick link at the top of the search-results listing. For example, the user enter- ing “1040” sees a list of the most popular 1040-related forms and instructions in addition to a list of other search results. There were more than 31,000 users in April 2005 who requested the I-9 form. Since the form is not an IRS form, users are presented with a link to the Bureau of Citizen and Immigration Services Web site. The tens of thousands of users who look for state tax forms on IRS.gov are directed either to the spe- cific state-tax-form Web- site page or to a page with links to state tax sites. This unique and user-friendly approach provides a sig- nificant improvement over a page that tells the user that there is no matching result, leaving him to fend for himself. Another technique for improving search preci- sion (not currently used for IRS.gov) is to tune and adjust parameters in the search engine, such as the relative weighting of basic metadata tags such as title (if they are used in the relevance calculation). Results-ranking improvement The search results can be programmatically re-ranked before being presented to the user. This approach (not used as yet on IRS.gov) is a variation on the quick links described above for re-ranking more than one result. Categorization A large set of search results can be automatically catego- rized into subsets to help the user find the information he needs. In addition, a “search within a search” function is available to help the user narrow down results. Research to be conducted on commercial products to support auto- matic categorization is planned for the future. Summarization As noted earlier, a barrier to a successful user experience can be the lack of informative descriptions in the search results. Therefore, an important tool for search-experi- ence improvement is to make sure that content titles and summaries are informative, or as a second choice, that the search engine dynamically generates informative sum- maries. Passage-based summaries and highlighted search terms in the summary and the content have become a feature of many commercial search engines as another way to improve the usability of the returned results. In Table 1. Reasons for using IRS.gov Reason for coming to IRS.gov % of total site visitors % of total search users Download a tax form, publication, or instructions 39 39 Obtain general tax information 27 28 Obtain information on e-file 10 10 Other 6 6 Obtain info on tax regulations or written determinations 4 4 Order forms from the IRS 3 4 Sign up or login to e-services 3 3 Link and learn (VITA/VCE) training 3 3 Obtain info on the status of your tax return 2 2 Use online tax calculators 1 1 Obtain info on revenue rulings or court cases 1 1 Obtain an Employer Identification Number (EIN) 1 — Note: Due to rounding, totals may not equal 100%. 82 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 addition, for those PDF publications that lacked informa- tive titles in the title tag, descriptive information from a different metadata field was added to the search display programmatically, which improved the usability of such results significantly. ■ Methodology The methodology for evolving the search functionality is based on a logical, systems-engineering approach to the issue of getting users the information they seek: under- standing the problems, understanding who has the prob- lems, and applying solutions that address the problems. Usability studies, weblogs, focus groups, Help Desk contacts, and user surveys provide different perspectives of the information system. The steps of the methodology are: 1. Understand the user population. 2. Identify the barriers to a successful search experi- ence. 3. Analyze the information-seeking behaviors of the users. 4. Understand the needs of the business owners. 5. Identify and use the appropriate tools to improve the user’s search experience. 6. Repeat as needed. 7. Monitor new developments in search and analytic technologies and replace the search engine as appropriate. Step 1: Understand the user population The first step is to profile and understand the user popu- lation. As mentioned above, an online satisfaction survey was conducted during a six-week period in January– February 2005, to which 11,715 users responded. The users were asked the frequency of their usage of the site, their primary reason for coming to IRS.gov, their category (individual, tax professional, business representative), and how they generally find information on IRS.gov. As shown in tables 1–4, 76 percent of the IRS. gov visitors use it once a month or less (the largest group being those who use it every six months or less), or were using it for the first time; 64 percent are individual taxpayers; 10 percent are tax profes- sionals; 39 percent visit the site to download a form or publication; and 27 percent come for general tax or e-file information. Forty-nine per- cent use the search engine. Not surprisingly, 44 percent of the frequent visitors (those who visit once a week or more) are tax professionals, while 72 percent of the infrequent visitors are individuals or those who represent a business. The most common task of both the most frequent and infrequent visitors is to download a form, publication, or instructions, followed by obtaining general tax informa- tion. Most frequent and infrequent visitors use the search function to locate their information. Thus, the largest group of IRS.gov users consists of average citizens, unfamiliar with the site, who have a specific question or a need for a specific form or publication. These users require high-precision, highly relevant results, and a highly intuitive search interface. They do not want or need to read all the material gen- erated by their search, but they want their question answered quickly. These users are generally not experienced with so- phisticated query language syntax, and because they come to the site no more than once a month, they are not likely to be familiar with its navigational organization. As studies demonstrate, users in general do not want to learn a search engine interface or tailor their queries to the design of a particular search engine.12 They want to find their information now before “search rage” sets in. One study observed that, on average, searchers get frus- trated in twelve minutes.13 Tax professionals form a small but important group of IRS.gov users that includes lawyers, accountants, and tax preparers. They generally use the site on a regular basis, which could be daily, weekly, or monthly. Some of these users, particularly lawyers and accountants, require high relevance in their search results; it is critical that they retrieve every relevant piece of information (e.g., all the tax regulations) related to a tax topic. They may be willing to sift through large results sets to make sure they have seen all the relevant items. In contrast, many tax preparers use the site primarily to download forms and instructions. While these different sets of users have different levels of expertise using the site and somewhat different precision and recall requirements, they do have one char- acteristic in common—they are not interested in search Table 2. Frequency of visits to IRS.gov First time Every six months or less About once a month About once a week Daily More than once a day Site visitor 29% 34% 13% 13% 7% 4% Search user 26% 34% 14% 14% 7% 5% ARTICLE TITLE | AUTHOR 83A DYNAMIC METHODOLOGY FOR IMPROVING THE SEARCH EXPERIENCE | KERCHNER 83 for its own sake. Approaches to improving retrieval results that focus on forcing users to use tools to refine their query to get presumably better search results (e.g., leveraging the power of Boolean or other search syntax) are not desirable in a public Web site environment. The complexity of the search must be hidden behind the search box and users must be helped to find information rather than be expected to master a search function. Step 2: Identify the barriers to a successful search experience There are several categories of reasons why finding infor- mation on a public Web site can be frustrating for the user. ■ Mismatch between user terminology and content terminology  The user search terms may not match the ter- minology or jargon used in the content (e.g., users ask for “Tax Tables” or “Tax Brackets”; the IRS names them “Tax Rate Schedules”).  Multiple synonymous terms or acronyms are found because different authors are pro- viding content on similar topics (e.g., “EIN,” “employer identification number,” “federal id number”; “EIC” versus “EITC”).  Users request the same information in a vari- ety of ways (e.g., “1040ez,” “1040-ez,” “ez,” “form1040EZ,” “1040ez form,” “2005 1040ez,” “ez1040”).  Related content may be inconsistently named, complicating the user’s search process (e.g., “1040X” form versus “1040-X” instructions).  The user may use a familiar acronym that is spelled out in the content (e.g., “poa” for “power of attorney”). ■ Mismatch between user requests and actual content  Many users ask for information that they ex- pect to find on the site but is actually hosted at another site (e.g., “ds156,” a Department of State form; “IT-201,” a New York State tax form). ■ Issues with results listing and content display  Content may lack informative titles.  Automatically generated summaries may not be sufficiently descriptive for users to recog- nize the relevant material in the results listing.  Content may consist of long, scrolling pages, which users find hard to manage. ■ Incomplete user queries  Very short search phrases (average length of less than two words) can make it difficult for a search algorithm to deduce the specific con- tent the user is seeking. Step 3: Analyze the information-seeking behaviors of the users Site-usage reports, satisfaction surveys, Help Desk con- tact reports, zero-results reports, focus groups, and usability studies are valuable sources of information. They should be mined for information-seeking behaviors of the site’s users and other barriers to a successful search experience, as follows: ■ Review site-usage reports for the most frequently entered search terms and popular pages (both may change over time) and the zero-results search terms. Look for:  New terms  Variations on popular terms  Common misspellings or typos  Common searches, including searches for items Table 3. IRS.gov user types Type of user % of total site visitors % of total search users Individual taxpayer 64% 64% Representing a business 11% 11% Tax professional 10% 11% Representing a charity or nonprofit 3% 3% VITA/VCE volunteers 3% 3% Representing a government entity 2% 2% Student 2% 1% IRS employee 1% 2% Other 4% 3% Table 4. How users find information on IRS.gov How do you usually find information on IRS.gov? % of total site visitors Search engine 49% IRS keyword 18% Navigation to the Web page 11% Internet search engine (e.g., Google, Yahoo) 7% Site map 5% Other 4% Bookmarks 3% Links to IRS.gov from other Web sites 3% 84 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 not on the site, that could be candidates for pre- programmed “quick links”  Frequently entered terms—review search re- sults to identify candidates for improvement ■ Review satisfaction surveys over time  Look for new problems that caused satisfac- tion to decrease  Analyze answers to questions asking what people could not find, potentially identifying new barriers to success ■ Conduct usability studies  Identify issues with the user interface as well as with content findability and usability ■ Review Help Desk contact reports  Identify which topics users are having trouble finding or understanding Step 4: Understand the needs of the business owners The business owners are the IRS business operating divisions that provide content to IRS.gov, such as Small Business and Self-Employed, Wage and Investment. It is important to involve them in the process of enhancing the user experience, because they may have specific goals for prioritizing information on a particular topic or may be managing campaigns for highlighting new informa- tion. Thus it is desirable to: ■ Meet with business owners regularly to understand their goals for providing information to users ■ Work with them to increase the findability of their content For example, when an issue in finding a particular content topic is identified (e.g., through an increase in Help Desk contacts), one approach is to show the business owner the actual results that common queries (based on the site-usage reports) on the topic retrieve and then pres- ent suggested alternative results that could be retrieved with a variety of enhancement techniques, such as thesau- rus expansion or title improvement. The business owner can then evaluate which set of results presents the content in the most informative manner to the user. Steps 1–4 facilitate work behind the scenes to gather the data needed to improve precision and recall and to make information more findable. The remaining steps use these data to adapt proven, widely used techniques for improv- ing search experiences to a Web site’s specific environment. Step 5: Identify appropriate tools to improve the information-retrieval process As described in the previous section, the tools in our toolkit are document engineering, query enhancement, search improvement, results-ranking improvement, cat- egorization, and summarization. Step 6: Repeat as needed The process of improving the user search experience is ongoing as the site evolves. At IRS.gov, different search terms appear on the site-usage reports over time, depend- ing on whether or not it is filing season, or as new content and applications are published. Human intervention (with the help of applicable tracking software) is essen- tial for incorporating business requirements, evaluating human behavior, and identifying changing terms. Step 7: Monitor new developments in search and analytic technologies and replace the search engine as appropriate Although a new search engine will not address all the issues that have been described, new features such as passage-based summaries and term-highlighting can improve the search experience. Of course, one should consider replacing a search engine if new technology can demonstrate significantly improved precision and recall. The application of the methodology and the use of the toolkit for IRS.gov will be described in the next section. ■ Findings Site-usage reports In 2003, an example of a serious mismatch in user and content terminology was discovered when site-usage reports were analyzed. Users entering the equivalent terms EIN, employer number, employer id number, and employer identification number retrieved significantly different sets of results. We met with the business owner, who identified a key-starting page that should be retrieved along with other highly relevant pages for all of these query terms. We recommended that “EIN” be added to the title of the key page because, although EIN is a very popular query, the acronym was not used in the content, but was instead spelled out. As a result, the key page was not being retrieved. Synonyms were added to the query enhancement thesaurus to accommodate the variants on the EIN concept. After these steps were implemented, the results were as follows: ■ For the query ein, the target page moved from #16 to #1 ■ For the query ein number, it moved from #17 to #5 ARTICLE TITLE | AUTHOR 85A DYNAMIC METHODOLOGY FOR IMPROVING THE SEARCH EXPERIENCE | KERCHNER 85 ■ For the query employer identification number, it moved up to #2 (it was not in the top 20 previously) ■ All search results now retrieved on the first page for these terms were highly relevant In January 2004, there were approximately twenty thousand queries using these terms, so the search experi- ence has been improved for tens of thousands of users in one month and hundreds of thousands of users through- out the year. ■ Review of Help Desk contacts Help Desk reports summarize, for each call or e-mail, the general topic of the user’s contact (filing information, employer ID number, forms, and publications issues) and the specific question. For example, the report might indicate that a user needed help in finding or download- ing the W-4 form or did not understand the instructions for amending a tax return. As Help Desk contact reports were reviewed, clusters of questions emerged indicating information that many users could not find or under- stand. By analyzing approximately 9,800 contacts (e-mail, telephone, chat) during a peak five-day period in April 2003, four particular areas were identified that were ripe for improvement: 480 users could not find previous years’ forms, which, although they can be found on the site, are not indexed and thus not findable through search; 250 users had questions about where to send their tax returns; 170 users had questions about getting a copy of their tax return or W-2 form; and 77 users had problems finding the 1040X or 1040EZ forms. Utilizing the information retrieval toolkit, the follow- ing improvements were implemented: a) Search for previous years’ forms Tool used: Results-ranking improvement A user requesting a previous year’s forms (for exam- ple, 2002 1099misc) is now presented with a link directly to the page of forms for that specific year, as follows: Recommendation(s) for: 2002 1099misc ■ 2002 Forms and Publications 2002 Forms, instructions, and publications available in PDF format b) Request for filing address Tools used: Document engineering and query en- hancement A new “where to file” page was created. Synonyms were added to the thesaurus to accommodate the varia- tions on how people make this request (address, where to send, where to mail) and to prioritize retrieval of the “where to file” page. c) Request for information about obtaining a copy of a tax return or W-2 form Tools used: Results-ranking improvement and query enhancement A “quick link” was created to the target page for get- ting a copy of returns and W-2 forms and synonyms were added to the thesaurus to prioritize related content for any query containing the word “copy.” d) Requests for 1040X or 1040EZ forms or instructions Tool used: Query enhancement Synonyms were added to the thesaurus to address both the variations on how users requested the 1040X and 1040EZ forms and instructions, and the inconsisten- cies in the titling of these documents (for example, the form and the instructions have different variations of the compound name). ■ Results In 2004, approximately 4,200 contacts were reviewed with the Help Desk during the same time period (the week before April 15) to see whether the changes actually did help users find the information. It should be noted that, during this period from April 2003 to April 2004, many other improvements to the user search experience based on the methodology were deployed. Although the number of visits to IRS.gov increased by approximately 50 percent compared with the same period in 2003, the total number of contacts with the Help Desk decreased by 47 percent (there were approximately 9,800 contacts in this period in 2003). The results for the specific improvements are shown in table 5. The average decrease in contacts for those four topics was 68 percent, compared with the average decrease of 47 percent. This approach has significantly improved the user experience by identifying and addressing subject areas users have trouble finding or understanding on IRS.gov, eliminating the need for them to contact the Help Desk. As a result, an increase of resources at the Help Desk was avoided and, hopefully, user satisfaction improved. 86 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 ■ Conclusions While the case presented in this article was specific to IRS.gov, the methodol- ogy itself has wide appli- cation across domains. Customer service for most government and commercial organizations depends on providing users with relevant infor- mation effectively and efficiently. There are many aspects to achieving this elusive goal of matching users with the specific information they need. In this paper, it has been demonstrated that, rather than focusing just on optimizing the search engine or developing a metadata-based solution, it is essential to view the user search experience from the time content is created to the moment when users have truly found the answer to their information needs. There is no one surefire solution, and one should not assume that enhanced metadata or a new search engine is the only solution to retrieval problems. The methodology described in this paper assumes that users, especially infrequent users of public Web sites, do not wish to become search experts; that intuitive interfaces and meaningful results displays contribute to a successful user experience; and that keeping business owners involved is important. The methodology is based on understanding the behavior of a site’s users in order to identify barriers to a successful search experience, and on understand- ing the needs of business owners. The methodology focuses on adapting the site to its users (rather than vice versa) through document modification, improved content-development processes, query enhancement, and targeted search improvement. It includes improve- ments to the results phase of the search process, such as improved titles and summaries, as well as to the search- and-retrieval phase. This toolkit-based approach is effective and low-cost. It has been used over the past four years to improve the user search experience significantly for the millions of IRS.gov users. Interesting follow-on research could focus on identifying to what degree this methodology can be automated and how to leverage new tools to pro- vide automated support for usage log analysis (such as MondoSearch by Mondosoft). It is clear from this case study that it is time to apply systems engineering rigor to search-experience improve- ment. This approach confirms the need to extend metrics for evaluating search beyond precision and recall to include the totality of the search experience. ■ Future work Teleporting has been defined as an approach in which users try to jump directly to their information targets.14 Trying to achieve perfect search results supports the infor- mation-seeking strategy of teleporting. But the search process may involve more than a single search. People often conduct “a series of interconnected but diverse searches on a single, problem-based theme, rather than one extended search session per task.”15 This approach is similar to the sport of orienteering with searchers using data from their present situation to determine where to go next—that is, looking for an overview first and then submitting more detailed searches. Given the general, nonspecific nature of the short queries submitted by IRS.gov users, the orienteering approach may well describe the information-seeking behaviors of many users. This paper is limited to the improvement of search results for individual searches, but the need to investigate improving the search experience to support orienteering behavior is acknowledged. Future research will investi- gate how to leverage the theoretical models of the infor- mation-search process, such as the anomalous states of knowledge (ASK) underlying information needs and the Information Search Process model.16 References and notes 1. “Common Evaluation Measures,” The Thirteenth Text Retrieval Conference, NIST Special Publication SP 500-261 (Gaith- ersburg, Va.: National Institute of Standards and Technology, 2004), appendix A. 2. Kalervo Jarvelin and Peter Ingwersen, “Information-Seek- ing Research Needs Extension towards Tasks and Technol- ogy,” Information Research 10, no. 1 (2004), http://InformationR .net/ir/10-1/paper212.html (accessed Feb. 2, 2006); K. Fisher, S. Erdelez, and L. McKechnie, eds., Theories of Information Behavior (Medford, N.J.: Information Today, 2005); T. Saracevic and Paul B. Kantor, “Studying the Value of Library and Information Services, Part I: Establishing a Theoretical Framework,” Journal of the Ameri- can Society for Information Science. 48, no. 6 (1997): 527–42. Table 5. Comparison of 2004 and 2003 Help Desk contacts Problem area Number of contacts 2003 Number of contacts 2004 Change 1040X, 1040EZ 77 19 -75% Prior year forms 480 103 -78% Copy of return 170 91 -47% Where to file 250 104 -58% Total 977 317 -68% ARTICLE TITLE | AUTHOR 87A DYNAMIC METHODOLOGY FOR IMPROVING THE SEARCH EXPERIENCE | KERCHNER 87 3. Scott Nicholson, “A Conceptual Framework for the Holistic Measurement and Cumulative Evaluation of Library Services,” Journal of Documentation 60, no. 2 (2004): 164–82. 4. Avra Michelson and Michael Olson, “Dynamically Enabling Search and Discovery TEM,” Internal MITRE presen- tation, McLean, Va., Mar. 30, 2005. 5. Lawrence E. Leonard, “Inter-indexer Consistency Studies, 1954–1975: A Review of the Literature and Summary Of Study Results,” Occasional Paper Series, No. 131, Graduate School of Library Science, University of Illionois, Urbana-Champaign, 1977; Tefko Saracevic, “Individual Differences in Organizing, Searching and Retrieving Information,” in Proceedings of Ameri- can Society for Information Science ’91 (New York: John Wiley, 1991), 82–86; G. Furnas et al., ”The Vocabulary Problem in Human-System Communication,” Communications of the ACM 30, no. 11 (1987): 964–71. 6. Rajiv Ramnath and David Landsbergen, “IT-enabled Sense-and-Respond Strategies in Complex Public Organiza- tions,” Communications of the ACM 48, no. 5 (2005): 58–64. 7. T. L. Brauen et al., “Document Indexing Based on Rel- evance Feedback,” Report No. ISR-14 to the National Science Foundation, Section XI, Department of Computer Science, Cor- nell University, Ithaca, N.Y., 1968; M. C. Davis, M. D. Linsky, and M. V. Zelkowitz, “A Relevance Feedback System Employing a Dynamically Evolving Document Space,” Report No. ISR-14 to the National Science Foundation, Section X, Department of Computer Science, Cornell University, Ithaca, N.Y., 1968; Marcia D. Kerchner, Dynamic Document Processing in Clustered Collec- tions, Report No. ISR-19 to the National Science Foundation, Ph.D. thesis, Department of Computer Science, Cornell Univer- sity, Ithaca, N.Y., 1971. 8. Ibid. 9. Gerard S. Salton, Dynamic Information and Library Process- ing (Englewood Cliffs, N.J.: Prentice-Hall, 1975). 10. P. Borlund, “The IIR Evaluation Model: A Framework for Evaluation of Interactive Information Retrieval Systems,” Information Research 8, no. 3 (2003), http://informationr.net/ir/8 -3/paper152.html (accessed Feb. 15, 2006). 11. Ian Ruthven, “Re-examining the Effectiveness of Interac- tive Query Expansion,” in Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval (New York: ACM Press, 2003), 213–20. 12. Marc L. Resnick and Rebecca Lergier, “Things You Might Not Know about How Real People Search,” 2002, www.search tools.com/analysis/how-people-search.html (accessed Oct. 1, 2005). 13. Danny Sullivan, “WebTop Search Rage Study,” The Search Engine Report, 2001, http://searchenginewatch.com/sereport/ article.php/2163451 (accessed Sept. 10, 2005). 14. J. Teevan et al., “The Perfect Search Engine Is Not Enough: A Study of Orienteering Behavior in Directed Search,” in Pro- ceedings of Computer-Human Interaction Conference ’94 (New York: ACM Press, 2004), 415–22. 15. Vicki O’Day and Robin Jeffries, “Orienteering in an Infor- mation Landscape: How Information Seekers Get from Here to There,” in Proceedings Interchi ’93 (New York; ACM Press, 1993), 438. 16. N. J. Belkin, R. N. Oddy, and H. M. Brooks, “ASK for Information Retrieval, Part I. Background and Theory,” The Journal of Documentation 38, no. 2 (1982): 61–71; N. J. Belkin, R. N. Oddy, and H. M. Brooks, “ASK for Information Retrieval, Part II. Results of a Design Study,” The Journal of Documentation 38, no. 3 (1982): 145–64; Carol C. Kuhlthau, Seeking Meaning: A Process Approach (Norwood, N.J.: Ablex, 1993). 3335 ---- 88 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Article Title: subtitle in same font Author Name and Second Author The concept of digital libraries is familiar to both librar- ians and library patrons today. These new libraries have broken the limits of space and distance by delivering information in various formats via the Internet. Since most digital libraries contain a colossal amount of infor- mation, it is critical to design more user-friendly inter- faces to explore, understand, and manage their content. One important technique for designing such interfaces is information visualization. Although computer-aided information visualization is a relatively new research area, numerous visualization applications already exist in various fields today. Furthermore, many library professionals are also starting to realize that combining information visualization techniques and current library technologies, such as digital libraries, can help library users find information more effectively and efficiently. This article first discusses three major tasks that most visualization for digital libraries emphasize, and then introduces several current applications of information visualization for digital libraries. A good understanding of user tasks is the founda- tion of designing useful visualizations. Rao et al. defined several specific user tasks of digital librar- ies and illustrated some existing information-visualiza- tion techniques that can be used to enhance these tasks, such as TileBar, Cone Tree, and Document Lens.1 The tasks were browsing subsets of sources iteratively, view- ing context-of-query match, visualizing passages within documents, rendering sources and results, reflecting time costs of interaction, managing multiple-search processes, integrating multiple search and browsing techniques, and visualizing large information sets. Moreover, Zaphiris et al. generalized these tasks into three essential ones: searching, navigation, and browsing.2 Indeed, most infor- mation-visualization projects for digital libraries have emphasized these three tasks. In terms of searching, Shneiderman et al. proposed the use of a two-dimensional display with continuous variables to view several thousand search results simul- taneously.3 This visualization included two strategies: two-dimensional visualizations, and browsers for hier- archical data sets (implemented by using categorical and hierarchical axes). In combination with a grid display, this visualization let users see an overview by color- coded dots or bar charts arranged on a grid and orga- nized by familiar labeled categories. Users could probe further by zooming in on desired categories or switching to another hierarchical variable. A language-indepen- dent document-classification system, completed by Liu et al., provided a search aid in a digital-library environ- ment and helped users analyze the search query results visually.4 This system used a vector model to calculate the similarities between documents and a Java applet to display the classification to the user. In terms of navigation, there are also a variety of information-visualization applications. The previous example of two-dimensional display developed by Shniederman et al. also contained navigation functions.5 Another example is Hascoet’s map interface applied to a digital library.6 This prototype was associated with summary views in the form of navigation trees and neighbor trees that showed documents related to one focus document. The user interface was composed of maps automatically generated based on the characteris- tics of documents retrieved and a default configuration. Users could also modify the configuration of maps and edit maps (classical operations such as cut, paste, move, delete, save and load a view, and expand a view). As for browsing, the use of dynamic queries is a tech- nique that has been employed for some time. Ahlberg and Shneiderman’s (1994) FilmFinder is an early example.7 Users can move several sliders to select query param- eters, and the search results change with the movement of the sliders. This tool can help users browse movie records more easily and cognitively. Another technique is Query Previews, proposed by Doan et al.8 Query Previews allows users to rapidly gain an understanding of the con- tent and scope of a digital collection. Users are presented with generalized previews of the entire database using only the most salient attributes. When they select rough ranges, they will immediately learn the availability of the data for their proposed query. All these applications provide good examples and paradigms to some recent projects. This paper’s discussion of visualization techniques will be based on these three essential tasks—searching, navigation, and browsing. ■ Techniques and applications This section presents several recent information-visualiza- tion projects applied to digital libraries. All these applica- tions emphasize searching, navigation, and browsing. Gang Wan Gang Wan (wangang11@gmail.com) is Science Librarian, Texas A&M University, College Station. Visualizations for Digital Libraries ARTICLE TITLE | AUTHOR 89VISUALIZATIONS FOR DIGITAL LIBRARIES | WAN 89 LVis—Digital Library Visualizer Indiana University’s (IU’s) LVis (Digital Library Visualiz- er) project aims to aid users’ navigation and compre- hension of large document sets retrieved from digital libraries. Borner et al. developed a prototype of LVis based on the data set in the Dido Image Bank, provided by IU’s department of art history.9 LVis is a good com- bination of information-retrieval algorithms and visual- ized-search interface. In the information retrieval and classification stage, it adopts Latent Semantic Analysis (LSA) to automatically extract semantic relationships between images. The LSA output feeds into a clustering algorithm that groups images into several classes sharing semantically similar descriptors. A modified Boltzman algorithm is then used to lay out images in space. This section will focus on the interface metaphors used to dis- play the results of this classification. Two interfaces have been implemented for LVis. A 2D Java applet was used on a desktop computer for details, and a 3D immersible environment for the CAVE (CAVE Automatic Virtual Environment). CAVE is a virtual real- ity 10’ x 10’ x 10’ theater made up of three rear-projection screens for walls and a down-projection screen for the floor. Projectors throw full-color workstation fields (1,280 x 512 stereo) at 120Hz onto the screens, giving between 2,000 and 4,000 linear pixel resolution to the surrounding composite image.10 Both 2D and 3D interfaces give users access to three levels of detail: they provide an overview about docu- ment clusters and their relations; they show how images belonging in the same cluster relate to one another; and they give more detailed information about an image, such as its description or its full-size version. In the CAVE environment, users can first enter a virtual display theater that stages the digital library as a cyberspace Easter Island, presenting gateways to specific subject categories established by the previous classification process. Borner et al. used 3D icons here to encode subject categories (in this case, they actually used a sculptural form of heads inspired by images of the data set).11 After users “enter” into these head icons, they are transited to a new 3D spatial metaphor that presents images in the current category. These images, or slides from the digital library, are presented in crystalline struc- tures (figure 1). In this environment, each crystal represents a set of images with semantically similar image descriptions. Again, physical proximity is used here as a metaphor to encode semantic similarities among images. The forma- tions of the crystalline structures depend on the size of the actual search-result data set. Navigation in this space is easy. Users can also “walk” through this environment and select images of interest to display a larger- and clearer-size version (as two images shown in figure 1). If the larger version is not satisfactory, it can be returned to its previous iconic presentation. UC system: A Fluid Treemap Interface for digital libraries The UC system—the acronym “UC” came from its original (but no longer used) internal name “UpLib Client”—was developed by Good et al. at Palo Alto Research Center.12 It was built on the UpLib personal digital-library platform, which provides an “extensi- ble, format-agnostic, and secure document-repository layer.”13 “Personal” here means that the user already has the right to use all of the data objects in the library, and already has local possession of those objects. However, this visualization can be employed in more general digi- tal libraries. The UC system uses continuous and quantum Treemap layouts to present collections of documents. Continuous Treemaps are space-filling visualizations of trees that assign an area to tree nodes based on the weighting of the nodes.14 In continuous Treemaps, the aspect ratio of the cells is not constrained, although square cells are often preferred. Quantum Treemaps extend this idea by making cell dimensions an even multiple of a unit size.15 The Treemap visualizations provide meaningful overviews of document collections and fast, intuitive navigation among multiple documents in a working set. An important aspect of the interface is the fluidity of navigation. This allows the user to focus on the docu- ments rather than on interacting with the tool. The inter- face allows a user to zoom in on an object with a left-click, and to zoom out when the user clicks on the background; Figure 1. LVis: image crystals and panels 90 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 however, the combination of a “zoomable” user interface and continuous Treemaps leads to a problem: conflicts with aspect ratios. To solve this problem, Good et al. proposed to zoom and morph the cell to the window size while leaving the rest of the layout in place.16 Thus the visual disturbance of the display is minimized since only a single cell moves. With respect to searching, this system provides sev- eral methods to filter results. First, its interface includes a mechanism to search for specific content within the documents. As letters are added to the search query, the system increasingly highlights matching documents to immediately indicate matching documents. Secondly, the user can also choose to update the view to display only those documents that match the current query. Figure 2 gives an example of a user-initiated query process. In the scenario described in figure 2, the user first enters the search terms (2.1), and interactive highlights then appear for groups with matching documents (2.2). The user presses a button to limit the view to only the matching documents (2.3). Finally, the user zooms in on a document and begins reading with an integrated reading tool. (2.4) The UC system also offers a mechanism that allows the user to compare multiple documents. After users retrieve a set of documents through a search, they can press a button to “explode” the documents to pages. They can continue zooming in to a portion of a single docu- ment, and then select a document page to read with the integrated reading tool. In short, the UC system uses Treemaps as the primary visual metaphor. It also uses various visualization tech- niques that enhance user interactions, such as zooming, interactively highlighting, exploding, etc. ActiveGraph ActiveGraph is an information-visualization tool de- signed by Marks et al. (2000) at Los Alamos National Laboratory (LANL).17 It aims to provide users with a concise, customizable view of documents in a digital library. In this system, a set of digital-library documents is represented as a data set in a 2D or 3D scatter plot. The data set can represent any digital-library objects in vari- ous formats including books, journals, papers, images, and Web resources. Marks et al. used six visual attributes of the scat- ter plot: the X-, Y-, and Z-axes, color, size, and shape to encode the bibliographic information of documents in a digital library, including title, author, date of publication, and number of citations.18 The user can select and adjust these attributes from a control panel on the right-hand side of the screen. Thus, ActiveGraph allows users to both view and customize the contents of a digital library. The main visual representation of this tool is a scatter plot. Scatter plots have been used to represent large sets of data for a long time. They provide an overview of a data set and show the distribution of data points clearly, revealing clusters and statistical information.19 Hence, these scatter plots make it possible for users to perceive meaningful patterns of the data. An example of using ActiveGraph scatter plots to visualize citation data for postdoctoral researchers at LANL is given in figure 3. This scatter plot intends to provide users with information, such as the number of times their papers published between 1998 and 2002 were cited. The visualization is based on the metadata in the LANL digital library. In this scatter plot, the postdoctoral researcher’s last name is mapped to the X-axis and the number of cita- tions is mapped to the Y-axis. A pixel of a particular color can provide two pieces of information, for example, by encoding a paper and the subject category of the paper. A group of pixels of a particular size, shape, and color can provide four pieces of information by encoding a paper, the subject category of the paper, whether the paper has been cited, and whether the paper has been read by another user of the collaborative library. From this scatter plot, users can easily learn the citation pattern of these papers. Unlike some other scatter plot applications such as HomeFinder and FilmFinder, ActiveGraph uses different filters for queries. Instead of filter sliders, it uses filter lists, which consist of selection list boxes, one for each data attribute. These filter lists can provide users with functionality that is important in the context of digital libraries. ActiveGraph allows users to manipulate the display of data in another manner by applying a logarith- mic transformation. As some data sets, such as citation data, can frequently have an exponential distribution, the Figure 2. The search interface of the UC system ARTICLE TITLE | AUTHOR 91VISUALIZATIONS FOR DIGITAL LIBRARIES | WAN 91 logarithmic transformation can spread the clustered dots more evenly across the scatter plot. Other data transformations and visualizations may be important in some cases as well, such as parallel coordi- nates for displaying citation statistics for the same group of researchers at different points in time. The scatter plot is not a new visualization technique. This example, however, demonstrates that by encoding document attributes and designing proper filters, it can be used in a digital library environment effectively and efficiently. 3D Vase Museum The previous example, LVis, already introduced the applications of 3D representations in digital libraries. The 3D Vase Museum developed by Shiaw et al. at Tufts University is another good example of 3D space meta- phor in digital libraries using a variety of visualization techniques.20 In this 3D Vase Museum, the user can navigate seam- lessly from a high level scatter-plot-like plan view to a perspective overview of a subset of the collection, to a view of an individual item, to retrieval of data associated with that item, all within the same virtual room and with- out any mode change or special command. Unlike the traditional digital library, which displays thumbnails and descriptions of vases in the main browser interface, this museum is a 3D virtual environment that presents each vase as a 3D object and places it in a 3D space of a room within a museum. Figure 4 gives a wide- angle view of this 3D museum. In this view, one wall represents the timeline (year BCE) and the adjacent wall represents the types of wares (e.g., red figures, black figures). The user can “walk” through this virtual room and look at the vases. The wide-angle view pictured in figure 4 will then be tran- sited to an eye-level view so that the user can probe the objects more clearly. When the user continues “walking” toward an object of interest, secondary information about this vase will appear in the virtual scene. If the user looks closer, the text information becomes clearer. As the user moves farther and farther away, the information becomes less and less visible until it eventually disappears from the scene. If the user clicks on the vase HTML page, a version of the original HTML page will be loaded, from which a 3D model of the vase can be loaded. The user can then rotate this 3D model on the screen using the mouse to see all the aspects of the vase. The 3D Vase Museum is maintained in the background all times. The user can also navigate the room in a perspective view by switching the view port upward toward the ceiling (watching from upside down). The user can then switch the views between a high-level scatter plot and a 3D perspective view. Similarly, in this application, the X- and Y- axes are mapped to two attributes of the vases: year and ware. With this seamless blend from a high-level data plot to 3D objects, the user can navigate without los- ing the point of view or context by just moving within the virtual environment. According to Shiaw et al., a usability test has been ad- ministered, in which tasks based on archaeology courses were designed and subjects were asked to perform these tasks in the original traditional digital library and this 3D museum.21 The results showed subjects who used the 3D Vase Museum performed the tasks 33 percent better and did so nearly three times faster. ■ Collaborative Visual Interfaces to digital libraries Collaborative Visual Interfaces is an ongoing project led by Borner at Indiana University (IU). Borner et al. (2002) proposed the development of a shared 3D document space for a scholarly community—namely faculty, staff, and students at IU’s School of Library and Information Science.22 The space will provide access to a collection of various online documents including text, images, video, and software demonstrations. A Semantic Treemap algorithm has been developed to layout documents in a 3D space.23 Semantic Treemaps utilize the original Treemap approach to determine the size (dependent on the number of documents) and layout Figure 3. ActiveGraph scatter plot of citation data for papers authored by LANL postdoctoral researchers 92 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 of document clusters. Subsequently, an algorithm (force directed placement) was applied to the documents in each cluster to place documents spatially, based on their semantic similarity, which was encoded by the physical proximity between two dots. An example of the Semantic Treemaps is shown in figure 5.1. A 3D space metaphor was then used to display these documents on the desktop interface, as shown in figure 5.2. In this 3D space, each document is represented by a square panel textured by the corresponding Web page’s thumbnail image and augmented by a short description such as the Web page title that appears when the user moves the mouse over the panel. As in other 3D envi- ronments, users can “walk” through this space to probe documents of interest. Upon clicking the panel, the cor- responding Web page is displayed in the Web-browser interface. Users can collaboratively examine, discuss, and modify (add and annotate) documents, thereby converting this document space into an ever-evolving repository of the user community’s collective knowledge that members can access, learn from, contribute to, and build upon. Certain usability studies have been performed to determine the influence of panel size and panel density on retrieval performance. Results showed that subjects were slightly faster and more accurate if Web-page panels are larger and denser. ■ AquaBrowser AquaBrowser is a fuzzy visualization tool that shows the high-level description of a conceptual space, hiding irrelevant information and displaying information ele- ments in context.24 It is a generic Java applet that can be embedded into any Web page. Medialab, the developer of AquaBrowser, claims that users of AquaBrowser can browse through a dynam- ic conceptual space that is continually reshaped to reflect their interests. Animations make tran- sitions from one state to another appear more fluid, showing users why and how the information is rearranged. Medialab uses the term “word cloud” as the visual metaphor of the AquaBrowser interface. But in fact, the primary visual representation is a network of linked words that are distributed in the conceptual space. The search term that the user assigns will display at the center of this network. The physical distance between another term node and this term encodes the relevancy between these terms. The larger and closer the word is to the center of the screen, the greater its relevance to the search term. In contrast, the smaller and more peripherally positioned, the less relevant it is. Each of the user’s actions will change and rearrange the distribution and importance of the words, putting those of greater interest closer to the user and those of less interest nearer to the edge of the screen. It also uses colors to encode attributes of terms, such as spelling variations, visited words, and translations. Figure 6 shows an example of a search display. This tool has been used by a number of libraries to enhance their online catalog search interfaces. It could be a very useful search aid in digital libraries as well. ■ Summary and trends The above applications are just a few examples of infor- mation visualization in a digital-library environment. Many other metaphors and techniques, such as per- spective wall, cone tree, document lens, and hyperbolic browser, have been used or can potentially be used to facilitate searching, browsing, and navigating through the maze of information in a digital library. The digital library is an interdisciplinary subject involving several research areas such as information retrieval, multimedia information processing, and clas- sification. All these aspects of digital libraries make information visualizations more complicated in this envi- ronment. Therefore, the systems described in this paper have integrated various visualization techniques. Figure 4. A wide-angle overview of the 3D Vase Museum ARTICLE TITLE | AUTHOR 93VISUALIZATIONS FOR DIGITAL LIBRARIES | WAN 93 The examples in this paper, along with many others, show that the 3D space metaphor has attracted much attention from information-science communities. The combination of 3D space and virtual reality that can be accessed from Web browsers these days is becoming a trend of information visualization for digital librar- ies. This technique gives the user maximum freedom to walk through the library collections, searching and browsing documents. The 3D visual structures, however, have greater implementations compared with those that are 2D, since they require more processing power and include more parameters.25 That is partly why many 2D visualizations developed in the 1990s are still widely used. For example, both ActiveGraph and 3D Vase Museum have employed 2D scatter plots; both UC system and Collaborative Visual Interfaces have used Treemaps. Furthermore, it is very important to focus on the actual needs of users. Research on any visualization for digital libraries should be based on the detailed analy- sis of users, their information needs, and their tasks.26 Usability tests have been done for some of the above applications, but not for all of them. Further research and usability tests are required to determine to what extent a visual interface facilitates the user ’s perception of information. References and notes 1. R. Rao et al., “Rich Interaction in the Digital Library,” Communications of ACM 38, no. 4 (1996): 29–39. 2. P. Zaphiris et al, “Exploring the User of Information Visu- alization for Digital Libraries,” The New Review of Information Networking 10, no. 1 (2004): 51–69. 3. B. Shneiderman et al., “Digital Library Search Results with Categorical and Hierarchical Axes,” DL-00: 5th ACM Digital Library Conference, San Antonio (New York: ACM Pr., 1999). 4. Y. Liu et al, “Visualizing Document Classification: A Search Aid for the Digital Library,” Journal of the American Society for Information Science 51, no. 3 (2000), 216–27. 5. Shneiderman et al., “Digital Library Search Results with Categorical and Hierarchical Axes.” 6. M. Hascoet, “Using Maps As a User Interface to a Digital Library,” SIGIR ’98, Melbourne, Australia (New York: ACM Pr., 1998). 7. C. Ahlberg and B. Shneiderman, “Visual Information Seeking Using the FilmFinder,” ACM CHI94 Conference, Boston (New York: ACM Pr., 1994). 8. K. Doan et al., “Query Previews for Networked Informa- tion Services,” Advanced Digital Libraries Conference (Washington: IEEE, 1996). 9. K. Borner et al., “LVis—Digital Library Visualizer,” Pro- ceedings, IEEE International Conference on Information Visual- Figure 5.2. Interface to the document space Figure 5.1. A semantic Treemap of Web links Figure 6. A search display in AquaBrowser 94 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 ization, July 19–21, 2000, London, England (Los Alamitos, Calif.: IEEE Computer Society, 2000), 77–81. 10. C. Cruz-Neira et al., “Surround-screen Projection-based Virtual Reality: The Design and Implementation of the CAVE,” Computer Graphics (Proceedings of SIGGRAPH ’93), vol. 27 (New York: ACM SIGGRAPH, 1993), 135–42. 11. K. Borner et al., “LVis—Digital Library Visualizer.” 12. L. E. Good et al., “A Fluid Treemap Interface for Personal Digital Libraries,” JCDL’05, June 7–11, Denver (New York: ACM Pr., 2005). 13. W. C. Janssen and K. Popat, “Uplib: A Universal Personal Digital Library System,” ACM Symposium on Document Engineer- ing (New York: ACM Press, 2003), 234. 14. Good et al., “A Fluid Treemap Interface for Personal Digi- tal Libraries.” 15. B. B. Bederson et al., “Ordered and Quantum Treemaps: Making Effective Use of 2D Space to Display Hierarchies,” ACM Transactions on Computer Graphics 21, no. 4 (2002): 833–54. 16. L. E. Good et al., “Zoomable User Interface for In-Depth Reading,” JCDL’04, June 7–11, Tucson, Ariz. (New York: ACM Pr., 2004) 17. L. Marks et al., “ActiveGraph: A Digital Library Visualiza- tion Tool,” International Journal on Digital Libraries 5, no. 1 (Mar. 2005), 57–69. 18. Ibid. 19. E. R. Tufte, The Visual Display of Quantitative Information (Cheshire, Conn.: Graphics Pr., 1983). 20. H. Shiaw et al., “The 3D Vase Museum: A New Approach to Context in a Digital Library,” JCDL’04, June 7–11, Tucson, Ariz. (New York: ACM Pr., 2004). 21. Ibid. 22. K. Borner et al., “Collaborative Visual Interfaces to Digital Libraries,” JCDL’02, July 13–17, Portland, Ore. (New York: ACM Pr., 2002). 23. Y. Feng and K. Borner, “Using Semantic Treemaps to Cat- egorize and Visualize Bookmark Files,” Visualization and Data Analysis 2002: 21–22 January 2002, San Jose, USA (Proceedings of SPIE, v. 4665) (Bellingham, Wash.: SPIE—the International Society for Optical Engineering, 2002), 218–27. 24. A. Veling, “The AquaBrowser—Visualization of Dynamic Concept Spaces,” Journal of AGSI 6, no. 3 (1997): 136–42. 25. B. Eden, “3D Visualization Techniques: 2D and 3D Infor- mation Visualization Resources, Applications, and Future,” Library Technical Reports 41, no. 1 (2005). 26. E. Bertini et al., “Visualization in Digital Libraries,” www .dis.uniromal.it/~delos/docs/ivdls_book_chapter.pdf (accessed Jan. 12, 2006). 3337 ---- This paper discusses some of the problems associated with search and digital-rights management in the emerging age of interconnectivity. An open-source system called Context Driven Topologies (CDT) is proposed to create one global context of geography, knowledge domains, and Internet addresses, using centralized spatial databases, geometry, and maps. The same concept can be described by differ- ent words, the same image can be interpreted a thousand ways by every viewer, but mathematics is a set of rules to ensure that certain relationships or sequences will be pre- cisely regenerated. Therefore, unlike most of today’s digital records, CDTs are based on mathematics first, images sec- ond, words last. The aim is to permanently link the highest quality events, artifacts, ideas, and information into one record documenting the quickest paths to the most relevant information for specific data, users, and tasks. A model demonstration project using CDT to organize, search, and place information in new contexts while protecting the authors’ intent is also introduced. ■ Statement of the problem Human history is composed of original events, artifacts, ideas, and information translated into records that are subject to deciphering and interpretation by future gener- ations (figure 1). It’s like putting together a puzzle, except that each person assembling bits and pieces of the same information may end up with a different picture. We are at a turning point in the history of humanity’s collective knowledge and expertise. We need more precise ways to structure questions and more interactive ways to interpret the results. Today, there is nearly unlimited access to online knowledge collections, information ser- vices, and research or educational networks to preserve and interpret records in more efficient and creative ways.1 There is no reason digital archiving and dissemination techniques could not also be used to streamline redun- dancies between collections, build cross-references more methodically.2 Content should be presented and tech- niques utilized according to orderly specifications. This will help to document work more responsibly, making shared records more correct, interesting, and complete. The open-source system proposed, Context Driven Topologies (CDT), packs and unpacks ideas and informa- tion in themes similar to museum exhibitions using speci- fications created by each author and network. Data layers are formed by registering unique combinations of geogra- phy, knowledge domains, and Internet addresses to create multidimensional shapes showing where data originate, where they belong, and how they relate to similar infor- mation over time. The topologies can be manipulated to consolidate and compare multiple sources to identify the most reliable source, block out repetitious or irrelevant background information, and broadcast precise combi- nations of ideas and information to and from particular places. “Places,” in this sense, means geographic region and cultural background, knowledge domain and educa- tion level, and all of their corresponding online resources. Modern information must be searchable on mul- tiple and simultaneous levels.3 Today’s searches occur for a number of reasons that did not exist when most current collections, repositories, and publications were created. Digital records have the potential to reach far broader audiences than original events, artifacts, and ideas. Therefore, digitized items and the acts of publish- ing and referencing over networks could theoretically serve a longer-term and more expanded purpose than most individual collections, repositories, or publications are designed to serve. There is no shortage of interesting work to look at. We live in a complex world that is just recently being digi- tized, mapped, analyzed, and broadcast over the Internet in fine detail and compelling overall relationships. Many Deborah L. MacPherson (debmacp@gmail.com) is Projects Director, Accuracy&Aesthetics (www.accuracyandaesthetics.com) in Vienna, Virginia. Deborah L. MacPherson Digitizing the Non-Digital: Creating a Global Context for Events, Artifacts, Ideas, and Information DIGITIZING THE NON-DIGITAL | MACPHERSON 95 Figure 1. 50 Word Word-Search-Puzzle (Courtesy of Kevin Lightner) 96 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 of these relationships require mathematics, images, and maps to explain them. We need more than keywords to explore and reference all that has been documented, but we have formed the habit of using keywords and machine-based classification schemes. The entire digital world is in a mire of conflicting priorities, funding oppor- tunities, and intellectual quests toward the future. To advance humanity’s collective curiosity and knowledge, and to coordinate similar efforts across disciplines and cultures, we need one form of record keeping. One global context to show: 1. Where ideas and information begin; 2. If the original is non-digital (e.g., an artifact or real world event), and if so, the location where the arti- fact resides or the time and place of the event; and 3. A marking system to keep track of the ways infor- mation has been exchanged, reinterpreted, and reused to create a more comprehensive and simpli- fied guide to humanity’s collective knowledge and expertise. Digitizing the non-digital is a concept to address three issues: ■ Tools to assemble the bigger pictures needed to docu- ment the best paths to the most relevant information in sets rather than retrieving results item by item; ■ Placeholders for information that has not been digi- tized or was never recorded; and ■ Distribution to and from specific places according to the ways it is used, the kind of information it is, and the types of people who are able to understand it. There is currently little distinction between all data that have been collected or exist, versus the data and techniques selected to draw conclusions. There are no tools to differentiate between information under rigorous discussion by a discipline or culture versus random bits and pieces. There is a need to develop the equivalent of interpretive exhibits to instruct and inspire the general public. There is currently no way to herd information into crowded areas to be consolidated, compressed, and prioritized by its relationship to similar ideas and infor- mation. Citation patterns are able to show connections or structure-related information.4 However, they currently do not show whether the reference is for or against the other work. There are very few big pictures.5 There is no way to trace where an idea has led over time. The global context proposed is not like the ancient Library of Alexandria or large-scale contemporary initiatives. The envisioned process looks beyond the quest to digitize or publish every available event, artifact, and idea. It is not about each item itself. It is being able to make sense of the ways the same information can be viewed in different contexts, and being able to construct a reliable process to search and document the results. Having bigger pictures will allow researchers, curators, and others to see what is missing or decide which archival works should be converted into digital form. We do not have the time, resources, or reasons to digitize every item in every collection. The aim is to gradually identify what the most telling examples are in different areas so someone new to an event, artifact, idea, or information can see it in various contexts and automatically be shown the most compelling or instructive sequences first (figure 2). A coordinated effort to overlap and see all archives and publications by ranking accuracy and appeal to the public in relationship to all knowledge will make it pos- sible for entirely new lines of inquiry to be established. It will help researchers coordinate work across disciplines. An example of this principle today is the International Virtual Observatory Alliance (IVOA).6 IVOA is a coor- dinated effort by astronomers worldwide to document our universe more efficiently by systematizing their records; showing where they originate; indicating how they were collected; meeting their rigorous mathematical Figure 2. Photomosaic ® Thousands of miniature images of the civil war combine to make one large portrait. (Courtesy of Robert Silvers) ARTICLE TITLE | AUTHOR 97DIGITIZING THE NON-DIGITAL | MACPHERSON 97 standards; and deciding themselves how and where their records belong in relationship to each other, and which ones are most important. Only astronomers are qualified to do this. The same is true in any area of humanity’s specialized knowledge and expertise. The most difficult aspect of creating a global context is accommodating and expressing each area in its unique way as created from within, while still being able to get the most descriptive examples from all areas to fit together in a sensible and appealing overview. Until digital archives and publications can be deeply searched on a global level using simpler tools and prede- termined pathways accessible by anyone, two research- ers in different geographic or academic areas may be investigating the same topic from different points of view and will not know it. There is no way to be led to the best Internet resources. Today, as so much information surrounds us, it is hard to believe that common lines of inquiry could be discovered by accident. Context of the place, time, idea, or education level should be able to drive Internet topologies to the most appropriate online resources. Constructing a reliable and beautiful digital history of all events—both natural and man-made—artifacts, ideas, and information means contributing to and com- bining a wide range of knowledge, expertise, networks, archives, and tools. Mapping digital knowledge to his- torical knowledge means arguing about and perfecting an entirely new set of checks and balances. Historical and digital knowledge are different. Historical knowledge is fluid, continuous, and held by traditionally separated cultures and disciplines. Digital knowledge goes every- where that can be marked and traced by the times and places it was created, captured, and distributed. Trying to visualize what is happening and relating it to working practices and the types of information that came before it is not like tracing the history of the human race back to Adam and Eve or the universe back to the Big Bang, where substantial guesswork beyond our memory or experience is involved. The entire conversion into the networked age is happening before our eyes in less than one generation without the benefit of reflection, care- ful review, and storytelling. We’re collecting everything indiscriminately over and over again while all datasets are rapidly expanding. We need to step back, slow down, and acknowledge that many current digitization and publication methods do not consistently generate reflec- tive or reviewed results that are able to tell a story. We do not currently have one shared map, context, mathematical record, language, or set of symbols to interpret from different points of view for a variety of purposes over time. We do not currently mark the origi- nal versus subsequent interpretations of the same infor- mation as an integral component of most digital records. There is no financial support for one single shared stor- age space to preserve only the highest-resolution, most agreed-upon versions because we may never be able to agree on what they are. Therefore, there is also not one system that can be fine-tuned to discover research and results that may be accidentally overlapping. Instead, unusual approaches get watered down by constrained words designed to fit metadata requirements devel- oped by archivists and engineers rather than the origi- nal authors. Links get broken, Web sites are no longer maintained, trends change. There are currently very few feasible ways to pick up on a line of inquiry previously initiated by others without sorting through and regener- ating the same information again.7 A simplified version of the work needs to be preserved on the network, able to be referenced by others even if they are far away, live in a different time, or are more or less advanced in their ways of thinking. If digital information is reliable, someone in a remote place or in the future should not need to collect the same information again or unintentionally retrieve out-of-date or duplicate results. Searches in the public domain should not be boring. They should be as easy to click through as TV chan- nels, with more directions to go and better content. All searchers should not have to start at the top like everyone else on the first page of Google, CiteSeer, or ArXiv with a blank white space and a box to enter key words. Investigators should be able to outline the facts they know, dial in measurements, specify relationships, and generally be able to use their own knowledge and expertise to isolate and extract entire ideas over broad spectrums or select only relevant portions of archives and publications to reintegrate into larger bodies of work for further discussion. Digital objects are able to depict more than the unaided eye can see. An example is the evaluation of the center of mass of Michelangelo’s David performed for David’s restoration by the Visual Computing Lab based on a 3D model of the statue built by Stanford University (figure 3).8 The digital David does not have mass. The original David is a beautiful object sculpted of a known and pre- dictable material. The model makes it possible to test res- toration techniques without permanent damage in ways no one would dare attempt on the irreplaceable original without first knowing more. The documentation process is an enhanced original that should be permanently bound to the digital history of the original sculpture. The evaluation method could be applied to other objects, but this model belongs with this object and this type of research. A global context built upon a solid, mathemati- cally linked foundation would mean this conscientious work would not be lost or need to be repeated. Digital records are not being used nearly to their full potential. So many influences on humanity’s intellectual evolution could be examined as history takes shape over 98 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 time. Concurrent and conflicting interpretations can take on more meaning than the original by itself. For example, how could the Internet and legal citations be used to map the subsequent interpretations of the U.S. Constitution from the time, place, and reasons where it was written to every Supreme Court case and related citation since the original context? What would this map look like (figure 4)? The impact that these four pages of ink on paper have had to the United States and the entire world can- not currently be examined in one volume to see where the most contentious and useful passages are. Similar dynamics in Wikipedia are shown in History Flow by Martin Wattenberg at IBM Research.9 What if techniques developed in one field could be applied to content from another area? For example, what if computer models created to track storms and hurricanes could be used to arrange and watch the evolution and real world impact of all the documents and actions associated with a war? Being able to see how originals evolve in their inter- pretation and impact on society over time is practical because not all records are worth keeping. Even worse, mundane or meaningless events, artifacts, ideas, or infor- mation may seem more important than they actually were if they are not translated into digital form or distrib- uted in the right way.10 The task today is to make the most advanced ways of thinking and working more approachable and appealing to someone new, which is everyone outside a particular discipline or culture, while traversing a map of humani- ty’s collective knowledge and expertise. Because shared memories of this magnitude would be so far-reaching and complex, the record itself needs to be able to show every user how to use it. Every unique purpose for look- ing around, publishing, or referencing work, and adding to or taking away from a collaborative global context should be geared toward improvement and simplifica- tion. While millions and millions of people are accessing enormous numbers of files and collections, some paths are better than others. In order to sort and choose the best parts of vast collections, documenting everyone going in and out of various semantic places can ultimately iden- tify the best paths to information everyone understands. What if someone who does not care at all about paintings makes an inquiry—which ten should they be shown to get them interested? There is also the issue of gearing the Internet to provide more efficient pathways to widely accessed preapproved and curated information. Every mouse click could accumulate to document the most reliable pathways in and out of shared information spaces to generate an assortment of scenarios for looking at the same information in different ways (figure 5).11 We think there is far too much information to con- solidate into one big picture, that our ideas and methods are too incompatible to coexist comfortably in one space, but perhaps this is not really the case. Perhaps we can understand what is happening more clearly by working backwards. ■ Proposed solution and design for a running prototype Even though many networks are in place and count- less computers have been manufactured, technology advances rapidly. There are very few reasons to repair obsolete equipment or maintain outdated web resources. Therefore, why not go back to the drawing board on all of it? We may have completely new computers and net- works within ten years, anyway. A record-keeping and referencing system this ambi- tious needs to incorporate every type of record, classifica- tion scheme, symbol, style, and quirk. When visiting a new place outside your comfort zone, it needs to be obvious what the best local techniques are to filter and understand the results. People new to an area need to have the option of using tools they can invent or already know. Figure 3. David’s Center of Mass (Courtesy of the Visual Computing Lab and Stanford University) ARTICLE TITLE | AUTHOR 99DIGITIZING THE NON-DIGITAL | MACPHERSON 99 The Visualization of CDT’s model demonstra- tion project will bring together research scien- tists, artists, integrators, and institutions to devel- op a running prototype. The purpose is to establish and record a series of planned and spontane- ous situations in different parts of the world across a range of disciplines and existing networks so that these situations can be mapped. The project will be a group of people thinking together to confront the road- blocks in assembling incompatible ideas and information into one context. The group will collaborate in larger and smaller groups in roughly three-month intervals as par- ticipants continue with their existing work. The develop- ment of this system has to be dynamic, changing piece by piece both from the bottom up and the top down while everyone’s regular work continues. Therefore, the system will be geared toward sample sets of active work products, rather than the record-keeping system by itself. The current objective is to establish a network of ten art museums, ten scientific research institutes, and ten new media/new technology efforts in ten cities that speak different natural languages (for example: English, German, French, Italian, Hindi, Mandarin, Ga [belong- ing to the cluster of KWA languages in Ghana], Uzbek, Spanish, and Arabic). The overall intent is to use math- ematics, art, and individual ways of knowing to develop a series of professional sketches to serve as shortcuts between languages and key words in the search process. The first step is to map the background of each of the project participants’ previous work by time, location, and discipline. The database will include scientific visu- alizations, art objects, performances, algorithms, math- ematical formulae, musical recordings, and many other forms of creative and scholarly expression. The next steps will be to hold a series of interactive workshops. At the first workshop, the research scientists will explain the mathematics and images they use in their work. Two sets of artists will isolate the aesthetics to render their own map through the scientists’ ideas. Two traveling exhibits will be created, one to be experienced in per- son, the other to be presented through a new media and online exhibit. Both will be tracked physically and con- ceptually using CDT. The results will be generated and interpreted using GIS, MATLAB, Photoshop, and flow visualization software. For more information, please contact the author. A survey of individual and institutional requirements will be undertaken to define practical ways to move and organize ideas and information into a unified sample map of previously unrelated content and techniques. For example, at one institute, perhaps only two participants and four local professors will understand what that part of the map is showing. Another part may only have meaning to one artist. A unified map for everyone, with built-in copyright protection for the participating artists, scientists, and institutions, will be presented to nonspe- cialist general publics around the world for feedback and further change within specified limits. The participating publics will be people interested in contemporary art, cutting-edge scientific research, new media, and events where all three communities can interact. Each part of the prototype will be able to be examined in groups to compare and contrast different elements against different backgrounds. Some arrangements will be assisted by the computer and network. The project will map everything with which each event, idea, and artifact has ever been associated in scale, proportion, and relative placement in the record overall. For example, if the records in question are paintings, any group could Figure 5. Thick and Thin (Courtesy of the artist John Simon) Figure 4. The Constitution of the United States (Courtesy of the U.S. National Archives and Records Administration [NARA]) 100 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 be gathered together into the same reference window without copying the images. The assembly window has a built-in scale for the items it is showing, so they will be displayed in the correct proportion to each other. The system binds images of physical objects with their dimen- sions and the times and places they were created while this information is known—so a user does not ever have to guess later when looking back at any part of the record. Any group of paintings can be automatically arranged chronologically, by size, culture, or any number of com- parisons and curatorial issues. A sample sequence is: 1. A zoomed-in map showing a group of paintings in an exhibit. Each painting links to its history. 2. Within the map of all paintings shown in an intri- cate collage. 3. Inside the map of all human endeavor shown as an appealing landscape. Higher levels can then be used to reorganize a theme, for example, “only Germany 2005 to 2007,” and drilling back down to generate other exhibitions. This would lead to other paintings and other curators’ conclusions, which would provide a more complete representation of each painting, exhibition, museum, curator, culture, and era. When the records in question are scientific visualizations, problems of presenting unrelated files together are more complex. The records may not share a common scale or system of reference. It may only be possible to place mathematical constructs in contexts based on where they originate geographically and by knowledge domain. An important part of the work will be determining the best contexts by which to introduce ideas or information to untrained viewers and devising methods to start deeper in the records using mathematical, cultural, or other prior knowledge and preferences. The same concept can be described by different words, the same image can be interpreted a thousand ways by every viewer, but mathematics is a set of rules to ensure that certain relationships or sequences will be precisely regenerated. Therefore, unlike most of today’s digital records, CDTs are based on mathematics first, images second, words last. Ideas and information will be encoded to persist over specified periods of time. Better examples will find higher placement by connecting to more background information and showing stronger relationships to larger numbers of open questions. Cycles will be implemented to return to the same idea later and remove information that is never referenced or has not changed the course of the record’s flow. Out-of-date, irrelevant, or rarely used information has to either be compressed or be thrown away, A new type of identity and a process to assemble and eliminate information will be created in thirty prototype forms showing the intertwined history of the events, artifacts, ideas, and information generated by the project and all it branches out to when connecting back to the publications, exhibits, ideas, artifacts, and other infor- mation generated by the participating individuals and institutions. The CDT model will relate and join tables to display all the different forms together in one map. Each piece of information and the patterned space around it will be documented a special way to generate drawings leading back to originals reliably structured to transfer to other computers and networks. They will transfer with- out ambiguity because the transactions and paths to the Internet addresses are based on mathematical relation- ships that can be checked. Each contributor has the first opportunity to place his or her ideas in context and define the limits of how their originals can be referenced, changed, and presented. At the end of the project, the set will be closed so that it can be cleaned of information that was only temporary, place- holders can be examined, and the entire model can be manipulated as one whole. For more information, please see www.contextdriventopologies.org The more specified a single piece or set of informa- tion, the easier it will be to define its history and place it in context. Each unique placement and priority assigned by each individual or institution may not agree with the priorities and placements envisioned by others, but sooner or later, there will begin to be correspondence and everyone will be looking very generally at the same emerging map. ■ Conclusions There will be innumerable contexts to create, discover, and remark upon in the future by creating a shared pace of curiosity and knowledge acquisition. A global context could be used to extrapolate new knowledge from trends that occur over longer periods of time in more places than we currently share or document. As the envisioned system is fine-tuned, it will become an ideal place to test an idea that is only partially complete to see where the idea fits or to determine if it has already been done. The results could be immediately applied to improve educa- tion. In today’s frantic information overload, we should not forget that digital information—and even cold, hard, raw data—is more than ones and zeros. They represent peoples’ work, their fingerprints; people are attached to their data. One wishes networks of computers could understand one’s ideas and work, but we only show them the boring parts. The proposed system will capture beauty so com- puters can help to find where it is hidden inside all the repositories, publications, and collections through which no person has the time to sort. The system will allow ARTICLE TITLE | AUTHOR 101DIGITIZING THE NON-DIGITAL | MACPHERSON 101 users to specify how they think their information relates to the rest of the world so their intended context can be traced in the future. One hopes that using networks and computers to compare ideas and works on larger levels will restore craftsmanship and attention spans to make users want to spend more time with better information. A shared visual language driven by mathematical relationships that can be checked will allow future his- torians to see where records simply will not harmonize. Users will be able to analyze why different ways of look- ing can shape and divide knowledge and history as it changes. Visiting online archives and publications will change. Developing processes to pre-organize searches and results for public viewing can change now by creat- ing a system for curators and others to develop sets of information, rather than publishing individual items on their Web sites. Library facilities can change, and research rooms can become multimedia centers. Networks can broadcast content and techniques in one package. There is not one clearly defined reason why being able to see these kinds of overviews or make these types of comparisons can be useful. The Internet is a worldwide invention being constructed for a variety of purposes. A perfectly legitimate reason to capture the history of trans- actions across it in a simple form is just to see what might happen with the objective of increasing our understand- ing and respect for each other. The most important reason for establishing a global context is to allow users to trans- fer and update complex histories, thoughts, images, stud- ies, visualizations, drawings, flow diagrams, sequences, transformations, cultural objects, stories, expressions, and purely mathematical or dynamic relationships without depending on constrained keywords or illegible codes that do not describe this information as well as the infor- mation can describe itself. All cultures and disciplines would be able to construct their parts of the record pre- cisely the way they prefer. We would finally be able to use computers to show why and how we think information is related—a huge leap forward in the world of digital record keeping. References and notes 1. CiteSeer, 2005, http://citeseer.ist.psu.edu (accessed Apr. 6, 2006); Internet2, 2005, www.internet2.edu (accessed Apr. 6, 2006); Jane’s Information Group, 2005, www.janes.com (accessed Apr. 6, 2006); Machine Learning Network Online Information Service (MLnetOiS), 2005, www.mlnet.org (accessed Apr. 6, 2006); National Technical Information Service, 2005, www.ntis .gov (accessed Apr. 6, 2006); Smithsonian Institution Librar- ies, Galaxy of Knowledge, 2005, www.sil.si.edu/digitalcollec- tions (accessed Apr. 6, 2006); Thompson Scientific, ISI Web of Knowledge, 2005, www.thomsonisi.com (accessed Apr. 6, 2006); Visual Collections, David Rumsey Collections, 2005, www .davidrumsey.com/collections (accessed Apr. 6, 2006); World Health Organization, Statistical Information System, 2005, www3.who.int/whosis/menu.cfm (accessed Apr. 6, 2006). 2. G. Ammons et al., “Debugging Temporal Specifications with Concept Analysis,” in Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (New York: Association for Computing Machinery, June 2003). 3. W. Huyer and A. Neumaier, “Global Optimization by Multilevel Coordinate Search,” in Global Optimization 14 (1999): 331–55 4. A. Bagga and B. Baldwin, (Workshop paper), in COLING- ACL ‘98: 36th Annual Meeting of the Association for Computational Linguisitics and 17th International Conference on Computational Linguisitics, Aug. 10–14, 1998, Montréal, Quebec, Canada: Proceed- ings of the Conference (New Brunswick. N.J.: ACL; San Francisco: Morgan Kaufmann, 1998); S. Deerwester et al., “Indexing by Latent Semantic Analysis,” Journal of the American Society for Information Science 41, no. 6 (1990): 391–07; A. McCallum and B. Wellner, “Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference,” in Proceedings of the IJCAI Workshop on Information Integration on the Web (Moun- tain View, Calif: Research Institute for Advanced Computer Science, 2003), 79–84; T. Nisonger, “Citation Autobiography: An Investigation of ISI Database Coverage in Determining Author Citedness,” College and Research Libraries 65, no. 2 (Mar. 2004): 152–63; K. van Deemter and R. Kibble, “On Coreferring: Corefer- ence in MUC and Related Annotation Schemes,” Computational Linguistics 26, no. 4 (Dec. 2000); K. Boyack, “Mapping All of Science and Technology at the Paper Level,” presented at the session Mapping Humanity’s Knowledge and Expertise in the Digital Domain as part of the 101st Annual Meeting of the Association of American Geographers (Denver: Association of American Geogra- phers, 2005): 54; Metacarta, 2005, www.metacarta.com. 5. J. Burke, “KnowledgeWeb Project, 2005.” www.k-web .org (accessed Apr. 6, 2006); Visual Browsing in Web and non-Web Databases, Iowa State University, www.public.iastate .edu/~CYBERSTACKS/BigPic.htm (accessed Apr. 6, 2006). 6. International Virtual Observatory Alliance, 2005, www .ivoa.net (accessed Apr. 6, 2006). 7. S. Bradshaw, “Charting Excursions through the Literature to Manage Knowledge in the Biological Sciences,” presented at the session Mapping Humanity’s Knowledge and Expertise in the Dig- ital Domain, as part of the 101st Annual Meeting of the Association of American Geographers (Denver: Association of American Geog- raphers, 2005): 56, project paper available from http://dollar .biz.uiowa.edu/~sbradsha/Beedance/publications.html (accessed Apr. 6, 2006). 8. M. Callieri et al., “Visualization and 3D Data Processing in the David Restoration,” IEEE Computer Graphics and Applica- tions 24, no. 2 (Mar./Apr., 2004): 16–21. 9. M. Wattenberg, “History Flow,” 2005, http://research web.watson.ibm.com/history (accessed Apr. 6, 2006). 10. K. Börner, “Semantic Association Networks: Using Semantic Web technology to Improve Scholarly Knowledge and Expertise Management,” in Visualizing the Semantic Web, 2nd ed. Vladmire Geroimenko and Chaomei Chen, eds., (London: Springer Verlag, 2006) 99–115. 11. G. Sidler, A. Scott, and H. Wolf, “Collaborative Brows- ing in the World Wide Web,” in Proceedings of 8th Joint European Networking Conference, Edinburgh, Scotland (New York: Elsevier, 102 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 1997); J. Thomas, “Meaning and Metadata: Managing Informa- tion in a Visual Resource Reference Collection,” in Proceedings of Association for Computers and the Humanities and the Association for Literary and Linguistic Computing Meeting (Charlottesville, Va.: University of Virginia, 1999); H. Yu and A. Vahdat, “Design and Evaluation of a Conit-based Continuous Consistency Model for Replicated Services,” in ACM Transactions on Computer Systems 20, no. 3 (Aug. 2002): 239–82. 12. Visualization of Context Driven Topologies/CDT Model Demonstration Project, 2005, www.contextdriventopologies.org (accessed Apr. 6, 2006). Image acknowledgments: 50-Word Word-Search Puzzle www.synthfool.com/puzzle.gif Permission: Kevin Lightner, Synthesizer Enthusiast. Wrightwood, California Abraham Lincoln www.photomosaic.com/samples/large/AbrahamLincoln.jpg Permission: from the artist Robert Silver. David’s Center of Mass http://vcg.isti.cnr.it/projects/davidrestoration/restaurodavid.htm http://graphics.stanford.edu/projects/mich/book/book.html Permission: Roberto Scopigno, Visual Computing Lab, ISTI-CNR, Via G. Moruzzi, 1, 56124 Pisa Italy and Marc Levoy, Stanford Computer Graphic Lab, Gates Computer Science Bldg. Stanford, CA 94305 U.S. Constitution www.archives.gov/ Repository: National Archives Building, Washington, D.C. Permission: NARA government records are in the public domain. Thick and Thin www.numeral.com/drawings/plotter/thickandthin.html 1995 11" × 15" Ink on Paper. Permission: from the artist John Simon, New York City. Specializing in algorithms and conceptual art. 3338 ---- 108 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Tutorial Writing Your First Scholarly Article: A Guide for Budding Authors in Librarianship Scott Nicholson This series of questions and answers is designed to help you take the first steps toward the successful production of a schol- arly article in librarianship. You may find yourself in a library position that requires writing or you may have just decided that you are ready to share your findings, expe- riences, and knowledge with the current and future generations of librarians. While following the guidelines listed here will not guarantee that you will be successful, these steps will take you closer to discovering the thrill of seeing your name in print and making a difference in the field. What should I write about? Perhaps you already have an idea based upon your experiences and expertise, or perhaps you aren’t sure which of those ideas you should write about. The best way to start writing is to read other articles! Many scholarly articles end with a Future Research section that outlines other projects and questions that the article sug- gests. It is useful to contact the author of a piece that holds a Future Research seed to ensure that the author has not already taken on that challenge. Sometimes, the original author may be interested in collaborating with you to explore that next question. How do I start? Scholarship is an iterative process, in that works that you produce are bricks in an ever-rising wall. Your brick will build upon the works of others and, once published, others will build upon your work. Because of this, it is essential to begin with a review of related literature. Search in bibliographic and citation databases as well as Web search tools to see if others have done similar projects to your own. The advantage of finding related literature is that you can learn from the mistakes of others and avoid duplicating works (unless your plan is to replicate the work of others). Starting with the work of others allows you to place your brick on the wall. If you do not explicitly discuss how your scholarship relates to the scholarship of others, only those hav- ing familiarity with the literature will be able to understand how your work fits in with that of previous authors. In addition, it’s easier to build upon your work if those who read it have a better idea of the scholarly landscape in which your work lives. As you go out and discover lit- erature, it is crucial to keep citation information about each item. Much of what you will cite will be book chap- ters or articles in journals, and you will save yourself time and trouble later if you make a printed copy of source items and record bibliographic information on that copy. Recording the title of the work, the full names (including middle initials) of authors and editors, page range, volume, issue, date, publisher and place of publication, URL and date accessed, and any other bibliographic infor- mation at the time of collection will save you headaches later when you have to create your references list. As different journals have different cita- tion requirements, having all of this information allows you the flexibility of adapting to different styles. One type of scholarship produced by libraries is the “how our library did something well” article. While a case study of your library can be an appro- priate area of discussion, it is critical to position these pieces within the schol- arship of the field. This allows readers to better understand how applicable your findings are to their own libraries. The concept illustrates the difference between the practice of librarianship and library science. Library science is the study of librarianship and includes the generalization of library practice in one setting to other settings. Before starting your writing, talk about your idea with your colleagues, which will help you refine your ideas. It will also generate some excitement and publicity about your work, which can help inspire you to continue in the writing process. Colleagues can help you consider different places where similar works may already exist and might even open your eyes to similar work in another discipline. You may find a colleague who wants to coauthor the piece with you, which can make the project easier to complete and richer through the collaborative process. Another important early step is to consider the journals you would like to be published in. Many times, it can be fruitful to publish in the journal that has published works that are in your literature review. Considering the journal at this point will allow you to correctly focus the scope, length, and style of your article to the requirements of your desired journal. Your article should match the length and tone of other articles in that journal. Most journals provide instructions to authors in each issue or on the Web; the information page for ITAL authors is at www.ala.org/ala/ lita/litapublications/ital/information authors.htm. How can I find funding for my research? Some projects can’t be easily done in your spare time and require resources for surveys, statistical analysis, travel, or other research costs. You will find that successful requests for funding Scott Nicholson (srnichol@syr.edu) is an Assistant Professor in the School of Information Studies, Syracuse University, New York. WRITING YOUR FIRST SCHOLARLY ARTICLE | NICHOLSON 109 start with a literature review and a research plan. Developing these before requesting funding will make your request for funding much stron- ger, as you will be able to demonstrate how your work will sit within a larger context of scholarship. You will need to develop a budget for your funding request. This budget will come together more easily if you have planned out your research. It may be useful or even required for you to develop a set of outcomes for your project and how you will be assessing those outcomes (find more information on outcome-based evaluation through the IMLS Web site at www.imls.gov/grants/current/ crnt_obe.htm). Developing this plan will give you a more concrete idea of what resources you will need and when, as well as how you can use the results of your work. Resources for research may come from the inside, such as the library or the parent organization of the library, or from an external source, such as a granting body or a corporate donor. In choosing an organization for selec- tion, you should consider who would most benefit from the research, as the request for funding should focus on the benefit to the granting body. Many libraries and schools do have small pots of money available for research that will benefit that institution and that, many times, go untapped due to a lack of interest. Granting organizations put out formal calls for grant proposals. These can result in a grant that would carry some prestige but would require a detailed formal application that can take months of writing and waiting. Another approach is to work with a corporate or nonprofit organization that gives grants. If your organization has a development office, this office may be able to help connect you with a potential supporter of your work. How do I actually do the research? Just as the most critical part of a dissertation is the proposal, a good research plan will make your research process run smoothly. Before you start the research, write the literature review and the research plan as part of an article. It can be useful to create tables and charts with dummy data that will show how you plan to pres- ent results. Doing this allows you to notice gaps in your data-collection plan well before you start that pro- cess. In many research projects, you only have a single chance to collect data; therefore, it’s important to plan out the process before you begin. How do I start writing the paper? The best way to start the writing process is to just write. Don’t worry about coming up with a title; the title will develop as the work develops. You can skip over the abstract and introduction; these can be much easier to write after the main body of the article is complete. If you’ve fol- lowed the advice in this paper, then you’ve already written a literature review and perhaps a research plan; these make a good starting point for your article. One way to develop the body of the article is to develop an outline of headings and subheadings. Starting with this type of outline forces you to think through your entire article and can help you identify holes in your preparation. Once you have the outline completed, you can then fill in the outline by adding text to the head- ings and subheadings. This approach will keep your thinking organized in a way typically used in scholarly writing. Scholarly writing is different than creative writing. Many librarians with a humanities background face some challenges in transitioning to a differ- ent writing style. Scholarly writing is terse; Strunk and White’s The Elements of Style (2000) focuses on succinct writing and can help you refresh your writing skills.1 If you are having difficulty finding the time to write, it can be useful to set a small quota of writing that you will do every day. A quota such as four paragraphs a day is a reasonable amount to fit into even a busy day, but it will result in the completion of your first draft in only a few weeks. I’m finished with my first complete draft! Now what? While you will be excited with the completion of the draft, it’s not appro- priate to send that off to a journal just yet. Take a few days off and let your mind settle from the writing, then go back and reread your article carefully. Examine each sentence for a subject and a verb, and remove unneeded words, phrases, sentences, para- graphs, or even pages. Try to tighten and clean your writing by replacing figures of speech with statements that actually say what you mean in that situation and removing unneeded references to first- and second-per- son pronouns. Working through the entire article in this way greatly improves your writing and reduces the review and editing time needed for the article. After this, have several colleagues read your work. Some of these might be people with whom you shared your original ideas, and others may be new to the concepts. It can be useful to have members of different depart- ments and with different backgrounds read the piece. Ask them if they can read your work by a specific date, as this type of review work is easy to put off when work gets busy. These col- leagues may be people who work in your institution or may be people you have met online. If you know nobody who would be appropriate, consider putting out a request for assistance on a library discussion list focused on your research topic. Dealing with the comments from others requires you to set aside your 110 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 defenses. You did spend a lot of time on this work and it can be easy to slip into a defensive mode. Attempt to read their comments from an objective viewpoint. Remember—these people are spending their time to help you, and a comment you disagree with at first blush may make more sense if you consider the question “Why would someone say this about my work?” Putting yourself into the reader’s shoes can aid you in the cre- ation of a piece that speaks to many audiences. What goes on when I submit my work? At this point, your readers have looked at the piece, and you have made corrections on it. Now you’re ready to submit your work. Follow the directions of the target journal, including length, citation format, and method of submission. If submission is made by e-mail, it would be appro- priate to send a follow-up e-mail a few days after submission to ensure the work was received; it can be very frustrating to realize, after a month of waiting, that the editor never got the work. Once you have submitted your work, the editor will briefly review it to ensure it is an appropriate sub- mission for the journal. If it is appro- priate, then the editor will pass the article on to one or more reviewers; if not, you will receive a note fairly quickly letting you know that you should pick another journal. If the reviewing process is “blind,” then you will not know who your reviewers are, but they may know your iden- tity. If the process is “double-blind,” neither reviewer nor author will know the identity of the other. The reviewers will read the article and then submit comments and a recom- mendation to the editor. The editor will collect comments from all of the reviewers and put them together, and send those comments to you. This will always take longer than you would prefer; in reality, it will usually take two to six months, depending upon the journal. After a few months, it would be appropriate for you to contact the edi- tor and ask about the progress on the article and when you should expect comments. Do not expect to have your article accepted on the first pass. The common responses are: ■ Reject. At this point, you can read the comments provided, make changes, and submit it to another journal. ■ Revise and resubmit. The journal is not making a commitment to you, but they are willing to take another look if you are willing to make changes. This is a common response for first submissions. ■ Accept with major changes. The journal is interested in publish- ing the article, but it will require reworking. ■ Accept with minor changes. You will be presented with a series of small changes. Some of these might be required and others might be your choice. ■ Accept. The article is through the reviewing process and is on to the next stage. This is an iterative process. You will most likely go through several cycles of this before your article is accepted, and staying dedicated to the process is key to its success. It can be disheartening to have made three rounds of changes only to face another round of small changes. Ideally, each set of requested changes should be smaller (and take less time) until you reach the acceptance level. Do not submit your work to mul- tiple journals at the same time. If you choose to withdraw your work from one journal and submit it to another, let the editor know that you are doing this (assuming they have not rejected your work). My article has been accepted. When will it come out? Once your article is accepted, it will be sent into a copyediting process. The copy editor will contact you with more questions that focus more on writing and citation flaws than on content. After making more cor- rections, you will receive a proof to review (usually with a very tight deadline). This proof will be what comes out in the journal, so check important things like your name, institutions, and contact information carefully. The journal will usually come out several months after you see this final proof. The process from acceptance to publication can take from six months to two years (or more), depending on how much of a publication queue the journal has. The editor should be able to give you an estimate as to when the article will come out after full acceptance. Can I put a copy of my article online? It depends upon the copyright agree- ment that you sign. Many publishers will allow you to put a copy of your article on a local or institutional Web site with an appropriate citation. Some allow you to put up a preprint, which would be the version after copyediting but not the final proof version. If the copyright agreement doesn’t say anything about this, then ask the editor of the journal about the policy of authors mounting their own articles on a Web site. Conclusion Writing an article and getting it published is akin to having a child. Your child will have a life of its own, and others may notice this new piece of knowledge and build upon it to improve their own library services WRITING YOUR FIRST SCHOLARLY ARTICLE | NICHOLSON 111 or even make their own works. It is a way to make a difference that goes far beyond the walls of your own library, to extend your professional network, and to engage other scholars in the continued development of the knowledge base of our field. Reference 1. W. Strunk Jr. and E. B. White, The Elements of Style (Boston: Allyn & Bacon, 2000). For More Information: W. Crawford, First Have Something to Say: Writing for the Library Profession (Chicago: ALA, 2003). R. Gordon, The Librarian’s Guide to Writing for Publication (Lanham, Md.: Scarecrow, 2004). L. Hinchliffe and J. Dorner, eds., How to Get Published in LIS Journals: A Practical Guide (San Diego: Elsevier, 2003), www .elsevier.com/framework_librarians/Lib raryConnect/lcpamphlet2.pdf, (accessed Feb. 8, 2006). 3339 ---- 112 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2006 Book Review Debra Shapiro, Editor Strategic Planning and Management for Library Managers By Joseph R. Matthews. Westport, Conn.: Libraries Unlimited, 2005. xiv, 150p. $40 (ISBN 1-59158-231-8). The reality for most librarians is that, sometime in their career, they will be involved in strategic manage- ment and planning. While library school courses occasionally deal with this topic, it is from a theoretical perspective only. Most librarians are promoted or coerced into leadership and management roles, often with little or no training or resources at their disposal to assist them with the transition or change of respon- sibilities. Strategic planning is one of those duties assigned to library managers and leaders that often get pushed to the lowest-priority list, mainly because there are few guide- lines and handbooks available in this area. Since the publication of Donald Riggs’s Strategic Planning for Library Managers (Oryx, 1984), little atten- tion has been given to this vital topic. Matthews’s book attempts to provide information on how to explore strat- egies; demystify false impressions about strategies; how strategies play a role in the planning and delivery of library services; broad categories of library strategies that can be used; and identification of new ways to communicate the impact of strategies to patrons. As the author states in the introduction, the focus of librar- ies has moved from collections to encompass the arena of change itself. Finding strategies to enable opera- tion in a fluid environment can mean the difference between relevance and irrelevance in today’s competitive information marketplace. The book is divided into three major sections: (1) what is a strategy, and the importance of having one; (2) the value of and options for strategic planning; and (3) the need to moni- tor and update strategies. The first four chapters make up the first sec- tion. Chapters 1 and 2 go through the semantics and the need for strategies, as well as the realities and limitations of strategies. Chapter 3 provides brief introductions to schools of strategic thought. These include the design school, the planning school, the posi- tioning school, the entrepreneurial school, the cognitive school, the learning school, the power school, the cultural school, the environmen- tal school, and the configuration school. Chapter 4 introduces types of strategies: operational excellence, innovative services, customer inti- macy, and the concept of strategic options. Section 2 consists of chapters 5 through 8 and provides information on what strategic planning is, what its value is, process options such as plan- ning alternatives and critical success factors, and implementation. Section 3, comprised of chapters 9 and 10, focuses on the culture of assessment; monitoring and updating strategies; and tools available for managing the library. Two appendixes are provided: one containing sample library strate- gic plans, and another with a critique of a library strategic plan. Overall, the book is very straight- forward and understandable, with numerous illustrations, process work- flows, and charts. I found the infor- mation very interesting and useful, and the final section on assessment and measurement of strategic plan- ning is essential for libraries to implement and monitor in today’s marketplace. The various explana- tions related to schools of strategic thought were especially helpful. This book should be read by every library manager and director involved in strategic planning and process.—Brad Eden, Associate University Librarian for Technical Services and Scholarly Communication, University of California, Santa Barbara EBSCO cover 3 LITA 107, 111, covers 2 and 4 Index to Advertisers 3322 ---- ITAL_24n4p2 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. WikiWikiWebs: New Ways to Communicate in a Web Environment Chawner, Brenda;Lewis, Paul H Information Technology and Libraries; Mar 2006; 25, 1; ProQuest Education Journals pg. 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Graphical Table of Contents for Library Collections: The Application ... Herrero-Solana, Victor;Félix Moya-Anegón;Guerrero-Bote, Vicente;Zapico-Alonso, Felipe Information Technology and Libraries; Mar 2006; 25, 1; ProQuest Education Journals pg. 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3340 ---- President's Column 114 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 B eing president of a dynamic organization like LITA is truly a humbling experience. Every day I am awestruck by the dedication, energy, creativity, and excitement exhibited by LITA’s members. I see it in everything that LITA does, from its stellar publications and communications—including this journal, ITAL—to its programming and contribution to standards and system development. None of this would be possible without the hard work of all the dedicated members who volunteer their time not only to advancing their own professional development, but also to advancing the profession. Thank you all. For forty years now, LITA members have been dedicated to the association’s work, and we have been celebrating our fortieth anniversary throughout 2006. The celebration continues as we prepare to convene in Nashville for the ninth LITA National Forum, October 26– 29, 2006. LITA has had a long tradition of providing quality conferences. The first, held in 1970, was the Conference on Interlibrary Communications and Information Networks, more familiarly known as the “Airlie Conference,” which had published proceedings. The second was a cooperative effort held in 1971 with the Library Education Division and the American Society for Information Science (ASIS), entitled “Directions in Education for Information Science: A Symposium for Educators.” In later years, LITA held three national conferences: Baltimore (1983), Boston (1988), and Denver (1992). In 1996, LITA and the Library Administration and Management Association (LAMA) held a joint conference in Pittsburgh. While the national conferences were very successful, the idea of a more infor- mal, intimate event to be held annually took form, and in 1998 LITA held its first annual National Forum. Next year we will continue the tradition of successful conference programming as we celebrate the tenth anniversary of the LITA National Forum in Denver. This year’s theme is “NetVille in Nashville: Web Services as Library Services.” We have an exciting lineup of keynote and concurrent-session speakers as well as several poster-session presenters who will stimulate lively discussions in all of the wonderful, informal networking opportunities this small conference offers. The Sponsor Showcase allows plenty of time for attendees to talk to our valued sponsors and learn more about their products. The two preconference programs offer in-depth experiences: “Opensource Installfest” and “Developing Best Project Management Practices for IT Projects.” LITA bloggers will be out in force producing summaries and reactions to it all. One of LITA’s strongest membership benefits is the per- sonal networking opportunities it provides. By providing an informal and enjoyable atmosphere, the National Forum is one of the best places to network with others dealing with the same issues as you. I hope to see you there. Besides the National Forum (just one of LITA’s many educational programs), one of the things I like most about LITA is its flexibility to quickly accommodate program- ming to cover the latest issues and trends. LITA’s pro- gramming at ALA Annual Conferences attracts attendees from all divisions for this reason. Every year, the highly successful Top Technology Trends attracts more and more people who come to listen to the experts speak on the lat- est trends. The LITA Interest Groups, like the technologies they focus on, also exhibit great flexibility because they can come and go—it’s easy to locate a few other members to create a new group where interested parties can come together for focused discussions or formal presentations. Since its inception, LITA has had traveling educational programs to provide programming opportunities for people who cannot attend the ALA conferences. These in-depth programs, now called the Regional Institutes, focus on a topic and are offered as long as that issue is relevant. Look for new electronic delivery of LITA pro- grams in the future. Of course, LITA’s publications provide a very lasting educational component. LITA launched Journal of Library Automation (JOLA), the predecessor of ITAL, in 1968, one year after the formation of the new division of ALA. JOLA and, later, ITAL have consistently been a place for library information technologists to publish in a peer-reviewed scholarly journal. These well-respected publications have had a wonderful group of editors and editorial boards over the years. We are pleased that ITAL is now available online for members from the moment of publication. I want to thank all the people who work so hard to produce this publication on a quarterly basis. I also want to thank all the authors who submit their research for publication here and make a lasting contribution to the profession. All of these programs are just a sampling of what LITA provides its members. Is it any wonder I am awed by it all? I hope you are as well. I also hope that, in my year as your president, you will communicate with me in an open dialogue on the LITA blog, via e-mail, or in person at conferences regarding how LITA can better meet your needs as a member. We have been focusing a great deal on our educational goal because that is what we have heard you want out of LITA. I encourage you to let me and the rest of the LITA board know how we can best deliver a quality set of educational programs. President’s Column Bonnie Postlethwaite Bonnie Postlethwaite (postlethwaiteb@umkc.edu) is LITA Pres- ident 2006/2007 and Associate Dean of Libraries, University of Missouri–Kansas City. 3342 ---- Antelman 128 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Article Title: subtitle in same font Author Name and Second Author Author ID box for 2 column layout Library catalogs have represented stagnant technology for close to twenty years. Moving toward a next-gen- eration catalog, North Carolina State University (NCSU) Libraries purchased Endeca’s Information Access Platform to give its users relevance-ranked keyword search results and to leverage the rich metadata trapped in the MARC record to enhance collection browsing. This paper discusses the new functionality that has been enabled, the implemen- tation process and system architecture, assessment of the new catalog’s performance, and future directions. Editor’s Note: This article was submitted in honor of the fortieth anniversaries of LITA and ITAL. T he promise of online catalogs has never been realized. For more than a decade, the profession either turned a blind eye to problems with the catalog or accepted that it is powerless to fix them. Online catalogs were, once upon a time, “the most widely-available retrieval system and the first that many people encounter.”1 Needless to say, that is no longer the case. Libraries cannot force users into those “closed,” “rigid,” and “intricate” online catalogs.2 As a result, the catalog has become for many students a call-number lookup system, with resource discovery happening elsewhere. Yet, while the catalog is only one of many discovery tools, covering a proportionately narrower spectrum of information resources than a decade ago, it is still a core library service and the only tool for accessing and using library book collections. In recognition of the severity of the catalog problem, particularly in the area of keyword searching, and seeing that Integrated Library System (ILS) vendors were not addressing it, the North Carolina State University (NCSU) Libraries elected to replace its keyword search engine with software developed for major commercial Web sites. The software, Endeca’s Information Access Platform (IAP), offers state-of-the-art retrieval technologies. ฀ Early online catalogs Larson and Large and Beheshti summarize an extensive body of literature on online public access catalogs (OPACs) and related information-retrieval topics through 1997.3 The literature has tapered off since then; however, as promising innovations failed to be realized in commercial systems, mainstream OPAC technology stabilized, and the library community’s collective attention was turned to the Web. First generation online catalogs (1960s and 1970s) provided the same access points as the card catalog, dropping the user into a pre-coordinate index.4 The first online catalogs, byproducts of automating circulation functions, were “intended to bring a generation of library users familiar with card catalogs into the online world.”5 The expectation was that most users were interested in known-item searching.6 With the second generation of online catalogs came keyword or post-coordinate (Boolean) searching. While systems based on Boolean algebra represented an advance over those that preceded them, Boolean is still a retrieval technique designed for trained and experi- enced searchers. (Twenty years ago, Salton wrote, “[T]he conventional Boolean retrieval methodology is not well adapted to the information retrieval task.”7) Boolean systems were, however, simple to implement and eco- nomical in their storage and processing requirements, important at that time.8 Soon after the euphoria of combining free-text terms across records wore off, the library community recognized that the major problem with first- and second-generation catalogs was the difficulty of searching by subject.9 ฀ The “next-generation” catalog By the early 1980s, thinking turned to next-generation catalog features.10 Out of this surge of interest in improv- ing online catalogs emerged a number of experimental catalogs that incorporated advanced search and match- ing techniques developed by researchers in information retrieval. They typically did not rely on exact match (Boolean) but used partial-match techniques (probabilistic and vector-based). Since probabilistic and vector-based models were first worked out on document collections, not collections of MARC records, adaptations were made to the models.11 These prototype systems included Okapi, which implemented search trees, and Cheshire II, which refined probabilistic retrieval algorithms for online cata- logs.12 It is particularly sobering to revisit one system that was developed between 1979 and 1983. The CITE catalog, developed at the National Library of Medicine, incorpo- rated many of the features of the Endeca-powered catalog, including suggesting (MeSH) subject headings, correcting spelling errors, stemming, as well as even more advanced features, such as term weighting, keyword suggestion, and “find similar.”13 Toward a Twenty-First Century Library Catalog Kristin Antelman, Emily Lynema, and Andrew K. Pace Kristin Antelman (kristen_antelman@ncsu.edu), Emily Lynema (emily_lynema@ncsu.edu), and Andrew K. Pace (andrew_pace@ncsu.edu) are respectively Associate Director for the Digital Library, Systems Librarian for Digital Projects, and Head, Information Technology, at the North Carolina State University Libraries, Raleigh. TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 129 ฀ Where are we now? As Belkin and Croft noted in 1987, “there is a disquiet- ing disparity between the results of research on IR tech- niques . . . and the status of operational IR systems.”14 Two decades later, libraries are no better off: all major ILS vendors are still marketing catalogs that represent second- generation functionality. Despite between-record linking made possible by migrating catalogs to Web interfaces, the underlying indexes and exact-match Boolean search remain unchanged. It can no longer be said that more sophisticated approaches to searching are too expensive computationally; they may, however, to be too expensive to introduce into legacy systems from a business perspective. ฀ The Endeca-powered catalog Coupled with the relative paucity of current literature on next-generation online catalogs is a scarcity of library industry interfaces from which to draw inspiration, RLG’s Red Light Green and OCLC’s FictionFinder being notable exceptions. In June 2004, library automation vendor TLC announced a partnership with Endeca Technologies for joint sales, marketing, technology, and product develop- ment of the company’s IAP software. This search software underlies the Web sites of companies such as Wal-Mart, Barnes and Noble, IBM, and Home Depot. NCSU Libraries acquired Endeca’s IAP software in May 2005, started implementation in August, and deployed the new catalog in January 2006. Several organizational and cultural factors contrib- uted to making this project possible. Of significance was an ongoing administrative commitment to fund digital- library innovation, including projects that involve some risk. Library staff share this feeling that calculated risks are opportunities to improve the library as well as to open up new challenges in their own jobs. Critically, they also believe that not all issues, particularly “edge cases,” (i.e., rarely occurring scenarios) must be resolved before releasing a new service. Finally, it was important that the managers who controlled access to programming and other resources were also the project leaders and drivers of the collective urgency to solve the underlying problem. All these factors also contributed to making possible a five-month implementation timeline. Functionality The principle functionality gained by implementing an advanced search-and-navigation technology such as the Endeca IAP falls in three main areas: relevance-ranked results, new browse capabilities, and improved subject access. Most ILSs, including NCSU’s former catalog, presented keyword results to users in one order: last-in, first-out (i.e., system sort), while browsing within key- word result sets was limited to the links within individual records. ฀ Searching and relevance ranking of results Inhabiting the catalog search landscape now, somewhere between a second- and third-generation catalog, is Endeca’s MDEX Engine, which is capable of both Boolean and limited partial-match retrieval. Queries submitted to Endeca can use one of several matching techniques (e.g., matchall, matchany, matchboolean, matchallpartial). The current NCSU implementation primarily uses the “match- all” technique for keyword searching, an implied AND technique that requires that all search terms (or their spell- corrected, truncated form) entered by the user occur in the result. The user is not required to enter Boolean operators for this type of search; in fact, these terms are discarded as stopwords. The “matchboolean” technique continues to support true Boolean queries with standard operators; access to this functionality is provided through advanced search options. Although classic information retrieval research tends to associate relevance ranking with probabilistic or vec- tor-based retrieval techniques, Endeca includes a suite of relevance ranking options that can be applied to Boolean- type searches (i.e., implied AND/OR). These individual modules are combined and prioritized according to cus- tomer specifications to form an overall relevance ranking strategy, or algorithm. Each search index created in the Endeca software can be assigned a different relevance ranking strategy. This capability becomes significant when considering the differences in the data being indexed for ISBN/ISSN as compared to a general keyword search. Since the Keyword Anywhere index contains the majority of the fields in a MARC record and is the default search operator, its rel- evance ranking strategy received the most attention. This strategy currently consists of seven modules. The first five modules rank results in a dynamic fashion, while the final two modules provide static ordering based on publication date and total circulation. The NCSU Libraries, algorithm prioritizes results with the query terms exactly as entered (no spell-correction, truncation, or thesaurus matching) as most relevant. For multiterm searches, results containing the exact phrase are considered more relevant than those that do not. In addition, NCSU has created a field priority ranking, which 130 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 provides the capability to define matches that occur in the title as more relevant than matches that occur in the notes fields. The relevance algorithm also considers factors such as the number of times the query appears in each result and the term frequency/inverse document frequency (tf/idf) of query terms. The unprecedented nature of using this particular set of tools to define relevance algorithms in library catalogs meant that the initial configuration required a best guess approach. The ability to quickly change the settings and re-index provided the opportunity both to learn by doing and test assumptions. Much work remains, however, including systematic testing of the “matchallpartial” retrieval technique. While not a true probabilistic or vector- based matching approach, the “matchallpartial” retrieval technique will broaden a search by dropping individual query terms if no results are returned. However, this type of retrieval technique creates the challenge of developing an intuitive interface that helps users understand partial matching (although many users must be aware that this is how Google works). Spell correction, “Did you mean . . . ,” and sort Several other features are included in the basic Endeca IAP application. These include auto-correction of mis- spelled words, which uses an index-based approach based on frequency of terms in the local database rather than a dictionary. Due to the presence of unique terminology in the database (particularly author names), the relevance ranking has been configured to display any matches on the user’s original term before spell-corrected matches. A “Did you mean…” feature also checks queries against terms indexed within the local database to determine if another possible term has more hits than the original term in order to provide the user the option to resubmit the search with a different spelling. Various sort options are supported, including date, title, author, and “most popular.” ฀ Browse Whatever the shortcomings of the card catalog, a library user could approach it with no query in mind; any drawer could be browsed. With the advent of online catalogs, this is no longer possible: an initial search is required to enter the system. Marchionini characterizes “browsing strategies” as “informal and opportunistic.”15 A good catalog browse should simulate the experience of browsing the stacks, even potentially improving upon it since the virtual browser can jump around. Many patrons cite the seren- dipity of browsing the stacks and “recognizing” relevant resources as a key part of their discovery process. With more books moving to online formats and off-site storage (and therefore, unable to be browsed), enhancing virtual browsing in the catalog becomes increasingly important. As Borgman points out, “Few systems allow search- ers . . . to pursue non-linear links in the database.”16 Key browsing features provided by the Endeca software are faceted navigation and the ability to browse the entire collection without entering a search term. Although most modern search engines support both fast response times and relevance ranking, the opportunity to apply Endeca’s Guided Navigation feature to the highly structured MARC record data was particularly intriguing. Guided, or faceted, navigation exposes the relationships between records in the result set. For example, a broad topi- cal search might return thousands of results. Classification codes, subject headings, and item-level details can be used to define logical clusters for browsing—post-coordinate refinement—within the result set. Since these refinements are based on the actual metadata of the records in the result set, users can never refine to less than one record, (i.e., there are no “dead ends”).These clusters, or facets, are known as dimensions. Users are able to select and remove values from all available dimensions in any order to assist them as they browse through the result set. Endeca’s dimensions, while able to be browsed, are not available only as post-coordinate search refinements, however. Using the Endeca application, library catalogs can once again give users the ability to browse the entire set of records without first entering a search term. Any of the dimensions can be used to browse the collection in this fashion, and the ability to assign item-level infor- mation (e.g., format, availability, new book), as well as bibliographic-record elements, to the dimensions further enhances the browsing functionality. ฀ Improving subject access Given the unsuitability of Library of Congress Subject Headings (LCSH) as an entry vocabulary, improving topical (subject) access in catalogs centers around keyword searching. While keyword searches query the subject headings as they do the rest of the record, most systems do not take advantage of the fact that subject headings are controlled and structured access points or use the subject information embedded in the classification number. The Endeca-powered catalog, in addition to address- ing classic keyword-search problems by introducing relevance ranking, implied phrase, spell correction, and stemming, also leverages the “ignored” controlled vocabulary present in the bibliographic records—subject headings and classification numbers—to aid in improv- ing topical searching. This is a system design concept that has been discussed in the literature on improving subject TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 131 access but has not until now been manifest in a major catalog implementation. As Chan noted, “subject headings and classification systems have more or less operated in isolation from each other.”17 The Endeca-powered catalog interface is an experiment in presenting users with these two different, but complementary, approaches to categorizing library materials by subject. Classification Several catalog experiments created retrieval clusters based on Dewey- and DDC-classification schemes and captions in order to improve subject access by expanding the entry vocabulary and as a way to improve precision and recall.18 Using the LC Classification is more challeng- ing, however, as it is not hierarchical. Still, the potential of its use has been noted by Bates and Coyle; and Larson experimented with creating clusters (“classification clus- ters”) based on subject headings associated with a given LC class.19 In Larson’s system, the interface suggested possible subject headings of interest, an approach similar to that of displaying the subject facets alongside the result set in the Endeca catalog. There is some evidence from early usability studies that exposing the classification, much as it was physically exposed in the card catalog, is useful and desired by catalog users. Markey summarizes findings of a 1981 Council on Library Resources study in which many institutions con- ducted usability testing. Positive aspects of card-catalog use that people wanted to see in the OPAC included, a “visual overview of what is available in the library,” and “serendipity.”20 But there is a difference between using the classification scheme to identify subject headings and displaying the classification itself in the user interface. The latter can be problematic from a usability perspective, as Larson pointed out, because the classification scheme and terminology are not transparent.21 Imagine the would-be browser of a library’s computer-science collection hav- ing to know to select first Q Science, then QA1–QA939 Mathematics, and then QA71–QA90 Instruments and Machines before possibly recognizing that QA75–QA76.95 Calculating Machines included computer science? Despite these potential problems, because the Endeca software supported display of the LC Classification as a dimension, NCSU decided to experiment with its utility by making it available on the results screen. Entry vocabularies Entry vocabularies or mappings apply to all types of retrieval models. They address the general problem of reconciling a user’s query vocabulary with the index vocabulary represented in the catalog or documents.22 Studies show that users’ query vocabulary is large (people rarely pick the same term to describe the same concept) and inflexible (people are unable to repair searches with synonyms.)23 Because of this, Bates refers to the objective of the entry vocabulary as the “Side-of-a- Barn Principle.”24 Several approaches have been taken to develop this functionality. Building on Larson’s “classification cluster- ing” methodology, Buckland created an Entry Vocabulary Module by associating dictionaries created by analyz- ing database records.25 The result was natural language indexes to existing thesauri and classification systems. While the Endeca-powered catalog does not yet incorporate an entry vocabulary, its exposure of the index vocabulary to the user in subject dimensions could be said to be a limited side-of-a-barn approach. The limitation is that only controlled vocabulary from the retrieved records is exposed as dimensions on the results screen; relevant records not retrieved because of a lack of match between query vocabulary and terms in the record will not have their facets displayed. Were an entry vocabulary for LCSH available, Endeca’s synonym-table feature could be used to map between query terms and LCSH. ฀ Implementation The library’s Information Technology Advisory Committee appointed a seven-member representative team to oversee the implementation. Preparatory steps included sending key development staff to training and a two-day meeting with Endeca project managers to establish functional and technical requirements. Architecture Knowing that the Endeca application would not com- pletely replace NCSU’s integrated library system, deter- mining how best to integrate the two products was part of the implementation process. The Endeca IAP coexists with the SirsiDynix Unicorn ILS and the SirsiDynix (Web2) online catalog, indexing MARC records that are exported from Unicorn’s underlying Oracle database. Figure 1 depicts the integration of the Endeca software with exist- ing systems. Although the Endeca software is capable of communicat- ing directly with the database that supports the Unicorn ILS, NCSU chose the easier path of exporting MARC records into text files for ingest by Endeca. The MARC4J API is used to reformat the exported MARC records (which include item- level information in 999 fields) into flat text files with UTF-8 encoding that are parsed by Endeca’s Data Foundry process. Nightly shell scripts export updated and new records from ILS, merge those with the base Endeca files, and start the re-indexing process. The indexing of seventy-three MARC 132 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 record fields and ten dimensions results in an index size of approximately 2.5 GB. The entire index resides in system memory. The Endeca Data Foundry can easily parse and re- index the approximately 1.7 million titles in NCSU’s holdings nightly (in stark contrast to the more than 3 days of down- time required to re-index keywords in Unicorn). The relative speed of this process and the fact that it does not interfere with the front-end application prompted the decision not to implement “partial indexing” at the outset. Though there was little doubt among staff as to the increased capabilities of keyword searching through Endeca, the implementation team decided that authority searching (author, title, subject, call number) would be preserved in the new catalog interface. This allowed NCSU to retain the value of authority headings, in addition to providing a familiar interface and approach to known-item searching. Since the detailed record in Web2 included the capability to save records, place requests, and send system- suggested searches (“more like this”), the implementation team also decided to link from titles in the Endeca-pow- ered results page to the Web2 detailed record. Only slight modifications were required to stylize this display in a manner consistent with the new interface. The front-end interface for keyword searching in Endeca is a Java-based Web application built in-house. This application is responsible for sending queries to the Endeca MDEX Engine—the back-end HTTP service that processes user queries—and displaying the results that are returned. User-interface design Because it is created by the customer, NCSU Libraries has complete control over the look, feel, and layout of the Endeca search-results page. Indexes, properties, and dimensions The implementation team began the process of making indexing decisions by looking at the fields indexed in the Unicorn keyword-index file. This list included 161 MARC fields and subfields, including more than thirty fields that are never displayed to the public. This kitchen-sink approach was replaced with a more carefully selected list less than half that number. The implementation team defined eleven dimensions for use with Endeca’s faceted navigation feature. Once users enter a search query, they can explore the result set by selecting values from these dimensions: Availability; LC Classification; Subject: Topic; Subject: Genre; Format; Library; Subject: Region; Subject: Era; Language; and Author (see figure 2). The eleventh dimension is not dis- played on the results page, but is used to enable patrons to browse new titles. Each dimension value also lists the number of results associated with it; most dimensions are listed in frequency order. Search interface Once the implementation team made some preliminary decisions regarding dimensions and search indexes, wire- frames were created to assist in the iterative design process for the front-end application. While the positioning of the dimensions on the results page and the display of holdings information was well debated, the design of the catalog search page was an even hotter topic. Integration of both Endeca keyword searching and Web2 authority searching required an interface that could help users differentiate between the two tools. A survey of the keyword-versus-authority search- ing distinction in a variety of library catalogs led to the development of four mock-ups. The implementation team chose a Search tab that includes separate search boxes for keyword and authority searching, as well as search Figure 1. NCSU Endeca architecture Figure 2. Dimensions TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 133 examples dynamically displayed based on the index selected. Authority searching was relabeled “Begins with” searching to let users know that this particular search box featured known-item searching (although it is also where LCSH searching is found) (see figure 3). An Advanced Search tab re-creates the pre-coordinated search options from the Web2 search interface using Endeca search functionality. One unique new feature allows users to include or exclude reference materials and government documents from their results. A true Boolean search box is made available here, primarily for staff. Browse While users can submit a blank search and browse the entire collection by any of the dimensions, the Browse tab specifically supports browsing by LC Classification scheme (see figure 4). This tab also includes a “New Titles” browse that can easily be refined with faceted navigation. At the time of this writing, there are plans to pull out other dimensions, such as format, language, or library, for browsing. This will be a great stride forward since there has traditionally been no way to perform a MARC codes-only search (in order to browse all Chinese fiction in the main library, for example). Assessment The Endeca-powered catalog seems self-evidently a better tool to help users find relevant resources quickly and intui- tively. But since so much of the implementation involved uncharted territory, plans for assessment began before the launch of the interface, and the actual assessment activi- ties began shortly thereafter. The library identified five assessment measures prior to implementation. One of these, however, requires longer time-series data (changes in circulation patterns), and another, the application of new and potentially complex log-analysis techniques (path analysis). Other measures relate to use of the refinements, “sideways searching,” and objective and subjective mea- surements of quality search results, some of which can be preliminarily reported on here. Log analysis To learn more about how patrons are using the catalog, data from two months of search logs were analyzed. While authority searching using the library’s old Web2 catalog is still available in the new interface, search logs show that authority searching has decreased 45 percent and keyword searches have increased 230 percent. It is noted, however, that a significant—and indefinable—component of this increase in keyword searching is due to the fact that the default catalog search was changed from title to keyword. Users are taking advantage of the new navigational features. Fifty-five percent of the Endeca-based search requests are simple keyword searches, 30 percent represent searches where users are selecting post-search refinements from the dimensions on the results page, and the remaining 15 percent are true browses with no search term entered (this figure includes use of Browse New Titles). Dimensions The horizontal space just above the results is used to dis- play the full range of results within the LC Classification scheme (see figure 2). The first dimensions in the left col- umn focus on the subject dimensions (topic and genre) that should be pertinent to the broadest range of searches. The following format and library dimensions recognize that patrons are often limited by time and space. When design- ing the user interface, it was not known which dimensions would be most valuable. As it turned out, dimension use does not exactly parallel dimension placement. LC Classification is the most heavily used, followed closely by Subject: Topic, and then Library, Format, Author, and Subject: Genre. Since no basis for the placement of dimen- Figure 3. New catalog search interface Figure 4. Browse by LC Classification and new titles 134 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 sions existed at the time of implementation, the Endeca product team plans to use these data, after some time, to determine if changes in dimension order are warranted. Spell correction and “Did you mean . . .” Approximately 6 percent of Endeca keyword searches responded to the user’s query with some type of spelling correction or suggestion: 3.6 percent performed an auto- matic spell correction, and 2.8 percent offered a “Did you mean…” suggestion. While NCSU has not analyzed how many of the spell corrections are accurate or how many of the “Did you mean…” suggestions are being selected by users, future work in this area is planned. Recommender features Two features in Endeca that have seen a surprising amount of use are the “most popular” sort option and the “more titles like this” feature available on the detailed-record page for a specific title. Both relate broadly to the area of recommending related materials to patrons. The “most popular” sort option is currently powered by aggregated circulation data for all items associated with a title. While this technique is ineffective for serials, reference materials, and other noncirculating items, it provides users a previously unavailable opportunity to define relevance. To date, the “most popular” sort is the second most frequently selected sort option (after publica- tion date, at 41 percent), garnering 19 percent of all sorting activity. Most-popular sorting was trailed by title, author, and call-number sorting. When viewing a detailed record, users are given the option to find “more titles like this” or “more by these authors.” The first option initiates a new subject keyword search combining the phrases from the $a subdivision of all the subject (6xx) fields assigned to the record. The lat- ter option initiates an author keyword search for any of the authors assigned to the current record. While there are not good statistics on use of this feature, these subject strings appear regularly in the list of most popular queries in search logs. Assessing top results If relevance ranking was effective, one would expect to see good results on the first page. But what are “good” or “relevant” results? Greisdorf finds that topicality is the first condition of relevance, and Xu and Chen’s more recent study finds topicality and novelty to be equally important components of relevance.26 While someone other than the searcher might be able to assess topical relevance, it is impossible to assess novelty, since it cannot be known what the searcher already knows. Although researchers agree that relevance is subjec- tive—that is, only a searcher can determine whether results are relevant—Janes showed that trained external searchers do a reasonably good job of approximating the topical relevancy judgments of users.27 The analysis reported here focuses on topicality (using a liberal inter- pretation of what might be topically relevant). NCSU Libraries sought to measure how many of the top search results are likely to be relevant to the user ’s query in the old and new catalogs. Methodology One of the authors searched 100 topical queries (taken from 2005 search logs) in both Web2 and Endeca catalogs using “keyword anywhere.” Topical queries whose meaning was unclear (e.g., “hand wrought”) were excluded. The topical relevance of the top hits (up to five) was coded for each target. Because not all search-result sets contained five records, success for each was measured as a ratio (e.g., 2/5 = .4). Those searches that resulted in 0 records in both targets were discarded, while those that resulted in 0 records in target a but “found relevant results” in target b were counted as 0 in target a. The ratios were then averaged for each target and compared to determine the difference in relevance-ranking performance. Finally, a random subset of forty-four of the queries was selected, and the placement in the Web2 results of the first result in Endeca was noted. Results On average, 40 percent of the top results in Web2 were judged to be relevant, while 68 percent of the top results in Endeca were judged to be relevant. That represents a 70 percent better performance for the Endeca catalog. If one makes the assumption that the first Endeca record is relevant (admittedly an assumption), based on these data, then one can look at the average position of that record in the old catalog. It was found that the first hit in Endeca fell between #1 and #4126 in Web2, with more than a third falling after the second screen of results, the maximum number of screens users are typically willing to examine.28 While this level of increased performance is impres- sive, it masks some dramatic differences in the respec- tive result sets. Looking at a broad search, “marsupial,” all of the top five hits in Endeca have “marsupial” in the title and “marsupials” or “marsupialia” as a subject heading. The result set includes seventy-eight records, thanks to this intelligent stemming. In the Web2 result set, just twenty-nine records, not a single one of the top five has “marsupial” in the title or subject headings (and the top two results, Tributes to Malcolm C McKenna and Poisonous plants and related toxins, are highly unlikely to be relevant). It is not until record #10 that you see the first item that contains “marsupial” in the title or subject. This single example demonstrates the benefit of both relevance ranking and stemming. TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 135 Usability testing As a result of a long history of catalog-usability studies, there are things that are known about library catalog users. One is that people both expect systems to be easy to use and find that they are not.29 Usability testing was conducted to compare student success in using the new catalog interface with that of students using the old catalog interface when completing the same set of ten tasks. Ten undergraduate students were recruited for the test. Five were randomly selected to use the old Web2 catalog, while the other five used the new catalog interface, which allows users to choose between a keyword search box powered by Endeca and an author- ity search box (begins with . . . ) that is still powered by Web2. The test contained four known-item tasks and six topical-searching tasks (appendix A). Task success, duration, and difficulty were recorded. User satisfaction was not measured since catalog usability studies have found that satisfaction does not correlate with success.30 Task duration Figure 5 shows the average task duration for the topical tasks (5–10) for Web2 and Endeca. Except for task 9*, there is clearly a trend of significantly decreased average task duration for Endeca catalog users. The Endeca catalog shows a 48 percent improvement in the average time required to complete a task (01:34 in Web2 compared to 00:49 in Endeca). It is also noted that, although results from known-item searching tasks (1–4) are not reported in detail here, test subjects were just as successful in completing them using keyword searching in the Endeca catalog as they were using authority searching in Web2. Task success and difficulty In addition to task duration, the test moderator assigned a difficulty rating to each task attempted by the partici- pants: easy, medium, hard, or failed. Figure 6 illustrates the overall task-attempt difficulty for topical tasks (5–10) in the Web2 and Endeca catalogs. The largest improvement is in the increased percentage of tasks that are completed easily in Endeca and the nearly equivalent decrease in the percentage of tasks that were rated as hard to complete. While a significant number of tasks were still failed using the Endeca catalog, many of these failures can be attributed to participants’ propensity to select Keyword in Subject rather than Keyword Anywhere searches. In fact, the only instances where Keyword Anywhere search in the new catalog failed to lead to successful task completion were for a single participant who was unwilling to examine retrieved results closely enough to determine if they were actually relevant to the task question, assuming too quickly that the task had been completed successfully. Terminology Participants using both the Web2 and Endeca catalog interfaces expressed confusion over some of the terminol- ogy employed. One of the most problematic terms was “subject.” A number of participants selected Keyword in Subject for topical searches because of the attraction of the word “subject.” None of the participants recognized that this term referred to controlled vocabulary assigned to records. Coupled with a slight unfamiliarity with the term “keyword,” not typically used in Web searching, this misunderstanding led participants to misuse (or overuse) Keyword in Subject searches when they could have found results more effectively using general keyword searching. This terminology problem appears to be an artifact of the usability testing, however. Looking at the search logs, more than 50 percent of the keyword searches were Keyword Anywhere searches, while only 4 percent represented Keyword in Subject searches. Relevance Relevance ranking of search results is clearly the most important im-provement in the new catalog. Students in this usability test all looked immediately at the first few results on the first page to determine if their search had pro- duced good results. If they didn’t like what they saw, they were likely to retry the search with fewer or more keywords in order to improve their first few results. One participant Figure 5. Average task duration: Web2 versus Endeca * While task 9 may appear to be an aberration, it actually reveals effec- tive use of new functionality. This task required users to locate an audio recording of poetry in Spanish. In Web2, three of five participants com- pleted the task successfully, all using the material type and language limits available in the advanced search tab. The two participants who didn’t locate this tool failed to complete the task. In Endeca, two participants used the same advanced search limits to complete the task success- fully and two additional participants were able to locate and use Endeca dimensions to complete the task successfully. This suggests that the new interface is providing users with more options to help them arrive at the results they seek. 136 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 using the Web2 catalog expressed the need for relevance ranking, “Once I scroll through a page, I get pretty discouraged about the results.” The number of pag- ing requests recorded in system logs confirms that users are focusing on the first result screen (with ten results per page); only 13 percent of searchers go to the second page. Use of dimensions When questioned after the test, all five par- ticipants who used the Endeca catalog intuitively understood that dimen- sions could be used to narrow results. However, only three used the dimensions during the test. Throughout the tests, the student participants frequently attempted to limit their search at the outset, rather than beginning with a broad search and then refining. It is unclear whether this behavior is a function of the very specific nature of the test questions or experience with the old catalog. Log data show that users are indeed entering broad keyword searches with only one or two terms, which implies that dimensions may be more useful than this usability test indicates. It is also interesting to note that while none of the students understood that the LC Classification dimension represented call-number ranges, they did understand that the values could be used to learn about a topic from different aspects—science, medicine, education. ฀ Future directions Weeks before the initial application went live in January 2006, the list of desired features had grown long. Some of these were small “to do” items that the team did not have time to implement. Others required deeper investigation, discussion, and testing before the feature could be put into production. Still others may or may not be possible. A few of NCSU’s significant planned development directions are summarized below. Functional Requirements for Bibliographic Records There is much interest in the utility of applying the Functional Requirements for Bibliographic Records model to online catalogs.31 Endeca includes a feature called “record rollup” that allows retailers to group items together for example, different sizes and colors of a shirt. All that is required for this feature is a rollup key. NCSU, working with OCLC, has elected to try the OCLC work identifier to take advantage of this functionality and create work-level record displays in the Endeca catalog hit list. Subject access The collective investment libraries have made in subject and name authorities is leveraged with the faceted naviga- tion features of Endeca. But only authorized headings in records are seen by Endeca, cross-references in the subject- authority record are not used. During implementation, the team looked at ways to improve the entry vocabulary to authorized-subject terms by loading the 1xx and 4xx fields from the subject-authority file into Endeca synonym tables so that users could be guided to proper subject terms. The team still views this as a promising direction, but simply did not have time to fully explore it prior to implementation. Additional discussions with OCLC centered on their Faceted Access to Subject Terms (FAST) project. FAST terms are more amenable than LCSH headings to being broken up into topical, geographic, and time-period facets without losing context and meaning. The normalization of geographic and time-period subdivisions promises to be particularly useful. FAST has, to date, lacked a ready inter- face for the application of its data. While the FAST structure is more conducive to non-cataloger metadata creation and post-coordinate refinement, it still does not meet the need Figure 6. Topical task success and difficulty: Web2 versus Endeca TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 137 for a user-entry vocabulary.32 Were such a vocabulary for LCSH to become available, it could be mapped to synonym tables to lead users to authorized headings. Abandon authority searching? The future of authority searching, however, is less clear. Although the usability testing described in this paper showed that the Endeca keyword search tools performed on a par with the old catalog for known-item searching, it is recognized that authority searching serves more func- tions. Clearly, collocation of all books on a topic is absent when a user does a topical search using keyword rather than a controlled subject heading. But there are more subtle losses as well. As Chan points out, one purpose of subject access is to help users focus searches, develop alternative strategies, and enable recall and precision.33 This is not possible with a simple keyword search, unless the searcher discovers that he can search on a subject heading from a record of interest. The display of subject facets in the Endeca-powered catalog works to counter this weakness of simple keyword searching. Another navigation aid in the traditional authority dis- play that is lost in a simple keyword-search result is visible “seams.” As Mann points out, “Seams serve as perceptible boundaries that provide points of reference; without such boundaries, readers get ‘lost at sea’ and don’t know where they are in relation to anything else: they can’t perceive either the extent of what they have, or what they don’t have.”34 Until users have confidence that a known item will appear at the top of a results list if the library holds that item, with a large keyword result set, one cannot confirm a “negative result” without browsing through the entire set. The Endeca- powered catalog interface does not help to address either the “seams” or the negative-result problem, which are two reasons why NCSU maintained authority searching. An integration platform Despite the vast improvements found in the Endeca catalog, the fact remains that it is still mainly books—as Calhoun says, “only a small portion of the expanding universe of scholarly information.”35 There are two approaches to take with the Endeca platform: one is to take advantage of having control over the data and the interface to facilitate incorporation of outside data sources to enhance bibliographic records. The second is to put other, non-catalog data sources under the Endeca search-and-navigation umbrella. The middleware nature of the Endeca platform makes either approach more promising than the “square peg and round hole” problem of trying to work with library management systems ill- equipped to handle a diversity of digital assets. Whether as a feed of catalog data to a metasearch application or Web-site search tool, or as a platform for faceted access to electronic theses, institutional repositories, or electronic books, Endeca has clear potential as a future platform for library resource discovery. ฀ Conclusion While it cannot be claimed that this Endeca-powered cata- log is a third-generation online catalog, it does implement a majority of the third-generation catalog features identified by Hildreth. Most notably, through navigation of subject and item-level facets, the Endeca catalog supports two of his objectives, “related record search and browse” and “integration of keyword, controlled vocabulary, and clas- sification-based approaches.” Spell correction, intelligent stemming, and synonym tables support “automatic term conversion/matching aids.” The flexible relevance-rank- ing tools support “closest, best-match retrieval” as well as “ranked output.” Much work remains, however. Three important features identified by Hildreth cannot be said to be implemented in this catalog at this time: “natural language query expression,” that is, an entry vocabulary, “expanded coverage and scope,” and “relevance feedback methods.”36 Requirements for these features are either being reviewed or are already under development by both Endeca and NCSU Libraries. NCSU views the Endeca catalog implementation in the context of a broader, critical evaluation and overhaul of library discovery tools. Like the library Web site, the catalog still requires users to come to it. When they do, it still sets a high threshold for patience and the ability to interpret clues. Still, at the end of the day it rewards the NCSU student searching “Declaration of Independence” with the book, American Scripture: Making the Declaration of Independence instead of the recent Congressional resolution, Recognizing the Mexican holiday of Cinco de Mayo. References 1. Christine L. Borgman, “Why Are Online Catalogs Still Hard to Use?” Journal of the American Society for Information Sci- ence 47, no. 7 (1996). 2. Karl V. Fast and D. Grant Campbell, “I Still Like Google: University Student Perceptions of Searching OPACs and the Web.” In Proceedings of the 67th ASIS&T Annual Meeting (Providence, R.I.: American Society for Information Science and Technology, 2004). 3. Ray R. Larson, “Between Scylla and Charybdis: Subject Searching in the Online Catalog,” Advances in Librarianship 15 (1991); Andrew Large and Jamshid Beheshti, “OPACs: A Research Review,” Library & Information Science Research 19, no. 2 (1997). 4. Nathalie Nadia Mitev, Gillian M. Venner, and Stephen Walker, Designing an Online Public Access Catalogue: Okapi, a Cata- logue on a Local Area Network (London: British Library, 1985). 138 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 5. Borgman, “Why Are Online Catalogs Still Hard to Use?” 495. 6. R. Hafter, “The Performance of Card Catalogs: A Review of Research,” Library Research 1, no. 3 (1979). 7. Gerard Salton, “The Use of Extended Boolean Logic in Information Retrieval,” in Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data (New York: ACM Pr., 1984), 277. 8. Ray R. Larson, “Classification Clustering, Probabalistic Information Retrieval, and the Online Catalog,” Library Quar- terly 61, no. 2 (1991). 9. Ibid. 10. Charles R. Hildreth, Online Public Access Catalogs: The User Interface (Dublin, Ohio: OCLC, 1982). 11. Larson, “Classification Clustering.” 12. Mitev, Venner, and Walker, Designing an Online Public Access Catalogue; Ray R. Larson et al., “Cheshire II: Designing a Next-Generation Online Catalog,” Journal of the American Society for Information Science 47, no. 7 (1996). 13. Tamas E. Doszkocs, “CITE NLM: Natural-Language Search- ing in an Online Catalog,” Information Technology and Libraries 2, no. 4 (1983). 14. Nicholos J. Belkin and W. Bruce Croft, “Retrieval Tech- niques,” in Annual Review of Information Science and Technology, ed. Martha E. Williams (New York: Elsevier, 1987), 129. 15. Gary Marchionini, Information Seeking in Electronic Envi- ronments (New York: Cambridge Univ. Pr., 1995), 100–18. 16. Borgman, “Why Are Online Catalogs Still Hard to Use?” 494. 17. Lois Mai Chan, Exploiting LCSH, LCC, and DDC to Retrieve Networked Resources: Issues and Challenges (Washington, D.C.: Library of Congress, 2001), www.loc.gov/catdir/bibcontrol/ chan_paper.html (accessed July 10, 2006). 18. Lois Mai Chan, “Library of Congress Classification as an Online Retrieval Tool: Potentials and Limitations,” Information Technology and Libraries 5, no. 3 (1986); Mary Micco and Rich Popp, “Improving Library Subject Access (ILSA): A Theory of Clustering Based in Classification,” Library Hi Tech 12, no. 1 (1994). 19. Marcia J. Bates, “Subject Access in Online Catalogs: A Design Model,” Journal of the American Society for Information Sci- ence 37, no. 6 (1986); Karen Coyle, “Catalogs, Card—and Other Anachronisms,” The Journal of Academic Librarianship 31, no. 1 (2005); Larson, “Classification Clustering.” 20. Karen Markey, “Thus Spake the OPAC User,” Information Technology and Libraries 2, no. 4 (1983): 383. 21. Larson, “Classification Clustering.” 22. Marcia J. Bates, Library of Congress Bicentennial Conference on Bibliographic Control for the new Millennium, Task Force Recom- mendation 2.3 Research and Design Review: Improving User Access to Library Catalog and Portal Information, Final Report, 2003; Charles R. Hildreth, Intelligent Interfaces and Retrieval Methods for Subject Searching in Bibliographic Retrieval Systems (Washington, D.C.: Library of Congress, 1989); Bates, “Subject Access in Online Catalogs”; Belkin and Croft, “Retrieval Techniques.” 23. Bates, “Subject Access in Online Catalogs”; Bates, Library of Congress Bicentennial Conference on Bibliographic Control for the new Millennium; Eric Novotny, “I Don’t Think, I Click: A Protocol Analysis Study of Use of a Library Online Catalog in the Internet Age,” College & Research Libraries 65, no. 6 (2004). 24. Bates, “Subject Access in Online Catalogs,” 367. 25. Larson, “Classification Clustering”; Buckland et al., “Map- ping Entry Vocabulary to Unfamiliar Metadata Vocabularies,” D-Lib Magazine 5, no. 1 (1999). 26. H. Greisdorf, “Relevance Thresholds: A Multi-Stage Pre- dictive Model of how Users Evaluate Information,” Information Processing & Management 39, no. 3 (2003): 403–23; Yunjie (Calvin) Xu and Zhiwei Chen, “Relevance Judgment: What do Informa- tion Users Consider beyond Topicality?” Journal of the American Society for Information Science and Technology 57, no. 7 (2006). 27. Joseph W. Janes, “Other People’s Judgments: A Compari- son of Users’ and Others’ Judgments of Document Relevance, Topicality, and Utility,” Journal of the American Society for Informa- tion Science 45, no. 3 (1994). 28. Bernard J. Jansen and Udo Pooch, “A Review of Web Searching Studies and a Framework for Future Research,” Jour- nal of the American Society for Information Science and Technology 52, no. 3 (2001); Novotny, “I Don’t Think, I Click.” 29. Borgman, “Why Are Online Catalogs Still Hard to Use?” 30. Brian Nielsen and Betsy Baker, “Educating the Online Catalog User: A Model Evaluation Study,” Library Trends 35, no. 4 (1987). 31. IFLA Cataloging Section, “FRBR Bibliography,” www.ifla .org/VII/s13/wgfrbr/bibliography.htm (accessed May 1, 2006). 32. Lois Mai Chan et al., “A Faceted Approach to Subject Data in the Dublin Core Metadata Record,” Journal of Internet Catalog- ing 4, no. 1/2 (2001). 33. Chan, Exploiting LCSH, LCC, and DDC. 34. Thomas Mann, “Is Precoordination Unnecessary in LCSH? Are Web Sites More Important to Catalog than Books?” A Refer- ence Librarian’s Thoughts on the Future of Bibliographic Control (Washington, D.C.: Library of Congress, 2001), www.loc.gov/ catdir/bibcontrol/mann_paper.pdf (accessed July 10, 2006). 35. Karen Calhoun, “The Changing Nature of the Catalog and its Integration with Other Discovery Tools,” prepared for the Library of Congress, 2006, 24. Unpublished, www.loc.gov/ catdir/calhoun-report-final.pdf (accessed July 7, 2006). 36. Charles R. Hildreth, Online Catalog Design Models: Are We Moving in the Right Direction? (Washington, D.C.: Council on Library Resources, 1995). TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 139 Copyright © 2006 by Charles W. Bailey Jr. This work is licensed under the Creative Commons Attribution- NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard St., 5th Floor, San Francisco, CA, 94105, USA. Bailey continued from 127 ฀ Known-Item Questions 1. “Your history professor has requested you to start your research project by looking up background information in a book titled Civilizations of the Ancient Near East.” a. “Please find this title in the library catalog.” b. “Where would you go to find this book physically?” 2. “For your literature class, you need to read the book titled Gulliver’s Travels written by Jonathan Swift. Find the call number for one copy of this book.” 3. “You’ve been hearing a lot about the physicist Richard Feynman, and you’d like to find out whether the library has any of the books that he has written.” a. “What is the title of one of his books?” b. “Is there a copy of this book you could check out from D. H. Hill Library?” 4. “You have the citation for a journal article about photosynthesis, light, and plant growth. You can read the actual citation for the journal article on this sheet of paper.” Alley, H., M. Rieger, and J.M. Affolter. “Effects of Developmental Light Level on Photosynthesis and Biomass Production in Echinacea Laevigata, a Federally Listed Endan- gered Species.” Natural Areas Journal 25.2 (2005): 117–22. a. “Using the library catalog, can you determine if the library owns this journal?” b. “Do library users have access to the volume that actually contains this article (either electronically or in print)?” ฀ Topical Questions 5. “Please find the titles of two books that have been written about Bill Gates (not books written by Bill Gates).” 6. “Your cat is acting like he doesn’t feel well, and you are worried about him. Please find two books that provide information specifically on cat health or caring for cats.” 7. “You have family who are considering a solar house. Does the library have any materials about building passive solar homes?” 8. “Can you show me how would you find the most recently published book about nuclear energy policy in the United States?” 9. “Imagine you teach introductory Spanish and you want to broaden your students’ horizons by expos- ing them to poetry in Spanish. Find at least one audio recording of a poet reading his or her work aloud in Spanish.” 10. “You would like to browse the recent journal litera- ture in the field of landscape architecture. Does the Design Library have any journals about landscape architecture?” Appendix A: NCSU Libraries Catalog Usability Test Tasks 3341 ---- Editorial The authors of “The State of RFID Applications in Libraries,” that appeared in the March 2006 issue, inadvertently included two sentences that are near quotations from a commentary by Peter Warfield and Lee Tien in the April 8, 2005 issue of the Berkeley Daily Planet. On page 30 immediately following footnote 24, the authors wrote: “The Eugene Public Library reported ‘collision’ problems on very thin materials and on videos as well as false readings from the RFID security gates. Collision problems mean that two or more tags are close enough to cancel the signals, making them undetectable by the RFID checkout and security systems.” Warfield and Lien wrote: “The Eugene (Ore.) Public Library reported ‘collision’ problems on very thin materials and on videos as well as ‘false readings’ from the RFID security gates. (Collision prob- lems mean that two or more tags are close enough to ‘cancel the signals,’ according to an American Library Association publication, making them undetectable by the RFID check- out and security systems.)” (Accessed May 16, 2006, www .berkeleydailyplanet.com/article.cfm?archiveDate=04-08 -05&storyID=21128). The authors’ research notes indicated that it was a near quotation, but this fact was lost in the writing of the article. The article referee, the copy editors, and I did not question the authors because earlier in the same paragraph they wrote about the Eugene Public Library experience and referred (footnote 23) to an earlier article in the Berkeley Daily Planet. The authors and I apologize for this unfortunate error. **** July 1, 2006 marked the merger of RLG and OCLC. By the time this editorial appears, many words will already have been spoken and written about this monumental, twenty- first century library event. I know what I think the three very important immediate effects of the merger will be. First, it is a giant step toward the realization of a global library bibliographic database. Second, taking advantage of RLG’s unique and successful programs and integrat- ing them and their development philosophy as “RLG- Programs,” while working alongside OCLC Research, seems a step so important for the future development of library technology that it cannot be overemphasized. Third, and very practically, incorporating RedLightGreen into Open WorldCat will give the library world a product that users might prefer over a search of Google Books or Amazon. I requested and received quotes about the merger from the principals that I might put into this editorial that won’t appear until four months after the May 3 announce- ment. Jay Jordan, president and CEO, OCLC, remarked: “We have worked cooperatively with RLG on a variety of projects over the years. Since we announced our plans to combine, staff from both organizations have been work- ing together to develop plans and strategies to integrate systems, products, and services. Over the past several months, staff members have demonstrated great mutual respect, energy, and enthusiasm for the potential of our new relationship and what it means for the organizations we serve. There is much work to be done as we complete this transition. Clearly, we are off to a good start.” Betsy Wilson, chair, OCLC Board of Trustees, and dean of libraries, University of Washington, wrote: “The response from our constituencies has been overwhelmingly supportive. Over the past several months, we have final- ized appointments for the twelve-person Program Council, which reports to . . . OCLC through a standing committee called the RLG Board Committee. We are starting to build agendas for our new alliance. The members of this group from the RLG board are: James Neal, vice president for Information Services and University Librarian, Columbia University; Nancy Eaton, dean of University Libraries and Scholarly Communication, Penn State University (and for- mer chair of the OCLC Board); and Carol Mandel, dean of libraries, New York University. From OCLC the members are Elisabeth Niggeman, director, DeutschesBibliothek; Jane Ryland, senior scientist, Internet 2; and Betsy Wilson, dean of University Libraries, University of Washington.” And from James Michalko, currently President and CEO of RLG, and by the time you read this, Vice President, RLG-Programs Development, OCLC: “We are combining the practices of RLG and OCLC in a very powerful way— by putting together the traditions of RLG and OCLC we are creating a robust new venue for research institutions and new capacity that will provide unique and beneficial outcomes to the whole community.” By now, all LITA members and ITAL readers know that in 1967, Fred Kilgour founded OCLC; and was the found- ing editor of the Journal of Library Automation (JOLA—Vol. 1, no. 1 was published in March, 1968), which, with but a mild outcry from serials librarians, changed its title to Information Technology and Libraries in 1982. This afternoon (6/15/06), I called Fred. He and his wife Eleanor remi- nisced about the earliest days, and then I asked him for his comments on the OCLC-RLG merger. Because he had had the first words about both OCLC and JOLA, as it were, I told him that I would like for him to have the last. And this is what he said, “At long last!” Fred Kilgour died on July 31, 2006, aged 92. A tribute posted by Alane Wilson of OCLC may be read at http:// scanblog.blogspot.com/2006/07/frederick-g-kilgour -1914-2006.html Editorial: A Confession, a Speculation, and a Farewell John Webb John Webb (jwebb@wsu.edu) is a Librarian Emeritus, Washington State University and Editor of Information Technology and Libraries. EDITORIAL | WEBB 115 3344 ---- Bailey 116 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Three critical issues—a dramatic expansion of the scope, duration, and punitive nature of copyright laws; the abil- ity of Digital Rights Management (DRM) systems to lock-down digital content in an unprecedented fashion; and the erosion of Net neutrality, which ensures that all Internet traffic is treated equally—are examined in detail and their potential impact on libraries is assessed. How legislatures, the courts, and the commercial marketplace treat these issues will strongly influence the future of digital information for good or ill. Editor's Note: This article was submitted in honor of the fortieth anniversaries of LITA and ITAL. B logs. Digital photo and video sharing. Podcasts. Rip/Mix/Burn. Tagging. Vlogs. Wikis. These buzz- words point to a fundamental social change fueled by cheap personal computers (PCs) and servers, the Internet and its local wired/wireless feeder networks, and powerful, low-cost software. Citizens have morphed from passive media consumers to digital-media producers and publishers. Libraries and scholars have their own set of buzzwords: digital libraries, digital presses, e-prints, institutional re- positories, and open-access (OA) journals, to name a few. They connote the same kind of change: a democratiza- tion of publishing and media production using digital technology. It appears that we are on the brink of an exciting new era of Internet innovation: a kind of digital utopia. Gary Flake of Microsoft has provided one striking vision of what could be (with a commercial twist) in a presentation entitled “How I Learned to Stop Worrying and Love the Imminent Internet Singularity,” and there are many other visions of possible future Internet advances.1 When did this metamorphosis begin? It depends on who you ask. Let’s say the late 1980s, when the Internet began to get serious traction and an early flowering of noncommercial digital publishing occurred. In the subsequent twenty-odd years, publishing and media production went from being highly centralized, capital-intensive analog activities with limited and well- defined distribution channels, to being diffuse, relatively low-cost digital activities with the global Internet as their distribution medium. Not to say that print and conven- tional media are dead, of course, but it is clear that their era of dominance is waning. The future is digital. Nor is it to say that entertainment companies (e.g., film, music, radio, and television companies) and information companies (e.g., book, database, and serial publishers) have ceded the digital-content battlefield to the upstarts. Quite the contrary. High-quality, thousand-page-per-volume scientific jour- nals and Hollywood blockbusters cannot be produced for pennies, even with digital wizardry. Information and enter- tainment companies still have an important role to play, and, even if they didn’t, they hold the copyrights to a significant chunk of our cultural heritage. Entertainment and information companies have under- stood for some time that they must adapt to the digital environment or die, but this change has not always been easy, especially when it involves concocting and embracing new business models. Nonetheless, they intend to thrive and prosper—and to do whatever it takes to succeed. As they should, since they have an obligation to their share- holders to do so. The thing about the future is that it is rooted in the past. Culture, even digital culture, builds on what has gone before. Unconstrained access to past works helps determine the richness of future works. Inversely, when past works are inaccessible except to a privileged minority, future works are impoverished. This brings us to a second trend that stands in opposi- tion to the first. Put simply, it is the view that intellectual works are property; that this property should be protected with the full force of civil and criminal law; that creators have perpetual, transferable property rights; and that contracts, rather than copyright law, should govern the use of intellectual works. A third trend is also at play: the growing use of Digital Rights Management (DRM) technologies. When intel- lectual works were in paper (or other tangible forms), they could only be controlled at the object-ownership or object-access levels (a library controlling the circulation of a copy of a book is an example of the second case). Physical possession of a work, such as a book, meant that the user had full use of it (i.e., the user could read the entire book and photocopy pages from it). When works are in digital form and are protected by some types of DRM, this may no longer be true. For example, a user may only be able to view a single chapter from a DRM-protected e-book and may not be able to print it. The fourth and final trend deals with how the Internet functions at its most fundamental level. The Internet was designed to be content-, application-, and hardware-neu- tral. As long as certain standards were met, the network did not discriminate. One type of content was not given preferential delivery speed over another. One type of Strong Copyright + DRM + Weak Net Neutrality = Digital Dystopia? Charles W. Bailey Jr. Charles W. Bailey Jr. (cbailey@digital-scholarship.com) is Assistant Dean for Digital Library Planning and Development at University of Houston Libraries. DIGITAL DYSTOPIA | BAILEY 117 content was not charged for delivery while another was free. One type of content was not blocked (at least by the network) while another was unhindered. In recent years, network neutrality has come under attack. The collision of these trends has begun in courts, leg- islatures, and the marketplace. It is far from over. As we shall see, its outcome will determine what the future of digital culture looks like. ฀ Stronger copyright: 1790 versus 2006 Copyright law is a complex topic. It is not my intention to provide a full copyright primer here. (Indeed, I will assume that the reader understands some copyright basics, such as the notion that facts and ideas are not cov- ered by copyright.) Rather, my aim is to highlight some key factors about how and why United States copyright law has evolved and how it relates to the digital problem at hand. Three authors (Lawrence Lessig, Professor of Law at the Stanford Law School; Jessica Litman, Professor of Law at the Wayne State University Law School; and Siva Vaidhyanathan, Assistant Professor in the Department of Culture and Communication at New York University) have done brilliant and extensive work in this area, and the following synopsis is primarily based on their con- tributions. I heartily recommend that you read the cited works in full. The purpose of copyright Let us start with the basis of U.S. copyright law, the Constitution’s “Progress Clause”: “Congress has the power to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.”2 Copyright was a bargain: society would grant creators a time-limited ability to control and profit from their works before they fell into the public domain (where works are unprotected) because doing so resulted in “Progress of Science and useful Arts” (a social good). Regarding the Progress Clause, Lessig notes: It does not say Congress has the power to grant “creative property rights.” It says that Congress has the power to promote progress. The grant of power is its purpose, and its purpose is a public one, not the purpose of enriching publishers, nor even primarily the purpose of reward- ing authors.3 However, entertainment and information companies can have a far different view, as illustrated by this quote from Jack Valenti, former president of the Motion Picture Association of America: “Creative property owners must be accorded the same rights and protections resident in all other property owners in the nation.”4 Types of works covered When the Copyright Act of 1790 was enacted, it protected published books, maps, and charts written by living U.S. authors as well as unpublished manuscripts by them.5 The act gave the author the exclusive right to “print, reprint, publish, or vend” these works. Now, copyright protects a wide range of published and unpublished “original works of authorship” that are “fixed in a tangible medium of expression” without regard for “the nationality or domi- cile of the author,” including “1. literary works; 2. musical works, including any accompanying words; 3. dramatic works, including any accompanying music; 4. pantomimes and choreographic works; 5. pictorial, graphic, and sculp- tural works; 6. motion pictures and other audiovisual works; 7. sound recordings; 8. architectural works.”6 Rights In contrast to the limited print publishing rights inherent in the Copyright Act of 1790, current law grants copyright owners the following rights (especially notable is the addi- tion of control over derivative works, such as a play based on a novel or a translation): ฀ to reproduce the work in copies or phonograph records; ฀ to prepare derivative works based upon the work; ฀ to distribute copies or phonograph records of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending; ฀ to perform the work publicly, in the case of literary, musi- cal, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works; ฀ to display the copyrighted work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work; and ฀ in the case of sound recordings, to perform the work pub- licly by means of a digital audio transmission.7 Duration The Copyright Act of 1790 granted authors a term of four- teen years, with one renewal if the author was still living (twenty-eight years total).8 Now the situation is much more complex, and, rather than trying to review the details, I’ll provide the following example. For a personal author who produced a work on or after January 1, 1978, it is covered for the life of the author plus seventy years.9 So, assuming 118 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 an author lives an average seventy-five years, the work would be covered for 144 years, which is approximately 116 years longer than in 1790. Registration Registration was required by the Copyright Act of 1790, but very few eligible works were registered from 1790 to 1800, which enriched the public domain.10 Now registra- tion is not required, and no work enriches the public domain until its term is over, even if the author (or the author’s descendants) have no interest in the work being under copyright, or it is impossible to locate the copyright holder to gain permission to use his or her works (creating so-called “orphan works”). Drafting of legislation By 1901, copyright law had become fairly esoteric and com- plex, and drafting new copyright legislation had become increasingly difficult. Consequently, Congress adopted a new strategy: let those whose commercial interests were directly affected by copyright law deliberate and negoti- ate with each other about copyright law changes, and use the results of this process as the basis of new legislation.11 Over time, this increasingly became a dialogue among representatives of entertainment, high-tech, information, and telecommunications companies; other parties, such as library associations; and rights-holder groups (e.g., ASCAP). Since these parties often had competing interests, the negotiations were frequently contentious and lengthy. The resulting laws created a kind of crazy quilt of specific exceptions for the deals made during these sessions to the ever-expanding control over intellectual works that copyright reform generally engendered. Since the public was not at the table, its highly diverse interests were not directly represented, and, since stakeholder industries lobby Congress and the public does not, the public’s interests were often not well served. (There were some efforts by special interest groups to represent the public on narrowly focused issues.) Frequency of copyright term legislation With remarkable restraint, Congress, in its first hundred years, enacted one copyright bill that extended the copy- right term and one in its next fifty; however, starting in 1962, it passed eleven bills in the next forty years.12 Famously, Jack Valenti once proposed that copyright “last forever less one day.”13 By continually extending copyright terms in a serial fashion, Congress may grant him his wish. Licenses In 1790, copyrighted works were sold and owned. Today, many digital works are licensed. Licenses usually fall under state contract law rather than federal copyright law.14 Licensed works are not owned, and the first-sale doctrine is not in effect.15 While copyright is the legal foundation of licenses (i.e., works can be licensed because licensors own the copyright to those works), licenses are contracts, and contract provisions trump user-favorable copyright provisions, such as fair use, if the licensor chooses to negate them in a license. Criminal and civil penalties In 1790 there were civil penalties for copyright infringe- ment (e.g., statutory fines of “50 cents per sheet found in the infringer ’s possession”).16 Now there are criminal copyright penalties, including felony violations that can result in a maximum of five years of imprisonment and fines as high as $250,000 for first-time offenders; civil statutory fines that can range as high as $150,000 per infringement (if infringement is “willful”), and other penalties.17 Once the copyright implications of digital media and the Internet sunk in, entertainment and information companies were deeply concerned: digital technologies made creating perfect copies effortless, and the Internet provided a free (or low-cost) way to distribute content globally. Congress, primarily spurred on by entertainment companies, passed several laws aimed at curtailing perceived digital “theft” through criminal penalties. Under the 1997 No Electronic Theft (NET) Act, copyright infringers face “up to 3 years in prison and/or $250,000 fines,” even for noncommercial infringement.18 Under the 1998 Digital Millennium Copyright Act (DMCA), those who defeat technological mechanisms that control access to copyrighted works (a process called “circum- vention”) face a maximum of five years in prison and $500,000 in fines.19 Effect of copyright on average citizens In 1790, copyright law had little effect on citizens. The average person was not an author or publisher, private use of copyrighted materials was basically unregulated, the public domain was healthy, and many types of works were not covered by copyright at all. In 2006, ฀ virtually every type of work imaginable is under automatic copyright protection for extended periods of time; ฀ private use of digital works is increasingly visible and of concern to copyright holders; ฀ the public domain is endangered; and ฀ ordinary citizens are being prosecuted as “pirates” under draconian statutory and criminal penalties. DIGITAL DYSTOPIA | BAILEY 119 Regarding this development, Lessig says: For the first time in our tradition, the ordinary ways in which individuals create and share culture fall within the reach of the regulation of the law, which has expanded to draw within its control a vast amount of culture and creativity that it never reached before. The technology that preserved the balance of our history—between uses of our culture that were free and uses of our culture that were only upon permission—has been undone. The consequence is that we are less and less a free culture, more and more a permission culture.20 How has copyright changed since the days of the founding fathers? As we have seen, there has been a shift in copyright law (and social perceptions of it) from ฀ promoting progress to protecting intellectual prop- erty owners’ “rights”; ฀ from covering limited types of works to covering virtually all types of works; ฀ from granting only basic reproduction and distribu- tion rights to granting a much wider range of rights; ฀ from offering a relatively short duration of protection to offering a relatively long (potentially perpetual) one; ฀ from requiring registration to providing automatic copyright; ฀ from drafting laws in Congress to drafting laws in work groups of interested parties dominated by com- mercial representatives; ฀ from making infrequent extensions of copyright duration to making frequent ones; ฀ from selling works to licensing them; ฀ from relatively modest civil penalties to severe civil and criminal penalties; and ฀ from ignoring ordinary citizens’ typical use of copy- righted works to branding them as pirates and prosecut- ing them with lawsuits. (Regarding lawsuits filed by the Recording Industry Association of America against four students, Lessig notes: “If you added up the claims, these four lawsuits were asking courts in the United States to award the plaintiffs close to $100 billion—six times the total profit of the film industry in 2001.”)21 Complicating this situation further is intense consolida- tion and increased vertical integration in the entertainment, information, telecommunications, and other high-tech industries involved in the Internet.22 This vertical integra- tion has implications for what can be published and the free flow of information. For example, a company that publishes books and magazines, produces films and television pro- grams, provides Internet access and digital content, and provides cable television services (including broadband Internet access) has different corporate interests than a company that performs a single function. These interrelated interests may affect not only what information is produced and whether competing information and services are freely available through controlled digital distribution channels, but corporate perceptions of copyright issues as well. One of the ironies of the current copyright situation is this: if creative works are by nature property, and stealing property is (and has always been) wrong, then some of the very industries that are demanding that this truth be embodied in copyright law have, in the past, been pirates themselves, even though certain acts of piracy may have been legal (or appeared to be legal) under then-existing copyright laws.23 Lessig states: If “piracy” means using the creative property of others without their permission—if “if value, then right” is true—then the history of the content industry is a his- tory of piracy. Every important sector of “big media” today—film, records, radio, and cable TV—was born of a kind of piracy so defined. The consistent story is how last generation’s pirates join this generation’s country club—until now.24 Let’s take a simple case: cable television. Early cable television companies used broadcast television programs without compensating copyright owners, who branded their actions as piracy and filed lawsuits. After two defeats in the Supreme Court, broadcast television companies won a victory (of sorts) in Congress, which took nearly thirty years to resolve the matter: cable television companies would pay, but not what broadcast television companies wanted; rather they would pay fees determined by law.25 Of course, this view of history (big media companies as pirates in their infancy) is open to dispute. For the moment, let’s assume that it is true. Put more gently, some of the most important media companies of modern times flourished because of relatively lax copyright control, a relatively rich public domain, and, in some cases, a societal boon that allowed them to pay statutory license fees— which are compulsory for copyright owners—instead of potentially paying much higher fees set by copyright owners or being denied use at all. Today, the very things that fostered media companies’ growth are under attack by them. The success of those attacks is diminishing the ability of new digital content and service companies to flourish and, in the long run, may diminish even big media’s ability to continue to thrive as a permission culture replaces a permissive culture. Several prominent copyright scholars have suggested copyright reforms to help restore balance to the copyright system. James Boyle, professor of law at the Duke University Law School, recommends a twenty-year copyright term with “a broadly defined fair use protection for journalis- tic, teaching, and parodic uses—provided that those uses were not judged to be in bad faith by a jury applying the ‘beyond a reasonable doubt’ standard.”26 120 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 William W. Fisher III, Hale and Dorr Professor of Intellectual Property Law at Harvard University Law School, suggests that “we replace major portions of the copyright and encryption-reinforcement models with . . . a governmentally administered reward system” that would put in place new taxes and compensate registered copyright owners of music or films with “a share of the tax revenues proportional to the relative popularity of his or her creation,” and would “eliminate most of the current prohibitions on unauthorized reproduction, distribution, adaptation, and performance of audio and video recordings.”27 Lessig recommends that copyright law be guided by the following general principles: (1) short copyright terms, (2) a simple binary system of protected/not pro- tected works without complex exceptions, (3) mandatory renewal, and (4) a “prospective” orientation that forbids retrospective term extensions.28 (Previously, Lessig had proposed a seventy-five-year term contingent on five-year renewals). He suggests reinstating the copyright registra- tion requirement using a flexible system similar to that used for domain name registrations. He favors works having copyright marks, and, if they are not present, he would permit their free use until copyright owners voice their opposition to this use (uses of the work made prior to this point would still be permitted). Litman wants a copyright law “that is short, simple, and fair,” in which we “stop defining copyright in terms of reproduction” and recast copyright as “an exclusive right of commercial exploitation.”29 Litman would eliminate indus- try-specific copyright law exceptions, but grant the public “a right to engage in copying or other uses incidental to a licensed or legally privileged use”; the “right to cite” (even infringing works); and “an affirmative right to gain access to, extract, use, and reuse the ideas, facts, information, and other public-domain material embodied in protected works” (including a restricted circumvention right).30 Things change in two hundred-plus years, and the law must change with them. Since the late nineteenth century, copyright law has been especially impacted by new tech- nologies. The question is this: has copyright law struck the right balance between encouraging progress through granting creators specific rights and fostering a strong public domain that also nourishes creative endeavor? If that balance has been lost, how can it be restored? Or is society simply no longer striving to maintain that balance because intellectual works are indeed property, property must be protected for commerce to prosper, and the concept of bal- ance is outmoded and no longer reflects societal values? ฀ DRM: Locked-up content and fine-grained control Noted attorney Michael Godwin defines DRM as “a collec- tive name for technologies that prevent you from using a copyrighted digital work beyond the degree to which the copyright owner (or a publisher who may not actually hold a copyright) wishes to allow you to use it.”31 Like copyright, DRM systems are complex, with many variations. There are two key technologies: (1) digital mark- ing (i.e., digital fingerprints that uniquely identify a work based on its characteristics, simple labels that attach rights information to content, and watermarks that typically hide information that can be used to identify a work), and (2) encryption (i.e., scrambled digital content that requires a digital key to decipher it).32 Specialized hardware can be used to restrict access as well, often in conjunction with digital marking and encryption. The intent of this article is not to provide a technical tutorial, but to set forth an overview of the basic DRM concept and discuss its implications. What is of interest here is not how system A-B-C works in contrast to system X-Y-Z, but what DRM allows copyright owners to do and the issues related to DRM. To do so, let’s use an analogy, understanding that real DRM systems can work in other ways as well (e.g., digi- tal watermarks can be used to track illegal use of images on the Internet without those images being otherwise protected). For the moment, let’s imagine that the content a user wishes to access is in an unbreakable, encrypted digital safe. The user cannot see inside the safe. By entering the correct digital combination, certain content becomes visible (or audible or both) in the safe. That content can then be utilized in specific ways (and only those ways), including, if permitted, leaving the safe. If a public domain work is put in the safe, access to it is restricted regardless of its copyright status. Bill Rosenblatt, Bill Trippe, and Stephen Mooney pro- vide a very useful conceptual model of DRM rights in their landmark DRM book, Digital Rights Management: Business and Technology, summarized here.33 There are three types of content rights: (1) render rights, (2) transport rights, and (3) derivative-works rights. Render rights allow authorized users to view, play, and print protected content. Transport rights allow authorized users to copy, move, and loan content (the user retains the content if it is copied and gets it back when a loan is over, but does not keep a copy if it is moved). Derivative-works rights allow authorized users to extract pieces of content, edit the content in place, and embed content by extracting some of it and using it in other works. Each one of these individual rights has three attributes: (1) consideration, (2) extents, and (3) types of users. In the first attribute, consideration, access to content is provided for something of value to the publisher (e.g., money or personal information). Content can then be used to some extent (e.g., for a certain amount of time or a certain number of times). The rights and attributes users have are determined by their user types. DIGITAL DYSTOPIA | BAILEY 121 For example, an academic user, in consideration of a specified license payment by his or her library, can view a DRM-protected scholarly article—but not copy, move, loan, extract, edit, or embed it—for a week, after which it is inaccessible. We can extend this hypothetical example by imagining that the library could pay higher license fees to gain more rights to the journal in question, and the library (or the user) could dynamically purchase additional article-specific rights enhancements as needed through micropayments. This example is extreme; however, it illustrates the fine-grained, high level of control that publishers could potentially have over content by using DRM technology. Godwin suggests that DRM may inhibit a variety of legitimate uses of DRM-protected information, such as access to public-domain works (or other works that would allow liberal use), preservation of works by librar- ies, creation of new derivative works, conduct of histori- cal research, exercise of fair-use rights, and instructional use.34 The ability of blind (or otherwise disabled) users to employ assistive technologies may also be prevented by DRM technology.35 DRM also raises a variety of privacy concerns.36 Fair use is an especially thorny problem. Rosenblatt, Trippe, and Mooney state: Fair use is an “I’ll know it when I see it” proposition, meaning that it can’t be proscriptively defined. . . . Just as there is no such thing as a “black box” that determines whether broadcast material is or isn’t indecent, there is no such thing as a “black box” that can determine whether a given use of content qualifies as fair use or not. Anything that can’t be proscriptively defined can’t be represented in a computer system.37 No need to panic about scholarly journals—yet. Your scholarly journal publisher or other third-party supplier is unlikely to present you with such detailed options tomor- row. But you may already be licensing other digital content that is DRM-protected, such as digital music or e-books that require a hardware e-book reader. As the recent Sony BMG “rootkit” episode illustrated, creating effective, secure DRM systems can be challeng- ing, even for large corporations.38 Again, the reasons for this are complex. In very simple terms, it boils down to this: assuming that the content can be protected up to the point it is placed in a DRM system, the DRM system has the best chance of working if all possible devices that can process its protected content either directly support its protection technology, recognize its restrictions and enforce them through another means, or refuse access.39 Anything less creates “holes” in the protective DRM shell, such as the well-known “analog hole” (e.g., when DRM-protected digital content is converted to analog form to be played, it can then be rerecorded using digital equipment without DRM protection).40 Ideally, in other words, every server, network router, PC and PC component, operating system, and relevant electronic device (e.g., CD player, DVD player, audio- recording device, and video-recording device) would work with the DRM system as outlined previously or would not allow access to the content at all. Clearly, this ideal end-state for DRM may well never be realized, especially given the troublesome backward- compatibility equipment problem.41 However, this does not mean that the entertainment, information, and high- technology companies will not try to make whatever piecemeal progress that they can in this area.42 The Trusted Computing Group is an important mul- tiple-industry security organization, whose standards work could have a strong impact on the future of DRM. Robert A. Gehring notes: But a DRM system is almost useless, that is from a con- tent owner’s perspective, until it is deployed broadly. Putting together cheap TC components with a market- dominating operating system “enriched” with DRM functionality is the most economic way to provide the majority of users with “copyright boxes.”43 Seth Schoen argues computer owners should be empowered to override certain features of “trusted com- puting architecture” to address issues with “anti-competi- tive and anti-consumer behavior” and other problems.44 DRM could potentially be legislatively mandated. There is a closely related legal precedent, the Audio Home Recording Act, which requires that digital audiotape equip- ment include special hardware to prevent serial copying.45 There is currently a bill before Congress that would require use of a “broadcast flag” (a digital marker) for digital broadcast and satellite radio receivers.46 Last year, a similar FCC regulation for broadcast digital television was struck down by a federal appeals court; consequently, the current bill explicitly empowers the FCC to “enforce ‘prohibi- tions against unauthorized copying and redistribution.’”47 Another bill would plug the analog-to-digital video analog hole by putting “strict legal controls on any video analog to digital (A/D) convertors.”48 Whether these bills become law or not, efforts to mandate DRM are unlikely to end. DMCA strongly supports DRM by prohibiting both the circumvention of technological mechanisms that control access to copyrighted works (with some minor exceptions) and the “manufacture of any device, composition of any program, or offering of any service” to do so.49 What would the world be like if all newly published (or released) commercially created information was in digital form, protected by DRM? What would it be like if all old works in print and analog formats were only reissued in digital form, protected by DRM? What would it be like if all hardware that could process that digital information had to support the information’s DRM scheme or block any access to it because this was mandated by law? What would it be 122 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 like if all operating systems had direct or indirect built-in support for DRM? Would “Progress of Science and useful Arts” be promoted or squashed? ฀ Weaker Net neutrality Lessig identifies three important characteristics of the Internet that have fostered innovation: (1) edge architec- ture: software applications run on servers connected to the network, rather than on the network itself, ensuring that the network itself does not have to be modified for new or updated applications to run; (2) no application optimization: a relatively simple, but effective, protocol is utilized (Internet Protocol) that is indifferent to what software applications run on top of it, again insulating the network from application changes; and (3) neutral platform: the network does not prefer certain data packets or deny certain packets access.50 Lessig’s conceptual model is very useful when thinking about Net neutrality, a topic of growing concern. EDUCAUSE’s definition of Net neutrality aptly cap- tures these concerns: “Net neutrality” is the term used to describe the concept of keeping the Internet open to all lawful content, infor- mation, applications, and equipment. There is increasing concern that the owners of the local broadband connec- tions (usually either the cable or telephone company) may block or discriminate against certain Internet users or applications in order to give an advantage to their own services. While the owners of the local network have a legitimate right to manage traffic on their network to pre- vent congestion, viruses, and so forth, network owners should not be able to block or degrade traffic based on the identity of the user or the type of application solely to favor their interests.51 For some time, there have been fears that Net neutral- ity was endangered as the Internet became increasingly commercialized, a greater percentage of home Internet users migrated to broadband connections not regulated by common carrier laws, and telecommunications mergers (and vertical integration) accelerated. Some of these fears are now appearing to be realized, albeit with resistance by the Internet community. For example, AOL has indicated that it will implement a two-tier e-mail system for companies, nonprofits, and others who send mass mailings: those who pay bypass spam filters, those who don’t pay don’t bypass spam filters.52 Critics fear that free e-mail services will deterio- rate under a two-tier system. Facing fierce criticism from the DearAOL.com Coalition and many others, AOL has relented somewhat on the nonprofit issue by offering special treatment for “qualified” nonprofits. A second example is that an analysis of Verizon’s FCC filings reveals that “more than 80% of Verizon’s current capacity is earmarked for carrying its service, while all other traffic jostles in the remainder.”53 Content-oriented Net companies are worried: Leading Net companies say that Verizon’s actions could keep some rivals off the road. As consumers try to search Google, buy books on Amazon.com, or watch videos on Yahoo!, they’ll all be trying to squeeze into the leftover lanes on Verizon’s network. . . . “The Bells have designed a broadband system that squeezes out the public Internet in favor of services or content they want to provide,” says Paul Misener, vice-president for global policy at Amazon.com.54 A third example is a comment by William L. Smith, BellSouth ‘s chief technology officer, who “told reporters and analysts that an Internet service provider such as his firm should be able, for example, to charge Yahoo Inc. for the opportunity to have its search site load faster than that of Google Inc.,” but qualified this assertion by indicat- ing that “a pay-for-performance marketplace should be allowed to develop on top of a baseline service level that all content providers would enjoy.”55 About four months later, AT&T announced that it would acquire BellSouth, after which it “will be the local carrier in 22 states covering more than half of the American population.”56 Finally, in a white paper for Public Knowledge, John Windhausen Jr. states: This concern is not just theoretical—broadband network providers are taking advantage of their unregulated status. Cable operators have barred consumers from using their cable modems for virtual private networks and home networking and blocked streaming video applications. Telephone and wireless companies have blocked Internet telephone (VoIP—Voice over the Internet Protocol) traffic outright in order to protect their own telephone service revenues.57 These and similar examples are harbingers of troubled days ahead for Net neutrality. The canary in the Net neu- trality mine isn’t dead yet, but it’s getting very nervous. The bottom line? Noted OA advocate Peter Suber analyzes the situation as follows: But now cable and telecom companies want to discrimi- nate, charge premium prices for premium service, and give second-rate service to everyone else. If we relax the principle of net neutrality, then ISPs could, if they wanted, limit the software and hardware you could con- nect to the net. They could charge you more if you send or receive more than a set number of emails. They could block emails containing certain keywords or emails from people or organizations they disliked, and block traffic to or from competitor web sites. They could make filtered service the default and force users to pay extra for the DIGITAL DYSTOPIA | BAILEY 123 wide open internet. If you tried to shop at a store that hasn’t paid them a kickback, they could steer you to a store that has. . . . If companies like AT&T and Verizon have their way, there will be two tiers of internet service: fast and expensive and slow and cheap (or cheaper). We unwealthy users—students, scholars, universities, and small publishers—wouldn't be forced offline, just forced into the slow lane. Because the fast lane would reserve a chunk of bandwidth for the wealthy, the peons would crowd together in what remained, reducing service below current levels. New services starting in the slow lane wouldn't have a fighting chance against entrenched players in the fast lane. Think about eBay in 1995, Google in 1999, or Skype in 2002 without the level playing field provided by network neutrality. Or think about any OA journal or repository today.58 Is Net neutrality a quaint anachronism of the Internet’s distant academic/research roots that we would be better off without? Would new Internet companies and noncom- mercial services prosper better if it was gone, spurring on new waves of innovation? Would telecommunications companies (who may be part of larger conglomerates), free to charge for tiered-services, offer us exciting new service offerings and better, more reliable service? ฀ Defending the Internet revolution Sixties icon Bob Dylan’s line in “The Times They Are A- Changin’”—“Then you better start swimmin’ or you’ll sink like a stone”—couldn’t be more apt for those concerned with the issues outlined in this paper. Here’s a brief over- view of some of the strategies being used to defend the freewheeling Internet revolution. 1. Darknet: J. D. Lasica says: “For the most part, the Darknet is simply the underground Internet. But there are many darknets: the millions of users trading files in the shady regions of Usenet and Internet Relay Chat; students who send songs and TV shows to each other using instant messaging services from AOL, Yahoo, and Microsoft; city streets and college campuses where people copy, burn, and share physical media like CDs; and the new breed of encrypted dark networks like Freenet. . .”59 We may think of the Darknet as simply fostering illegal file swapping by ordinary citizens, but the Darknet strategy can also be used to escape government Internet censorship, as is the case with Freenet use in China.60 2. Legislative and Legal Action: There have been attempts to pass laws to amend or reverse copyright and other laws resulting from the counter-Internet-revolution, which have been met by swift, powerful, and generally effective opposition from entertainment companies and other parties affected by these proposed measures. The moral of this story is that these large corporations can afford to pay lobbyists, make campaign contributions, and otherwise exert significant influence over lawmakers, while, by and large, advocates for the other side do not have the same clout. The battle in the courts has been more of a mixed bag; however, there have been some notable defeats for reform advocates, especially in the copyright arena (e.g., Eldred v. Ashcroft), where most of the action has been. 3. Market Forces: When commercial choices can be made, users can vote with their pocketbooks about some Internet changes. But, if monopoly forces are in play, such as having a single option for broadband access, the only other choice may be no service. However, as the OA move- ment (described later) has demonstrated, a concerted effort by highly motivated individuals and nonprofit organiza- tions can establish viable new alternatives to commercial services that can change the rules of the game in some cases. Companies can also explore radical new business models that may appear paradoxical to pre-Internet-era thinking, but make perfect sense in the new digital real- ity. In the long run, the winners of the digital-content wars may be those who are not afraid of going down the Internet rabbit hole. 4. Creative Commons: Copyright is a two-edged sword: it can be used as the legal basis of licenses (and DRM) to restrict and control digital information, or it can be used as the legal basis of licenses to permit liberal use of digital information. By using one of the six major Creative Common Licenses (CCL), authors can retain copyright, but significantly enrich society’s collective cultural repository with works that can be freely shared for noncommercial purposes, used, in some cases, for commercial purposes, and used to easily build new derivative creative works. For example, the Creative Commons Attribution License requires that a work is attributed to the author; however, a work can be used for any commercial or noncommercial purpose without permission, including creating derivative works.61 There are a variety of other licenses, such as the GNU Free Documentation License, that can be used for similar purposes.62 5. OA: Scholars create certain types of information, such as journal articles, without expecting to be paid to do so, and it is in their best interests for these works to be widely read, especially by specialists in their fields.63 By putting e-prints (electronic preprints or post-prints) of articles on personal home pages or in various types of digital archives (e.g., institutional repositories) in full compliance with copyright law and, if needed, in compli- ance with publisher policies, scholars can provide free global access to these works with minimal effort and at no (or little) cost to themselves. Further, a new generation of free e-journals are being published on the Internet that are being funded by a variety of business models, such as advertising, author fees, library membership fees, and supplemental products. These OA strategies make digital 124 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 scholarly information freely available to users across the globe, regardless of their personal affluence or the affluence of their affiliated institutions. ฀ Impact on libraries This paper’s analysis of copyright, DRM, and network neutrality trends holds no good news for libraries. Copyright The reach of copyright law constantly encompasses new types of materials and for an ever-lengthening duration. As a result, copyright holders must explicitly place their works in the public domain if the public domain is to continue to grow. Needless to say, the public domain is a primary source of materials that can be digitized without having to face a complex, potentially expensive, and sometimes hope- less permission clearance process. This process can be especially daunting for media works (such as films and video), even for the use of very short segments of these works. J. D. Lasica recounts his effort to get permission to use short music and film segments in a personal video: five out of seven music companies declined; six out of seven movie studios declined, and the one that agreed had serious reservations.64 The replies to his inquiry, for those companies that bothered to reply at all, are well worth reading. For U.S. libraries without the resources to deal with complicated copyright-related issues, the digitization clock stops at 1922, the last year we can be sure that a work is in the public domain without checking its copyright status and getting permission if it is under copyright.65 What can we look forward to? Lessig says: “Thus, in the twenty years after the Sonny Bono Act, while one million patents will pass into the public domain, zero copyrights will pass into the public domain by virtue of the expiration of a copyright term.”66 (The Sonny Bono Term Extension Act was passed in 1998.) Digital preservation is another area of concern in a legal environment where most information is automati- cally copyrighted, copyright terms are lengthy (or end- less), and information is increasingly licensed. Simply put, a library cannot digitally preserve what it does not own unless the work is in the public domain, the work’s license permits it, or the work’s copyright owner grants permission to do so. Or can it? After all, the Internet Archive does not ask permission ahead of time before preserving the entire Internet, although it responds to requests to restrict infor- mation. And that is why the Internet Archive is currently being sued by Healthcare Advocates, which says that it: “is just like a big vacuum cleaner, sucking up information and making it available.”67 If it is not settled out of court, this will be an interesting case for more digitally adventurous libraries to watch. As the cost of the hardware and software needed to effectively do so continues to drop, faculty, students, and other library users will increasingly want to repurpose content, digitizing conventional print and media materials, remixing digital ones, and/or creating new digital materi- als from both. With the “information commons” movement, academic libraries are increasingly providing users with the hard- ware and software tools to repurpose content. Given that the wording of the U.S. Copyright Act section 108 (f) (1) is vague enough that it could be interpreted to include these tools when they are used for information reproduction, is the old “copyright disclaimer on the photocopier” solution enough in the new digital environment? Or—in light of the unprecedented transformational power of these tools to create new digital works, and their widespread use both within libraries and on campus—do academic libraries bear heavier responsibilities regarding copyright compli- ance, permission-seeking, and education? Similar issues arise when faculty want to place self-cre- ated digital works that incorporate copyrighted materials in electronic reserves systems or institutional repositories. End- user contributions to “Library 2.0” systems that incorporate copyrighted materials may also raise copyright concerns. DRM As libraries realize that they cannot afford dual formats, their new journal and index holdings are increasingly solely digital. Libraries are also licensing a growing variety of “born digital” information. The complexities of dealing with license restrictions for these commercial digital prod- ucts are well understood, but imagine if DRM was layered on top of license restrictions. As we have discussed, DRM will allow content producers and distributors to slice, dice, and monetize access to digital information in ways that were previously impossible. What may be every publisher/vendor’s dream could be every library’s nightmare. Aside from a potential surge of publisher/vendor-specific access licensing options and fees, libraries may also have to contend with publisher/ vendor-specific DRM technical solutions, which may: ฀ depend on particular hardware/software platforms, ฀ be incompatible with each other, ฀ decrease computer reliability and security, ฀ eliminate fair or otherwise legal use of DRM-pro- tected information, ฀ raise user privacy issues, ฀ restrict digital preservation to bitstream preservation (if allowed by license), DIGITAL DYSTOPIA | BAILEY 125 ฀ make it difficult to assess whether to license DRM- protected materials, ฀ increase the difficulty of providing unified access to information from different publishers and vendors, ฀ multiply user support headaches, and ฀ necessitate increased staffing. DRM makes solving many of these problems both legally and technically impossible. For example, under DMCA, libraries have the right to circumvent DRM for a work in order to evaluate whether they want to purchase it. However, they cannot do so without the software tools to crack the work’s DRM protection. But the distribution of those tools is illegal under DMCA, and local develop- ment of such tools is likely to be prohibitively complex and expensive.68 Fostering alternatives to restrictive copyright and DRM Given the uphill battle in the courts and legislatures, CCLs (or similar licenses) and OA are particularly prom- ising strategies to deal with copyright and DRM issues. Copyright laws do not need to change for these strategies to be effective. It is not just a question of libraries helping to support OA by paying for institutional memberships to OA jour- nals, building and maintaining institutional repositories, supporting OA mandates, encouraging faculty to edit and publish OA journals, educating faculty about copyright and OA issues, and encouraging them to utilize CCLs (or similar licenses). To truly create change, libraries need to “walk the talk” and either let the public-domain materials they digitize remain in the public domain, or put them under CCLs (or similar licenses), and, when they create original digital content, put it under CCLs (or similar) licenses as well. As the OA movement has shown, using CCLs does not rule out revenue generation (if that is an appropriate goal), but it does require facilitating strategies, such as advertis- ing and offering fee-based add-on products and services. Net neutrality There are many unknowns surrounding the issue of Net neutrality, but what is clear is that it is under assault. It is also clear that Internet services are more likely to require more, not less, bandwidth in the future as digital media and other high-bandwidth applications become more com- monplace, complex, and interwoven into a larger number of Internet systems. One would imagine that if a corporation such as Google had to pay for a high-speed digital lane, it would want it to reach as many consumers as possible. So, it may well be that libraries’ Google access would be unaffected or possibly improved by a two-tier (or multi-tier) Internet “speed-lane” service model. Would the same be true for library-oriented publishers and vendors? That may depend on their size and relative affluence. If so, the ability of smaller publishers and vendors to offer innovative bandwidth-intensive products and services may be curtailed. Unless they are affluent, libraries may also find that they are confined to slower Internet speed lanes when they act as information providers. For libraries engaged in digital library, electronic publishing, and institutional repository projects, this may be problematic, especially as they increasingly add more digital media, large-data-set, or other bandwidth-intensive applications. It’s important to keep in mind that Net neutrality impacts are tied to where the chokepoints are, with the most serious potential impacts being at chokepoints that affect large numbers of users, such as local ISPs that are part of large corporations, national/international back- bone networks, and major Internet information services (e.g.,Yahoo!). It is also important to realize that the problem may be partitioned to particular network segments. For example, on-campus network users may not experience any speed issues associated with the delivery of bandwidth-intensive information from local library servers because that net- work segment is under university control. Remote users, however, including affiliated home users, may experience throttled-down performance beyond what would normally be expected due to speed-lane enforcement by backbone providers or local ISPs controlled by large corporations. Likewise, users at two universities connected by a spe- cial research network may experience no issues related to accessing the other university’s bandwidth-intensive library applications from on-campus computers because the backbone provider is under a contractual obligation to deliver specific network performance levels. Although the example of speed lanes has been used in this examination of potential Net neutrality impacts on libraries, the problem is more complex than this, because network services, such as peer-to-peer networking proto- cols, can be completely blocked, digital information can be blocked or filtered, and other types of fine-grained network control can be exerted. ฀ Conclusion This paper has deliberately presented one side of the story. It should not be construed as saying that copyright law should be abolished or violated, that DRM can serve no useful purpose (if it is possible to fix certain critical defi- ciencies and if it is properly employed), or that no one has to foot the bill for content creation/marketing/distribution and ever-more-bandwidth-hungry Internet applications. 126 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Nor is it to say that the other side of the story, the side most likely to be told by spokespersons of the entertain- ment, information, and telecommunications industries, has no validity and does not deserve to be heard. However, that side of the story is having no problem being heard, especially in the halls of Congress. The side of the story presented in this paper is not as widely heard—at least, not yet. Nor does it intend to imply that executives from the entertainment, information, telecommunications, and other corporate venues lack a social conscience, are fully unified in their views, or are unconcerned with the societal implications of their positions. However, by focusing on short-term issues, they may not fully realize the potentially negative, long-term impact that their positions may have on their own enterprises. Nor has this paper presented all of the issues that threaten the Internet, such as assaults on privacy, increas- ingly determined (and malicious) hacking, state and other censorship, and the seemingly insolvable problem of over- laying national laws on a global digital medium. What this paper has said is simply this: three issues—a dramatic expansion of the scope, duration, and punitive nature of copyright laws; the ability of DRM to lock-down content in an unprecedented fashion; and the erosion of Net neutrality—bear careful scrutiny by those who believe that the Internet has fostered (and will continue to foster) a digital revolution that has resulted in an extraordinary explosion of innovation, creativity, and information dis- semination. These issues may well determine whether the much-touted information superhighway lives up to its promise or simply becomes the “information toll road” of the future, ironically resembling the pre-Internet online services of the past. References and notes 1. Gary Flake, “How I Learned to Stop Worrying and Love the Imminent Internet Singularity,” http://castingwords.com/ transcripts/O3/5073.html (accessed May 2, 2006). 2. Lawrence Lessig, Free Culture: The Nature and Future of Creativity (New York: Penguin, 2005), 130, www.free-culture.cc/ (accessed May 2, 2006). 3. Ibid., 131. 4. Ibid., 117–18. 5. William F. Patry, Copyright Law and Practice (Washing- ton, D.C.: Bureau of National Affairs, 2000), http://digital-law -online.info/patry (accessed May 2, 2006). 6. U.S. Copyright Office, Copyright Basics (Washington, D.C.: U.S. Copyright Office, 2000), www.copyright.gov/circs/circl/ html (accessed May 2, 2006). 7. Ibid. 8. Lessig, Free Culture, 133. 9. Barbara M. Waxer and Marsha Baum, Internet Surf and Turf Revealed: The Essential Guide to Copyright, Fair Use, and Find- ing Media (Boston: Thompson Course Technology, 2006), 17. 10. Patry, Copyright Law and Practice; Lessig, Free Culture, 133. 11. Jessica Litman, Digital Copyright (Amherst: Prometheus Books, 2001), 35–63. 12. Lessig, Free Culture, 134. 13. Ibid., 326. 14. Association of American Universities, the Association of Research Libraries, the Association of American University Presses, and the Association of American Publishers, Campus Copy- right Rights & Responsibilities: A Basic Guide to Policy Considerations (Association of American Universities, the Association of Research Libraries, the Association of American University Presses, and the Association of American Publishers, 2006), 8, www.arl.org/info/ frn/copy/CampusCopyright05.pdf (accessed May 2, 2006). 15. George H. Pike, “The Delicate Dance of Database Li- censes, Copyright, and Fair Use,” Computers in Libraries 22, no. 5 (2002): 14, http://infotoday.com/cilmag/may02/pike .htm (accessed May 2, 2006). 16. Patry, Copyright Law and Practice. 17. Computer Crime and Intellectual Property Section Crimi- nal Division, U.S. Department of Justice, “Prosecuting Intellec- tual Property Crimes Manual,” www.cybercrime.gov/ipmanual .htm (accessed May 2, 2006); U.S. Copyright Office, Copyright Law of the United States of America and Related Laws Contained in Title 17 of the United States Code (Washington, D.C.: U.S. Copyright Office, 2003), www.copyright.gov/title17/circ92.pdf (accessed May 2, 2006). 18. Recording Industry Association of America, “Copyright Laws,” www.riaa.com/issues/copyright/laws.asp (accessed May 2, 2006). 19. Kenneth D. Crews, Copyright Law for Librarians and Educa- tors: Creative Strategies and Practical Solutions, 2nd ed. (Chicago: ALA, 2006), 94. 20. Lessig, Free Culture, 8. 21. Ibid., 51. 22. Lawrence Lessig, The Future of Ideas: The Fate of the Com- mons in a Connected World (New York: Vintage Bks., 2002), 165–66, 176. 23. Lessig, Free Culture, 53–61. 24. Ibid., 53. 25. Ibid., 59–61. 26. James Boyle, Shamans, Software, and Spleens: Law and the Construction of the Information Society (Cambridge: Harvard Univ. Pr., 1996), 172. 27. William W. Fisher III, Promises to Keep: Technology, Law, and the Future of Entertainment (Stanford, Calif.: Stanford Univ. Pr., 2004), 202. 28. Lessig, Free Culture, 289–93. 29. Litman, Digital Copyright, 179–80. 30. Ibid., 181–84. 31. Michael Godwin, Digital Rights Management: A Guide for Librarians (Washington, D.C.: Office for Information Technology Policy, ALA, 2006), 1, www.ala.org/ala/washoff/WOissues/ copyrightb/digitalrights/DRMfinal.pdf (accessed May 2, 2006). DIGITAL DYSTOPIA | BAILEY 127 32. Ibid., 10–18. 33. Bill Rosenblatt, Bill Trippe, and Stephen Mooney, Digital Rights Management: Business and Technology (New York: M&T Bks., 2002), 61–64. 34. Godwin, Digital Rights Management: A Guide for Librar- ians, 2. 35. David Mann, “Digital Rights Management and People with Sight Loss,” INDICARE Monitor 2, no. 11 (2006), www .indicare.org/tiki-print_article.php?articleId=170 (accessed May 2, 2006). 36. Julie E. Cohen, “DRM and Privacy,” Communications of the ACM 46, no. 4 (2003): 46–49. 37. Rosenblatt, Trippe, and Mooney, Digital Rights Manage- ment: Business and Technology, 45. 38. J. Alex Halderman and Edward W. Felten, “Lessons from the Sony CD DRM Episode,” Feb. 14, 2006, http://itpolicy.princeton .edu/pub/sonydrm-ext.pdf (accessed May 2, 2006). 39. Godwin, Digital Rights Management: A Guide for Librarians, 18–36. 40. Wikipedia, “Analog Hole,” http://en.wikipedia.org/ wiki/Analog_hole (accessed May 2, 2006). 41. Godwin, Digital Rights Management: A Guide for Librarians, 18–20. 42. Ibid., 36. 43. Robert A. Gehring, “Trusted Computing for Digital Rights Management,” INDICARE Monitor 2, no. 12 (2006), www.indicare .org/tiki-read_article.php?articleId=179 (accessed May 2, 2006). 44. Seth Schoen, “Trusted Computing: Promise and Risk,” www.eff.org/Infrastructure/trusted_computing/20031001 _tc.php (accessed May 2, 2006). 45. Pamela Samuelson, “DRM {and, or, vs.} the Law,” Com- munications of the ACM 46, no. 4 (2003): 43–44. 46. Declan McCullagh, “Congress Raises Broadcast Flag for Audio,” CNET News.com, Mar. 2, 2006, http://news.com .com/Congress+raises+broadcast+flag+for+audio/2100-1028 _3-6045225.html (accessed May 2, 2006). 47. Ibid. 48. Danny O’Brien, “A Lump of Coal for Consumers: Analog Hole Bill Introduced,” EFF DeepLinks, Dec. 16, 2005, www.eff .org/deeplinks/archives/004261.php (accessed May 2, 2006). 49. Siva Vaidhyanathan, Copyrights and Copywrongs: The Rise of Intellectual Property and How it Threatens Creativity (New York: New York Univ. Pr., 2001), 174–75. 50. Lessig, The Future of Ideas, 36–37. 51. EDUCAUSE, “Net Neutrality,” www.educause.edu/ c o n t e n t . a s p ? PA G E _ I D = 6 4 5 & PA R E N T _ I D = 8 0 7 & b h c p = 1 (accessed May 2, 2006). 52. Electronic Frontier Foundation, “DearAOL.com Coalition Grows from 50 Organizations to 500 In One Week,” Mar. 7, 2006, www.eff.org/news/archives/2006_03.php#004461 (accessed May 2, 2006). 53. Catherine Yang, “Is Verizon a Network Hog?” Business- Week, Feb. 13, 2006, 58, www.businessweek.com/technology/ content/feb2006/tc20060202_061809.htm (accessed May 2, 2006). 54. Ibid. 55. Jonathan Krim, “Executive Wants to Charge for Web Speed,” Washington Post, Dec. 1, 2005, D05, www.washingtonpost .com/wp-dyn/content/article/2005/11/30/AR2005113002109 .html (accessed May 2, 2006). 56. Harold Furchtgott-Roth, “AT&T, or Another Telecom Takeover,” The New York Sun, Mar. 7, 2006. www.nysun.com/ article/28695 (accessed May 2, 2006). (See also: www.furchtgott -roth.com/news.php?id=87 (accessed May 2, 2006). 57. John Windhausen Jr., Good Fences Make Bad Broadband: Preserving an Open Internet through Net Neutrality (Washington, D.C.: Public Knowledge, 2006), www.publicknowledge.org/ content/papers/pk-net-neutrality-whitep-20060206 (accessed May 2, 2006). 58. Peter Suber, “Three Gathering Storms That Could Cause Collateral Damage for Open Access,” SPARC Open Access News- letter, no. 95 (2006), www.earlham.edu/~peters/ fos/newsletter/ 03-02-06.htm#collateral (accessed May 2, 2006). 59. J. D. Lasica, Darknet: Hollywood’s War against the Digital Generation (New York: Wiley, 2005), 45. 60. John Borland, “Freenet Keeps File-Trading Flame Burn- ing,” CNET News.com, Oct. 28, 2002, http://news.com.com/2100 -1023-963459.html (accessed May 2, 2006). 61. Creative Commons, “Attribution 2.5,” http://creative commons.org/licenses/by/2.5/ (accessed May 2, 2006). 62. Lawrence Liang, “A Guide To Open Content Licenses.” http://pzwart.wdka.hro.nl/mdr/research/lliang/open _content_guide (accessed May 2, 2006). 63. Peter Suber, “Open Access Overview: Focusing on Open Access to Peer-Reviewed Research Articles and Their Preprints.” www.earlham.edu/~peters/fos/overview.htm (accessed May 2, 2006); Charles W. Bailey Jr., “Open Access and Libraries,” in Mark Jacobs, ed., Electronic Resources Librarians: The Human Ele- ment of the Digital Information Age (Binghamton, N.Y.: Haworth, 2006), forthcoming, www.digital-scholarship.com/cwb/OA Libraries.pdf (accessed May 2, 2006). 64. Lasica, Darknet, 72–73. 65. Waxer and Baum, Internet Surf and Turf Revealed, 17. 66. Lessig, Free Culture, 134–35. 67. Joe Mandak, “Internet Archive’s Value, Legality Debated in Copyright Suit,” Mercury News, Mar. 31, 2006, www .mercurynews.com/mld/mercurynews/news/local/states/ california/northern_california/14234638.htm (accessed May 2, 2006). 68. Arnold P. Lutzker, Primer on the Digital Millennium: What the Digital Millennium Copyright Act and the Copyright Term Exten- sion Act Mean for the Library Community (Washington, D.C.: ALA Washington Office, 1999), www.ala.org/ala/washoff/WOis sues/copyrightb/dmca/dmcaprimer.pdf (accessed May 2, 2006). The Chamberlain Group Inc. v. Skylink Technologies Inc. deci- sion offers some hope that authorized users of DRM-protected works could legally circumvent DRM for lawful purposes if they had the means to do so (see: Crews, Copyright Law for Librarians and Educators: Creative Strategies and Practical Solutions, 96–97). continued on page 139 TOWARD A TWENTY-FIRST-CENTURY LIBRARY CATALOG | ANTELMAN, LYNEMA, AND PACE 139 Copyright © 2006 by Charles W. Bailey Jr. This work is licensed under the Creative Commons Attribution- NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard St., 5th Floor, San Francisco, CA, 94105, USA. Bailey continued from 127 ฀ Known-Item Questions 1. “Your history professor has requested you to start your research project by looking up background information in a book titled Civilizations of the Ancient Near East.” a. “Please find this title in the library catalog.” b. “Where would you go to find this book physically?” 2. “For your literature class, you need to read the book titled Gulliver’s Travels written by Jonathan Swift. Find the call number for one copy of this book.” 3. “You’ve been hearing a lot about the physicist Richard Feynman, and you’d like to find out whether the library has any of the books that he has written.” a. “What is the title of one of his books?” b. “Is there a copy of this book you could check out from D. H. Hill Library?” 4. “You have the citation for a journal article about photosynthesis, light, and plant growth. You can read the actual citation for the journal article on this sheet of paper.” Alley, H., M. Rieger, and J.M. Affolter. “Effects of Developmental Light Level on Photosynthesis and Biomass Production in Echinacea Laevigata, a Federally Listed Endan- gered Species.” Natural Areas Journal 25.2 (2005): 117–22. a. “Using the library catalog, can you determine if the library owns this journal?” b. “Do library users have access to the volume that actually contains this article (either electronically or in print)?” ฀ Topical Questions 5. “Please find the titles of two books that have been written about Bill Gates (not books written by Bill Gates).” 6. “Your cat is acting like he doesn’t feel well, and you are worried about him. Please find two books that provide information specifically on cat health or caring for cats.” 7. “You have family who are considering a solar house. Does the library have any materials about building passive solar homes?” 8. “Can you show me how would you find the most recently published book about nuclear energy policy in the United States?” 9. “Imagine you teach introductory Spanish and you want to broaden your students’ horizons by expos- ing them to poetry in Spanish. Find at least one audio recording of a poet reading his or her work aloud in Spanish.” 10. “You would like to browse the recent journal litera- ture in the field of landscape architecture. Does the Design Library have any journals about landscape architecture?” Appendix A: NCSU Libraries Catalog Usability Test Tasks 3345 ---- Fagan 140 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Visual search interfaces have been shown by researchers to assist users with information search and retrieval. Recently, several major library vendors have added visual search interfaces or functions to their products. For pub- lic service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. This study presents the results of eight full-scale usability tests of both the EBSCOhost Basic Search and Visual Search in the context of a large liberal arts university. L ike the Web, online library research database inter- faces continue to evolve. Even with the smaller scope of library research databases, users can still suffer from information overload and may have difficulty in processing large results sets. Web search-engine research has shown that the number of searchers viewing only the first results page has increased from 29 percent in 1997 to 73 percent in 2002 for United States-based Web search- engines users.1 Additionally, the mean number of results viewed per query in 2001 was 2.5 documents.2 This may indicate either increasing relevance in search results or an increase in simplistic Web interactions. Visual alternatives to search interfaces attempt to address some of the problems of information retrieval within large document sets. While research and devel- opment of visual search interfaces began well before the advent of the Web, current research into visual Web interfaces has continued to expand.3 Within librarianship, the most visual interface research seems to focus on those that could be applied to large-scale digital library projects.4 Although library products often have more metadata and organizational structure than the Web, search engine-style interfaces adapted for field searching and Boolean opera- tors are still the most frequent approach to information retrieval.5 Yet research has shown that visual interfaces to digital libraries offer great benefit to the user. Zaphiris emphasizes the advantage of shifting the user’s mental load “from slow reading to faster perceptual processes such as visual pattern recognition.”6 According to Borner and Chen, visual interfaces can help users better under- stand search results and the interrelation of documents within the result set, and refine their search.7 In their dis- cussion of the function of “overviews” in visual interfaces, Greene and his colleagues say that overviews can help users make better decisions about potential relevance, and “extract gist more accurately and rapidly than traditional hit lists provided by search engines.”8 Several library database vendors are implement- ing visual interfaces to navigate and display search results. Serials Solutions’ new federated search product, CentralSearch, uses technology from Vivisimo that “orga- nizes search results into titled folders to build a clear, concise picture for its users.”9 Ulrich’s Fiction Connection Web site has used AquaBrowser to help one “discover titles similar to books you already enjoy.”10 The Queens Library has also implemented AquaBrowser to provide a graphical interface to its entire library’s collections.11 XReferPlus maps search results to topics by making visual connections between terms.12 ComAbstracts, from CIOS, uses a similar concept map, although one cannot launch a search directly from the tool. Groxis chose a circular style for its concept-mapping software, Grokker. Partnerships between Groxis and Stanford University began as early as 2004, and Grokker is now being implemented at Stanford University Libraries Academic and Information Resources.13 EBSCO and Groxis announced their partnership in March 2006.14 The EBSCOhost interface now features a Visual Search tab as an option that librarians can choose to leave on (by default) or turn off in EBSCO’s administrator module. Figure 1 shows a screenshot of the Visual Search interface. Within the context of library research databases, visual searching likely provides a needed alternative from tradi- tional, text-based searching. To test this hypothesis, James Madison University Libraries (JMU Libraries) decided to conduct eight usability sessions with EBSCOhost’s new Visual Search, in coordination with EBSCO and Groxis. While this is by no means the first published usability test of vendor interfaces, the literature understandably reveals a far greater number of usability tests on in-house projects such as library Web sites and customized catalog interfaces than on library database interfaces.15 It is hoped that by observing users try both the EBSCO Basic Search and Visual Search, more understanding will be gained about user search behav- ior and the potential benefits of a visual approach. ฀ Method The usability sessions were conducted at JMU, a large liberal arts university whose student population is mostly drawn from Virginia and the northeastern region. Only 10 percent of the students are from minority groups. JMU requires that all freshmen pass the online Information Skills Seeking Test (ISST) before becoming a sophomore, and the Libraries developed a Web tutorial, “Go For the Gold,” to prepare students for the ISST. Therefore, usabil- Jody Condit Fagan Usability Testing of a Large, Multidisciplinary Library Database: Basic Search and Visual Search Jody Condit Fagan (faganjc@jmu.edu) is Digital Services Li- brarian at Carrier Library, James Madison University, Har- risonburg, Virginia. USABILITY TESTING OF A LARGE, MULTIDISCIPLINARY LIBRARY DATABASE | FAGAN 141 ity-test participants were largely white, from the northeast- ern United States, and had exposure to basic information literacy instruction. JMU Libraries’ usability lab is a small conference room with one computer workstation equipped with Morae soft- ware.16 Audio and video recordings of user speech and facial expressions, along with “detailed application and computer system data,” are captured by the software and combined into a searchable recording session for the usability tester to review. A screenshot of the Morae analysis tool is shown in figure 2. The usability test script was developed in collabora- tion with representatives of EBSCO and Groxis. EBSCO provided access to the beta version of Visual Search for the test, and Groxis provided financial incentives for student participants. The test sessions and the results analysis, however, were conducted solely by the researcher and librarian facilitators. The Visual Search development team was provided with the results and video clips after analysis. Usability study participants were recruited by posting an announcement to the JMU students’ Web portal. A $25 gift certificate was offered as an incentive, and more than 140 students submitted a participation interest form. These were sorted by the number of years the student(s) had been at JMU to try to get as many novice users as possible. Because so much of today’s student work is conducted in groups, four groups of two, as well as four individual ses- sions, were scheduled, for a total of twelve students. JMU librarians who had received both human-subjects training and an introduction to facilitation served as facilitators to the usability sessions. Their role was to watch the time and ask open-ended questions to keep the student participants talking about what they were doing. The major research question it was hoped would be answered by the tests was, “To what extent does EBSCO’s basic search interface and visual search interface support student research?” Since the tests could not evaluate the entire research process, it was decided to focus on the development of the research topic. Specifically, the goal was to find out how well each interface supported the intellectual process of the students in coming up with a topic, narrowing their topic, and performing searches on their chosen subtopics. An additional goal was to deter- mine how well users were able to find and use the interface widgets and how satisfied the students felt after using the interfaces. The overall session was structured in this order: a pretest survey about the students’ research experience; a series of four tasks performed with EBSCOhost’s Basic Search; a series of three tasks performed with EBSCOhost’s Visual Search; and a posttest interview. Both Basic and Visual Search interfaces were used with Academic Search Premier. Each of the eight sessions was recorded in entirety by the Morae software, and each recording was viewed in entirety. To try to gain some quantitative data, the researcher measured the time it took to complete each task. However, due to variables such as facilitator involvement and interaction between group members, the numbers did not lend themselves to comparison. Also, it would not have been clear whether greater numbers indicated a positive or negative sign. Taking longer to come up with subtopics, for example, could as easily be a sign of exploration and interested inquiry as it might be of frustration or failure. As such, the data are mostly qualitative in nature. Figure 1. Screenshot of EBSCOHost’s Visual Search Figure 2. Screenshot of Morae Recorder Analysis Tool 142 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 ฀ Results The student participants were generally underclassmen. Two of the students, Group 2, were in their third year at JMU. All others were in their first or second year. While students were drawn from a wide variety of majors, it is regrettable that there was not stronger representation from the humanities. When asked, “What do you normally use to do research?” six students answered an unquali- fied “Google.” Three other students mentioned Internet search engines in their response. Only two students gave the brand or product names of library research databases: one said, “PubMed, WilsonOmniFile, and EBSCO,” while the other, a counseling major, mentioned PsycINFO and CINAHL. When shown a screenshot of Basic Search, half of the students said they had used an EBSCO database before. All of the participants said they had never before used a visual search interface. The full results from the individual pretest interviews are shown in figures 3 and 4. To begin the usability test, the facilita- tor started Internet Explorer and loaded the EBSCOhost Basic Search, which was set to have a single input box. The scripts for each task are listed in figure 5. Note that Task 4 was only featured in the Basic Search portion of the test. For Task 1 on the Basic Search—coming up with a general topic—all of the par- ticipants began by using their own topics rather than choosing from the list of ideas. Also, although they were asked to “spend some time on EBSCO to come up with a possible general topic,” all but Group 6 fulfilled this by simply thinking of a topic (sometimes after some discussion within the groups of two) and typing it in. With the exception of Group 6, the size of the result set did not inspire topic changes. Figure 6 summarizes the students’ searches and relative success on Task 1. In retrospect, the tests might have yielded more straightforward findings if the students had been directed to choose from the provided list of topics, or even to use the same topic. However, part of the intention was to determine whether either interface was helpful in guiding the students’ topic development. It was hoped that by defining the scenario as writing a paper for class, their topic selection would reflect the realities of student research. However, it probably would have been better to have used the same topic for each session. Task 2 asked participants to identify three subtopics, and Task 3 asked them to refine their search to one subtopic and limit it to the past two years. A summary of these tasks appears in figure 7. A surprising finding during Task 2 was that students did go past the first page of results. Four groups went past the first page of results, while two groups did not get enough results for more than one page. The other two groups did not choose to look past the first page of results. This contrasts with Jansen and Spink’s findings, Figure 3. Results from pretest interview, groups 1–4 Figure 4. Results from pretest interview, groups 5–8 USABILITY TESTING OF A LARGE, MULTIDISCIPLINARY LIBRARY DATABASE | FAGAN 143 in which 73 percent of Web searchers only view the first results page.17 Another pleasant surprise was that students spent some time actually reading through results when they were searching for ways to narrow their topic. Five groups scanned through both titles and abstracts, which requires clicking on the article titles to display the citation view. One of these five additionally chose to open full-text articles and look at the references to determine relevance. Two groups scanned through the results pages only, but looked at both article titles and the subjects in the left-hand column. Group 5 seemed to only scan the titles in the results list. This user behavior is also quite different than that found with Web search-engine users. In one recent study by Jansen and Spink, more than 90 percent of the time, search-engine users viewed five or fewer documents per query.18 The five groups that chose to view the citation/abstract view by clicking on the title (Groups 1, 2, 3, 4, and 6) identi- fied subtopics that were significantly more interesting and plausible than the general topic they had come up with. From looking at their results, these groups were clearly identifying their subtopics from reading the abstracts and titles rather than just brainstorming. Although Group 2 had the weakest subtop- ics, going from the world baseball classic to specific players’ relationships to the classic and the home-run derby, they were working with a results set of but eleven items. The three groups that relied on scanning only the results list succeeded to an extent, but as a whole, the new subtopics would be much less satisfying to the scenario’s hypo- thetical professor. After scanning the titles on two pages of results, Group 5 (an individual) ended up brainstorming her subtopics (pre- vention, intervention, and what an eating disorder looks like) based on her knowledge of the topic rather than drawing from the results. Group 7 (a group of two) identified their subtopic (sand dunes) from the left- hand column on the results list. Group 8 (an individual) picked up his subtopics (steroids in sports, President Bush’s stance on steroids, and softball) from reading keywords in the article titles on the first page of results. Since the subjects in the left-hand column were a new addition to Basic Search, the use of this area was also noted. Four groups used the subjects in the left-hand column without prompting. Two groups saw the subjects (i.e., ran the mouse over them) but did not use them. The remaining two groups made no action related to the subjects. A worrisome finding of Tasks 2 and 3 was that most students had trouble with the default search being set to phrase-searching rather than to a Boolean AND. This can easily be seen in looking at the number of results the students came up with when they tried to refine their topics (figure 7). Even though most students had some limiter still in effect (full text, last two years) when they first tried their new refined search, it was the phrase- searching that really hurt them. Luckily, this Figure 6. Task 1, coming up with a general topic using Basic Search Figure 5. Tasks posed for each portion of the usability test. 144 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 is a customizable setting in EBSCO’s administrator module, and it is recommended that libraries enable the “Proximity” expander to be set “On” by default, which will automatically combine search terms with AND. Task 4, finding a “recent article in the Economist about the October earthquake in Kashmir,” was designed to test the usability of the EBSCOhost publication search and limiter. It was listed as optional in case the facilitator was worried that time was an issue. Four of the student groups—1, 2, 5, and 7—were posed the task. Of these four groups, three relied entirely on the publication lim- iter on the Refine Search panel. Group 1 chose to use the Publication search. All four groups quickly and success- fully completed this task. ฀ ฀Additional questions during Basic Search tasks At various points during the three tasks in EBSCO’s Basic Search, the students were asked to limit their results set to only full-text results, to find one peer-reviewed article, and to limit their search to the past two years. Seven out of the eight student groups had no problem finding and using the EBSCOhost “Refine Search” panel, including the full-text check box, date limiter, and peer- reviewed limiter. Group 7 did not find the Refine Search panel or use its limiters until specifically guided by the facilitator near the end. This group had found other ways to apply limits: they used the “Books/Monographs” tab on the results list to limit to full text, and the results-list sorting function to limit to the past two years. After having seen the Refine Search panel, Group 7 did use the “Peer Reviewed” check box to find their peer-reviewed article. Toward the end of the Basic Search portion, students were asked to “save three of their results for later.” Three groups demonstrated full use of the folder. An additional three groups started to use the folder and viewed the folder but did not use print, save, or e-mail. It is unclear whether they knew how to do so and just did not follow through, or whether they thought they had safely stored the items. Two students did not use the folder at all, act- ing individually on items. One group used the “Save” function but did not save each article. ฀ Visual Search Similar to Task 1, when using the Basic Search, students did not discover general topics by using the interface, but simply typed in a topic of interest. Only two groups, 1 and 8, chose to try the same topic again. In the interests of processing time, Visual Search limits the search to the first 250 results retrieved. Since JMU has set the default sort results to display in chronological order, the most recent 250 results were returned during these usability tests. Figure 8 shows the students’ original search terms using Visual Search, the actions they took while looking for subtopics, and the subtopics they identified. Additionally, if the subtopics they identified matched words on the screen, the location of those words is noted. Three of the groups (1, 2, and 5) identified subtopics when looking at the labels on topic and subtopic circles. Group 3 identified subtopics while looking at article titles as well as the subtopic circles. The members of Group 6 identified subtopics while look- ing at the citation view and reading the abstract and full text, as well as rolling over article titles with their mice. It was not entirely clear where the student in Group 4 got his subtopics from. Two of the three subtopics did not seem to Figure 7. Basic Search, Task 2 and 3, coming up with subtopics. USABILITY TESTING OF A LARGE, MULTIDISCIPLINARY LIBRARY DATABASE | FAGAN 145 be represented in the display of the results set. His third subtopic was one of the labels from a subtopic circle. Groups 7 and 8 both struggled with finding their subtopics. Group 7 simply had a narrow topic (“jackalope”), and Group 8 misspelled “steroids” and got few results for that reason. Lacking many clusters, both groups tried typing additional terms into the title keyword box on the Filter panel, resulting in fewer or zero results. For Task 3, students were asked to limit their search to the last two years and to refine their search to a chosen subtopic (figure 9). Particularly because the results set is limited to 250, it would have been better to have separated these two tasks: first to have them limit the content, then perhaps the date of the search. Three groups, all groups of two, used the date limit first (2, 6, and 8). Three groups (1, 3, and 6) narrowed the content of their search by typing a new search or additional keywords into the main search box. Groups 2 and 4 narrowed the content of their search by clicking on the subtopic circles. Note that this does not change the count of the number of results displayed in the filter panel. Groups 5 and 7 tried typing keywords into the title keyword filter panel and also clicking on circles. Both groups fared better with the latter approach. Group 8 typed an additional keyword into the filter panel box to narrow his search. While five of the groups announced the subtopic to which they wanted to narrow their search before beginning to narrow their topic, Groups 2, 7, and 8 began to interact with the interface and experi- ment with subtopics before choosing one. While groups 2 and 8 arrived at a subtopic and identified it, Group 7 tried many experiments, but since their original topic (jackalope) was already narrow, they were not ultimately successful in identifying or searching on a subtopic. As with Basic Search, students were asked to save three articles for later. Five of the groups (2, 4, 5, 6, and 8) used the “Add to folder” function which appears in the citation view on the right-hand side of the screen. Of these, three groups proceeded to “Folder Has Items.” Of these groups, two chose the “Save” function. Two groups used either “Save” or “e-mail” to preserve individual items, rather than using the folder. One group experienced system slowness and was not able to load the full-record view in time to deter- mine whether they would be able to save items for later. A concern that students may not realize is that in folder view or individually, the “Save” button really just formats the records. The user must still use a browser function to save the formatted page. No student performed this function. Figure 8. Visual Search, Task 1 and 2, coming up with a general topic Figure 9. Visual Search, Task 3, searching on subtopic (before date limit, if possible) 146 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Several students had some trouble with the mechanics of the filter panel, shown in figure 10. Seven of the eight groups found and used the filter panel, originally hidden from view, without assistance. However, some users were not sure how the title keyword box related to the main search box. At least two groups typed the same search string into the title keyword box that they had already entered into the main search box. Also, users were not sure whether they needed to click the Search button after using the date limiter. However, in no case was a student unable to quickly recover from these areas of confusion. ฀ Results of posttest interview At the end of the entire usability session, participants were asked several questions while looking at screenshots of each interface. A full list of posttest interview questions can be found in figure 11. When speaking about the strengths of Basic Search, seven of eight groups talked about the search options, such as field searching and limiters. The individual in Group 1 mentioned “the ability to search in fields, especially for publications and within publications.” One of the students in Group 3 mentioned that “I thought it was easier to specify the search for the full text and the peer reviewed—it had a separate page for that.” The student in Group 4 added, “They give you all the filter options as opposed to the other one.” Five of the eight groups also mentioned familiarity with the type of interface as a strength of Basic Search. Since JMU has only had access to EBSCO databases for less than a year, and half of the students admitted they had not used EBSCO, it seemed their com- ments were with the style of interface more than their experience with the interface. The student in Group 1 commented, “Seems like the standard search engine.” Group 2 noted, “It was organized in a way that we’re used to more,” and Group 3 said, “It’s more traditional so it’s more similar to other programs.” Half of the groups mentioned that Basic Search was clear or organized. Group 6 explained, “It was nice how it was really clearly set out . . . like, everything’s in a line.” Not surprisingly, Visual Search’s strengths surrounded the grouping of subtopics: seven of eight groups made some comment about this. The student in Group 4 said, “It groups the articles for you better. It kinda like gives you the subtop- ics when you get into it and search it and that’s pretty cool.” The student in Group 8 stated, “You can look and see an outline of where you want to go . . . it’s easy to pinpoint it on screen like that’s where I want to go with my research.” Some of the other strengths mentioned about Visual Search were: showing a lot of information on one screen without scrolling (Group 7) and the colorful nature of the interface. A student in Group 2 added, “I like the circles and squares—the symbols register easily.” The only three weaknesses listed for Basic Search in response to the first question were: “not having a spot to put in words NOT to search for” (Group 1); that, like Internet search engines, Basic Search should have “a clip from the article that has the keyword in it, the line before and the line after” (Group 6); and that Basic Search might be too broad, because “unless you narrow it, [you have to] type in keywords to narrow it down yourself” (Group 7). Figure 10. Visual Search Filter Panel Figure 11. Posttest interview questions USABILITY TESTING OF A LARGE, MULTIDISCIPLINARY LIBRARY DATABASE | FAGAN 147 With regard to weaknesses of Visual Search, half of the groups had some confusion about the content, partially due to the limited number of results. A student from Group 7 declared, “It may not have as many results. . . . if you typed in ‘school’ on the other one, it might have . . . 8,000 pages [but] on this you have . . . 50 results.” The student in Group 5 agreed, saying that with Visual Search, “They only show you a certain number of articles.” The student in Group 1 said, “It’s kind of confusing when it breaks it up into the topics for you. It may be helpful for some other people, but for the way my mind works I like just having all my results displayed out like on the regular one.” Half of the groups also made some comment that they were just not used to it. Six of the groups were asked which one they would choose if they had class in one hour. (It is not clear why the facilitator did not ask this question of groups 3 and 8.) Four groups (1, 2, 5, and 7) indicated Basic Search. One student in Group 2 said, “I think it’s easier to use, but I don’t trust it.” The other in Group 2 added, “It’s new and we’re not quite sure because every other search engine is you just type in words and it’s not graphical.” Both stu- dents in Group 7 commented that the familiarity of Basic Search was the reason they would use it for class in one hour. Both Groups 2 and 7 would later say that they liked the Visual Search interface better. Two groups (4 and 6) chose Visual Search for the “class in one hour” scenario. The student in Group 4 commented, “Because it does cool things for you, makes it easier to find. Otherwise you’re going through by title.” Both these groups would later also say that they liked the Visual Search interface better. The students were also asked to describe two scenarios, one in which they would use Basic Search and one in which they would use Visual Search. Four of the groups (1, 3, 5, and 6) said they would use Basic Search when they knew what information they needed. Seven of the eight groups said they would use Visual Search for broad topics. All the students’ responses are given in figure 12. When asked which interface they preferred, the groups split evenly. Comments from the four who preferred Basic Search (1, 3, 5, and 8) centered on the familiarity of the interface. The student in Group 5 added, “The regular one . . . I like to get things done.” All four of these students had said they had used an EBSCO database before. The two students who could list library research databases by name were both in this group. Of the four who preferred Visual Search (2, 4, 6, and 7), three groups had never used EBSCO before, though one of the students in Group 7 thought he’d used it in the library Web tutorial. Group 2 commented, “It seemed like it had a lot more information . . . cool . . . futuristic.” The student in Group 4 said, “It’s kind of like a little game. . . . like you’re trying to find the hidden piece.” Group 7 commented that Visual Search was colorful and intriguing. The students in Group 6 both stated “the visual one” in unison. One stu- dent said that Visual Search was more “[Eye-catching] . . . it keeps you focused at what you are doing, I felt, instead of . . . words . . . you get to look at colors” and added later that it was “fun.” The other students in Group 6 said, “I’m a very visual learner. So to see instead of having to read the categories, and say oh this is what makes sense, I see the circles like ‘abilities test’ or ‘academic achievement’ and I automatically know that’s what it is . . . and I can see how many articles are in it . . . and you click on it and it zooms in and you have all of them there.” The second student went on to add, “I’ve been teaching my mom how to use technology and the visual search would be so much easier for her to get, because its just looks like someone drew it on there like this is a general category and then it breaks it down.” Other suggestions given during the free-comment portion of the survey were to have the filters from Basic Search appear on Visual Search (especially peer-reviewed); curiosity about when Visual Search would become available (at the time it was in beta test); and a suggestion to have general- education writing students write their first paper using Visual Search. Figure 12. Examples of two situations: one in which you would be more likely to use Visual Search, and one in which you would be more likely to use EBSCO 148 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 ฀ Discussion This evaluation is limited both because most students chose different topics for each search interface, and because they only had time to research one topic in each interface. Therefore, there could be an infinite number of scenarios in which they would have performed differently. However, this study does show that, for some students, or for some search topics, Visual Search will help students in a way that Basic Search may not. One hypothesis of this study was that within the con- text of library research databases, visual searching would provide a needed alternative from traditional, text-based searching. The success of the students was observed in three areas: the quality of the subtopics they identified after interacting with their search results; the improvement of the chosen subtopic over their chosen general topic, and the quality of the results they found for their subtopic search. The researcher made a best effort to compare top- ics and results sets and decide which interface helped the student groups to perform better. In addition, qualities that each interface seemed to contribute to the students’ search process were noted (figure 13). These qualities were deter- mined by reviewing the video recordings and examining the ways in which either interface seemed to support the attitudes and behaviors of the students as they conducted their research tasks. When considering all three of these areas, four groups did not, overall, require Visual Search as an alternative to Basic Search (1, 3, 4, and 7). Two of these groups (4 and 7) seemed to benefit from more focus when using the Basic Search interface. Although Visual Search lent them more interaction and exploration (which may be why they said they preferred Visual Search), it seems the focus was more important to their performance. For the other two groups (1 and 3), Basic Search really supported the depth of inquiry and high interest in find- ing results. These two groups confirmed that they preferred Basic Search. For two groups (6 and 8), Visual Search seemed an equally viable alternative to Basic Search. For Group 6, both interfaces seemed to support the group’s desire to explore; they said they preferred Visual Search. For the student in Group 8, Basic Search seemed to orient him to the goal of finding results, while Visual Search sup- ported a more exploratory approach. Since, in his case, this exploratory approach did not turn out well in the area of finding results, it is not surprising that he ended up preferring Basic Search. The remaining two groups (2 and 5) performed bet- ter with Visual Search, upholding the hypothesis that an alternate search is needed. Group 2 seemed bored and uninterested in the search process when using Basic Search even though they chose a topic of personal interest: “world baseball classic.” Visual Search caught their attention and sparked interest in the impersonal topic “global warming.” Group 2 spent more time exploring while using the Visual Search interface, and in the posttest survey admitted that they preferred the Visual Search interface. The student in Group 5 said she preferred Basic Search, and as a self- described PsycINFO user, seemed comfortable with the interface. Yet for this test scenario, Visual Search made her think of new ideas and supported more real exploration during the search process. Within each of the three areas, Basic Search appeared to have the upper hand for both the quality of the subtopics identified by the students, and in the improvement of the chosen subtopics over the general topics. This is at least partially explained by the limitation of Visual Search to the most recent 250 results. That is, as the students explored the Visual Search results, choosing subtopics would not relaunch a search on that subtopic, which would have engendered more and perhaps better subtopics. In the third area, the quality of the results set for the chosen topic, Visual Search seemed to have the upper hand if only because of the phrase-searching limitation present in JMU’s administrative settings for Basic Search. That is, students were often finding few or no results on their chosen subtopics in Basic Search. This study also had findings that seem to transcend Figure 13: Strengths of Basic Search and Visual Search in quality of subtopics, most improved topic, and result sets USABILITY TESTING OF A LARGE, MULTIDISCIPLINARY LIBRARY DATABASE | FAGAN 149 these interfaces and the underlying database. First, librar- ies should strongly consider changing their database default searching from phrase searching to a Boolean AND, if possible. (This is possible in EBSCO using the administrative module.) Second, most students did not have trouble finding or using the interface widgets to perform limiting functions, with the one exception being some confusion about the relationship between the Visual Search filters and main Search box. Unlike some research into Web search behavior, students may well travel beyond the first page of results and view more than just a few docu- ments when determining relevance. Finally, the presence of subject terms in both interfaces proved to be an aid to understanding results sets. This study also pointed out some improvements that could be made to Visual Search. First, it would be great if Visual Search returned more than 250 results in the initial set, or at least provided an overview of the size, type, and extent of objects using available metadata.19 However, even with today’s high-speed connections, result-set size will need to be balanced with performance. Perhaps, as students click on subtopics, the software could rerun the search so that the results set does not stay limited to the original 250. On a minor note, for both Basic and Visual Search, greater care should be taken to make sure users understand how the Save function works and alert users to the need to use the browser function to complete the process. It should be noted that EBSCO has not stopped devel- oping Visual Search, and many of these improvements may well be on their way. EBSCO says it will be adding more support for limiters, display preferences, and contextual text result-list viewing at some point in the future. These feature sets can currently be viewed on Grokker.com. An important area for future research is user behavior in library subscription databases. While these usability tests provide a qualitative evaluation of a specific inter- face, it would be worthwhile to have a more reliable understanding about students’ searching behavior in library databases across similar interfaces. Since public service librarians deal primarily with users who have self-identified as needing help, their experience does not always describe the behavior of all users. Furthermore, studies of Web search behavior may not apply directly to searching in research databases. Specifically, students’ use of subject terms in both interfaces could be explored. Half of the student groups in this study chose to use the Basic Search subject clusters in the left-hand column on the results page, despite the fact that they had never seen them before (this was a beta-test feature). Is this typical? Would this strategy hold up to a variety of research topics? Another interesting question is the use of a single search box versus several search boxes arrayed in rows (to assist in constructing Boolean and field searching). In the EBSCO administrative module, librarians can choose either option. Based on research rather than anecdotal evidence, which is best? Another option is the default sort: historically, at JMU Libraries, this has been a chronological sort. Does this cause problems for relevance-thinking students? Finally, the issue of collaboration in student research using library research databases would be a fascinat- ing topic. Certainly, these usability recordings could be reviewed with a mind to capturing the differences between individuals and groups of two, but there may be better designs for a more focused study of this topic. ฀ Conclusion If you take away one conclusion from this study, let it be this: Do not hesitate to try Visual Search with your users! Information providers must balance investments in cut- ting-edge technology with the demands of their users. Libraries and librarians, of course, are a key user group for information providers. A critical need in librarianship is to become familiar with the newest technology solutions, particularly with regard to searching, in order to provide vendors with informed feedback about which technologies to pursue. By using and teaching new visual search alter- natives, librarians will be poised to influence the further development of alternatives to text-based searching. References and notes 1. Bernard J. Jansen and Amanda Spink, “How Are We Searching the World Wide Web? A Comparison of Nine Search Engine Transaction Logs,” special issue, Information Processing and Management 42, no. 1 (2006): 257. 2. Bernard J. Jansen and Amanda Spink, “An Analysis of Web Documents Retrieved and Viewed,” in Proceedings of the 4th Inter- national Conference on Internet Computing (Las Vegas, 2003), 67. 3. Aravindan Veerasamy and Nicholas J. Belkin, “Evaluation of a Tool for Visualization of Information Retrieval Results,” SIGIR Forum (ACM Special Interest Group on Information Retrieval) (1996): 85–93; Katy Börner and Javed Mostafa, “JoDL Special Issue on Information Visualization Interfaces for Retrieval and Analysis,” International Journal on Digital Libraries 5, no. 1 (2005): 1–2; Ozgur Turetken and Ramesh Sharda, “Clustering-based Visual Interfaces for Presentation of Web Search Results: An Empirical Investigation,” Information Systems Frontiers 7, no. 3 (2005): 273–97. 4. Stephen Greene et al., “Previews and Overviews in Digital Libraries: Designing Surrogates to Support Visual Information Seeking,” Journal of the American Society for Information Science 51, no. 4 (2000): 380–93; Panayiotis Zaphiris et al., “Exploring the Use of Information Visualization for Digital Libraries,” New Review of Information Networking 10, no. 1 (2004): 51–69. 5. Katy Börner and Chaomei Chen eds., Visual Interfaces to Digital Libraries, 1st ed. (Berlin; New York: Springer, 2003), 243. 150 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 6. Zaphiris et al., “Exploring the Use of Information Visual- ization for Digital Libraries,” 51–69. 7. Börner and Chen, Visual Interfaces to Digital Libraries, 243. 8. Greene et al., “Previews and Overviews in Digital Librar- ies,” 380–93. 9. “Vivisimo Corporate Profile,” in Vivisimo, http://vivi simo.com/html/about (accessed Apr. 19, 2006). 10. “AquaBrowser Library—Fiction Connection,” www.fic tionconnection.com/ (accessed Apr. 19, 2006). 11. “Queens Library—AquaBrowser Library,” http://aqua .queenslibrary.org/ (accessed Apr. 19, 2006). 12. “xrefer—Research Mapper,” www.xrefer.com/research (accessed Apr. 19, 2006). 13. “Stanford ‘Groks,’” http://speaking.stanford.edu/Back _Issues/ SOC67/library/Stanford_Groks.html (accessed Apr. 19, 2006); “Grokker at Stanford University,” http://library.stan ford.edu/catdb/grokker/ (accessed Apr. 19, 2006). 14. “EBSCO has Partnered with Groxis to Deliver an Inno- vative Visual Search Feature as Part of EBSCO,” www.groxis .com/service/grokker/pr29.html (accessed Apr. 19, 2006). 15. Michael Dolenko, Christopher Smith, and Martha E. Williams, “Putting the User into Usability: Developing Cus- tomer-Driven Interfaces at West Group,” in Proceedings of the National Online Meeting 20 (Medford, N.J.: Learned Information, 1999), 81–90; E. T. Morley, “Usability Testing: The SilverPlatter Experience,” CD-ROM Professional 8, no. 3 (1995); Ron Stew- art, Vivek Narendra, and Axel Schmetzke, “Accessibility and Usability of Online Library Databases,” Library Hi Tech 23, no. 2 (2005): 265–86; Nicholas Tomaiuolo, “Deconstructing Ques- tia: The Usability of a Subscription Digital Library,” Searcher 9, no. 7 (2001): 32–39; B. Hamilton, “Comparison of the Different Electronic Versions of the Encyclopaedia Britannica: A Usability Study,” Electronic Library 21, no. 6 (2003): 547–54; Heather L. Munger, “Testing the Database of International Rehabilitation Research: Using Rehabilitation Researchers to Determine the Usability of a Bibliographic Database,” Journal of the Medical Library Association (JMLA ) 91, no. 4 (2003): 478–83; Frank Cer- vone, “What We’ve Learned From Doing Usability Testing on OpenURL Resolvers and Federated Search Engines,” Computers in Libraries 25, no. 9 (2005): 10–14; Alexei Oulanov and Edmund F. Y. Pajarillo, “Usability Evaluation of the City University of New York CUNY+ Database,” Electronic Library 19, no. 2 (2001): 84–91; Steve Brantley, Annie Armstrong, and Krystal M. Lewis, “Usability Testing of a Customizable Library Web Portal,” College & Research Libraries 67, no. 2 (2006): 146–63; Carole A. George, “Usability Testing and Design of a Library Web Site: An Iterative Approach,” OCLC Systems & Services 21, no. 3 (2005): 167–80; Leanne M. VandeCreek, “Usability Analysis of Northern Illinois University Libraries’ Web Site: A Case Study,” OCLC Sys- tems & Services 21, no. 3 (2005): 181–92; Susan Goodwin, “Using Screen Capture Software for Web-Site Usability and Redesign Buy-In,” Library Hi Tech 23, no. 4 (2005): 610–21; Laura Cobus, Valeda Frances Dent, and Anita Ondrusek, “How Twenty-Eight Users Helped Redesign an Academic Library Web Site,” Refer- ence & User Services Quarterly 44, no. 3 (2005): 232–46. 16. “Morae Usability Testing for Software and Web Sites,” www.techsmith.com/morae.asp (accessed Apr. 19, 2006). 17. Jansen and Spink, “An Analysis of Web Documents Retrieved and Viewed,” 67. 18. Ibid. 19. Greene et al., “Previews and Overviews in Digital Librar- ies,” 381. 3346 ---- Levan OPENSEARCH AND SRU | LEVAN 151 Not all library content can be exposed as HTML pages for harvesting by search engines such as Google and Yahoo!. If a library instead exposes its content through a local search interface, that content can then be found by users of metasearch engines such as A9 and Vivísimo. The functionality provided by the local search engine will affect the functionality of the metasearch engine and the findability of the library’s content. This paper describes that situation and some emerging standards in the metasearch arena that choose different balance points between functionality and ease of implementation. Editor's Note: This article was submitted in honor of the fortieth anniversaries of LITA and ITAL. ฀ The content provider’s dilemma Consider the increasingly common situation in which a library wants to expose its digital content to its users. Sup- pose it knows that its users prefer search engines that search the contents of many sites simultaneously, rather than site-specific engines such as the one on the library’s Web site. In order to support the preferences of its users, this library must make its contents accessible to search engines of the first type. The easiest way to do this is for the library to convert its contents to HTML pages and let the harvesting search engines such as Google and Yahoo! collect those pages and provide searching on them. However, a serious problem with harvesting search engines is that they place limits on how much data they will collect from any one site. Google and Yahoo! will not harvest a 3-million-record book catalog, even if the library can figure out how to turn the catalog entries into individual Web pages. An alternative to exposing library content to harvest- ing search engines as HTML pages is to provide a local search interface and let a metasearch engine combine the results of searching the library’s site with the results from searching many other sites simultaneously. Users of metasearch engines get the same advantage that users of harvesting search engines get (i.e., the ability to search the contents of many sites simultaneously) plus those users get access to data that the harvesting search engines do not have. The issue for the library is determining how much functionality it must provide in its local search engine so that the metasearch engine can, in turn, provide accept- able functionality to its users. The amount of functionality that the library provides will determine which metasearch engines will be able to access the library’s content. Metasearch engines, such as A9 and Vivísimo, are search engines that take a user’s query, send it to other search engines, and integrate the responses.1 The level of integration usually depends on the metasearch engine’s ability to understand the responses it receives from the various search engines it has queried. If the response is HTML intended for display on a browser, then the metasearch engine developers have to write code to parse through the HTML looking for the content. In such a case, the perceived value of the content determines the level of effort that the metasearch engine developers put into the parsing task; low-value content will have a low priority for developer time and will either suffer from poor integration or be excluded. For metasearch engines to work, they need to know how to send a search to the local search engine and how to interpret the results. Metasearch engines such as Vivísimo and A9 have staffs of programmers who write code to translate the queries they get from users into queries that the local search engines can accept. Metasearch engines also have to develop code to convert all the responses returned by the local search engines into some common format so that those results can be combined and displayed to the user. This is tedious work that is prone to breaking when a local search engine changes how it searches or how it returns its response. The job of the metasearch engine is made much simpler if the local search engine supports a standard search interface such as SRU (Search and Retrieve URL) or OpenSearch. ฀ What does a metasearch engine need in order to use a local search engine? The search process consists of two basic steps. First, the search is performed. Second, records are retrieved. To do a search, the metasearch engine needs to know: 1. The location of the local search engine 2. The form of the queries that the local search engine expects 3. How to send the query to the local search engine To retrieve records, the metasearch engine needs to know: 4. How to find the records in the response 5. How to parse the records OpenSearch and SRU: A Continuum of Searching Ralph LeVan Ralph LeVan (levan@oclc.org) is a Research Scientist at OCLC Online Computer Library Center in Dublin, Ohio. 152 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 ฀ Four protocols This paper will discuss four search protocols: OpenSearch, OpenSearch 1.1, SRU, and the MetaSearch XML Gateway (MXG).2 OpenSearch was initially developed for the A9 meta- search engine. It provides a mechanism for content providers to notify A9 of their content. It also allows RSS (Really Simple Syndication) browsers to display the results of a search.3 OpenSearch 1.1 has just been released. It extends the original specification based on input from a number of organizations, Microsoft being prominent among them. SRU was developed by the Z39.50 community.4 Recognizing that their standard (now eighteen years old) needed updating, they simplified it and created a new Web service based on an XML encoding carried over HTTP. The MXG protocol is the product of the NISO MetaSearch Initiative, a committee of metasearch engine developers, content providers, and users.5 MXG uses SRU as a starting place, but eases the requirement for support of a standard query grammar. ฀ Functionality versus ease of implementation A library rarely has software developers. The library’s area of expertise is, first of all, the management of content and, sec- ondarily, content creation. Librarians use tools developed by other organizations to provide access to their content. These tools include the library’s OPAC, the software provided to search any licensed content, and the software necessary to build, maintain, and access local digital repositories. For a library, ease of adoption of a new search protocol is essential. If support for the search protocol is built into the library’s tools, then the library will use it. If a small piece of code can be written to convert the library’s existing tools to support the new protocol, the library may do that. Similarly, the developers of the library’s tools will want to expend the minimum effort to support a new search protocol. The tool developer’s choice of search protocol to sup- port will depend on the tension between the functionality needed and the level of effort that must be expended to provide and maintain it. If low functionality is acceptable, then a small development effort may be acceptable. High functionality will require a greater level of effort. The developers of the search protocols examined here recognize this tension and are modifying their protocols to make them easier to implement. The new OpenSearch 1.1 will make it easier for some local search-engine providers to implement by easing some of the functionality require- ments of version 1.0. Similarly, the NISO Metasearch Committee has defined MXG, a variant of SRU that eases some of the requirements of SRU.6 ฀ Search protocol basics Once again, the five basic pieces of information that a metasearch engine needs in order to communicate effec- tively with a local search engine are: (1) local search engine location, (2) the query-grammar expected, (3) the request encoding, (4) the response encoding, and (5) the record encoding. The four protocols provide these pieces of infor- mation to one degree or another (see table 1). The four protocols expose a site’s searching functional- ity and return responses in a standard format. All of these protocols have some common properties. They expect that the content provider will have a description record that describes the search service. All of these services send searches via HTTP as simple URLs, and the responses are sent back as structured XML. To ease implementation, OpenSearch 1.1 allows the content provider to return HTML instead of XML. All four protocols use a description record to describe the local search engine. The OpenSearch protocols define what a description record looks like, but not how it is retrieved. The location of the description record is dis- covered by some means outside the protocol (a priori knowledge). The description record specifies the location of the local search engine. The SRU protocols define what a description record looks like and specifies that it can be obtained from the local search engine. The location of the local search engine is provided by a means outside the protocol (a priori knowledge again). Each protocol defines how to formulate the search URL. OpenSearch does this by having the local search-engine provider supply a template of the URL in the description record. SRU does this by defining the URL. OpenSearch and MXG do not define how to formu- late the query. The metasearch engine can either pass the user’s query along to the local search engine unchanged or reformulate the query based on information about the local search engine’s query language that it has gotten by outside means (more a priori knowledge). In the first case, the metasearch engine has to hope that some magic will happen and the local search engine will do something useful with the query. In the latter case, the metasearch engine’s staff has to develop a query translator. SRU specifies a standard query grammar: CQL (Common Query Language).7 This means that the meta- search engine only has to write one translator for all the SRU local search engines in the world. But it also means that all the SRU local search engines have to support the CQL query grammar. Since there are no local search engines that support CQL as their native query grammar, the con- tent provider is left with the task of translating CQL que- ries into their native query grammar. The query translation task has moved from the metasearch engine to the content provider. OPENSEARCH AND SRU | LEVAN 153 OpenSearch 1.0, MXG, and SRU define the struc- ture of the query response. In the case of OpenSearch, the response is returned as an RSS message, with a couple of extra elements added. MXG and SRU define an XML schema for their responses. OpenSearch 1.1 allows the local search engine to return the response as unstructured HTML. This moves the requirement of creating a standard response from the content provider and leaves the metasearch engine with the much tougher task of finding the content embedded in HTML. If the metasearch engine doesn’t write code to parse the response, then all it can do is display the response. It will not be able to combine the response from the local search engine with the responses from other engines. SRU and MXG require that records be returned in XML and that the local search engine must specify the schema for those records in the response. This leaves the content provider with the task of formatting the records according to the schema of their choice, a task that the content provider is probably best able to do. In turn, the metasearch engine can convert the returned records into some common format so that the records from multiple local search engines can be combined into a single response. Because the records are encoded in XML, it is assumed that standard XML format- ting tools can be used for the conversion. OpenSearch does not define how records should be structured. The OpenSearch response has a place for the title of the record and a URL that points to the record. The structure of the record is undefined. This leaves the metasearch engine with the task of parsing the record that is returned. Again, the effort moves from the content pro- vider to the metasearch engine. If the metasearch engine does not or cannot parse the records, then it can at least display the records in some context, but it cannot combine them with the records from another local search engine. ฀ Conclusion These protocols sit on a spectrum of complexity, trading the content provider’s complexity for that of the search engine. However, with lessened complexity for the metasearch engine comes increased functionality for the user. Metasearch engines have to choose what content providers they will search. Those that provide a high level of functionality can be easily combined with their existing local search engines. Content providers with a lower level of functionality will either need additional development by the metasearch engine or will not be searched. Not all metasearch engines require the same level of functionality, nor will they be prepared to accept content with a low level of functionality. Content providers, such as digital librar- ies and institutional repositories, will have to choose the functionality they need to support to reach the metasearch engines they desire. References and notes 1. Joe Barker, “Meta-Search Engines,” in Finding Information on the Internet: A Tutorial (U.C. Berkeley: Teaching Library Inter- net Workshops, Aug. 23, 2005 [last update]), www.lib.berkeley. edu/TeachingLib/Guides/Internet/MetaSearch.html (accessed May 8, 2006). 2. A9.com, “OpenSearch Specification,” http://opensearch .a9.com/spec/ (accessed May 8, 2006); A9.com, “OpenSearch 1.1,” http://opensearch.a9.com/spec/1.1/ (accessed May 8, 2006). 3. Mark Pilgrim, “What is RSS?” O’Reilly XML.com, Dec. 18, 2002, www.xml.com/pub/a/2002/12/18/dive-into-xml.html (accessed May 8, 2006). 4. The Library of Congress Network Development and MARC Standards Office, “Z39.50 Maintenance Agency Page,” www.loc.gov/z3950/agency/ (accessed May 8, 2006). 5. National Information Standards Organization, “NISO MetaSearch Initiative,” www.niso.org/committees/ MS_initiative.html (accessed May 8, 2006). 6. NISO Metasearch Initiative Task Group 3, “NISO MetaSe- arch XML Gateway Implementors Guide, Version 0.2,” May 16, 2005, [Microsoft Word Document] www.lib.ncsu.edu/niso- mi/images/0/06/NISO_Metasearch_Initiative_XML _Gateway _Implementors_Guide.doc (accessed May 8, 2006); The Library of Congress, “SRU: Search and Retrieve via URL; SRU Version 1.1 13 February 2004,” www.loc.gov/standards/sru/index.html (accessed May 8, 2006). 7. The Library of Congress, “Common Query Language; CQL Version 1.1 13th February 2004.” [Web page] www.loc .gov/standards/sru/cql/index.html (accessed May 8, 2006). Table 1. Comparison of requirements of four metasearch protocols for effective communication with local search engines Protocol Feature OpenSearch 1.1 OpenSearch 1.0 MXG SRU Local search engine location A priori A priori A priori A priori Request encoding Defined Defined Defined Defined Response encoding None RSS XML XML Record encoding None None XML XML Query grammar None None None CQL 3348 ---- Manzari USER-CENTERED DESIGN OF A WEB SITE | MANZARI AND TRINIDAD-CHRISTENSEN 163 This study describes the life cycle of a library Web site created with a user-centered design process to serve a graduate school of library and information science (LIS). Findings based on a heuristic evaluation and usability study were applied in an iterative redesign of the site to better serve the needs of this special academic library population. Recommendations for design of Web-based services for library patrons from LIS programs are dis- cussed, as well as implications for Web sites for special libraries within larger academic library settings. U ser-centered design principles were applied to the creation of a Web site for the Library and Information Science (LIS) Library at the C. W. Post campus of Long Island University. This Web site was designed for use by master’s degree and doctoral students in the Palmer School of Library and Information Science. The prototype was subjected to a usability study consisting of a heuristic evaluation and usability testing. The results were employed in an iterative redesign of the Web site to better accommodate users’ needs. This was the first usabil- ity study of a Web site at the C. W. Post library. Human-computer interaction, the study of the inter- action of human performance with computers, imposes a rigorous methodology on the process of user-interface design. More than an intuitive determination of user- friendliness, a successful interactive product is developed by careful design, testing, and redesign based on the testing outcomes. Testing the product several times as it is being developed, or iterative testing, allows the users’ needs to be incorporated into the design. The interface should be designed for a specific community of users and set of tasks to be accomplished, with the goal of creating a consistent, usable product. The LIS Library had a Web site that was simply a description of the collection and did not provide access to online specialized resources. A new Web site was designed for the LIS library by the incoming LIS librarian who made a determination of what content might be useful for LIS students and faculty. The goal was to have such content readily accessible in a Web site separate from the main library Web site. The Web site for the LIS library includes: ฀ access to all online databases and journals related to LIS; ฀ a general overview of the LIS library and its resources as well as contact information, hours, and staff; ฀ a list of all print and online LIS library journal sub- scriptions, grouped by both title and subject, with links to access the online journals; ฀ links to other Web sites in the LIS field; ฀ links to other university Web pages, including the main library’s home page, library catalog, and in- structions for remote database access, as well as to the LIS school Web site; ฀ a link to JAKE (Jointly Administered Knowledge Environment), a project by Yale University that allows users to search for periodical titles within online databases, since the library did not have this type of access through its own software. This information was arranged in four top-level pages with sublevels. Design considerations included making the site both easy to learn and efficient once users were familiar with it. Since classes are taught at four locations in the metropolitan area, the site needed to be flexible enough to serve students at the C. W. Post campus library as well as remotely. The layout of the information was designed to make the Web site uncluttered and attractive. Different color schemes were tried and informally polled among users. A version with white text on black background prompted strong likes or dislikes when shown to users. Although this combination is easy to read, it was rejected because of the strong negative reactions from several users. Photographs of the LIS library and students were included. The pages were designed with a menu on the left side; fly-out menus were used to access submenus. Where main library pages already existed for informa- tion to be included in the LIS Web site, such as LIS hours and staff, links to those pages were made instead of re-cre- ating the information in the LIS Web site. An attempt was made to render the site accessible to users with disabilities, and pages were made compliant with the World Wide Web Consortium (W3C) by using their html validator and their cascading style sheet validator.1 ฀ Literature review Usability is a term with many definitions, varying by field.2 The fields of industrial engineering, product research and development, computer systems, and library science all share the study of human-and-machine interaction, as well User-Centered Design of a Web Site for Library and Information Science Students: Heuristic Evaluation and Usability Testing Laura Manzari and Jeremiah Trinidad-Christensen Laura Manzari (manzari@liu.edu) is an Associate Professor and Library and Information Science Librarian at the C. W. Post Campus of Long Island University, Brookville, N.Y. Jeremiah Trinidad-Christensen (jt2118@columbia.edu) is a GIS/Map Librarian at Columbia University, New York, N.Y. 164 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 as a commitment to users. Dumas and Reddish explain it simply: “Usability means that the people who use the product can do so quickly and easily to accomplish their own tasks.”3 User-centered design incorporates usability principles into product design and places the focus on the user during project development. Gould and Lewis cite three principles of user-centered design: an early focus on users and tasks, empirical measurement of product usage, and iterative design to include user input into product design and modification.4 Jakob Nielsen, an often-cited usability engineering specialist, emphasizes that for increased functionality, engineering usability principles should apply to Web design, which should be treated as a software development project. He advocates incorporating user evaluation into the design process first through a heuristic evaluation, fol- lowed by usability testing with a redesign of the product after each phase of evaluation.5 Usability principles have been applied to library Web-site design; however, library Web-site usability studies often do not include the addi- tional heuristic evaluation recommended by Nielsen.6 In addition to usability, consideration should also be given during the design process to making the Web site accessible to people with disabilities. Federal agencies are now required by the Rehabilitation Act to make their Web sites accessible to the disabled. Section 508 part 1194.22 of the act enumerates sixteen rules for Internet applications to help ensure Web-site access for people with various dis- abilities.7 Similarly, the Web Accessibility Initiative hosted by the W3C works to ensure that accessibility practices are considered in Web-site design. They developed the Web Content Accessibility Guidelines for making Web sites accessible to people with disabilities.8 Although articles have been written about usability testing of academic library Web sites, very little has been written about usability testing of special-collection Web sites for distinct user populations within larger academic settings.9 ฀ Heuristic evaluation methodology Heuristic evaluation is a usability engineering method in which a small set of expert evaluators examine a user interface for design problems by judging its compliance with a set of recognized usability principles or heuristics. Nielsen developed a set of ten widely adopted usability heuristics (see sidebar). After studying the use of individual evaluators as well as groups of varying sizes, Nielsen and Molich recommend using three to five evaluators for a heuristic evaluation.10 The use of multiple experts will catch more flaws than a single expert, but using more than five experts does not produce greater results. In comparisons of heuristic evaluation and usability testing, the heuristic evaluation uncovered more of the minor problems while usability test- ing uncovered more major, global problems.11 Since each method tends to uncover different usability problems, it is recommended that both methods be used complementa- rily, particularly with an iterative design change between the heuristic evaluation and the usability testing. For the heuristic evaluation, four people were approached from the Palmer LIS School faculty and Ph.D. program with expertise in Web-site design and human- computer interaction. Three agreed to participate. They were asked to familiarize themselves with the Web site and evaluate it according to Nielsen’s ten heuristics, which were provided to them. ฀ Heuristic evaluation results The evaluators were all in agreement that the language was appropriate for LIS students. One evaluator said if new students were not familiar with some of the terms they soon would be. Another thought JAKE, the tool to access full text, might not be clear to students at first, but the LIS Web-site explanation was fine the way it was. They were also in agreement that the Web site was well designed. Comments included: “the purpose and descrip- tion of each page is short and to the point, and there is a good, clean, viewable page for the users”; “the site was well designed and not over designed”; “very clear and user friendly”; “excellent example of limiting unnecessary irrelevant information.” The only page to receive a “poor layout” comment was the lengthy subject list of journals, though no suggestions for improvement were made. Concern was expressed about links to other Web sites on campus. One evaluator thought new students might be confused about the relationship between Long Island University, C. W. Post, and the Palmer School. Two evalua- tors thought links to the main library’s Web site could cause confusion because of the different design and layout. A preference for the design of the LIS library Web site over the main library and Palmer School Web sites was expressed. To eliminate some confusion, the menu options for other cam- pus Web sites were dropped down to a separate menu right below the menu of LIS Web pages. For additional clarity, some of the main library pages were re-created in the style of the LIS pages instead of linking to the original page. The evaluators made several concrete suggestions for menu changes, which were included in the redesign. It was suggested that several menu options were unclear and needed clarification, so additional text was added for clarity at the expense of brevity. Long Island University’s online catalog is named LIUCAT and was listed that way on the menu. New students might not be familiar with this name, so the menu label was changed to LIUCAT (library catalog). USER-CENTERED DESIGN OF A WEB SITE | MANZARI AND TRINIDAD-CHRISTENSEN 165 For the link to JAKE, a description, Find periodicals in online databases, was added for clarification. It was also suggested that the link to the main library Web page for All Databases could cause confusion since the layout and design of that page is different. The wording was changed to All Databases (located in the C. W. Post Library Web site). Menu options were originally arranged in order of anticipated use (see figure 1). Thus, the order of menu options from the LIS home page was databases, journals, library catalog, other Web sites, Palmer School, and main library. Evaluators suggested that putting the option for LIS home page first would give users an easy “emergency exit” to return to the home page if they were lost. The original menu options also varied from page to page. For example, menu options on the database page referred only to pages that users might need while doing database searches. At the suggestion of eva- luators, the menu options were changed to be con- sistent on every page (see figure 2). A redesign based on these results was com- pleted and posted to the Internet for public use (see figure 3). ฀ Usability testing methodology Usability testing is an em- pirical method for improv- ing design. Test subjects are gathered from the popu- lation who will use the product and are asked to perform real tasks using the prototype while their performance and reactions to the product are observed and recorded by an inter- viewer. This observation and recording of behav- ior distinguishes usability testing from focus groups. Observation allows the tes- ter to see when and where users become frustrated or confused. The goal is to Jakob Nielsen’s Usability Heuristics Visibility of system status—The system should always keep users informed about what is going on, through appropriate feedback within reasonable time. Match between system and the real world— The system should speak the user’s language, with words, phrases, and concepts familiar to the user rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order. User control and freedom—Users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state without having to go through an extended dialogue. Support undo and redo. Consistency and standards—Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions. Error prevention—Even better than good error messages is a careful design that prevents problems from occurring in the first place. Recognition rather than recall—Make objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate. Flexibility and efficiency of use—Accelerators, unseen by the novice user, may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions. Aesthetic and minimalist design—Dialogues should not contain information that is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility. Help users recognize, diagnose, and recover from errors—Error messages should be expressed in plain language (no codes), precisely indicate the problems, and constructively suggest a solution. Help and documentation—Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information would be easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too large.12 Figure 1. Original menu Figure 2. Revised menu 166 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 uncover usability problems with the product, not to test the participants themselves. The data gathered are then analyzed to recommend changes to fix usability problems. In addition to recording empirical data such as number of errors made or time taken to complete tasks, active intervention allows the interviewer to question participants about reasons for their actions as well as about their opinions regarding the product. In fact, subjects are asked to verbalize their thought processes as they complete the tasks using the interface. Test subjects are usually interviewed individually and are all given the same pretest briefing from a script with a list of instructions followed by tasks representing actual use. Test subjects are also asked questions about their likes and dislikes. In most situations, payment or other incentives are offered to help recruit subjects. Four or five subjects will reveal 80 percent of usability problems.13 Messages were sent to students via the Palmer School’s mailing lists requesting volunteers. A ten-dollar gift certifi- cate to a bookstore was offered as an inducement to recruit- ment. Input was desired from both master’s degree and doctoral students. The first nine volunteers to respond—all master’s degree students—were accepted. This group included students from both the main and satellite campuses. No Ph.D. stu- dents volunteered to participate at first, citing busy schedules, but eventually a doctoral student was recruited. Testing was conducted in computer labs at the library, at the Palmer School, and at the Manhattan satellite campus. Demographic information was gathered regarding users’ gender, age range, university status, familiarity with computers, with the Internet, and with the LIS library, as well as the type of Internet connection and browser usually used. The subjects were given eight tasks to complete using the Web site. The tasks reflected both the type of assignment a student might receive in class and the type of information they might seek on the LIS Web site on their own. The questions were designed to test usability of different parts of the Web site. ฀ ฀Usability testing results The first task tested the Print Journals page and asked if the LIS library subscribes to a specific journal and whether it is refereed. (The Web site uses an asterisk next to a journal title to indicate that it is refereed.) All subjects were able to easily find that the LIS library does hold the journal title. Although it was not initially obvious that the asterisk was a notation indicating that the journal was refereed, most of the subjects eventually found the explanatory note. Many of the subjects did not know what a refereed journal was, and some asked if a definition could be provided on the site. For the second task, subjects needed to use JAKE to find the full text of an article. None of the students were familiar with JAKE but were able to use the LIS Web site to gain an understanding of its purpose and to access it. The third task asked subjects to find a library asso- ciation that required using the Other Web Sites page. All subjects demonstrated an understanding of how to use this page and found the information. The fourth task tested the Full-Text Databases page. Only one subject actually used this page to complete the task. The rest used the All Databases link to the main library’s database list. That link appears above the link to Full-Text Databases and most subjects chose that link without looking at the next menu option. Several sub- Figure 3. Final home page USER-CENTERED DESIGN OF A WEB SITE | MANZARI AND TRINIDAD-CHRISTENSEN 167 jects became confused when they were taken to the main library’s page, just as the evaluators had predicted. Even though wording was added warning users that they were leaving the LIS Web site, most subjects did not read it and wondered why the page layout changed and was not as clear. They also had trouble navigating back to the LIS Web site from the main library Web site. The fifth task tested the Journals by Subject page. This task took longer for most of the subjects to answer, but all were able to use the page successfully to find a journal on a given subject. The sixth task required using the LIS home page, and everyone easily used it to find the operating hours. The seventh task required subjects to find an online journal title that could be accessed from the Electronic Journals page. All subjects navigated this page easily. The final task asked subjects to find a book review. Most subjects did not look at the page for Library and Information Sciences Databases to access the Books in Print database, saying they did not think it would be included there. Instead, they used the link to the main library’s database page. One subject was not able to complete this task. Problems primarily occurred during testing when sub- jects left the LIS page to use a non-library science database located on the main Web site. Subjects had problems get- ting back to the LIS site from the main library site. While performing tasks, some subjects would scroll up and down long lists instead of using the toolbars provided to bring the user to an exact location on the page. Some preferred using the back button instead of using the LIS Web-site menu to navigate. These seemed to be individual styles of using the Web and not any usability problem with the site. Several people consistently used the menu to return to the LIS home page before starting each new task, even though they could have navigated directly to the page they needed, making a return to the home page unnecessary. This validated the recommendation from the heuristic study that the link to the home page always be the first menu option to give users a comfortable safety valve when they get lost. The final questions asked subjects for their opinions on what they did and did not like about the Web site, as well as any suggestions for improving the site. All subjects responded that they liked the layout of the pages, calling them uncluttered, clean, attractive, and logical. There were very few suggestions for improving the site. One person asked that contact information be included on the menu options in addition to its location right below the menu on the LIS home page. Another participant suggested adding class syllabi to the Web site each semester, listing required texts along with a link to an online bookstore. Some of the novice users asked for explanations of unfamiliar terms such as “refereed journals.” A participant suggested including a search engine instead of using links to navi- gate the site. This was considered during the initial site design but was not included since the site did not have a large number of pages. However, a search engine may be worth including. The one doctoral student had previously only used the main library’s Web page to access databases. Originally, he said he did not see the advantage of a site devoted to information science sources for doctoral candidates, since that program is more multidisciplinary. However, after completing the usability study, the student concluded that the LIS Web site was useful. He suggested that it should be publicized more to doctoral candidates and that it be more prominently highlighted on the main library Web site. Though the questions asked were about the LIS Web site, several subjects complained about the layout of the main library Web site and suggested that it have better linking to the LIS Web site to enable it to be accessed more easily. ฀ Conclusions Iterative testing and user-centered design resulted in a product that testing revealed to be easy to learn and effi- cient to use, and about which subjects expressed satisfac- tion. Based on findings that some students had not even been aware of the existence of the LIS Web site, greater emphasis is now given to the Web site and its features during new student orientations. The biggest problem users had was navigating from the Web pages of the main library back to the LIS site. It was suggested that the LIS site be highlighted more prominently on the main library Web site. Some users were confused by the different layouts between the sites, but no one expressed a preference for the design used by the main library Web site. Despite this confusion, subjects overwhelmingly expressed positive feedback about having a specialized library site serving their specific needs. Issues regarding Web-site design can be problematic for smaller specialized libraries within larger institutions. In this case, some of the problems navigating between the sites could be resolved by changes to the main library site. The design of the LIS Web site was preferred over the main campus Web site by both the heuristic evaluators and the students in the usability test. However, designers of a main library Web site might not be receptive to suggestions from a specialized or branch library. Although consistency in design would eliminate confusion, requiring the special- collection’s Web site to follow a design set by the main institution could be a loss for users. In this instance, the main site was designed without user input, whereas the specialized library serving a smaller population was able to be more dynamic and responsive to its users. Finding an appropriate balance for a site used by students new to the field as well as advanced students is 168 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 a challenge. Although the students in the study were all experienced computer and Web users, their familiarity with basic library concepts varied greatly. A few novice users expressed some confusion as to the difference between journals and index databases. There actually was a description of each of these sources on the site but it was not read. (The subjects barely read any of the site’s text, so it can be difficult to make some points clearer when users want to navigate quickly without reading instructions. Several subjects who did not bother to read text on the site still suggested having more notes to explain unfamiliar terms. However, if the site becomes too overloaded with explanations of library concepts, it could become annoying for more advanced users.) A separate page with a glos- sary is a possibility—based on the study, however, it will probably not be read. Another possibility is a handout for students that could have more text for new users without cluttering the Web site. Having such a handout would also serve to publicize the site. There was some concern prior to the study that offer- ing more advanced features, such as providing access to JAKE or indicating which journals are refereed, might be off-putting for new students; therefore, test questions were designed to gauge reactions to these features. Most students in the study did express some intimidation at not being familiar with these concepts. However, all the subjects eventually figured out how to use JAKE and, once they tried it, thought it was a good idea to include it. Even new students who had the most difficulty were still able to navigate and learn from the site to be able to use it efficiently. An online survey was added to the final design to allow continuous user input. The site consistently receives posi- tive feedback through these surveys. It was planned that responses could be used to continually assess the site and ensure that it is kept responsive and up-to-date; however specific suggestions have not yet been forthcoming. How valuable was usability testing to the Web-site design? Several good suggestions were made and imple- mented, and the process confirmed that the site was well designed. It provided some insight into how subjects used the Web site that had not been anticipated by the design- ers. Since usability studies are fairly easy and inexpensive to conduct, it is probably a step worth taking during the Web-site design process even if it results in only minor changes to the design. References and notes 1. W3C, “The W3C Markup Validation Service,” validator .w3.org (accessed Nov. 1, 2005); W3C, “The W3C CSS Validation Service,” jigsaw.w3.org/css-validator (accessed Nov. 1, 2005). 2. See Carol M. Barnum, Usability Testing and Research (New York: Longman International, 2002); Alison J. Head, “Web Redemption and the Promise of Usability,” Online 23, no. 6 (1999): 20–29; International Standards Organization, Ergonomic Requirements for Office Work with Visual Display Terminals. Part 11: Guidance on Usability—ISO 9241-11 (Geneva: International Organization for Standardization, 1998); Judy Jeng, “What is Usability in the Context of the Digital Library and How Can it be Measured?” Information Technology and Libraries 24, no. 2 (2005): 47–52; Jakob Nielsen, Usability Engineering (Boston: Academic, 1993); Ruth Ann Palmquist, “An Overview of Usability for the Study of Users’ Web-based Information Retrieval Behavior,” Journal of Education for Library and Information Science 42, no. 2 (2001): 123–36. 3. Joseph S. Dumas and Janice C. Redish, A Practical Guide to Usability Testing (Portland: Intellect Bks., 1999), 4. 4. John D. Gould and Clayton H. Lewis, “Designing for Usability: Key Principles and What Designers Think,” Commu- nications of the ACM 28 no. 3 (1985): 300–11. 5. Jakob Nielsen, “Heuristic Evaluation,” in Jakob Nielsen and Robert L. Mack, eds., Usability Inspection Methods (New York: Wiley, 1994), 25–62. 6. See Denise T. Covey, Usage and Usability Assessment: Library Practices and Concerns (Washington, D.C.: Digital Library Federation, 2002); Nicole Campbell, Usability Assessment of Library-related Web Sites (Chicago: ALA, 2001); Kristen L. Garlock and Sherry Piontek, Designing Web Interfaces to Library Services and Resources (Chicago: ALA, 1999); Anna Noakes Schulze, “User-Centered Design for Information Professionals,” Journal of Education for Library and Information Science 42, no. 2 (2001): 116–22; Susan M. Thompson, “Remote Observation Strategies for Usability Testing,” Information Technology and Libraries 22, no. 3 (2003): 22–32. 7. Government Services Administration, “Section 508: Sec- tion 508 Standards,” www.Section508.gov/index.cfm?FuseActi on=Content&ID=12#Web (accessed Nov. 1, 2005). 8. W3C, “Web content accessibility guidelines 2.0,” www .w3.org/TR/WCAG20 (accessed Nov. 1, 2005). 9. See Susan Augustine and Courtney Greene, “Discover- ing How Students Search a Library Web Site: A Usability Case Study,” College and Research Libraries 63, no. 4 (2002): 354–65; Brenda Battleson, Austin Booth, and Jane Weintrop, “Usability Testing of an Academic Library Web Site: A Case Study,” Journal of Academic Librarianship 27, no. 3 (2001): 188–98; Janice Krueger, Ron L. Ray, and Lorrie Knight, “Applying Web Usability Tech- niques to Assess Student Awareness of Library Web Resources,” Journal of Academic Librarianship 30, no. 4 (2004): 285–93; Thura Mack et al., “Designing for Experts: How Scholars Approach an Academic Library Web Site,” Information Technology and Librar- ies 23, no. 1 (2004): 16–22; Mark Shelstad, “Content Matters: Analysis of a Web site Redesign,” OCLC Systems & Services 21, no. 3 (2005): 209–25; Robert L. Tolliver et al., “Web Site Redesign and Testing With a Usability Consultant: Lessons Learned,” OCLC Systems & Services 21, no. 3 (2005): 156–67; Dominique Turnbow et al., “Usability Testing for Web Redesign: A UCLA Case Study,” OCLC Systems & Services 21, no. 3 (2005): 226–34; Leanne M. VandeCreek, “Usability Analysis of Northern Illinois USER-CENTERED DESIGN OF A WEB SITE | MANZARI AND TRINIDAD-CHRISTENSEN 169 University Libraries’ Web Site: A Case Study,” OCLC Systems & Services 21, no. 3 (2005): 181–92. 10. Jakob Nielsen and Rolf Molich, “Heuristic Evaluation of User Interfaces,” in Proceedings of the ACM CHI ’90 (New York: Association for Computing Machinery, 1990), 249–56. 11. Robin Jeffries et al., “User Interface Evaluation in the Real World: A Comparison of a Few Techniques,” in Proceedings of the ACM CHI ’91 (New York: Association for Computing Machin- ery, 1991), 119–24; Jakob Nielsen, “Finding Usability Problems through Heuristic Evaluation,” in Proceedings of the ACM CHI ’92 (New York: Association for Computing Machinery, 1992), 373–86. 12. Jakob Nielsen, “Heuristic Evaluation,” 25–62. 13. Jeffrey Rubin, Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests (New York: Wiley, 1994); Jakob Nielsen, “Why You Only Need to Test with Five Users, Alert- box Mar. 19, 2000,” www.useit.com/alertbox/20000319.html (accessed Nov. 1, 2005). 3349 ---- Salazar 170 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Author ID box for 3 column layout Traditional, larger libraries can rely on their physical collection, coffee shops, and study rooms as ways to entice patrons into their library. Yet virtual libraries merely have their online presence to attract students to resources. This can only be achieved by providing a fully functional site that is well designed and organized, allowing patrons to navigate and locate information easily. One such technology significantly improving the overall useful- ness of Web sites is a content management system (CMS). Although the CMS is not a novel technology per se, it is a tech- nology smaller libraries cannot afford to ignore. In the fall of 2004, the Northcentral University Electronic Learning Resources Center (ELRC), a small, virtual library, moved from a static to a database-driven Web site. This article explains the impor- tance of a CMS for the virtual or smaller library and describes the methodology used by ELRC to complete the project. State of the virtual library The Northcentral University Elec- tronic Learning Resource Center (ELRC), a virtual library, recently moved from a static to a database- driven Web site in 2004.1 Before this, the site consisted of 450 static pages and continued to multiply due to the creation and expansion of Northcentral University (NCU) pro- grams. To provide the type of service demanded by our Internet-savvy patrons, the ELRC felt it needed to evolve to the next stage of Web man- agement and design. NCU, with a current enrollment of roughly twenty-one hundred full- time students, is one of many for- profit virtual universities (including the University of Phoenix, Capella, and Walden, among others) seeking to carve a niche in the education mar- ket by offering professional degrees entirely online.2 In the past few years, distance education has experienced exponential growth, causing virtual universities to flourish, but forcing on their libraries the challenge of keeping pace.3 Typically, virtual libraries are manned by a limited staff comprised of one or two librarians who are responsible for all facets of the library, including interlibrary loan, virtual reference, library instruction, and Web site management, among other library duties. 4 Web site management, as expected, becomes cumbersome when a site exceeds two hundred or more static pages and a clear and structured system is not in place to maintain a proliferating number of Web pages. Because virtual, for-profit librar- ies do not rely on public funding and taxes, they tend not to be as con- cerned about autonomy as public or state libraries, which must find ways to stay within budget and curtail expenses. On the same note, some academic libraries prefer to maintain a local area network (LAN), while other libraries may not have the staff, resources, or need for such a system. Thus, for some virtual libraries, such as ELRC, the incorporation of tech- nology takes on a more dependent role. That is, where some libraries are encouraged to explore open source applications and create homegrown tools, the virtual, smaller-staffed library finds itself more or less reli- ant on its university’s information technology (IT) department.5 Virtual libraries address the needs of distance education students, who demand an equivalent, if not surpass- ing, level of service and instruction as they would expect to find at physical libraries.6 Meeting these needs requires a great deal of creativity, ingenuity, and a strong technical background. Recent trends in developing technologies such as MyLibrary, learning objects, blogs, virtual chat, and federated searching have broadened the scope of possibilities for the smaller-staffed, virtual library. In particular, a content management system (CMS) utilizes a combination of tools that provide numerous advantages, as outlined below: 1. The creation of templates that maintain a consistent design throughout the site 2. The convenience of adding, up- dating, and deleting information from a single, online location 3. The creation and maintenance of interactive pages or learning objects 4. The implementation of a simple editing interface that eliminates knowledge of extensible hyper- text markup language/hypertext markup language (XHTML/ HTML) by library staff Simply defined, a CMS is com- prised of a database, server pages such as Active Server Page (ASP), Personal Home Page (PHP), or ColdFusion; a Web server—for exam- ple, Internet Information Server (IIS), Personal Web Server (PWS), or Apache; and an editing tool to man- age Web content.7 These resources vary in price, but for a virtual library integrated into a larger university, it is ideal to implement applica- tions and software supported by the university. For the autonomous aca- demic library, this may differ. There are advantages and disadvantages for using proprietary and nonpro- prietary software, and it is left to the library, virtual or physical, to deter- mine the type of resources needed to meet the goals and mission of the university.8 Although the scope of this article focuses on the creation of tools for a homegrown CMS, some libraries may wish to explore com- mercial CMS packages that include additional services such as technical support. These CMS packages will vary in price and services depend- ing on the vendor and the needs of the library.9 ELRC transformed In fall 2004, a group that consisted Ed Salazar (esalazar@ncu.edu) is Reference/Web Librarian at Northcentral University. Content Management for the Virtual Library Ed Salazar ARTICLE TITLE | AUTHOR 171CONTENT MANAGEMENT FOR THE VIRTUAL LIBRARY | SALAZAR 171 of two librarians, the education chair, and programmer, convened to discuss the redesign of the ELRC Web site, which had become increas- ingly difficult to manage. Specifically, the amount of duplicated content, inconsistent design and layout, and unstructured architecture of the site posed severe navigational and organi- zational problems. The group selected and compared other academic library sites to determine a desired design and theme for the new ELRC site. Discussions also involved the addi- tion of features such as a site search and breadcrumbs, which the group felt were essential. As a result, the creation of a homegrown CMS using proprietary software became the route of choice to meeting the increasing demands of patrons and the need to expand the site. Because NCU utilizes Microsoft (MS) information system products, it was agreed MS or MS-compatible applications would be used to create the CMS, which consisted of SQL Server, IIS, ASP, Visual Basic Script (VBScript), Jspell Iframe, and MS Visual Interdev. MS Visual Interdev and Jspell Iframe supplanted our previous Web editor, MS FrontPage, which seemed to generate superflu- ous code and thus made it difficult to debug or alter the design and layout of pages. Also, using Jspell Iframe eliminated the need for future NCU librarians to possess an expertise in XHTML/HTML. With these pieces in place, the arduous task of culling content from static pages and enter- ing it into a database was begun. The database The SQL Server database helped in organizing and structuring content, and allowed for the creation of tem- plates and administration (admin) pages.10 In addition, the database played an integral part in creating the search, breadcrumb, and site map features the group so desperately wanted. A significant amount of time was spent weeding the site for infor- mation that had become obsolete or irrelevant to ELRC. It should be noted that the group originally attempted to use Access for a database but stumbled across several problems, one being the inability to maintain a stable and reliable connection to the database. The templates With the database nearly complete, the programmer began creating ASP templates in MS Visual Interdev. These templates basically serve as the shell of the Web page, preserv- ing the design and layout elements of the page while extracting unique content based on a user’s request. In essence, a single template can pro- duce hundreds of pages consistent in design. Likewise, a single change to the template can alter the entire design of the site. For the ELRC, seven templates were created for more than 450 pages. Figure 1 shows the ELRC course guides template. Figure 2 shows the public view of the ELRC course guide template. Changes to the templates are done using MS Visual Interdev, which offers a user-friendly environ- ment for managing Web pages. MS Visual Interdev also includes helpful features, such as highlighting code errors for easy debugging, and the ability to access, create, and maintain stable connections to databases.11 In addition, the MS Visual Interdev edi- tor recognizes commonly used ASP commands, allowing the user to save time by utilizing keyboard shortcuts when programming. Besides creating templates, ASP server-include files and cascading style sheets (CSS) were incorporated, allowing for the easy modification of code on a single file instead of each and every page or template. This, in particular, is time-efficient when having to add or change database connections or design elements. Also, the ELRC took extra precaution to ensure that style elements met the accessibility requirements and stan- dards set forth by the World Wide Web Consortium (W3C), as well as tested the site on other browsers, such as Firefox and Netscape.12 As the site continues to grow and expand, so may the need for additional templates. Creation or replication of templates is simple, requiring a basic understanding of programming and the re-assigning of new variables in the code to match added or modified tables. There is some speculation in the near future of migrating the site to the ASP.NET environment for added functionality and security. If and when that time comes, the ELRC will be ready. At present, NCU is not considering the use of open source code or applica- tions (the exception being the Apache Web server); this is primarily due to available technical support, security, and intuitiveness of use associated with commercial software. In addi- tion, the NCU information system was built using commercial software and a complete transition to open source, at the moment, is not possible or desirable. With the templates complete, the ELRC began running a prototype of the new site, making it accessible to students and faculty from a link on the old site. A survey was created that allowed users to comment on the new site. One detail of importance to note is that the survey duplicated a prior survey done on the old site in 2003 in order to provide the ELRC with comparative data. The admin pages The next phase of the project required the creation of admin pages, which would allow content to be quickly added, updated, and deleted on the site. These pages, like the templates, were created in MS Visual Interdev; display content is housed within the database on the Web, thus allowing 172 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 it to be changed on the fly. Figure 3 shows all of the Web pages for the ELRC within a table. What is particularly convenient about the admin edit pages is the incorporation of the Jspell Iframe editor, which serves as the front- end editor to the site. The reason for using Jspell Iframe, as stated earlier, is its ease of use: the simple tool bar provides the basic, essential tools necessary for creating content with- out the daunting number of buttons and menu selections other editors tend to have. Also, Jspell Iframe is reasonably priced and does not entail a complex installation or require any space on local hard drives; instead, the program is maintained on the server. Consequentially, all that is required is the insertion of the Jspell Iframe JavaScript code into the Web pages. In addition to Jspell Iframe, fields within admin edit pages are or can be pre-populated by content in the data- base. For instance, the title or display order of links can be easily edited or changed. Longer text fields comprised of paragraphs are created or modified using Jspell Iframe. Deleting a page is simple, requiring only the click of a delete button on the bottom, right- hand corner. Figure 4 shows Jspell Iframe embedded within an admin edit page. The admin add page is straightfor- ward. Information is entered into the fields appearing on a form page, and the proper page type designation is selected from a drop-down menu. Yet, more importantly, the admin add and the admin edit pages can filter infor- mation to specific users for security purposes and library needs. Figure 5 shows an admin add page. Figure 6 shows an admin edit page. The admin pages were designed with flexibility in mind. Main col- umn headings may be sorted, as seen in figure 3, allowing one to locate a particular page. The sorting feature also displays the inner structure of the database that, in turn, identifies parent-child relationships between pages in the ELRC, which is useful and necessary when adding pages to the ELRC site. Due to the careful thought used in creating the admin pages, they have proven to be extremely effective and useful in maintaining a library Web site. Each and every change to the site can be made on the Web, allow- ing content to be edited remotely and eliminating the need for installing and maintaining expensive editing soft- ware on local and remote machines. Usability testing With the site completed, the ELRC felt it important to perform usability Figure 1. ELRC course guide template Figure 2. Public view of the ELRC course guide template ARTICLE TITLE | AUTHOR 173CONTENT MANAGEMENT FOR THE VIRTUAL LIBRARY | SALAZAR 173 tests, but how does a virtual library conduct usability testing when all of its students are distance education stu- dents? This is a difficult question that involves some ingenuity to answer. In order to solve this problem, staff members were propositioned (begged) to volunteer for the study. Total staff acquired was five. Also, a local col- lege class of about ten students was persuaded to participate in the study. Granted, the total number of subjects is not representative of the NCU student body; however, substantial changes to the site were made from the data gathered. More usability testing is expected in the immediate future. The findings Usability testing complete, the site was launched. During this period, a few minor hang-ups were experienced, including broken links, form page errors, and stray design elements, but these were only minor problems that were quickly fixed. Feedback from the ELRC survey showed that nearly all of the students and faculty, roughly fifty respondents, approved of the changes by commenting that the site had improved in layout and organi- zation of content as well as naviga- tion. Also, responses and comments from usability testing participants were equally positive and encourag- ing. Figure 7 shows the new NCU Learners ELRC home page. Although it is difficult to estab- lish a direct connection between the ELRC site and usage, recent statistics appear promising. Since the inception of the new site in December 2004, the number of visits to the ELRC Learners home page has jumped 10 percent. This number is expected to rise as NCU continues to grow and students become more acquainted and familiar with the site. The project took nearly six months to complete and required the expertise of a programmer. Although program- ming may be outside the requisites of a distance librarian, managing the site is not. A general understanding of control statements and SQL is all that is needed. For the distance librarian who spends almost all of his or her time online, these skills can be acquired on the job or by taking introductory programming courses at a local college. In the hope that the site will continue to expand in concert with the growing body of NCU students, recently the ELRC added a writing cen- ter and blog. With the entire site now being database driven, adding, updat- ing, deleting content is done effort- lessly. Ideally, students and faculty will play a greater role in the development of the ELRC site as a result of the changes. Involving patrons with the site can play an integral, beneficial role in their academic pursuits. Figure 3. Web pages for ELRC within a table Figure 4. Jspell Iframe editor embedded within an admin edit page 174 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Conclusion The ELRC at NCU encourages other virtual or smaller libraries to explore their resources for improving their library Web sites, which involves understanding campus resources and personnel. With the ever-burgeoning growth of technological resources, every library—small or large, virtual or physical, public or private—can empower itself to meet the needs of Internet-savvy students. It is only a matter of being aware of the resources and putting them to good use. References and notes 1. The NCU ELRC Web site is com- prised of three separate sites: the Public site www.ncu.edu/elrc (accessed Dec. 2, 2004), the Mentors site http://mentors .ncu.edu/elrc (accessed Dec. 2, 2004), and the Learners site http://learners.ncu .edu/elrc (accessed Dec. 2, 2004). Al- though similar in design, each site is tai- lored to meet the needs of each individual group as well as protect NCU’s resources, services, and information. Access to sub- scription resources and personal informa- tion is available upon authentication of the user to the site. 2. For a detailed overview of virtual libraries, see Valerie A. Akuna, “Virtual Universities: The New Higher Educa- tion Paradigm,” Estrella Mountain Col- lege, http://students.estrellamountain .edu/drakuna/VirtualUniversities.htm (accessed Feb. 15, 2005). 3. U.S. Department of Education, National Center for Education Statistics, “The Condition of Education 2004,” Dis- tance Education at Postsecondary Insti- tutions, http://nces.ed.gov/pubsearch/ pubsinfo.asp?pubid=2004077 (accessed Feb. 8, 2005). 4. For more information on the role of the virtual librarian in a virtual univer- sity, see Jan Zastrow, “Going the Distance: Academic Librarians in the Virtual Uni- versity,” University of Hawaii–Kapiolani Community College, http://library.kcc .hawaii.edu/~illdoc/DE/DEpaper.htm (accessed Jan. 29, 2005). 5. For an overview on developing an Open Source CMS, please see Mark Dahl, “Content Management Strategy for a College Library Web Site,” Information Technology and Libraries 23, no. 1 (2004). 6. For a detailed discussion on dis- tance education and virtual libraries, see Smiti Gandhi, “Academic Librarians and Distance Education: Challenges and Opportunities,” Reference & User Services Quarterly 43, no. 2 (2003). 7. For detailed information on using ASP pages for managing databases, see Xiaodong Li and John Paul Fullerton, “Cre- ate, Edit, and Manage Web Database Con- tent Using Active Server Pages,” Library Hi Tech 20, no. 3 (2002); see also, Bryan H. Davidson, “Database Driven, Dynamic Content Delivery: Providing and Man- aging Access to Online Resources Using Microsoft Access and Active Server Pages,” OCLC Systems and Services 17, no. 1 (2001). Figure 6. Admin edit page Figure 5. Admin add page ARTICLE TITLE | AUTHOR 175CONTENT MANAGEMENT FOR THE VIRTUAL LIBRARY | SALAZAR 175 8. For advantages and disadvantages of open source and proprietary software, see John Caroll, “Open Source versus Pro- prietary: Both Have Advantages,” Special to CNET Asia, http://asia.cnet.com/ builder/program/work/0,39009380,3918 1451,00.htm (accessed Feb. 4, 2004); see also, Stephen Shankland, “Study: Open- Source Database Going Mainstream,” CNET, http://ecoustics-cnet.com.com/ Study+Open-source+databases+going +mainstream/2100-7344_3-5171543.html (accessed Feb. 4, 2004). 9. For information on commercial con- tent management vendors and prices, see CMS Watch, www.cmswatch.com/CMS/ Vendors (accessed Feb. 15, 2005). “SQL Server 2000 Product Overview,” Microsoft Windows Server System, www.microsoft. com/sql/evaluation/overview/default. asp (accessed Feb. 15, 2005). 10. For a review on Visual Interdev, see Maggie Biggs, “Visual Studio 6.0 Demon- strates Improved Integration,” InfoWorld 20, no. 35 (1998), www.infoworld.com/ cgi-bin/displayTC.p1?/reviews/980831 vstudio6.htm (accessed Feb. 4, 2004). 11. “Checklist of Checkpoints for Web Content Accessibility Guidelines 1.0,” W3C, www.w3.org/TR/WAI-WEBCON TENT/full-checklist.html (accessed Feb. 1, 2005). 12. Jspell Iframe 2004, www.jspell .com/iframe-spell-checker.html (accessed Dec. 2, 2004). Figure 7. ELRC Learners home page EBSCO cover 2 LAMA cover 3 LITA cover 4 Index to Advertisers 3350 ---- 178 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 L eadership—what is it? ALA President Leslie Burger has me thinking about it a lot these days. As I write, the LITA Board is in the process of determining who LITA will sponsor in the ALA Emerging Leaders Program. The task is difficult. LITA has many new librarians who have strong potential for leadership. Consequently I feel assured that LITA has a strong future because what is an association, if not its members? So one of the questions the Board asked was what does it mean to be an emerg- ing leader? When has one emerged? Personally, I feel that I am still emerging because there is always more to learn. Lifelong learning, isn’t that what librarians are all about? In preparation for my presidency, I attended an American Society for Association Executives seminar facili- tated by Tecker Consultants. They defined four types of influential leadership: servant, visionary, expert, and cata- lytic. I see all four types of influential leaders within LITA and they are all important. The servant leader provides service to others. In a volunteer organization like LITA, a lot of servant leadership is being exhibited. These are the people who keep the organization humming, making sure we have the programs and education opportunities that make LITA relevant to its members. The most obvi- ous place we see visionary leaders in LITA is at our Top Technology Trends; however, it is not the only place where visionary thinking occurs. LITA members are often cutting- edge, applying new technologies to solve problems or to provide better solutions and services. Visionary leadership is where one sees what the future could look like. LITA programs are filled with expert leaders who share their technical expertise and lead the profession in applying those technologies. However, we also have many expert leaders who have important insights into what the asso- ciation can be. The catalytic leader brings people together and leverages their capabilities. The LITA Board works with other LITA leadership to ensure that our goals are reached and to bring together all of the LITA offerings to make membership a comprehensive professional benefit. My challenge as the current LITA President with the ALA Emerging Leaders Program is to ensure that our sponsored member has a meaningful opportunity to become a superb leader both within LITA and within the profession. In addition to attending the leadership training workshops for all of the emerging leaders, each sponsored person will be appointed to some service role within ALA or one of its units. The LITA Board has elected to have our sponsored emerging leader work closely with the officers, in particular our Vice President/President-elect Mark Beatty, on strategic planning for the next two years. I am hopeful that we will learn a great deal from our emerging leader regarding what new members are seeking out of the organization. When I think about a good leader, I think about some- one who listens, who allows others to think creatively and to take risks, who inspires, who sees the big picture, who can make decisions and make others understand the reasons for a decision, and who communicates well. John Buchan put it this way: “The task of leadership is not to put greatness into people, but to elicit it, for the greatness is there already.” My goal this year, in conjunction with, but not limited to, the ALA Emerging Leaders Program, is to grow our new members into future LITA leaders. I have been rewarded in all of my work within LITA to wit- ness rising stars take on exciting roles and projects. I hope everyone reaps the joys of mentoring new professionals at some point in their careers. In my own leadership role, I take seriously the need to implement LITA’s strategic plan. In that vein, the board has created an Assessment and Research Task Force that will make recommendations on gathering assessment data and feedback from members. With the appropriate knowl- edge base, we can ensure that value is being received. The board has also created a working group consisting of the chairs of the Education Committee, the Regional Institutes Committee, and the Program Planning Committee to make recommendations on our education programs. I have been working with that group to identify new modes of deliver- ing our programs and to ensure that they maintain their relevancy to LITA members. LITA continues to implement new communication technologies to reach out to its mem- bers. The LITA Blog has now been up for over a year and the new LITA Wiki is available for use by Interest Groups and others to allow experts to collaborate in the building of topic-specific resources. Sir John Harvey-Jones framed the question thusly: “How do you know you have won? When the energy is coming the other way and when your people are visibly growing individually and as a group.” I see this happening in LITA. What an energizing and fulfilling sight it is! President’s Column Bonnie Postlethwaite Bonnie Postlethwaite (postlethwaiteb@umkc.edu) is LITA Pres- ident 2006/2007 and Associate Dean of Libraries, University of Missouri–Kansas City. 3347 ---- Cherry 154 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 Article Title: subtitle in same font Author Name and Second Author Author ID box for 2 column layout The present study investigated whether there is a cor- relation between user performance and compliance with screen-design guidelines found in the literature. Rather than test individual guidelines and their interactions, the authors took a more holistic approach and tested a com- pilation of guidelines. Nine bibliographic display formats were scored using a checklist of eighty-six guidelines. Twenty-seven participants completed ninety search tasks using the displays in a simulated Web environment. None of the correlations indicated that user performance was statistically significantly faster with greater confor- mity to guidelines. In some cases, user performance was actually significantly slower with greater conformity to guidelines. In a supplementary study, a different set of forty-three guidelines and the user performance data from the main study were used. Again, none of the correlations indicated that user performance was statistically signifi- cantly faster with greater conformity to guidelines. A ttempts to establish generalizations are ubiquitous in science and in many areas of human endeavor. It is well known that this enterprise can be extremely problematic in both applied and pure science.1 In the area of human-computer interaction, establishing and evaluating generalizations in the form of interface-design guidelines are pervasive and difficult challenges, particu- larly because of the intractably large number of potential interactions among guidelines. Using bibliographic dis- play formats from Web catalogs, the present study utilizes global evaluation by correlating user performance in a search task with conformity to a compilation of eighty-six guidelines (divided into four subsets). The literature offers many design guidelines for the user interface, some of which cover all aspects of the user interface, some of which focus on one aspect of the user interface—e.g., screen design. Tullis, in chapters in two editions of the Handbook of Human-Computer Interaction, reviews the work in this area.2 The earlier chapter provides a table describing the screen-design guidelines available at that time. He includes, for example, Galitz, whom he notes have several hundred guidelines addressing general screen design, and Smith and Mosier, whom he notes have about three hundred guidelines addressing the display of data.3 Earlier guidelines tended to be generic. More recently, guidelines have been developed for specific applica- tions—e.g., Web sites for airline travel agencies, multi- media applications, e-commerce, children, bibliographic displays, and public-information kiosks.4 Although some of the guidelines in the literature are based on empirical evidence, many are based on expert opinion and have not been tested. Some of the research- based guidelines have been tested in isolation or in com- bination with only a few other guidelines. The National Cancer Institute (NCI) Web site, Research-based Web Design and Usability Guidelines, rates sixty guidelines on a scale of 0 to 5 based on the strength of the evidence.5 The more valid the studies that directly support the guideline, the higher the rating. In interpreting the scores, the site advises that scores of 1, 2, or 3 suggest that “more evidence is needed to strengthen the designer’s overall confidence in the validity of a guideline.” Of the sixty guidelines on the site, forty-six (76.7 percent) fall into this group. In 2003, the United States Department of Health and Human Services Web site, Research-based Web Design and Usability Guidelines, rated 187 guidelines on a different five-point scale.6 Eighty- two guidelines (43.9 percent) meet the criteria of having strong or medium research support. Another forty-eight guidelines (25.7 percent) are rated as having weak research support. Thus, there is some research support for 69.6 percent of the guidelines. In addition to the issue of the validity of individual guidelines, there may be interactions among guidelines. An interaction occurs if the effect of a variable depends on the level of another variable—e.g., an interaction occurs if the usefulness of a guideline depends on whether some other guideline is being followed. A more severe problem is the potential for high-order interactions: The nature of a two-way interaction may depend on the level of a third variable, the nature of a three-way interaction may depend on the level of a fourth variable, and so on. Because of the combinatorial explosion, if there are more than a few vari- ables the number of possible interactions becomes huge. As Cronbach stated: “Once we attend to interactions, we enter a hall of mirrors that extends to infinity.”7 With a large set of guidelines, it is impractical to test all of the guidelines and all of the interactions, including high- order interactions. Muter suggested several approaches for handling the problem of intractable high-order interac- tions, including adapting optimizing algorithms such as Simplex, seeking “robustness in variation,” re-construing the problem, and pruning the alternative space.8 The pres- ent study utilizes another approach: global evaluation by Joan M. Cherry, Paul Muter, and Steve J. Szigeti Bibliographic Displays in Web Catalogs: Does Conformity to Design Guidelines Correlate with User Performance? Joan M. Cherry (joan.cherry@utoronto.ca) is a Professor in the Faculty of Information Studies; Paul Muter (muter@psych .utoronto.ca) is an Assistant Professor in the Department of Psychology; and Steve J. Szigeti (szigeti@fis.utoronto.ca) is a doctoral student in the Faculty of Information Studies and the Knowledge Media Design Institute, all at the University of Toronto, Canada. BIBLIOGRAPHIC DISPLAYS IN WEB CATALOGS | CHERRY, MUTER, AND SZIGETI 155 correlating user performance with conformity to a set of guidelines. Using this method, particular guidelines and interactions are not tested, but the set and subsets are tested globally, and some of the interactions, including high-order interactions, are captured. Bibliographic displays were scored using a compilation of guidelines, divided into four subsets, and the performance of users doing a set of search tasks using the displays was measured. An attempt was made to determine whether users find information more quickly on displays that receive high scores on checklists of screen-design guidelines. The authors are aware of only two studies that have investigated conformity with a set of guidelines and user performance, and they both included only ten guide- lines. D’Angelo and Twining measured the correlation between compliance with a set of ten standards (D’Angelo Standards) and user comprehension.9 The D’Angelo Standards are in the form of principles for Web-page design, based on a review of the literature.10 D’Angelo and Twining found a small correlation (.266) between number of standards met and user comprehension.11 They do not report on statistical significance, but from the data provided in the paper it appears that the correlation is not significant. Gerhardt-Powals compared an interface designed according to ten cognitive engineering principles to two control interfaces and found that the cognitively engineered interface resulted in statistically significantly superior user performance.12 The guidelines used in the present study were based on a list compiled by Chan to evaluate displays of bib- liographic records in online library catalogs.13 The set of guidelines was broken down into four subsets. Participants in this study were given search tasks and clicked on the requested item on a bibliographic display. The main depen- dent variable of interest was response time. ฀ Method Participants Twenty-seven participants were recruited through the University of Toronto Psychology 100 Subject Pool. Seventeen were female; ten were male. Most (twenty) were in the age group 17 to 24; three were in the age group 25 to 34 years, and four were in the age group 35 to 44. One had never used the Web; all others reported using the Web one or more hours per week. Participants received course credit. Design To control for the effects of fatigue, practice runs, and the like, the order of trials was determined by two orthogonal 9 x 9 Latin squares—one to select a display and one to select a book record. Each participant completed five consecutive search tasks—Author, Title, Call Number, Publisher, and Date—in a random order, with each display-book combina- tion. (The order of the five search tasks was randomized each time.) This procedure was repeated, so that in total each participant did ninety tasks (9 displays x 5 tasks x 2 repetitions). Materials and apparatus The study used nine displays from library catalogs avail- able on the Web. They were selected to represent a variety of systems and to illustrate the remarkable diversity in bibliographic displays in Web catalogs. The displays dif- fered in the amount of information included, the structure of the display, employment of highlighting techniques, and use of graphical elements. Four examples of the nine displays are presented in figures 1a, 1b, 1c, and 1d. The displays were captured and presented in an interactive environment using Active Server Page (ASP) software. The look of the displays was retained, but hypertext links were deactivated. Nine different book records were used to provide the content for the displays. Items selected were those that would be readily understood by most users—e.g., books by Saul Bellow, Norman Mailer, and John Updike. The guidelines were based on a list compiled by Chan from a review of the literature in human-computer interaction and library science.14 The list does not include guidelines about the process of design. Chan formatted the guidelines as a checklist for bibliographic displays in online catalogs. In work reported in 1996, Cherry and Cox modified the checklist for use with bibliographic displays in Web catalogs.15 In a 1998 paper, Cherry reported on evaluations of bibliographic displays in catalogs of aca- demic libraries, based on Chan’s data for twelve OPACs and data for ten Web catalogs evaluated by Cherry and Cox using a modification of the 1996 checklist for Web catalogs.16 The findings showed that, on average, displays in OPACs scored 58 percent and displays in Web cata- logs scored 60 percent. The 1996 checklist of guidelines was modified by Herrero-Solana and De Moya-Anegón, who used it to explore the use of multivariate analysis in evaluating twenty-five Latin American catalogs.17 For the present study four questions were removed that were considered less useful from the checklist used in Cherry’s 1998 analysis. The checklist consisted of four sections or subsets: Labels (these identify parts of the bibliographic descrip- tion); Text (the display of the bibliographic, holdings/ location, and circulation status information); Instructions (includes instructions to users, informational messages, and options available); and Layout (includes identifica- tion of the screen, the organization for the bibliographic 156 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 information, spacing, and consistency of information presentation). Items on the checklist were phrased as ques- tions requiring Yes/No responses. Examples of the items are: Labels: “Are all fields/variables labeled?” Text: “Is the text in mixed case (upper and lowercase)?” Instructions: “Are instructional sentences or phrases simple, concise, clear, and free of typographical errors?” and Layout: “Is the width of the display no more than forty to sixty characters?” The set used in the present study contained eighty- six guidelines in total, of which forty-eight were generic and could be applied to any application. Thirty-eight are specific and apply to bibliographic displays in Web catalogs. The experiment was run on a Pentium computer with a seventeen-inch Sony color monitor with a standard keyboard and mouse. Figure 1a. Example of display Figure 1b. Example of display Figure 1c. Example of display Figure 1d. Example of display BIBLIOGRAPHIC DISPLAYS IN WEB CATALOGS | CHERRY, MUTER, AND SZIGETI 157 Procedure Participants were tested individually. Five practice trials with a display and book record not used in the experi- ment familiarized the participant with the tasks and software. At the beginning of a trial, the message “When ready, click” appeared on the screen. When the participant clicked on the mouse, a bibliographic display appeared along with a message at the top of the screen indicating whether the participant should click on the author, title, call number, publisher, or date of publication—e.g., “Current task: Author.” Participants clicked on what they thought was the correct answer. If they clicked on any other area, the display was shown again. An incorrect click was not defined as an error—in effect, percent correct was always 100—but an incorrect click would of course add to the response time. The software recorded the time to suc- cessfully complete each search, the identification for the display and the book record, and the search-task type. When a participant completed the five search tasks for a display, a message was shown indicating the average response time on that set of tasks. When participants completed the ninety search tasks, they were asked to rank the nine displays according to their preference. For this task, a set of laminated color printouts of the displays was provided. Participants ranked the displays, assigning a rank of 1 to the display that they preferred most, and 9 to the one they preferred least. They were also asked to complete a short background questionnaire. The entire session took less than forty-five minutes. Scoring the displays on screen design guidelines The authors’ experience has indicated that judging whether a guideline is met can be problematic: evalua- tors sometimes differ in their judgments. In this study, three evaluators assessed each of the nine displays inde- pendently. If there was any disagreement amongst the evaluators’ responses for a given question for a given display, that question was not used in the computation of the percentage score for that display. (A guideline regard- ing screen density was evaluated by only one evaluator because it was very time-consuming.) The total number of questions used to assess each display was eighty-six. The number of questions on which the evaluators disagreed ranged from twelve to thirty across the nine displays. All questions on which the three evaluators agreed for a given display were used in the calculation of the percent- age score for that display. Hence the percentage scores for the displays are based on a variable set and number of questions—from fifty-six to seventy-four. The subset of questions on which the three evaluators agreed for all nine displays was small—twenty-two questions. ฀ Results With regard to conformity to the guidelines, in addition to the overall scores for each display, which ranged from 42 percent to 65 percent, the percentage score was calculated for each subset of the checklist (Labels, Text, Instructions, and Layout). The time to successfully complete each search task was recorded to the nearest millisecond. (For some unknown reason, six of the 2,430 response times recorded [27 x 90] were 0 milliseconds. The program was written in such a way that the response-time buffer was cleared at the time of stimulus presentation, in case the participant clicked just before this time. These trials were treated as missing values in the calculation of the means.) Six mean response times were calculated: Author, Title, Call Number, Publisher, Date, and the sum of the five response times, called All Tasks. The mean of All Tasks response times ranged from 13,671 milliseconds to 21,599 milliseconds for the nine formats. The nine display formats differed significantly on this variable according to an analysis of variance, F(8, 477) = 17.1, p < .001. The correlations between response times and guide- lines-conformance scores are presented in table 1. It is important to note that a high correlation between response time and conformity to guidelines indicates a low correla- tion between user performance (speed) and conformity to guidelines. Row 1 of table 1 contains correlations between the total guidelines score and response times; Column 1 contains correlations between All Tasks (the sum of the five response times) and guidelines scores. Of course, the cor- relations in table 1 are not all independent of each other. Only five of the thirty correlations in table 1 are signifi- cant at the .05 level, and they all indicate slower response times with higher conformity to guidelines. Of the six correlations in table 1 indicating faster response times with higher conformity to guidelines, none approaches statistical significance. The upper left-hand cell of table 1 indicates that the overall correlation between total scores on the guidelines and the mean response time across all search tasks (All Tasks) was 0.469 (df = 7, p = 0.203)—i.e., conformity to the overall checklist was correlated with slower overall response times, though this correlation did not approach statistical significance. Figure 2 shows a scatter plot of the main independent variable, overall score on the checklist of guidelines, and the main dependent variable, the sum of the response times for the five tasks (All Tasks). Figure 3 shows a scatter plot for the highest obtained correlation: between score on the overall checklist of guidelines and the time to complete the Title search task. Visual inspection suggests patterns con- sistent with table 1: no correlation in figure 2, and slower search times with higher guidelines scores in figure 3. Finally, correlations were computed between prefer- ence and response times (All Tasks response times and five 158 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 specific-task response times) and between preference and conformity to guidelines (over- all guidelines four subsets of guidelines). None of the eleven correlations approached statisti- cal significance. ฀ Supplementary Study To further validate the results of the main study, it was decided to score the interfaces against a different set of guidelines based on the 2003 U.S. Department of Health and Human Services Research-based Web Design and Usability Guidelines. This set consists of 187 guidelines and includes a rating for each guide- line based on strength of research evidence for that guide- line. The present study started with eighty-two guidelines that were rated as having either moderate or strong research support, as the definitions of both of these include “cumulative research-based evidence.”18 Compliance with guidelines that address the process of design can only be judged during the design process, or via access to the interface designers. Since this review process did not allow for that, a total of nine process-focused guidelines were dis- carded. This set of seventy-three guidelines was then com- pared with the sixty-guideline 2001 NCI set, Research-based Web Design and Usability Guidelines, intending to add any outstanding NCI guidelines supported by strong research evidence to the existing list of seventy-three. However, all of the strongly supported NCI guidelines were already represented in the original seventy-three. Finally, the guidelines in the ISO 9241, Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs), part 11 (Guidance on Usability), part 12 (Presentation of Information ), and part 14 (Menu Dialogues ) were compared to the existing set of seventy-three, with the intention that any prescriptive guideline in the ISO set that was not already included in the original seventy-three would be added.19 Again, there were none. The seventy-three guidelines were organized into three thematic groups: (1) layout (the organization of textual and graphic material on the screen), (2) interaction (which included navigation or any element with which the user would interact), and (3) text and readability. All of the guidelines used were written in a manner allowing readers room for interpretation. The authors explicitly stated that they were not writing rules, but rather, guidelines, and recognized that their application must allow for a level of flexibility.20 This ambiguity creates problems in terms of assessing displays. In this study, two evaluators independently assessed the nine displays. The first evaluator applied all seventy-three guidelines and found thirty to be nonapplicable to the specific types of interfaces considered. The second evaluator applied the shortened list of forty-three guidelines. Following the independent evaluations, the two evaluators compared assessments. The initial rate of agreement between the two assessments ranged from 49 percent to 70 percent across the nine displays. In cases where there was disagreement, the evaluators discussed their rationale for the assessment in order to achieve consensus. ฀ Results of supplementary study As with the initial study, in addition to the overall scores for each display, the percentage score was calculated for each subset of the checklist (Labels, Interaction, and Text and Readability). It is worth noting that the overall scores witnessed higher compliance to this second set of guide- lines, ranging from 68 percent to 89 percent. The correla- tions between response times and guidelines-conformance scores are presented in table 2. Again, it is important to note that a high correlation between response time and confor- mity to guidelines indicates a low correlation between user performance (speed) and conformity to guidelines. Row 1 of table 2 contains correlations between the total guidelines score and response times; column 1 contains correlations between All Tasks (the sum of the five response times) and guidelines scores. Of course, the correlations in table 2 are not all independent of each other. Only one of the twenty-four correlations in table 2 Table 1. Correlations between scores on the checklist of screen design guidelines and time to complete search tasks: Pearson Correlation (Sig. - 2-tailed); N=9 all cells All tasks Author Title Call # Publisher Year Total score: .469 (.203) .401 (.285) .870 (.002) .547 (.127) .035 (.930) .247 (.522) Labels: .722 (.028) .757 (.018) .312 (.413) .601 (.087) .400 (.286) .669 (.049) Text: -.260 (.500) -.002 (.997) .595 (.091) -.191 (.623) -.412 (.271) -.288 (.452) Instructions: .422 (.258) .442 (.234) .712 (.032) .566 (.112) .026 (.947) .126 (.748) Layout: .602 (.086 -.102 (.794) .383 (.308) .624 (.073) .492 (.179) .367 (.332) BIBLIOGRAPHIC DISPLAYS IN WEB CATALOGS | CHERRY, MUTER, AND SZIGETI 159 is significant at the .05 level, and it indicates a slower response time with higher conformity to guidelines. Of the ten correlations in table 2 indicating faster response times with higher conformity to guidelines, none approaches statistical significance. The upper left-hand cell of table 2 indicates that the overall correlation between total scores on the guidelines and the mean response time across all search tasks (All Tasks) was 0.292 (p = 0.445)—i.e., conformity to the overall checklist was correlated with slower overall response times, though this correlation did not approach statistical significance. Figure 4 shows a scatter plot of the main independent variable, overall score on the checklist of guidelines, and the main dependent variable, the sum of the response times for the five tasks (All Tasks). Figure 5 shows a scatter plot for the highest-obtained correlation: between score on the Text and Readability category of guidelines and the time to complete the Title search task. Visual inspection suggests patterns consistent with table 2: no correlation in figure 4, and slower search times with higher guidelines scores in figure 5. ฀ Discussion In the present experiment and the supplementary study, none of the correlations indicating faster user perfor- mance with greater conformity to guidelines approached statistical significance. In some cases, user performance was actually significantly slower with greater conformity to guidelines—i.e., in some cases, there was a negative correlation between user performance and conformity to guidelines. The authors are aware of no other study indicating a negative correlation between user performance and con- formity to interface design guidelines. Some researchers would not be surprised at a finding of zero correlation between user performance and conformity to guide- lines, but a negative correlation is somewhat puzzling. A negative correlation implies that there is something wrong somewhere—perhaps incorrect underlying theories or an incorrect body of assumptions. Such a negative correla- tion is not without precedent in applied science. In the field of medicine, before the turn of the twentieth century, seeing a doctor actually decreased the chances of improv- ing health.21 Presumably, medical guidelines of the time were negatively correlated with successful practice, and the negative correlation implies not just worthlessness, but medical theories or beliefs that were actually incorrect and harmful. The boundary conditions of the present findings are unknown. The present findings may be specific to the tasks employed—fairly simple search tasks. The findings may apply only to situations in which the user is switch- ing formats frequently, as opposed to situations in which each user is using only one format. (A between-subjects design would test this possibility.) The findings may be specific to the two sets of guidelines used. With sets of ten guidelines, D’Angelo and Twining and Gerhardt-Powals found positive correlations between user performance and conformity to guidelines (though apparently not statisti- cally significantly in the former study).22 The guidelines used in the authors’ main study and supplementary study tended to be more detailed than in the other two studies. Detailed guidelines are sometimes seen as advantageous, since developers who use guidelines need to be able to interpret the guidelines in order to implement them. However, perhaps following a large number of detailed Figure 2. Scatter plot for overall score on checklist of screen design guidelines and time to complete set of five search tasks Figure 3. Scatter plot for overall score on checklist of screen design guidelines and time to complete “Title” search tasks 160 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 guidelines reduces the amount of personal judgment used and results in less effective designs. (Designers of the nine displays used in the present study would not have been using either of the sets of guidelines used in our studies but may have been using some of the sources from which our guidelines were extracted.) As noted by Cheepen in discussing guidelines for voice dialogues, sometimes a designer’s experience may be more valuable than a par- ticular guideline.23 The lack of agreement in interpreting the guide- lines was an unexpected but interesting factor re- vealed during the collec- tion of data in both the main study and the sup- plementary study. While a higher rate of agree- ment had been expected, the differences raised an important point in the use of guidelines. If guidelines intentionally leave room for interpretation, what factor does expert opinion and experience play in design? In the main study, the number of guidelines on which the evaluators disagreed ranged from 14 percent to 35 percent across the nine displays. In the supplemen- tary study, both evaluators had experience in interface design through a number of different roles in the design process (both academic and professional). This meant the evaluators’ interpretations of the guidelines were informed by previous experience. The initial level of disagreement ranged from 30 percent to 51 percent across the nine dis- plays. While it was possible to quickly reach consensus Table 2. Correlations between scores on subset of the U.S. Dept. of Health and Human Services (2003) Research–based Web Design and Usability Guidelines and time to complete search tasks: Pearson Correlation (Sig. - 2-tailed); N=9 all cells All tasks Author Title Call # Publisher Year Total score: .292 (.445) .201 (.604) .080 (.839) -.004 (.992) .345 (.363) .499 (.172) Layout: -.308 (.420) -.264 (.492) -.512 (.159) -.332 (.383) .046 (.906) -.294 (.442) Text: .087 (.824) -.051 (.895) .712 (.032) -.059 (.879) -.095 (.808) -.259 (.500) Interaction: .638 (.065) .603 (.085) .055 (.887) .439 (.238) .547 (.128) .625 (.072) Figure 4. Scatter plot for subset of U.S. Department of Health and Human Services (2003) Research–based Web Design and Usability Guidelines conformance score and total time to com- plete five search tasks Figure 5. Scatter plot for Text and Readability category of U.S. Department of Health and Human Services (2003) Research–based Web Design and Usability Guidelines and time to complete “Title” search tasks BIBLIOGRAPHIC DISPLAYS IN WEB CATALOGS | CHERRY, MUTER, AND SZIGETI 161 on a number of assessments (because both evaluators recognized the high degree of subjectivity that is involved in design), it also led to longer discussions regarding the intentions of the guideline authors. A majority of the differ- ences involved lack of guideline clarity (where one evalu- ator had indicated a meet-or-fail score, while another felt the guideline was either unclear or not applicable). Does this imply that guidelines can best be applied by commit- tees or groups of designers? The dynamic of such groups would add another complex variable to understanding the relationship between guideline conformity and user performance. Future research should test other tasks and other sets of guidelines to confirm or refute the findings of the present study. There should also be investigation of other potential predictors of display effectiveness. For example, would the ratings of usability experts or graphic designers for a set of bibliographic displays be positively correlated with user performance? Crawford, in response to a paper presenting findings from an evaluation of bibliographic displays using a previous version of the checklist of guidelines used in the main study, commented that the design of bibliographic displays still reflects art, not science.24 Several researchers have discussed aesthetics and user interface design. Reed et al. noted the need to extend our understanding of the role of aesthetic elements in the context of user-interface guidelines and standards.25 Ngo, Teo, and Byrne dis- cussed fourteen aesthetic measures for graphic displays.26 Norman discussed these ideas in “Emotions and Design: Attractive Things Work Better.”27 Tractinsky, Katz, and Ikar found strong correlations between perceived aesthetic appeal and perceived usability.28 Most empirical studies of guidelines have looked at one variable only or, at the most, a small number of variables. The opposite extreme would be to do a study that exam- ines a large number of variables factorially. For example, assuming eighty-six yes/no guidelines for bibliographic displays, it would be theoretically possible to do a factorial experiment testing all possible combinations of yes/no—2 to the 86th power. In such an experiment, all two-way interactions and higher interactions could be assessed, but such an experiment is not feasible. What the authors have done is somewhere between these two extremes. This study has the disadvantage that we cannot say anything about any individual guideline, but it has the advantage that it captures some of the interactions, including high- order interactions. Despite the present results, the authors are not recom- mending abandoning the search for guidelines in interface design. At a minimum, the use of guidelines may increase consistency across interfaces, which may be helpful. However, in some research domains, particularly when huge numbers of potential interactions result in extreme complexity, it may be advisable to allocate resources to means other than attempting to establish guidelines, such as expert review, relying on tradition, letting natural selection take its course, utilizing the intuitions of design- ers, and observing user-interaction. Indeed, in pure and applied research in general, perhaps more resources should be allocated to means other than searching for explicit generalizations. Future research may better indi- cate when to attempt to establish generalizations and when to use other methods. ฀ Acknowledgements This work was supported by a Social Sciences and Humanities Research Council General Research Grant awarded by the Faculty of Information Studies, University of Toronto, and by the Natural Sciences and Engineering Research Council of Canada. The authors wish to thank Mark Dykeman and Gerry Oxford who developed the software for the experiment; Donna Chan, Joan Bartlett, and Margaret English, who scored the displays with the first set of guidelines; Everton Lewis, who conducted the experimental sessions; M. Max Evans, who helped score the displays with the supplementary set of guidelines; and Robert L. Duchnicky, Jonathan L. Freedman, Bruce Oddson, Tarjin Rahman, and Paul W. Smith for helpful comments. References and notes 1. See, for example, A. Chapanis, “Some Generalizations About Generalization,” Human Factors 30, no. 3 (1988): 253–67. 2. T. S. Tullis, “Screen Design,” in Handbook of Human-Com- puter Interaction, ed. M. Helander (Amsterdam: Elsevier, 1988), 377–411; T. S. Tullis, “Screen Design,” in Handbook of Human- Computer Interaction, 2d ed., eds. M. Helander, T. K. Landauer, and P. Prabhu (Amsterdam: Elsevier, 1997), 503–31. 3. W. O. Galitz, Handbook of Screen Format Design, 2d ed. (Wellesley Hills, Mass.: QED Information Sciences, 1985); S. L. Smith and J. N. Mosier, Guidelines for Designing User Interface Software, Technical Report ESD-TR-86-278 (Hanscom Air Force Base, Mass.: USAF Electronic Systems Division, 1986). 4. C. Chariton and M. Choi, “User Interface Guidelines for Enhancing the Usability of Airline Travel Agency e-Com- merce Web Sites,” CHI ‘02 Extended Abstracts on Human Fac- tors in Computing Systems, Apr. 20–25, 2002 (Minneapolis, Minn.: ACM Press), 676–77, http://portal.acm.org/citation .cfm?doid=506443.506541 (accessed Dec. 28, 2005); M. G. Wad- low, “The Andrew System; The Role of Human Interface Guidelines in the Design of Multimedia Applications,” Current Psychology: Research and Reviews 9 (Summer 1990): 181–91; J. Kim and J. Lee, “Critical Design Factors for Successful e-Commerce Systems,” Behaviour and Information Technology 21, no. 3 (2002): 185–99; S. Giltuz and J. Nielsen, Usability of Web Sites for Children: 162 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2006 70 Design Guidelines (Fremont, Calif.: Nielsen Norman Group, 2002); Juliana Chan, “Evaluation of Formats Used to Display Bibliographic Records in OPACs in Canadian Academic and Public Libraries,” Master of Information Science Research Proj- ect Report (University of Toronto: Faculty of Information Stud- ies, 1995); M. C. Maquire, “A Review of User-Interface Design Guidelines for Public Information Kiosk Systems,” International Journal of Human-Computer Studies 50, no. 3 (1999): 263–86. 5. National Cancer Institute, Research-based Web Design and Usability Guidelines (2001), www.usability.gov/guidelines/index .html (accessed Dec. 28, 2005). 6. U.S. Department of Health and Human Services, Research- based Web Design and Usability Guidelines (2003), http://usability .gov/pdfs/guidelines.html (accessed Dec. 28, 2005). 7. L. J. Cronbach, “Beyond the Two Disciplines of Scientific Psychology,” American Psychologist 30, no. 2 (1975): 116–27. 8. P. Muter, “Interface Design and Optimization of Read- ing of Continuous Text,” in Cognitive Aspects of Electronic Text Processing, eds. H. van Oostendorp and S. de Mul (Norwood, N.J.: Ablex, 1996), 161–80; J. A. Nelder and R. Mead, “A SIM- PLEX Method for Function Minimization,” Computer Journal 7, no. 4 (1965): 308–13; T. K. Landauer, “Research Methods in Human-Computer Interaction,” in Handbook of Human-Computer Interaction, ed. M. Helander (Amsterdam: Elsevier, 1988), 905–28; R. N. Shepard, “Toward a Universal Law of Generalization for Psychological Science,” Science 237 (Sept. 11, 1987): 1317–323. 9. J. D. D’Angelo and J. Twining, “Comprehension by Clicks: D’Angelo Standards for Web Page Design, and Time, Compre- hension, and Preference,” Information Technology and Libraries 19, no. 3 (2000): 125–35. 10. J. D. D’Angelo and S. K. Little, “Successful Web Pages: What are They and Do They Exist?” Information Technology and Libraries 17, no. 2 (1998): 71–81. 11. D’Angelo and Twining, “Comprehension by Clicks.” 12. J. Gerhardt-Powals, “Cognitive Engineering Principles for Enhancing Human-Computer Performance,” International Journal of Human-Computer Interaction 8, no. 2 (1996): 189–211. 13. Chan, “Evaluation of Formats.” 14. Ibid. 15. Joan M. Cherry and Joseph P. Cox, “World Wide Web Displays of Bibliographic Records: An Evaluation,” Proceed- ings of the 24th Annual Conference of the Canadian Association for Information Science (Toronto, Ontario: Canadian Association for Information Science, 1996), 101–14. 16. Joan M. Cherry, “Bibliographic Displays in OPACs and Web Catalogs: How Well Do They Comply with Display Guide- lines?” Information Technology and Libraries 17, no. 3 (1998): 124– 37; Cherry and Cox, “World Wide Web Displays of Bibliographic Records.” 17. V. Herrero-Solana and F. De Moya-Anegón, “Bibliograph- ic Displays of Web-Based OPACs: Multivariate Analysis Applied to Latin-American Catalogs,” Libri 51 (June 2001): 75–85. 18. U.S. Department of Health and Human Services, Research- based Web Design and Usability Guidelines, xxi. 19. International Organization for Standardization, ISO 9241- 11: Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs)—Part 11: Guidance on Usability (Geneva, Swit- zerland: International Organization for Standardization, 1998); International Organization for Standardization, ISO 9241-12: Ergonomic Requirements for Office Work with Visual Display Termi- nals (VDTs)—Part 12: Presentation of Information (Geneva, Swit- zerland: International Organization for Standardization, 1997); International Organization for Standardization, ISO 9241-14: Ergonomic Requirements for Office Work with Visual Display Ter- minals (VDTs)—Part 14: Menu Dialogues (Geneva, Switzerland: International Organization for Standardization, 1997). 20. U.S. Department of Health and Human Services, Research- based Web Design and Usability Guidelines. 21. Ivan Illich, Limits to Medicine: Medical Nemesis: The Expro- priation of Health (Harmondsworth, N.Y.: Penguin, 1976). 22. D’Angelo and Twining, “Comprehension by Clicks”; Ger- hardt-Powals, “Cognitive Engineering Principles.” 23. C. Cheepen, “Guidelines for Dialogue Design—What is Our Approach? Working Design Guidelines for Advanced Voice Dia- logues Project. Paper 3,” (1996), www.soc.surrey.ac.uk/research/ reports/voice-dialogues/wp3.html (accessed Dec. 29, 2005). 24. W. Crawford, “Webcats and Checklists: Some Caution- ary Notes,” Information Technology and Libraries 18, no. 2, (1999): 100–03; Cherry, “Bibliographic Displays in OPACs and Web Catalogs.” 25. P. Reed et al., “User Interface Guidelines and Standards: Progress, Issues, and Prospects,” Interacting with Computers 12, no. 1 (1999): 119–42. 26. D. C. L. Ngo, L. S. Teo, and J. G. Byrne, “Formalizing Guidelines for the Design of Screen Layouts,” Displays 21, no. 1 (2000): 3–15. 27. D. A. Norman, “Emotion and Design: Attractive Things Work Better,” Interactions 9, no. 4 (2002): 36–42. 28. N. Tractinsky, A. S. Katz, D. Ikar, “What is Beautiful is Usable,” Interacting with Computers 13, no. 2 (2000): 127–45. 3351 ---- GUEST EDITORIAL | HIRST 179 O rganization structure and reorganization are never exciting topics. The world rarely pauses to take a deep breath or offer a round of applause when an organization adds a new committee or decides to split into subgroups. However, organizations frequently inform the patterns and processes of change—as well as no change. Recently, the Ex Libris Users of North America (ELUNA) group reorganized. Processes and outcomes were similar to those I observed many years before when the Library Information and Technology Association (LITA) restructured, and I labeled the process LITAish. John Webb subsequently asked me to elaborate through an Information Technologies and Libraries (ITAL) editorial. ■ LITA—An organizational recap In 1981, LITA launched a bold reorganization. Sections and committees were abolished and a new structure, the inter- est group, was created with the hope of significant benefits to the organization. The final report of the Long-Range Plan Implementation Committee of May 29, 1984, stated: The main thrust of the reorganization . . . was the estab- lishment and encouragement of interest groups, which were intended to reflect topics of current interest to members and to have a structure which allows for easy creation and easy elimination as interests and technology change. Interest groups could be formed . . . from as few as ten LITA members and were empowered to plan and present programs, institutes, and preconferences . . . Linda Knutson, who became executive director of LITA in February 1987, “has . . . been impressed by the increase in the level of participation and by the tremendous energy that the players have; they want to contribute, and they plunge in with both feet.” These comments are from conversations with Linda Knutson quoted in “LITA’s First Twenty-Five Years: A Brief History,” by Stephen R. Salmon in the March 1993 silver anniversary issue of ITAL. Twenty years later, the LITA organization and, specifi- cally, the LITA Interest Groups (IGs) continue to provide forums for discussion, create conference programs, insti- tutes, and preconferences. The IGs hold the content of the organization with minimal administrative overhead, irregular leadership, and virtually no bylaws. ■ NAAUG—The deconstruction of a classic model ALEPH, the Ex Libris integrated library solution (ILS) software, is used in numerous countries. The North American ALEPH user’s group (NAAUG) existed from 1999 to 2006. The organization had a reasonably classic structure with a steering committee and ad hoc groups to work on annual software enhancements, focus groups, and conference planning. The organization was very cen- tralized with all appointments to subgroups made by the steering committee. Developments outside the ILS put pressure on NAAUG to reorganize. Ex Libris was offering numerous new products, some of which complemented, some of which were independent of the ILS. As with any organization, there was some pressure to retain all or part of the status quo from those who were hesitant to change or change radically. Leaders, including myself, were cautious, always questioning whether new developments would work and be effective. ■ ELUNA emerges The new Ex Libris users’ organization, ELUNA, is com- posed of the steering committee, product groups (PGs), and interest groups (IGs). I was intrigued with the formation of ELUNA IGs and believe that this structure was an offspring of the LITA IGs. The ELUNA IGs have very little bureaucracy to hinder the creativity and energy that LITA wanted to capture. There is no minimum number of participants in an ELUNA IG, the creation of which can be proposed by any single individual. Each group must write a brief annual report, have a contact person whose name and e-mail is posted on the Web site, and may have an optional electronic discussion list. The groups can meet at the annual conference or anywhere they choose and a virtual IG is not discouraged. The IGs may get involved in product enhancements, but it is fine to leave this work to the PGs. Currently, IGs are organized around such areas as function, type of library, and particular software. Some examples: ■ Data Representation (special scripts) ■ Law ■ EDI ■ Music ■ Government Publications ■ Shared Systems (consortia) ■ ILL ■ SQL Guest Editorial: Organizational Structure— Yesterday Informs the Present Donna Hirst Donna L. Hirst (donna-hirst@uiowa.edu) is Project Coordinator, Library Information Technology, University of Iowa Libraries, and a member of the ITAL Editorial Board. 180 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 ■ Large Research Libraries ■ Z39.50 ■ What happens next? The ELUNA structures of steering committee, PGs, and IGs are off to a good start. Because each of these is empowered to work independently, a communication matrix needs to be put into place so that all interested or affected parties are adequately informed. In the future, a process will need to be created to iden- tify groups that need to be disbanded. LITA solved this problem with the periodic renewal process. In ELUNA, the contact person may be able to assume this responsibility. We live in an age where “opening” offers a context for change. Opening implies new possibilities and few restric- tions. Open systems . . . Open access . . . Open source. It appears to me that ELUNA is continuing a tradition that LITA began twenty-five years ago with an open organiza- tion. Put people into a group, stir lightly, and watch what comes out of the pot. ■ 3352 ---- SEARCH ACROSS DIFFERENT MEDIA | BUCKLAND, CHEN, GEY, AND LARSON 181 Digital technology encourages the hope of searching across and between different media forms (text, sound, image, numeric data). Topic searches are described in two different media: text files and socioeconomic numeric databases and also for transverse searching, whereby retrieved text is used to find topically related numeric data and vice versa. Direct transverse searching across different media is impossible. Descriptive metadata pro- vide enabling infrastructure, but usually require map- pings between different vocabularies and a search-term recommender system. Statistical association techniques and natural-language processing can help. Searches in socioeconomic numeric databases ordinarily require that place and time be specified. A hope for libraries is that new technology will support searching across an increasing range of resources in a growing digital landscape. The rise of the Internet provides a technological basis for shared access to a very wide range of resources. The reality is that network-accessible resources, like the contents of a well-stocked reference library, are quite heterogeneous, especially in the variety of indexing, classification, catego- rization, and other forms of metadata. However, the use of digital technology implies a degree of technical compat- ibility between different media, sometimes referred to as “media convergence,” and these developments encourage the prospect of being able to search across and between different media forms—notably text, images, sound, and numeric data sets—for different kinds of material relat- ing to the same topic. To examine the practical problems involved, the authors undertook to demonstrate searching between and across two different media forms: text files and socioeconomic numeric data sets.1 Two kinds of search are needed. First, it should be pos- sible to do a topical search in multiple media resources, so that one can find, for example, both pertinent factual numeric data and relevant discussion. (One difficulty is that the vocabulary used to classify the numeric data is ordinarily quite different from the subject headings used for books, magazine articles, and newspaper stories about the same topic.) Second, when intriguing data values are encountered, one would like to move directly to topically relevant texts. Likewise, when a questionable statement is read, one would like to be able to find relevant statisti- cal evidence. Therefore, there needs to be search support that facilitates such transverse searching among resources, establishing connections, transferring data, and invoking appropriate utilities in a helpful way. Both problems were addressed through the design and demonstration of a gateway providing search sup- port for both text and socioeconomic numeric databases. First, the gateway should help users conduct searches in databases of different media forms by accepting a query in the searcher’s own terms and then suggesting the spe- cialized categorization terms to search for in the selected resource. Second, if something interesting was found in a socioeconomic database, the gateway would help the searcher to find documents on the same topic in a text database, and vice versa. Selection of the best search terms in target databases is supported by the use of indexes to the categories (entries, headings, class numbers) in the system to be searched. These search-term recommender systems (also known as “entry vocabulary indexes”) resemble Dewey’s “Relativ Index,” but are created using statistical association techniques.2 Four characteristics of this investigation need to be noted: 1. Searching independent sources: The authors were not concerned with ingesting resources from differ- ent sources into a consolidated local data repository and searching within it. The interest lay, instead, in being able to search effectively in any accessible resource as and when one wants. This implies that interoperability issues in dealing with the native query languages and metadata vocabularies of remote repositories can be solved. 2. Search for independent content: Numeric data sets commonly have associated text in the form of documentation, code books, and commentary. However, the authors were interested in finding topical content that had no such formal or liter- ary connection. Independent means, for example, a newspaper article written by someone unaware that relevant statistical data existed or had been written before the author’s article existed. In the other direction, having found statistical data of interest, could topically related text created inde- pendently of this particular data point be found? 3. Two different media forms were chosen: text and numeric data sets. They look similar because they both use arabic numerals, but the traditional reli- ance on information retrieval in a text environment Search across Different Media: Numeric Data Sets and Text Files Michael Buckland, Aitao Chen, Fredric C. Gey, and Ray R. Larson Michael Buckland (buckland@sims.berkeley.edu) is Emeritus Professor, School of Information, University of California, Berkeley; Aitao Chen (aitao@yahoo-inc.com) is a researcher at Yahoo!, Sunnyvale, California; Fredric C. Gey (gey@berkeley .edu) is an Information Scientist, UC Data Archive and Technical Assistance at the University of California, Berkeley; and Ray R. Larson (ray@sims.berkeley.edu) is a Professor, School of Information at the University of California, Berkeley. 182 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 of using any character string from the corpus as a query, although technically feasible, cannot be expected to be useful here. One can copy a number expressing quantity, such as 12,941, from a numeric data cell, use it as a query in a text search engine such as Google, and retrieve a large and eclectic retrieved set, usually involving “12941” as an iden- tifying number for a postal code, a memorandum, a part number, software bug report, and so on, but the relationship is spurious. It requires great faith in numerology to expect anything topically mean- ingful to the original data cell one started with. With other combinations of media forms, not even spurious results are feasible: one cannot submit a musical fragment or some pixels from an image as a text query. 4. The authors’ interest was in how to achieve a bet- ter return on existing investments in well-formed, edited resources with descriptive metadata. This project built directly on prior work on how to make more effective use of existing, expertly developed metadata, rather than creating or replacing meta- data. Search of multiple resources comes in two forms: 1. Parallel search is when a single query is sent to two or more resources at more or less the same time. For example, a researcher interested in the import of shrimp would like to see pertinent newspaper articles and trade statistics. Thus, one might send a query to the Census Bureau’s United States (U.S.) Imports and Exports numeric data series and look at SIC 0913 for shrimp and prawn and note a dra- matic increase in imports from Vietnam through Los Angeles from 1995 onwards. One would also search newspaper indexes for articles such as “Normalizing ties to Vietnam important steps for U.S. firms; California stands to profit handsomely when barriers fall to trade with fast-growing coun- try.”3 Different sources are likely to use different index terms or categories, so the challenge is how to express the searcher’s query in terms that will be effective for searching in the target resources, which, mostly likely, will use different vocabular- ies. As one example, the term for “automobiles” is 3711 in the Standard Industrial Classification; TL 205 in the Library of Congress (LC) Classification, 180/280 in the U.S. Patent Classification; and, in the Census Bureau’s U.S. Imports and Exports data series, PASS MOT VEH, SPARK IGN ENG.4 2. Transverse search is when an item of interest found in one resource is used as the basis for a query to be forwarded to a different resource. The challenge here, again, is that when a query using the topical metadata in one resource needs to be expressed in the vocabulary of the target resource, the metadata vocabularies in the two resources will usually be different from each other, and, quite likely, both are unfamiliar to the searcher. When searching within a single media form, it may be possible to use content itself directly as a query: A frag- ment of text in a source-text database is commonly used as a query in a target-text database. Similarly, one might start with an image and seek images that are measur- ably similar. However, because such direct search cannot be done when searching across different media forms, an indirect approach relying on the use of interpretive representations becomes necessary. As the network envi- ronment expands, mapping between vocabularies will be increasingly important. ■ Text and numeric resources Text resource A library catalog—a special case of text file—was chosen for use as a text file rather than a corpus of “full text.” The reasons were practical: In this exploratory investiga- tion, it was important to start with resources that had rich metadata; it needed to be a resource that was sufficiently controllable to enable experimentation with it. A library catalog was in the spirit of the project in that it would lead to additional text resources; and a suitable resource was available, which was intended for metadata mapping: a set of several million MARC records, derived from MELVYL, the University of California online library catalog. Socioeconomic numeric data set Initially, and in prior work, the authors had worked on access to U.S. federal data series, especially import and export statistics and county business reports. Although some progress was made with interfaces to these data series, it became clear that the investment needed to craft interoperable access was high relative to the available staff. Crafting access to individual data series did not appear to be a scalable way to demonstrate variety within the authors’ limited resources, so attention was turned to a single collection comprising many diverse numeric tables, the Counting California database.5 ■ Mapping topical metadata Well-edited, high-quality databases typically have topi- cal metadata expertly assigned from a vocabulary (the- saurus, classification, subject-heading system, or set of SEARCH ACROSS DIFFERENT MEDIA | BUCKLAND, CHEN, GEY, AND LARSON 183 categories). But there is a Babel of different vocabularies. Not only do the names of topics vary, but the underlying concepts or categories may also differ. Effective searching requires expert familiarity with a system’s vocabulary; but as access to digital resources expands, the diversity of vocabularies increases and accessible resources are decreasingly likely to use vocabularies familiar to any individual searcher. The best answer is twofold: First, it is desirable to have an index (a “mapping”) from the natural language of each group of searchers to the entries used in each metadata vocabulary. Such a mapping provides an index from a vocabulary familiar to the searcher to the vocabulary used in entries of the target system and so is called a search-term recommender system. (The authors called it an “entry-vocabulary index,” or EVI.) Dewey’s “Relativ Index” to his Decimal Classification is a famil- iar example. When searching across databases, one also wants a second kind of mapping: between pairs of system vocabularies. Unfortunately, mappings between different vocabularies are rare, expensive, time-consuming, and hard to maintain. (The Unified Medical Language System is a notable example.)6 It is the authors’ impression that this problem is worse in searching across different media forms because data bases in different media forms tend to be created by different communities, increasing the chances that they will use different categories, vocabularies, and ways of thinking. Fortunately where data containing two forms of vocabulary are available, they can be used as training sets for statistical-association techniques to generate EVIs auto- matically, and this is the approach that was used. (More details can be found in the appendix.) From text words to Library Subject Headings An EVI from ordinary English words to Library of Congress Subject Headings (LCSH) was created by taking catalog records containing at least one subject heading (6xx field in the MARC bibliographic format). From each of the 4,246,510 records used, main subject headings were extracted (subfield a from fields 600, 610, 611, 630, 650, and 651) and fields containing text: titles (245a), subtitles (245b), and summaries describing the scope and general content of the material (520a). The underlying assump- tion is that for each record, the words in the “text” fields (245a,b and 520a) tend to be characteristic of discourse on the subject (6xxa). Two examples, with identifying LCCNs in the <001> field are: <001>73180254 //r86 <245>A study of operant conditioning under delayed reinforcement in early infancy <650>Infant psychology <650>Operant conditioning <001>73180255 <245>Reptilian diseaserecognition and treatment <650>ReptilesDiseases The words in the text fields (245a, 245b, and 520a) were extracted. Stop words were removed and the remainder normalized. Then the degree to which each word is asso- ciated with each subject heading (by co-occurring in the same records) was computed using a maximum likelihood ratio-based measure. Natural-language processing can be used to identify adjective-noun phrases to support more precise searching using phrases as well as individual words. A very large matrix shows the association of each text word (or phrase) with each subject heading; so, for any given word (or combination of words), a list of the most closely associated headings, ranked by degree of association, can be derived from the matrix. Queries A query, which can be a single word, a phrase, a set of keywords, a book title, and so on, is normalized in the same way and looked up in the matrix to produce a ranked list of the most closely associated subject headings as candidate LCSH search terms. For example, entering the textual query words “Peanut” and “Butter” generates the following ranking list of LCSH main headings as candi- dates for searching: Rank LCSH (subfield 650a) 1. Peanut 2. Cookery (peanut butter) 3. Cookery (peanuts) 4. Peanut industry 5. Peanut butter 6. Butter 7. Schulz, Charles M. This display is an important departure from traditional fully automatic searching. The list is, in effect, a prompt, indicating probably suitable query terms in the vocabulary of the target resource. It introduces the searcher to the categories and terminology of the system and enables the searcher to use expert judgment to select the heading that seems best for the search. From text words to the metadata vocabularies in numeric data sets A training set of records containing both descriptive words and topical metadata is often not readily available for numeric data sets. The authors’ first effort was to create an EVI to the Standard Industrial Classification (SIC), widely used over many years in numeric data sets. (SIC codes were associated with words by using, as a training set, the 184 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 titles in a bibliographic database that used SIC codes.) But by the time the SIC EVI was completed, SIC had been dis- continued and replaced by the North American Industry Classification System (NAICS), so a mapping was created from SIC codes to NAICS codes. Figures 1–3 show stages in an interface that accepts a searcher’s query “car” (figure 1), prompts with a ranked list of NAICS codes (figure 2), then extends the search with the selected NAICS code to retrieve numeric data (figure 3). By this time, however, it had become apparent that, with the current low level of interoperability in software and in data formats, the labor required to create EVIs and interfaces to each large traditional numeric data series was enormous. Therefore, attention was turned to a collection of different numeric data sets available through a single interface, Counting California, made available by California Digital Library at http://countingcalifornia.cdlib.org. This resource is a collection of some three thousand numeric tables containing statistics related to a range of topics. The numeric data sets are mainly from the California Department of Health Services, the California Department of Finance, and the federal Bureau of the Census. The tables are organized under a two-level classification scheme. There are sixteen topics at the top level, which are subdi- vided into a total of 184 subtopics. All the numeric tables were assigned to one or more subtopics and each table has a caption. At the Counting California Web site, a searcher can browse for tables by selecting a higher-level topic, then a lower-level subtopic, and then a table. Two additional ways were created to access the tables: Probabilistic retrieval, and an EVI to the topical categories. The cap- tions, topics, and subtopics were extracted for each of the three thousand tables, and XML records were created in the following form: education libraries

Retrieval Two search methods were used: Direct Probabilistic Retrieval. An in-house implementa- tion was used of a probabilistic full-text retrieval algo- rithm developed at Berkeley.7 This search engine takes a free-form text query and returns a ranked list of captions of tables ranked according to their relevance scores. For example, the five top-ranked captions returned to the query “Public Libraries in California” were: Figure 1. Query interface for search-term recommender system f or the North American Industry Classification System Figure 2. Display of NAICS code search-term recommendations for “car” Figure 3. Display of numeric data retrieved using selected NAICS code SEARCH ACROSS DIFFERENT MEDIA | BUCKLAND, CHEN, GEY, AND LARSON 185 1. Library statistics, Statewide summary by type of library California, 1992–93 to 1997–98 Table F6. 2. Library statistics, Statewide summary by type of library California, 1993–94 to 1998–99 Table F6YR0-0. 3. Number of California libraries, 1989 to 1999 Table F5YR00 4. Number of California libraries, 1989 to 1998, as of September Table F5. 5. California Public Schools, Grades K–12, 1989 to 1998 Table F4. Each entry in the retrieved set list is linked to a numeric table maintained at the Counting California Web site and, by clicking on the appropriate link, a user can display the table as an MS Excel file or as a PDF file. Mediated Search. From the same extracted records the words in the captions were used to create an EVI to the sub- topics in the topic classification using the method already described. As an example, the query “personal individual income tax,” when submitted to the EVI, generated the following ranked list of subtopics: 1. Income 2. Government earnings and tax revenues 3. Personal income 4. Property tax 5. Personal income tax 6. Corporate income tax 7. Per capita income A user can click on any selected subtopic to retrieve the cap- tions of tables assigned that subtopic. For example, clicking on the fifth subtopic, Personal income tax, retrieves: ■ Personal income tax returns: Number and amount of adjusted gross income reported by adjusted gross income class California, 1998 taxable year. Table D10YR00 ■ Personal income tax returns: Number and amount of adjusted gross income reported by adjusted gross income class California, 1997 taxable year. Table D9 ■ Personal income statistics by county, California 1997 taxable year. Table D10 ■ Personal income statistics by county, California 1998 taxable year. Table D11YR00 ■ Transverse searching between text- and numeric-data series To demonstrate the searching capability from a bib- liographic record to numeric-data sets, the first step is to retrieve and display a bibliographic record from an online catalog. A Web-based interface for searching online catalogs was implemented using an in-house implementation of the Z39.50 protocol. Besides the Z39.50 protocol, an important component that makes searching remote online catalogs feasible is the gateway between the HTTP (Hypertext Transfer Protocol) and the Z39.50 protocol. While HTTP is a connectionless-oriented protocol, the Z39.50 is a connec- tion-oriented protocol. The gateway maintains connections to remote Z39.50 servers. All search requests to any remote Z39.50 server go through the gateway. Searching from catalog records to numeric data sets Having selected some text (for the purposes of this study, a catalog record), how could one identify the facts or statis- tics in a numeric database that are most closely related to the topic? Clicking on a “formulate query” button placed at the end of a displayed full MARC record creates a query for searching a numeric database. The initial query will contain the words extracted from the title, subtitle, and the subject headings and is placed in a new window where the user can modify or expand the query before submitting it to the search engine for a numeric database. So, for example, the following text extracted from a catalog record: Library laws of the State of California, Library legislation. California. Public libraries when submitted as a query, retrieves a ranked list of table names, of which two, covering different time periods, are entitled Library Statistics, Statewide Summary by Type of Library, California. Searching from numeric data sets from catalog records Transverse search in the other direction, starting from a data table, is achieved by forwarding the caption of a table to the word-to-LCSH EVI to generate a prompt list of the seven top-ranked LCHSs, any one of which can be used as a query submitted to the catalog. ■ Architecture Figure 4 shows the structure of the implementation. The boxes shown in the figure are: 1. A search interface for accessing bibliographic/tex- tual resources through a word-to-LCSH EVI. 2. A word to the LCSH EVI. 3. A ranked list of LCSHs closely associated with the query. 4. An online catalog. 186 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 5. Results of searching the online catalog using an LCSH. 6. A full MARC record displayed in tagged form. 7. A new query formed by extracting the title and sub- ject fields from the displayed full MARC record. 8. A numeric database. 9. A list of captions of numeric tables ranked by rel- evance score to the query. 1 0. Numeric table displayed in PDF or MS Excel for- mat. 11. A search interface for numeric databases based on a probabilistic search algorithm. A user can start a search using either interface (boxes 1 or 11) and, from either starting point, find records on the same topic of interest in a textual (here bibliographic) database and a socioeconomic database. ■ Conclusions and further work Enhanced access to numeric data sets The descriptive texts associated with numeric tables, such as the caption, headers, or row labels, are usually very short. They provide a rather limited basis for locating the table in response to queries, or describing a data cell sufficiently to form a usefully descriptive query from it. Sometimes the title (caption) of a table may be the only searchable textual description about the content of the table, and the titles are sometimes very general. For example, one of the titles, Library Statistics, Statewide Summary by Type of Library California, 1992–93 to 1997–98, is so general that neither the kinds of statistics nor the types of libraries are revealed. If a user posed the question, “What are the total operating expenditures of public libraries in California?” to a query system that indexes table titles only, the search may well be ineffective since the only word in common between the table title and the user’s query is “California” and, if the plurals of nouns have been normalized, to the singular form, “library.” Table column headings and row headings provide additional information about the content of a numeric table. However, the column and row headings are usu- ally not directly searchable. For example, a table named “Language spoken at home” in Counting California databases consists of rows and columns. The column headings list the languages spoken at home, while the row headings show the county names in California. Each cell in the table gives the number of people, five years of age and older, who speak a specific language at home. To answer questions such as “How many people speak Spanish at home in Alameda County, California?” using the table title alone may not retrieve the table that contains the answer to the example question. It is recommended that the textual descriptions of numeric tables be enriched. Automatically combining the table title and its column and row headings would be a small but practical step toward improved retrieval. Geographic search Socioeconomic numeric data series refer to particular areas and, in contrast to text searching, the geographical aspect ordinarily has to be specified. To match the geographical area of the numeric data, a matching text search may also have to specify the same place. The authors found that this was hard to achieve for several reasons. Place names are ambiguous and unstable: A search for data relating to Trinidad might lead to Trinidad, West Indies, instead of Trinidad, California, for example. The problem is compounded because, in numeric data series, specialized geopolitical divisions, such as census tracts and counties, are commonly used. These divisions do not match conve- niently with searchers’ ordinary use of place names. Also, the granularity of geographical coverage may not match well. Data relating to Berkeley, for example, may be avail- able only in aggregated data for Alameda County. It was eventually concluded that reliance on the names of places could never work satisfactorily. The only effective path to reliable access to data relating to places would be to use geospatial coordinates (latitude and longitude) to establish unambiguously the identity and location of any place and the relationship between places. This means that gazetteers and map visualizations become important. Gazetteers relate named places to defined spaces, and thereby reveal spatial relationships between places, e.g., the city of Alameda is on Alameda Island within Alameda County. This problem has been addressed in a subsequent Figure 4. Architecture of the prototype SEARCH ACROSS DIFFERENT MEDIA | BUCKLAND, CHEN, GEY, AND LARSON 187 study entitled “Going Places in the Catalog: Improved Geographical Access.”8 Temporal search Searches of text files and of socioeconomic numeric data series also differ substantially with respect to time periods: Numeric data searches ordinarily require the years of inter- est to be specified; text searches rarely specify the period. An additional difficulty arises because in text, as in speech, a period is commonly referred to by a name derived meta- phorically from events used as temporal markers, rather than by calendar time, as in “during Vietnam,” “under Clinton,” or “in the reign of Henry VIII.” Named time periods have some of the characteristics of place names: they are culturally based and tend to be multiple, unstable, and ambiguous. It appears that an analogous solution is indicated: directories of named time periods mapped to calendar definitions, much as a gazet- teer links place names to spatial locators. This problem is being addressed in a subsequent study entitled “Support for the Learner: What, Where, When, and Who.”9 Media forms The paradox, in an environment of digital “media conver- gence,” that it appears impossible to search directly across different media forms invites closer attention to concepts and terminology associated with media. A view that fits and explains the phenomena as the authors understand them, distinguishes three aspects of media: ■ Cultural codes: All forms of expression depend on some shared understandings, on language in a broad sense. Convergence here means cultural convergence or interpretation. ■ Media types: Different types of expression have evolved: Texts, images, numbers, diagrams, art. An initial classification can well start with the five senses of sight, smell, hearing, taste, and feel. ■ Physical media: Paper; film; analog magnetic tape; bits; . . . Being digital affects directly only this aspect. Anything perceived as a meaningful document has cul- tural, type, and physical aspects, and genre usefully denotes specific combinations of code, type, and physical medium adopted by social convention. Genres are historically and culturally situated. Convergence can be understood in terms of interoper- ability and is clearly seen in physical media technology. The adoption of English as a language for international use in an increasingly global community promotes conver- gence in cultural codes. Nevertheless, the different media types are fundamentally distinct. Metadata as infrastructure It is the metadata and, in a very broad sense, “biblio- graphic” tools that provide the infrastructure necessary for searches across and between different media—thesauruses, mappings between vocabularies, place-name gazetteers, and the like. In isolation, metadata is properly regarded as description attached to documents, but this is too narrow a view. Collectively, the metadata forms the infrastructure through which different documents can be related to each other. It is a variation on the role of citations: Individually, references amplify an individual document by validating statements made within it; collectively, as a citation index, references show the structure of scholarship to which docu- ments are attached. ■ Summary A project was undertaken to demonstrate simultane- ous search of two different media types (socioeconomic numeric data series and text files) without ingesting these diverse resources into a shared environment. The project objective was eventually achieved, but proved harder than expected for the following reasons: Access to these differ- ent media types has been developed by different commu- nities with different practices; the systems (vocabularies) for topical categorization vary greatly and need interpre- tative mappings (also known as relative indexes, search- term recommender systems, and EVIs); specification of geographical area and time period are as necessary for search in socioeconomic data series and, for this, existing procedures for searching text files are inadequate. ■ Acknowledgement This work was partially supported by the Institute of Museum and Library Services through National Library Leadership Grant No. 178 for a project entitled “Seamless Searching of Numeric and Textual Resources,” and was based on prior research partially supported by DARPA Contracts N66001-97-C-8541; AO# F477: “Search Support for Unfamiliar Metadata Vocabularies” and N66001-00-1- 8911, TO# J290: “Translingual Information Management Using Domain Ontologies.” References 1. Michael K. Buckland, Fredric C. Gey, and Ray R. Larson, Seamless Searching of Numeric and Textual Resources: Final Report on Institute of Museum and Library Services National Leadership 188 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 Grant No. 178 (Berkeley, Calif.: Univ. of California, School of Information Management and Systems, 2002), http:// metadata.sims.berkeley.edu/papers/SeamlessSearchFinal Report.pdf (accessed July 18, 2006); Michael Buckland et al., “Seamless Searching of Numeric and Textual Resources: Fri- day Afternoon Seminar, Feb. 14, 2003,” http://metadata.sims .berkeley.edu/papers/seamlessfri.ppt (accessed July 18, 2006). 2. Michael Buckland et al., “Mapping Entry Vocabulary to Unfamiliar Metadata Vocabularies,” D-Lib Magazine 5, no. 1 (Jan. 1999), www.dlib.org/dlib/january99/buckland/01buckland .html (accessed July 18, 2006); Michael Buckland, “The Sig- nificance of Vocabulary,” 2000, http://metadata.sims.berkeley .edu/vocabsig.ppt (accessed July 18, 2006); Fredric C. Gey et al., “Entry Vocabulary: A Technology to Enhance Digital Search,” in Proceedings of the First International Conference on Human Lan- guage Technology, San Diego, Mar. 2001 (San Francisco: Morgan Kaufmann, 2001), 91–95, http://metadata.sims.berkeley.edu/ papers/hlt01-final.pdf (accessed July 18, 2006). 3. Los Angeles Times, July 12, 1995: D1. 4. Michael Buckland, “Vocabulary As a Central Concept in Library and Information Science,” in Digital Libraries: Interdisci- plinary Concepts, Challenges, and Opportunities. Proceedings of the Third International Conference on Conceptions of Library and Infor- mation Science (CoLIS3), Dubrovnik, Croatia, May 23–26, 1999, ed. T. Arpanac et al. (Lokve, Croatia: Benja Pubs., 1999), 3–12, www .sims.berkeley.edu/~buckland/colisvoc.htm (accessed July 18, 2006); Buckland et al., “Mapping Entry Vocabulary.” 5. Counting California, http://countingcalifornia.cdlib.org (accessed July 18, 2006). 6. “Factsheet: Unified Medical Language System,” www .nlm.nih.gov/pubs/factsheets/umls.html (accessed July 18, 2006). 7. William S. Cooper, Aitao Chen, and Fredric C. Gey, “Full- Text Retrieval Based on Probabilistic Equations with Coefficients Fitted by Logistic Regression,” in D. K. Harman, ed., The Second Text REtrieval Conference (TREC-2), March 1994, 57–66 (Gaith- ersburg, Md.: National Institute of Standards and Technol- ogy, 1994), http://trec.nist.gov/pubs/trec2/papers/txt/05.txt (accessed July 18, 2006). 8. “Going Places in the Catalog: Improved Geographical Access,” http://ecai.org/imls2002 (accessed Jul. 18, 2006). 9. Vivien Petras, Ray Larson, and Michael Buckland, “Time Period Directories: A Metadata Infrastructure for Placing Events in Temporal and Geographic Context,” in Opening Information Horizons: Joint Conference on Digital Libraries (JCDL), Chapel Hill, N.C., June 11–15, 2006, forthcoming, http://metadata.sims .berkeley.edu/tpdJCDL06.pdf (accessed July 18, 2006); “Support for the Learner: What, Where, When, and Who,” http://ecai .org/imls2004 (accessed July 18, 2006). SEARCH ACROSS DIFFERENT MEDIA | BUCKLAND, CHEN, GEY, AND LARSON 189 Appendix: Statistical association methodology A statistical maximum likelihood ratio weighting tech- nique was used to construct a two-way contingency table relating each natural-language term (word or phrase) with each value in the metadata vocabulary of a resource, e.g., LCSH, LCCNs, U.S. Patent Classification Numbers, and so on.1 An associative dictionary that will map words in natural languages into metadata terms can also, in reverse, return words in natural language that are closely associated with a metadata value. Training records containing two different metadata vocabularies can be used to create direct mappings between the values of the two metadata vocabularies. For example, U.S. patents contain both U.S. and International Patent Classification numbers and so can be used to create a mapping between these two quite different classifica- tions. Multilingual training sets, such as catalog records for multilingual library collections, can be used to create multilingual natural language indexes to metadata vocabu- laries and, also, mappings between natural language vocabularies. In addition to the maximum likelihood ratio-based association measure, there are a number of other asso- ciation measures, such as the Chi-square statistic, mutual information measure, and so on, that can be used in creat- ing association dictionaries. The training set used to create the word-to-LCSH EVI was a set of catalog records with at least one assigned LCSH (i.e., at least one 6xx field). Natural language terms were extracted from the title (field 245a), subtitle (245b), and summary note (520a). These terms were tokenized; the stopwords were removed; and the remaining words were normalized. A token here can contain only letters and digits. All tokens were then changed to lower case. The stoplist has about six hundred words considered not to be content bearing, such as pronouns, prepositions, coordinators, determiners, and the like. The content words (those not treated as stopwords) were normalized using a table derived from an English morphological analyzer.2 The table maps plural nouns into singular ones; verbs into the infinitive form; and comparative and superlative adjectives to the positive form. For example, the plural noun printers is reduced to printer, and children to child; the comparative adjective longer and the superlative adjective longest are reduced to long; and printing, printed, and prints are all reduced to the same base form print. When a word belonging to more than one part-of-speech category can be reduced to more than one form, it is changed to the first form listed in the morphological analyzer table. As an example, the word saw, which can be a noun or the past tense of the verb to see, is not reduced to see. Subject headings (field 6xxa) were extracted without qualifying subdivisions. The inclusion of foreign words (alcoholismo, alcoolisme, alkohol, and alcool), derived from titles in foreign languages, demonstrate that the technique is language independent and could be adopted in any country. It could also support diversity in U.S. libraries by allowing searches in Spanish or other languages, so long as the training set contains sufficient content words. EVIs are accessible at http://metadata. sims.berkeley.edu/prototypesI.html. Fuller descriptions of the project methodology can be found in the literature.3 ■ References 1. Ted Dunning, “Accurate Methods for the Statistics of Surprise and Coincidence,” Computational Linguistics 19 (March 1993): 61–74. 2. Daniel Karp et al., “A Freely Available Wide Cover- age Morphological Analyzer for English,” in Proceedings of COLING-92, Nantes, 1992 (Morristown, N.J.: Association for Computational Linguistics, 1992), 950–55, http://acl.ldc.upenn .edu/C/C92/C92-3145.pdf (accessed July 18, 2006). 3. Michael K. Buckland, Fredric C. Gey, and Ray R. Larson, Seamless Searching of Numeric and Textual Resources: Final Report on Institute of Museum and Library Services National Leadership Grant No. 178 (Berkeley, Calif.: Univ. of California, School of Informa- tion Management and Systems, 2002), http://metadata.sims .berkeley.edu/papers/SeamlessSearchFinalReport.pdf (accessed Jul. 18, 2006); Youngin Kim et al., “Using Ordinary Language to Access Metadata of Diverse Types of Information Resources: Trade Classification and Numeric Data,” in Knowledge: Creation, Organization, and Use. Proceedings of the American Society for Infor- mation Science Annual Meeting, Oct. 29–Nov. 4, 1999 (Medford, N.J.: Information Today, 1999), 172–80. 3354 ---- ACADEMIC WEB SITE DESIGN AND ACADEMIC TEMPLATES | PETERSON 217 Academic Web site design continues to evolve as colleges and universities are under increasing pressure to create a Web site that is both hip and professional looking. Many colleges and universities are using templates to unify the look and feel of their Web sites. Where does the library Web site fit into a comprehensive campus design scheme? The library Web site is unique due to the wide range of services and content available. Based on a poster session presented at the Twelfth Annual Association of College and Research Libraries conference in Minneapolis, Minnesota, April 2005, this paper explores the preva- lence of university-wide academic templates on library Web sites and discusses factors libraries should consider in the future. C ollege and universities have a long history with the Web. In the early 1990s, university Web sites began as piecemeal projects with varying degrees of complexity—many started as informational sites for various technologically advanced departments on campus. Over the last decade, these Web sites have become a vital part of postsecondary institutions and one of their most visible faces. Academic Web sites communicate the brand and mission of an institution. They are used by prospective students to learn about an institution and then used later to apply. Current students use them to pay tuition bills, register for classes, access course materials, participate in class discussions, take tests, get grades, and more. Online learning and course-management software programs, such as Blackboard, continue to increase the use of Web sites. They are now an important learning tool for the entire campus community and the primary communication tool for current students, parents, alumni, the community, donors, and funding organizations. Web site standards have developed since the 1990s. Usability and accessibility are now important tenets for Web site designers, especially for educational institutions. As a result, campus Web designers or outside consultants are often responsible for designing large parts of the academic Web site. As Web sites have grown, ongoing maintenance is an important workload issue. Databases and other technologies are used to simplify daily updates and changes to Web sites. This is where the academic template fits in. An academic template can be defined as a common or shared template used to control the formatting of Web pages in different departments on a campus. Generally, administrators will mandate the use of a specific template or group of templates. This mandate includes guidelines for such things as layout, design, color, font, graphics, and navigation links to be used on all Web pages. Often, the templates are administered using content management systems (CMSs) or Web development software such as Macromedia’s Contribute. These programs give different levels of editing rights to individuals, thus keeping tight control over particular Web pages or even parts of Web pages. Academic templates give the Web site administra- tor the ability to change the template and update all pages with a single keystroke. For example, the Web site administrator may give edit- ing rights to content editors, such as librarians, to edit only the center section of the Web page. The remaining parts of the page such as the top, sides, and bottom are locked and cannot be edited. The result of using templates is that the university Web site is very unified and consistent. This is particularly important in creating a brand for the university. Well-branded institutions have the opportunity to increase revenue, improve administration and faculty staffing, improve retention, and increase alumni relation- ships.1 But what about the library? Libraries are one of the most visited Web pages on a university’s Web site.2 Thus, the design of the library page can be crucial to a well-designed academic Web site. The library Web site can set a tone for an institution and help prospective students get a feel for the campus. Belanger, Mount, and Wilson contend it is important for the image of an institution to match the reality.3 If there is discord between the two, students may choose an inappropriate college and quickly drop out, lowering a campus’s reten- tion data. The library Web site can also be important in the recruitment of new faculty members. In addition, libraries use their Web sites for marketing, public relations, and fund-raising for the library.4 Library Web sites are crucial to delivering data, research tools, and instruction to students, faculty, staff, and com- munity patrons. More than 90 percent of students access the library from their home computers, and 78 percent prefer this form of access.5 Today, the Web site connects users with article citations and databases, library catalogs, full-text journals, magazines, newspapers, books, videos, DVDs, e-books, encyclopedias, streaming music and video, and more. Users access subject-specific research guides, library tutorials, information-literacy instruction, and critical evaluation tools. Services such as interlibrary loan (ILL), reference management programs such as Endnote or RefWorks, and print and electronic reserves are also used via the Web. Users get help with doing research by e-mail and virtual chat. In addition, libraries are digital repositories for a growing number of digital historic docu- ments and archives. Academic Web Site Design and Academic Templates: Where Does the Library Fit In? Kate Peterson Kate Peterson (katepeterson@gmail.com) is an Information Literacy Librarian at Capella University, Minneapolis, Minnesota. 218 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 How common are academic templates in library Web sites? What effect do they have on the content and services provided by libraries? ■ Methods For the purposes of this study, a list of doctoral, master’s, and bachelor of arts (BA) institutions (private and public) based on the Carnegie Classification of Institutions of Higher Education was created and a random number table was used to select a sample of Web pages (n=216).6 Home pages, admissions pages, departmental pages, and library Web pages were analyzed. A similarly sized sample of each type was selected to give a broad overview of trends—18 percent of doctoral institutions (n=47), 19 percent of mas- ter’s institutions (n=115), and 23 percent of BA institutions (n=54). The following questions were asked: ■ Does the college or university Web site use an aca- demic template? ■ If yes, is the library using the template, and for how much of the library Web site? ■ To what extent is the template being used? Primarily, a Web site was determined to be using an aca- demic template based on the look of the site. For example, if the majority of the Web elements (top banner, navigation) all matched, then the Web site was counted as using some sort of template. Use and nonuse of content management system (CMS) software behind the Web site was not con- sidered in this study—only the look of the Web site. ■ Results A majority of college and university Web sites (94 percent) use an academic template. Fifty percent of the librar- ies surveyed use the academic template for at least the library’s home page. Of that number, about 34 percent of libraries use the template on a majority of the library pages. Roughly 44 percent of the total libraries surveyed did not use the academic template, and approximately 5 percent of academic Web sites do not use any sort of uni- fied academic template. Smaller BA institutions are more likely to use the academic template on multiple library pages than doctoral institutions, which tend to have their own library design or template (see table 1). For those libraries that did not use the academic tem- plate on every library page, the most commonly used elements template were the top header (which often has the university seal or an image of the university), the top navigation bar (with university-wide links), and the bot- tom footer, which often contains the university address, privacy statement, or legal disclaimers. Less frequently used elements were the bottom navigation bar, and the left or right navigation bar with university-wide links (see tables 2–3). ■ Discussion While many colleges and universities use academic templates, only about half of their libraries follow suit. Libraries using the template often use selected parts of the template, or only use the template on their home page. Though not considered in this study, there may be a correlation between institution size and template use, as larger institutions are more likely to have library Web designers and thus use the academic template only on the library’s home page. While academic templates can cause libraries many problems, there are also many benefits to be considered. ■ Problems with academictemplates on library Web sites The primary concern with any template is how much space is available for content. For example, there may be a very small box for the page content while images, banner bars, and large navigation links may take up most of the real estate on the page. This problem can be exacerbated for libraries because there are so many different types of content such as the library catalog, databases, tutorials, forms, ILL, and other library services delivered via the Web. Libraries can be caught between the design imposed by the academic template and the rigid size requirements from outside vendors such as database companies, ILL or reserve modules, federated search products, or others. Academic templates are usually mandated by admin- istrators without a full understanding of the specific content and uses of the library Web site. Many problems can occur when trying to fit an existing library Web site into a poorly designed academic template. It can be very difficult to modify the template effectively for the library’s purposes. An example of one specific problem is confus- ing links on the template, where a link on every page to the “university catalog” links to the course catalog and not the library catalog, which is very confusing for users. Another example is a search box as part of the academic template—what are users searching? The university Web site? The library Web site? The library catalog? The World Wide Web? Another drawback to using academic templates for library Web sites can be the time involved in training librar- ians, staff, and library Web site administrators. The existing ACADEMIC WEB SITE DESIGN AND ACADEMIC TEMPLATES | PETERSON 219 content must be fit into the new template—a huge project, given that many library Web sites contain one thousand pages or more. Generally, a decision to use a template is accompanied by a decision to use a CMS or new Web-page editor. This takes yet more time to train individuals on the new software in addition to the new template. ■ Benefits of using academic templates One of the benefits for libraries using an academic tem- plate is the ability to exploit the expertise of the Web site designers who created the template. The academic template often incorporates images, logos, and brand- ing that the library may not be able to design otherwise. Many libraries do not have professional Web designers on staff; even if they do, there often is no one person who designs and maintains the entire library Web site. Instead, different parts of a library Web site are designed and maintained by different individuals with varying degrees of Web site ability. As a result, many library Web sites are a mix of styles, which can be disorienting for students who are familiar with the university’s “look.” Web site uniformity has a positive effect on usability since familiar- ity with one part of the Web site helps students, faculty, Table 1. Percentages of occurrences of academic templates No academic template (%) Library not using template (%) Library using template—transition or top page (%) Library using template—majority of pages (%) Bachelor of Arts 4 37 13 46 Master’s 6 48 12 34 Doctoral 6 45 28 21 Table 2. Occurrence of templates in academic and library Web sites No academic template Library not using template Library using template— transition or top page Library using template— majority of pages Total sites analyzed Bachelor of Arts 2 20 7 25 54 Master’s 7 55 14 39 115 Doctoral 3 21 13 10 47 Total 12 96 34 74 216 Table 3. Percentages of occurrence for institutions using the academic-wide template for first page of library Web site or libraries using modified academic template BA (%) Master’s (%) Doctoral (%) All Colleges and Universities (%) Top header (no navigation) 100 94 94 91 Top navigation 75 82 82 76 Bottom header (no navigation) 83 65 76 72 Bottom navigation 25 18 18 20 Left navigation 42 18 18 24 Right navigation 8 0 0 2 220 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 and staff navigate other parts of the Web site. Even Web site basics such as knowing the color and style of the links and how to navigate to different pages can be helpful.8 Another benefit is academic templates are generally ADA compliant as required under Section 508 of the Rehabilitation Act of 1973.9 As usability and usability test- ing become more prevalent, academic template designers may also test the template and navigation for usability. Such testing will improve the template and thus the library Web site as well. ■ Trends in academic and library Web sites Colleges and universities are responding to a new genera- tion of students, the majority of whom have grown up with computers. In trying to meet their needs and desires, many academic Web sites have high-quality photographs, quotes, and testimonials from the universities’ students on their home pages. More and more materials are being placed online to allow both prospective and current stu- dents to do what they need to do twenty-four hours a day, from registering for classes to handing in research papers. Many Web sites have interactive elements such as instant polls or quizzlets or use instant messaging to connect with tech-savvy students. For example, prospective students can chat with admissions staff members or current students about what it is like to attend a particular university. A large number of sites also highlight weblogs written by current students or those studying abroad. These features allow students to use the technology they are comfortable with to maximize their academic experience. Numerous library Web sites are changing as well, fea- turing a library catalog, article database, or federated search box on the home page to allow users to search instantly. Additionally, library sites are beginning to include images of students using the library, external or internal shots of the building, Flash graphics, icons, and sound. Many incorporate screen captures to help users navigate specific databases or forms. In addition, an increasing number of libraries use weblogs to give more of a dynamic quality with daily library news and announcements. ■ Strategies for using academic templates Based on comments received in April 2005 during the poster session, and in recent electronic discussion list postings, many academic libraries are dealing with these issues. Libraries should work on creating a mission state- ment and objectives for their Web sites that expand upon the library’s mission, the institutional Web site’s mission, and the institution’s overall mission and brand. Librarians must be knowledgeable about Web site usability and trends in Web site design in order to communicate effectively to designers and administrators. Librarians should also become members of campus Web committees and be a voice for library users during the design process. Teaching administrators and campus Web designers about the library and the library Web site’s prominence are important tools to successfully deal with any proposed university-wide academic templates. For example, a librar- ian could mock-up a few pages, conduct informal usability testing, and invite administrators to learn firsthand about potential problems library users could experience with a template. Librarians could also propose a modified template that uses a few key elements from the academic template. This would maintain the brand but retain enough space for important library content. Connecting with other librarians and learning from each other’s successes and failures will also help bring insight into this academic template issue. ■ Conclusion The use of academic templates is only going to increase as institutional Web sites grow in complexity and importance. Libraries are an important part of institutions both physi- cally—on campus—and virtually—as part of the campus Web site. Academic templates are part of a unified design scheme for colleges and universities. Librarians must work with both library and university administrators to create a well-designed but usable library Web site. They must advocate for library users and continue to help students and faculty access the rich resources and services available from the library. Library administrators need to allocate resources and staff time to improve their Web sites and to work in concert with academic Web site designers to merge the best of the academic template to the best of the library site while not sacrificing users’ needs. The result will be highly used, highly usable library Web sites that attract students and keep them coming back to access the fantastic world of information available in today’s academic libraries. ■ References 1. Robert Sevier, “University Branding: 4 Keys to Success,” University Business 5, no. 1 (2002): 27–28. 2. Mignon Adams and Richard M. Dougherty, “How Useful is Your Homepage? A Quick and Practical Approach to Evaluat- ing a Library’s Web Site,” College & Research Libraries News 63, no. 8 (2002): 590–92. ACADEMIC WEB SITE DESIGN AND ACADEMIC TEMPLATES | PETERSON 221 3. Charles Belanger, Joan Mount, and Mathew Wilson, “Institutional Image and Retention,” Tertiary Education and Man- agement 8, no. 3 (2002): 217. 4. Jeanie M. Welch, “The Electronic Welcome Mat: The Academic Library Web Site as a Marketing and Public-Rela- tions Tool,” The Journal of Academic Librarianship 31, no. 3 (2005): 225–28. 5. OCLC, “How Academic Librarians Influence Students’ Web-Based Information Choices,” in OCLC Online Computer Library Center database online, (2002), 5, http://www5.oclc .org/downloads/community/informationhabits.pdf (accessed March 10, 2005). 6. Carnegie Foundation, Carnegie Classification of Institutions of Higher Education, 2000 edition, http://www.carnegiefound ation.org/Classification/ (accessed Jan. 8, 2005). 7. Beth Evans, “The Authors of Academic Library Home Pages: Their Identity, Training, and Dissemination of Web Con- struction Skills,” Internet Research 9, no. 4 (1999): 309–19. 8. OCLC, 6. 9. U.S. Department of Justice, Section 508 Home Page, in United States Department of Justice database online, (2004), 1, http://www.usdoj.gov/crt/508/ (accessed July 3, 2005). STATEMENT OF OWNERSHIP, MANAGEMENT, AND CIRCULATION Information Technology and Libraries, Publication No. 280-800, is published quarterly in March, June, September, and December by the Library Information and Technology Association, American Library Association, 50 E. Huron St., Chicago, Illinois 60611-2795. Editor: John Webb, Librarian Emeritus, Washington State University Libraries, Pullman, WA 99164-5610. Annual subscription price, $55. Printed in U.S.A. with periodical-class postage paid at Chicago, Illinois, and other locations. As a nonprofit organization authorized to mail at special rates (DMM Section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. EXTENT AND NATURE OF CIRCULATION (Average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: June 2006 issue). Total number of copies printed: average, 5,256; actual, 5,300. Sales through dealers and carriers, street vendors, and counter sales: none. Paid or requested mail subscriptions: average, 4,262; actual, 4,280. Free distribution (total): average, 59; actual, 67. Total distribution: average, 4,758; actual, 4,769. Office use, leftover, unaccounted, spoiled after printing: average, 498; actual, 531. Total: average, 5,256; actual, 5,300. Percentage paid: average, 98.76; actual, 98.60. S t a t e m e n t o f O w n e r s h i p , M a n a g e m e n t , a n d C i r c u l a t i o n ( P S F o r m 3 5 2 6 , O c t o b e r 1 9 9 9 ) f i l e d w i t h t h e U n i t e d S t a t e s P o s t O f f i c e P o s t m a s t e r i n C h i c a g o , S e p t e m b e r 3 0 , 2 0 0 6 . 3355 ---- 222 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 Social engineering is the use of non- technical means to gain unauthorized access to information or computer systems. While this method is rec- ognized as a major security threat in the computer industry, little has been done to address it in the library field. This is of particular concern because libraries increasingly have access to databases of both proprietary and personal information. This tutorial is designed to increase the awareness of library staff in regard to the issue of social engineering. One morning the phone rings at the circulation desk; the assistant, Joyce, answers. “Seashore Branch Public Library, how may we help you?” she asks, smiling. “My wife and I recently moved and I wanted to confirm that you had our current address,” a pleas- ant male voice responds. “Could you give me your name please?” “The card is in my wife’s name, Jennifer Greene. We’ve been so busy with the move that she hasn’t had a chance to catch up with everything.” “Okay, I have her information here. 123 Main Street, Apartment 2B. Is that correct?” “Thank you so much, that’s it. Do you have our new number or is it still 555-555-1234 in your records?” “Let me see . . . no, I think we have your new number.” “Could you read it back to me?” “Sure . . . 555-555-6789, is that right?” “555-555-6789 . . . that’s right. Thank you very much, you’ve been very helpful.’ “No problem, that’s what we’re here for.” What just happened? What happened to Joyce may have been exactly what it appeared to be—a conscientious spouse trying to make sure information was updated after a move. But what else could it have been—research for an identity theft, or a stalker trying to get personal information? We have no way of knowing. All reasons except for the first, innocent, reason are covered by the term social engineering. In the language of computer hackers, social engineering is a non- technical hack. It is the use of trickery, persuasion, impersonation, emotional manipulation, and abuse of trust to gain information or computer-system access through the human interface. Regardless of an institution’s commit- ment to computer security through technology, it is vulnerable to social engineering. Recently, the Institute of Management and Administration (IOMA) reported social engineering as the number-one security threat for 2005. According to IOMA, this method of security violation is on the rise due to continued improvements in techni- cal protections against hackers.1 Why and how does social engineering work? The first thing to keep in mind about social engineering is that it does work. Kevin Mitnick, possibly the best known hacker of recent decades, carried out most of his questionable activities through the medium of social engineering.2 He did not need to use his technical expertise because it was easier to just ask for the infor- mation he wanted. He discovered that people, when questioned appropri- ately, would give him the information he wanted. Social engineering succeeds because most people work under the assumption that others are essentially honest. As a pure matter of probabil- ity, this is true; the vast majority of communications that we receive dur- ing the day are completely innocent in character. This fact allows the social engineer to be effective. By making seemingly innocuous requests for information, or making requests in a way that seems reasonable at the time, the social engineer can gather the information that he or she is look- ing for. Methods of social engineering The arsenal of the social engineer is large and very well established. This is mainly because social engineering amounts to a variation on confidence trickery, an art that goes back as far as human history can recall. One might argue that Homer’s Iliad contains the first record of a social engineer- ing attack in the form of the Trojan Horse. Direct requests Many social-engineering methods are complex and require significant plan- ning. However, there is a simple and effective method that is often just as effective. The social engineer contacts his or her target and simply asks for the information. Preying on trust and emotion Social engineering is a method of gain- ing information through the persua- sion of human sources, based on the abuse of trust and the manipulation of emotion. In his book, The Art of Deception, Mitnick makes the argu- ment that once a social engineer has established the trust of a contact, then all security is effectively voided and Helping the Hacker? Library Information, Security, and Social Engineering Samuel T. C. Thompson Samuel T. C. Thompson (sthompson@ collier-lib.org), is a public service librar- ian at the Collier County Public Library, Naples, Florida. HELPING THE HACKER? | THOMPSON 223 the social engineer can gather what- ever information is required. The most common method of tar- geting computer end-users is through the manipulation of gratitude. In these cases, a social engineer, usually impersonating a technician, contacts a user and states that there is something wrong on the victim’s end, and that the social engineer needs a few pieces of information to “help” the user. Appreciative of the assistance, the vic- tim provides the necessary informa- tion to the helpful caller or carries out the requested actions. Predictably, no problem ever existed and the victim has now provided the social engineer either access to a computer system or with the information needed to gain that access. A counterpoint to the manipula- tion of gratitude is the manipulation of sympathy. This method is most often used on information providers such as help-desk personnel, techni- cians, and library staff members. In this scenario, a social engineer con- tacts a victim and claims to have either lost information, is out of contact with a normal source, or is simply ignorant of something that he or she should know. As anyone can empathize with this plea, the victim is often all too willing to provide the information sought by the social engineer. Using these methods—taking advantage of the gratitude, sympathy, and empathy of their victims—social engineers are able to achieve their aims. Impersonation Because forming trust relationships with their victims is critical to a social- engineering attack, it is not surprising that social engineers often pretend to be someone or something that they are not. Two of the major tools of imper- sonation are (1) speaking the language of the victim institution and (2) knowl- edge of personnel and policy. To allay suspicion, a social engi- neer needs to know and be able to use an institution’s terminology. Being unable to do so would cause the victim to suspect, rather than trust, the social engineer. With a working knowledge of an organization’s par- ticular vocabulary, a social engineer can phrase his or her request in terms that will not rouse alarm with the intended victim. The other major goal of a social engineer in preparing a successful impersonation is to develop a famil- iarity with the “lay of the land,” i.e., the specifics of and personnel within an organization. For instance, a social engineer needs to discover who has what authority within an organization so as to understand for whom he or she needs to claim to speak. Research To establish trust in their victims, social engineers use research as a tool. This comes in two forms, background research and cumulative research. Background research is the pro- cess by which a social engineer uses publicly available resources to learn what to ask for, how to ask for it, and whom to ask it of. While the intent and goal of this research differs from the techniques used by students, librarians, and other members of the population, the actual process is the same. Cumulative research is the process by which a social engineer gathers the information that he or she needs to make more critical requests of their victims. The facts that a social engineer seeks through cumulative research may seem without value to the casual observer, but put together properly, they are anything but that. Questions can include names of staff, internal phone numbers, procedures, or seemingly minor technical details about the library’s network (e.g., what operating system are you running?). Late in the afternoon the phone at the reference desk rings. Marcy, the librarian on duty answers, “Reference desk.” “Hi there, this is Dave Simpson calling from information services at the main branch. Sorry about the echo, I’m working in the cabling closet at the moment, so I’m calling you on my cell phone.” “No problem, I can hear you fine. What can I do for you?” “Thanks. A lot of the branches have been having network problems over the last few days. Has everything been okay at the Seashore Branch reference desk?” “I think so.” “Okay, that’s good. I’m running a test right now on the network and needed to find a terminal that was behaving itself. Could you log off and let me know if any messages come up?” “No problem.” Marcy logs off of the reference computer; nothing strange happens. “Just the usual mes- sages.” “Good. Now start logging back on. What user are you going in as? I mean which login name are you using?” “Searef. Okay, I’m logged on now.” “No strange messages?” “Nothing.” “That’s great. Look, our problem might be kids hacking into the system so I need you to change the password. Do you know how to do that?” “I think so.” “Well, let me walk you through it.” Dave spends a couple of minutes walking Marcy through changing the system password. The password is now changed to 5eaR3f, a moderately secure password. “Thanks, Marcy. You’ve been a great help. We have your new password logged into the system. Could you pass on the new password to the other reference per- sonnel?” “Sure.” “Wonderful. Just remember not to give the password out to anyone who doesn’t need it, and don’t write it down where anyone who shouldn’t have it can get at it. Have a great day.” “You too.” 224 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 Why are libraries vulnerable? Libraries are vulnerable to social-engi- neering attacks for two major reasons: (1) ignorance and (2) institutional psychology. The first of these diffi- culties is the easiest to address. The ignorance of library professionals in this matter is easily explained—there is very little literature to date about the issue of social engineering directed at library personnel. What exists is usually mixed in larger articles on general security issues and receives little focus. This lack of concern about social engineering can also be seen in com- puter professional literature, where it is dwarfed by the volume of articles concerning technical security issues. This is a curious gap, considering the high rate of occurrence of this kind of attack. Is it because many techni- cal professionals are less comfortable with a social issue—that can only be solved through people—than with a technical security issue that can be solved through the development or implementation of proper software?3 Unfortunately, not knowing about a method of security violation leaves one vulnerable to that method. It is incumbent on librarians, computer administrations, and security profes- sionals to be aware of these issues. The second factor is harder to address but equally important. Unlike almost any other profession, librarians are expected to fulfill their patrons’ informational needs without ques- tion or bias. This laudable goal makes librarians vulnerable to social-engi- neering attacks because the inquiries made by a social engineer about the information resources available at a library may be used for nefarious purposes. A reference interview over these issues may be very successful from the point of view of both parties involved, as the librarian fills the open- ended inquiries of the social engineer, and the social engineer receives much, if not all, of the information that he or she needs to violate the library’s internal information systems. Why libraries can be targets At this point, it is relevant to ask why security violators would even bother with library computer networks. What do libraries have that is worth possibly committing a crime to get? Personal information is probably the most tempting target in a library computer system. Libraries possess databases of names, addresses, and other personal data about library card- holders. This information is valuable, and not all of it is easily available from public sources. As may be seen in the section of this article on techniques, such information could be used as an end unto itself or as a stepping stone to security violations in other systems. Subscriptions to proprietary data- bases are quite expensive, as any acquisitions librarian will explain. Given the high prices and limited licensing, a hacker may want to gain access to these information resources. This could be a casual hacker who wants to have access to a library-only resource from his or her home com- puter, or this may be a criminal who wishes to steal intellectual properties from a database provider. Libraries often have broadband access designed for a large network (e.g., T1). As these lines are very expensive, few individuals can afford them. At the same time, it has been observed that these broadband lines have immense capabilities for down- loading information from other net- works. There are many reasons why a hacker would seek to illicitly use such a resource. For instance, a casual hacker may want to download a large number of bootlegged movie files, or a criminal may wish to download a corporate database. With access to a library’s high bandwith internet line, these actions can be carried out quickly and with a minimized chance of detection. Libraries possess large numbers of computers due to their increas- ing automation. These computer resources can, if compromised, be used as anonymous remote comput- ers by hackers. Called “zombies,” compromised computers could be used to deliver illegal spam, distrib- uted denial of service (DDoS) attacks, or as servers to distribute illegal materials. If library computers are used in this way, there is a potential for a library to face legal responsibil- ity for the actions of its computers or for the questionable materials found on them. Prevention The tools needed to prevent social engineering from succeeding are awareness, policy, and training. These tools feed into one another—we become aware of the possibility of social-engineering attacks, develop policy to communicate these concerns to others, and then train others in these policies to protect them and their libraries from social engineering. Libraries should have a simple set of policies to help prevent social engi- neering from affecting them. This pol- icy need not be long; ideally, it should be a small page of bullet points that are easy to remember or to post near telephones. What is important is that it is easy to remember and implement when a call or e-mail comes in.4 Basic guidelines for protection against social engineering ■ Be suspicious of unsolicited communications asking about employees, technical informa- tion, or other internal details. ■ Do not provide passwords or login names over the phone or HELPING THE HACKER? | THOMPSON 225 via e-mail no matter who claims to be asking. ■ Do not provide patron informa- tion to anyone but the patron in person and only upon presenta- tion of the patron’s library card or other proper identification. ■ If you are not sure if a request is legitimate, contact the appropri- ate authorities. ■ Trust your instincts. If you feel suspicious about a question or communication, there is prob- ably a good reason. ■ Document and report suspicious communications. In closing Social engineering is an immensely effective method of breaching com- puter and network security. It is, how- ever, entirely dependent on the ability of the social engineer to persuade staff members into providing information or access that they should not provide. With care and good information poli- cies, we can prevent social engineer- ing from working. After all, do we really want to be helping the hacker? The circulation desk phone rings. Joyce answers, “Seashore Branch Public Library, how may we help you?” “Hi there, I’m worried that I haven’t turned in all the books I have out, and I really don’t want to get stuck with a fine. Could you tell me what I have out?” “No problem. What is you name?” “Sean Grey.” Joyce brings up Sean Grey’s circu- lation records, and then remembers about the library’s information policy and decides to ask another question, “Could you give me your library card number?” “I don’t have that with me. I really don’t want to get stuck with those fines.” “I’m sorry. Mr. Grey, to preserve patron privacy we can only give out circulation information if you give us your card number or if you are here in person with your card or ID.” “But I just want to avoid a fine. Can’t you help?” “Don’t worry; if you are late by accident on occasion, we are willing to forgive a fine.” “So you can’t give me my records?” “I’m sorry but we have to protect patron privacy. I’m sure you under- stand.” “I guess so. Goodbye.” “Have a good day.” ■ References 1. Institute of Management & Admin- istration, “Six Security Threats That Will Make Headlines in ’05,” IOMA’s Security Director’s Report 5, no. 1 (2004): 1–14. 2. K. Manske, “An Introduction to Social Engineering,” Security Management Practices (Nov./Dec. 2000): 53–59. 3. M. McDowell, “Cyber-Security Tip ST04-014,” (2005), http://www.us.cert. gov/cas/tips/ST04-014.html (accessed June 5, 2005). 4. K. Mitnick and W. Simon, The Art of Deception (Indianapolis: Wiley, 2002). ALCTS cover 2 LAMA cover 3 LITA 180, 216, cover 4 Index to Advertisers 3353 ---- 190 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 Digital tool making offers many challenges, involving much trial and error. Developing machine learning and assistance in automated and semi-automated Internet resource discovery, metadata generation, and rich-text identification provides opportunities for great discov- ery, innovation, and the potential for transformation of the library community. The areas of computer science involved, as applied to the library applications addressed, are among that discipline’s leading edges. Making applied research practical and applicable, through placement within library/collection-management systems and ser- vices, involves equal parts computer scientist, research librarian, and legacy-systems archaeologist. Still, the early harvest is there for us now, with a large harvest pending. Data Fountains and iVia, the projects discussed, dem- onstrate this. Clearly, then, the present would be a good time for the library community to more proactively and significantly engage with this technology and research, to better plan for its impacts, to more proactively take up the challenges involved in its exploration, and to better and more comprehensively guide effort in this new territory. The alternative to doing this is that others will develop this territory for us, do it not as well, and sell it back to us at a premium. Awareness of this technology and its current capabilities, promises, limitations, and probable major impacts needs to be generalized throughout the library management, metadata, and systems communi- ties. This article charts recent work, promising avenues for new research and development, and issues the library community needs to understand. T his article is intended to discuss Data Fountains (http://datafountains.ucr.edu) project work and thinking (and its foundation in the iVia system, http://ivia.ucr.edu) regarding tools and services, for use in collection creation and augmentation. Both systems emphasize automated and semi-automated Internet resource discovery, metadata generation, and rich-text harvest. These areas of work and research occur within the larger realms of machine assistance and machine learning. They are of critical value to libraries as they currently or potentially concern: significant resource savings; ampli- fication and re-tasking of expert effort to better match librarian expertise with tasks that truly require it (through the automation of routine tasks); and better scaling of col- lections by providing them the technological wherewithal to grow, as appropriate, and better match the explosion of significant available knowledge and information that the Internet has accelerated. This article is organized into three major sections: ■ Part I details machine assistance work to date in the Data Fountains and iVia systems project. ■ Part II describes current and upcoming promising research directions in machine assistance. ■ Part III delves into planning and organizational issues that may arise for the library community as a result of these technologies. ■ Part I: Recent work in Data Fountains and iVia Part I covers work to date on Data Fountains and iVia. Section 1, “A new service and open source software,” describes concrete project work with Data Fountains, a new open service and suite of open-source software tools for the educational and library communities, in developing practical machine learning to provide machine assistance in collection building. Data Fountains is an expansion of work based upon the iVia systems foundation.1 It is an effort that has been ongoing and evolving since 1994.2 Section 2, “Role and niche definition for machine assis- tance in collection building,” covers recent developments in our ongoing effort to better research and define roles and niches for machine assistance of the types offered by Data Fountains. The spectrum—ranging from collection building with an emphasis on expertise that receives small assists from machine tools to an emphasis on machine tools that are configured and thereafter assisted through small refinements by expertise—is examined. Results from an initial exploratory survey in these areas are summarized. ■ A new service and open-source software—Data Fountains Description Data Fountains is an Internet resource discovery, meta- data-generation, and selected, full-text harvesting service as well as the open source (Lesser General Public License Machine Assistance in Collection Building: New Tools, Research, Issues, and Reflections Steve Mitchell Steve Mitchell (smitch@ucr.edu) is the Science Librarian for iVia/NSDL Data Fountains/Data Fountains Projects, Science Library, University of California, Riverside. MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 191 (LGPL) and General Public License (GPL) licensed) soft- ware that makes the services possible. It is a set of tools for use by organizations and institutions serving the greater learning community that create and maintain Internet por- tals, subject directories, digital libraries, virtual libraries, or library catalogs with portal-like capabilities (IPDVLCs) containing significant collections of Internet resources. It is an evolved variant of the iVia system, with which it shares many components. The Data Fountains/iVia code base rep- resents more than 250,000 lines of primarily C++ code. On the systems level, Data Fountains operates as an array of independent systems containing crawler, text classifier, text extraction, portal, and database software components customized to the needs of participat- ing projects. Each cooperator and subject community works with, fine tunes, and benefits from its own set of crawler(s), classifier(s), and database manager(s), i.e., its own specific Data Fountain. Note that in this article, Data Fountains’ portal/metadata repository/database man- agement, content management, import-export, or content search/browse capabilities, which are substantial, will not be discussed.3 Instead, the article will focus on its machine assistance and machine-learning components. The Data Fountains system and service has been developed through a research partnership among com- puter scientists and academic librarians that is beginning to provide technological solutions to some of the major overall problems associated with the scalability and effi- cient running of IPDVLCs. Much project effort is based on applying machine-learning techniques to partially automate and provide help in a number of laborious and costly IPDVLC activities. Included here, more specifically, are the following needs/scaling challenges: reducing to some degree the high costs of manually created metadata; better coverage of the ever-increasing number of important Internet resources (relatedly, the relatively small size of most library Internet collections, where searches yielding very few or no results are common); reducing or making more efficient expert-involved tasks requiring little exper- tise; and reducing redundant efforts among IPDVLCs (both in content and systems building). By providing inexpensive, universally needed raw materials (i.e., metadata and rich full text represent- ing important resources), the Data Fountains service is intended to offer major support and resource savings to cooperating IPDVLC participants that otherwise have strong ongoing commitments to their established institu- tional identity or “brand,” interface or look, system, and, more generally, “established way of doing things.” Data Fountains viability and sustainability is keyed to providing universally needed service and very generic information products that do not require IPDVLCs to change—this often being seen as prohibitively expensive in time and resources. Data Fountains is intended to lower barriers for substantive cooperation in collection building and resource savings on the part of large numbers of IPDVLCs by developing, sharing, and distributing the benefits of machine learning in its areas of application. The Data Fountains service will be useful to a large spectrum of academic and library-based finding tools including metadata repositories and catalogs with Internet portal-like capabilities.4 Increasingly, library-catalog soft- ware is developing more flexibility, including, hopefully, the means by which full MARC (MAchine-Readable Cataloging) records coexist with more streamlined (and less expensive) records, e.g., Dublin Core (DC) and other types, and, moreover, metadata records that include or can be closely associated with selected rich full-text, among many other catalog need areas.5 Data Fountains offers mul- tiple levels of products and services geared to fit the needs of IPDVLCs of differing sizes, subject needs, and desired data “completeness” or depth (this being the amount and type of metadata and full-text needed to properly represent each resource). Uses, products, and services Overall, Data Fountains automatically or semi-automati- cally supplies varying levels of what represents the basic “ore” required by IPDVLCs for Internet resource and article collection building: access to significant, previ- ously undiscovered resources as well as the metadata and selected full-text that describe or represent them. This ore is available in both raw (relatively unprocessed) and more refined products depending on the needs of the partici- pating IPDVLC including, perhaps most importantly, the degree to which expertise is available to provide further refinement and how and for whom the material is intended to be used. Data Fountains multiple product and usage models supports the building of a wide array of IPDVLC collections. A number of usage or service models are supported by Data Fountains, including: Collection development support for single hybrid record type collections The first usage model, based on full automation, involves the utilization of Data Fountains metadata and rich, full- text “as is,” without review, to populate a collection. These records can be used by themselves or mixed with other types of records. They can also be used as part of a hybrid collection to undergird another, more primary, or fully expert-created, collection.6 While more accurate, expert- created collections are not only comparatively more labor intensive and expensive to create and maintain, but often smaller, with narrower and more limited coverage. This has been the INFOMINE (http://infomine.ucr.edu) model that features two distinct collections, with the automati- cally generated collection supporting, as a second tier of 192 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 data, the expert-built content in the primary collection. Users can search one or both. Internet resource discovery service A second model uses Data Fountains primarily as an Internet resource discovery service where links and titles and other minimal metadata are supplied but where the user’s intent is to identify new resources and build metadata records emphasizing a considerable amount of metadata not generated by Data Fountains (e.g., different subject schema). This is done by utilizing the Targeted Link Crawler, Expert Guided Crawler, or Focused Crawler. Because little to no metadata/rich-text generation/extrac- tion occurs, this is the least complex of the usage models. Crème de la Data Fountains A third approach, a variation of the second, utilizes only those Data Fountains records that have been automatically determined, through a user-set threshold, to represent the most highly significant resources (e.g., the top 20 percent). These can be flagged for expert review or automatically harvested without review. The Data Fountains metadata retained for expert review, post-processing, and improve- ment can be minimal or full. Metadata records intended for expert refinement A fourth approach, which is semi-automated, involves using Data Fountains as both a discovery service and as a metadata record-building service where employment of records from the Data Fountains data stream is selective but the machine-created record is routinely retained as a foundation record to be refined or augmented by the expert. Metadata records plus full-text A fifth approach is to use the rich full-text selectively iden- tified and harvested from the Internet resource, either in addition to the metadata generated or by itself, to populate a collection and greatly boost retrieval. That is, some col- lections may want to utilize metadata differing from that produced by Data Fountains but have Data Fountains perform the service of augmenting their metadata with rich full-text. All or parts of the object and full-text can be harvested. ■ Standards, metadata, and full-text Data Fountains’ record format is Dublin Core (DC) and features standard research library subject schemas includ- ing slightly modified Library of Congress Subject Headings (LCSH) and Library of Congress Classification (LCC). As part of upcoming work, development of additional classifiers to apply other subject/classification schemas/ vocabularies will occur, notably DDC and those that can be automatically invoked from the terminology found in the collection objects. Cooperators may choose to help develop new formats, subject schemas, and metadata to meet custom needs in collecting and classification. Other important metadata generated include: Title, Creators, Description (an annotation-like construct), Keyphrases, Capitalized Terms, and Resource Language, among a total of thirty-plus fields. In addition to fielded metadata, Data Fountains delivers selected rich text harvested from the resource. This is important for enhancing IPDVLC retrieval capabilities and user-searching success. The rich text can be harvested verbatim and offered as-is for search or, if this is problematical, further processed into keyphrases. Data post-processing, transfer, and product relevance assurance Participants determine and download resources of rel- evance automatically in batch mode via subject-profiled, custom Internet crawls and editable results sets created by and for each IPDVLC to reflect its particular interests. These profiled crawls and metadata generation routines are stored and can be re-executed at selected intervals. Results are transferred using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) or SDF (Standard Delimited Format) in DC, MARC, and eXtensible hypertext markup language (XHTML) formats. In addition to batch transfers, participants can manually and interactively identify individual records or groupings of records that suit their needs for harvest. Selective, interactive, sort- ing/browsing of results, followed often by evaluation and editing of metadata and full-text fields (as individual records or globally in patterns), is enabled prior to export. These capabilities allow precisely targeted, custom record identification, modification, and downloading. This in turn enables the most general, as well as the most subject-spe- cialized, IPDVLCs certainty in identifying and receiving only records that meet their need criteria. Open-source software The software making the above possible is available to all for free through the LGPL/GPL open-source licenses and model. The open-source model should work well for tool development as fundamental as that described. Open source of this type generally means that users freely use and perhaps participate in further development of the functionality of the software and, at intervals, contribute their innovations back to the code base for all to use. LGPL/GPL supports a wide diversity of forms of com- MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 193 mercial service development. Open source has worked well for large applications such as many forms of the Linux operating system (a number of variants of this are supported), Apache server software, and MySQL database management software (all of which are used by the Data Fountains system). Using this model has the intent of cooperatively benefiting the community as a whole. It is the author’s belief that tools of the Data Fountains type will have wide enough usage within and are crucial enough to the library community to support the development of an open-source community around them. Data Fountains software is of use to thousands of institutions that build IPDVLC collections. Open source also means that the development and evolution of a core tool or system for a community can potentially occur faster and more flexibly, with the proper community support, than many types of proprietary effort. This is needed given the continuing and increasingly greater revolutions in computing power and software potential. The community needs to be able to evolve faster in response to changing conditions, and free, community- based, open-source software development is one strategy for achieving this. ■ Current systems design, development, and features To date, most of the work has emphasized research and development leading to innovations in preferential focused crawling, subject classification using Logistic Regression, kNearest Neighbor (kNN) and other classifiers, and rich full-text identification and extraction. A major emphasis in systems development has been identifying points of intervention in crawling, classification, and extraction, whereby initial, periodic or ongoing interactive expert input can be employed to improve machine processes and results. That is, the work has emphasized usage not only of fully automated machine processes but semi-automated machine processes intended to interactively augment, amplify, and improve the efforts of experts. Experts assist machine processes, and machine processes assist expert judgment/labor. The programming has also been done with an eye toward modularity among different systems components. ■ Internet resource discovery/ identification—expert guided and focused crawling A number of crawling systems have been used; cur- rently, for Data Fountains, three are used that represent two approaches to crawling: expert guided and focused. Expert-guided crawling is accomplished by a Targeted Link Crawler (TLC) and an Expert Guided Crawler (EGC). TLC is concerned with crawling a user-specified link or list of links. EGC differs from TLC in that the single “Start URL” link given is only the beginning point from which the crawler will either drill down (find onsite links at multiple depths in a site) or drill out (find external links not on the Start URL site). The result is that, compared with TLC, many more links than just those given the EGC crawler initially are crawled. With all crawlers, a metadata record with accompanying rich full-text is generated for each resource crawled. A preferential focused crawler, called the Nalanda iVia Focused Crawler (NiFC) after the name of the ancient seat of learning in India, continues to be developed. Focused crawling makes possible focused identification of signifi- cant Internet resources by identifying specific, interlinked, and semantically similar communities of sites of shared subject interest. Generally, NiFC traverses subject expert- targeted regions of the Internet to find resources that are strongly interlinked and thereby represent coherent subject-interest communities and sites of shared interest and mutual use (i.e., are often concerned with and contain content similar to one another). Communities sharing interests often identify and cite one another through link- ages on their Internet resources. Through this mechanism, these communities and their sites/resources can be identi- fied, mapped, and harvested. Preferential focused crawl- ing makes focused crawling more efficient by employing algorithms that can respond to clues in Web resource page layout and structure (e.g., using document object models, visual cues, and text windows adjacent to anchor text, among others) that indicate the more “promising” links to crawl. The result is more efficient focused crawling (figure 1).7 The focused crawling process starts with exemplary sites/pages/URLs being supplied by participating IPDVLC experts. These highly on-topic exemplars are used to form a seed set of model pages used for training/guid- ing the crawler. As the crawling progresses, an interlinkage graph is developed of which resources link to one another (i.e., cite and co-cite). Highly interlinked resources are evaluated, differentiated, and rated as to the degree to which they are linked to/from as well as for their capaci- ties as authoritative resources (e.g., a primary resource such as an important technical report that receives many in-links to it from other resources) or hubs (e.g., secondary sources such as expert virtual library collections that pro- vide out-links to other, authoritative resources). As hubs, expert-created, high-quality IPDVLC collections of links (e.g., INFOMINE) play an important role as milestones and navigation aids in the guidance of many types of crawling. Another automated process works to rate resources, as a second indirect measure of resource quality, by comparing for similarity of content (e.g., similarities among key-word 194 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 vocabularies) between the potential new resources and model resources. The most linked to/from authorities and hubs, with terminology most similar to the exemplars, are thus identified and become prime candidates for adding to the collection and for indicating other resources to add. The overall architecture of Data Fountains involves multiple concurrent crawls and an array of multiple crawlers and associated classifiers on multiple machines (i.e., there are one or more Data Fountains for each major subject area or major cooperator). Areas of expert interaction in focused crawling Expert interactive and semi-automated approaches to improve crawling are employed in and constitute special design areas of Data Fountains since many participating projects and communities have access to considerable subject expertise. There is much promise in amplifying the role of this expertise in the crawling process. Experts can create and refine crawls by: ■ determining the most appropriate seeds (exemplary resources) to use (whether found in their own collec- tions or generated from other sources); ■ choosing degree of “on-topic-ness” desired (a preci- sion versus recall setting); ■ determining the total number of resources to be crawled; ■ editing initial crawl results (e.g., de-selecting or blacklisting resources found) with an eye toward generally refining and developing a super seed set of very large numbers of increasingly on-target seeds that are then crawled anew. (This process of refine- ment and enlargement can be reiterated as desired in achieving increasing accuracy in and numbers of exemplars and therefore accuracy in the final crawl.) ■ In addition, expert truing of crawler Web graph weightings (i.e., manually “lifting” the values of selected hubs and authorities) either during or after a crawling run is being explored to improve crawling accuracy. This lifting process can be aided through tools to visualize the crawl so that the expert can quickly identify, among the masses of results, the most promising areas of a Web graph for the crawler to emphasize. ■ Expert-created blacklists of URLs for types of sites or pages that are not valuable can be stored to save future crawling and expert time. There is such a blacklist for each participating Data Fountains com- munity group and individual. ■ Metadata generation— automated and semi-automated subject classification Data Fountains and iVia embody innovations in automated metadata generation, including identifying and applying controlled subject terms (using academic library-stan- dard subject schema), keyphrases, and annotation-like constructs (figure 2). Automated classifier programs apply these and other metadata and are part of a suite of programs known as the record builder. Controlled subject terminology applied currently includes LCSH, LCC, DDC, and Medical Subject Headings (MeSH). In assigning these, the system generally first looks for HTML and DC metatags and then extracts these data. With some fields, when these data are not present (which is common), original metadata are then generated automatically. In the case of LCSH, LCC, and DDC, if not present in metatags, or if users choose to override metatag extraction (in cases where metatags are not accurate, such as when they are spammy or when top-page boilerplate metadata is carried onto all pages regardless of subject relevance), then classification processes are invoked. These derive a set of keywords and key phrases from the resource that serve as a surrogate in representing and summarizing its content. Then, using a model that encapsulates the rela- tionships between these natural-language terms and the set of controlled-subject terms, the closest corresponding set of controlled terms is assigned. The model is learned from training data sets that consist of large sets of records (more than thirty million in corpora loaned for research purposes by the Cornell University Library, Library of Congress, California Digital Library [CDL], and OCLC) Figure 1. Focused and preferential crawling (courtesy of S. Chakrabarti) MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 195 from library catalogs and virtual libraries. With LCC, the aim has been to assign one or more LCCs to a resource based on the set of LCSHs associated with that resource. SVM, kNN, and Logistic Regression classifiers have been used. Generally, performance has been acceptable in cases where there were two hundred examples of the usage of a particular LCSH (in a record with a URL). Unfortunately, as large as the training data sets have been, there simply haven’t been enough records for classification purposes with URLs and associated text. This problem will more than likely be resolved shortly as catalogs increasingly incorporate Web resources. Metadata generation—Automated extraction of known, named entities Named-entity (e.g., data elements that can be expected to be in a resource and that are placed by authors/publish- ers within a known textual/markup pattern) extraction is primarily practiced through the simple means of identify- ing and extracting data elements indicated by HTML/DC metatags, when present on a page. Data for more than thirty Dublin Core common (and not so common) fields are extracted. With some fields, extraction can be guided, as needed, in the interests of original metadata creation through pattern recognition and profiling, or through classification (e.g., title, subjects, description). ■ Rich-text identification and harvest Refinement of our “aboutness” measure for identifying the most relevant pages or sections in a resource or document (i.e., those intended by the author to be rich in descrip- tive information about the topics within and the type of resource) from which to extract text is a continuing pursuit. Involved in this quest has been better determination of author-created structures and conventions in document or resource layout (e.g., locating introductions, summaries, etc., and determining/proportioning the amount of text to be extracted from each). More accurate rich-text identification in turn yields more accurate identification, extraction, and application of key phrases and, from these, more accurate controlled subject term and other metadata application. This is at the foundation of many metadata generation processes. Crucially, rich full-text is also important from an end-user information-retrieval perspective because the natural-lan- guage terminology contained partially corrects for the limi- tations inherent in many controlled metadata and subject vocabulary/schema approaches (e.g., new or specialized subject terminology is often slow to appear or weakly represented in the often generalist library-standard subject schemas). Refinement of the “aboutness” measure in identi- fying terms indicating that rich text follows is an important and ongoing task that involves formulating fairly intricate text-extraction rules in reflecting conventions in rich-text placement in resources and documents of differing types (e.g., Web sites, articles, database interfaces), formats (e.g., HTML, PDF, postscript), and languages. ■ A modular architecture that supports a federated array of subject-specific focused crawlers and classifiers The architecture that Data Fountains is based upon is shown in figures 3 and 4. Data Fountains operates on the systems level as an array of separate sets of bundled crawl- ers (both guided and focused), classifiers, and extractors; this bundled array of crawlers approach provides greater flexibility and efficiency, as compared with using a more monolithic, single-crawler, multiple-subject approach. A bundle can occupy a whole machine or several can exist independently, as virtual Data Fountains, on a single machine. Instead of one broad, multiple-subject, multiple- audience Data Fountain that follows a broad shotgun approach to Internet resource discovery and classification, there are several vertical, subject- and audience-focused Data Fountains. A Data Fountain is intended to exist for each distinct, major subject area and the subject-specific IPDVLC collections (e.g., visual arts, business, horticulture) associated with them. Figure 2. Metadata choices in Data Fountains 196 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 Data Fountains systems architecture emphasizes modularity. It has been enabled and assumed that sepa- rate components of the system (e.g., the crawlers, classi- fiers, database management systems) could be developed further for other uses independent of the Data Fountains system. In addition, as technologies that the system is dependent upon advance, users will be able to more easily swap out and replace older modules. These capabilities contribute to system sustainability. ■ Service design and sustainability Data Fountains was conceived to be a cooperative, non- profit, low-overhead, cost-recovery-based service intended to sustain itself after start-up. Access will be provided to IPDVLC cooperators who demonstrate interest and sup- port for the work and service. By so doing, cooperators share in supporting the continuing evolution and improve- ment of Data Fountains. As an additional sustainability consideration, the software has been released as open source so that it can develop and evolve in many directions (to directly fit unique needs) as well as benefit through distributed effort. ■ “Small is beautiful”: Roles for and advantages of appropriate small- to medium-scaled tools Approaches like those Data Fountains has taken may be among the few ways that Internet finding tools can con- tinue to be relevant to the learning/library community and offer the accuracy and significant content needed by that community. The technical challenges faced by the large engines in their quest to cover an infinitude of audiences and Internet resources do not need to be grappled with by the community of research libraries and are not faced by focused crawlers and classifiers of the type Data Fountains relies upon. The latter are better able to develop targeted, more accurate approaches to their subjects because they enable machine assistance for, as well as amplification of, authoritative subject expertise (e.g., librarians) as a core interactive component in the process of finding and describing new resources. The processes involved target more narrowly defined, distinct, and finite subject uni- verses and intellectual communities. This, in turn, allows them to scale appropriately for their tasks and to apply more complex and varied types of metadata for faculty, researchers, graduate students, and librarians, who gener- ally require more precision (and authority) in their finding tools but still need to move beyond collections (even allied) that are essentially catalogs moved forward a notch. The smaller scale of this work also potentially enables inno- vations in effective linkage and similarity (i.e., semantic) analysis. Some experts note that the future of Internet searching as a whole may lie in searching federated finding tools based in these techniques.8 Such a federation could be an academic’s or librarian’s Web of high-quality find- ing tools. Data Fountains may offer part of the foundation needed to support such a Web. From a related perspective, these tools represent an appropriate approach for library and library-community- scaled resource identification and description tasks that emphasizes perhaps the great advantage the library com- Figure 3. Interaction of fully and semi-automated and manual collection building processes in Data Fountains Figure 4. Overall Data Fountains architecture MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 197 munity can bring to bear in creating useful finding and metadata generation tools, which no others have. That is, the community’s unparalleled subject and description expertise in finding and ordering significant resources into coherent rich collections might be amplifiable shortly, through machine assistance. If such an effort was sensi- bly coordinated and focused, and minor modifications in approach and established standards made to enable best use of these new tools, then the best Internet find- ing tools/collections could be made possible yielding high-quality and significant coverage. These collections would benefit by having the capability to catalyze, out of the mass of the Web, the resources that constitute much of its intelligent fraction and make this coherently visible and available to learners and researchers. Moreover, this could be done in such a way that digital and print record and object collections could seamlessly interact as one, rendering what would be the best information-finding tools/collections without regard to type of resource. This effort in fact has been unfurling for a long time, though, to date, in small and somewhat sporadic, uncoordinated ways. For example, INFOMINE and similar collections have provided credible links to and for the academic com- munity for well over a decade. ■ Role and niche definition for machine assistance in collection- building exploratory survey An exploratory survey conducted in fall 2005 illuminated new perspectives, desired products and services, and research opportunities as perceived by a sampling of digital library and library leaders in regard to a number of areas involving machine assistance in collection building. Generally, areas explored concerned, among others: new roles projected for machine learning/machine assistance in libraries for metadata generation, resource discovery, and rich full-text identification and extraction; new finding-tool niches and opportunities existing in the service spectrum between Google and OPAC; acceptance of streamlined, more minimal, and cost-saving approaches to metadata creation or augmentation; the role of cost-recovery-based service and cooperative, participatory business models in digital libraries. More specifically, the purposes of the survey were to: 1. Elicit leading library attitudes in relation to the types of services, software development, and research that generally will constitute Data Fountains; 2. Test the waters in regard to attitudes toward implementing machine-learning/machine-assis- tance-based services for semi-automated collection building within the general context of libraries; 3. Probe for new avenues or niches for these services and tools in distinction to both traditional library services/tools and large Web search engines; 4. Concretely define Data Foundations’ initial set of automatically and semi-automatically generated metadata/resource-discovery products, formats, and services; 5. Examine attitudes toward the value and roles of rich, full-text in library-related finding tools; 6. Examine attitudes toward hybrid databases contain- ing heterogeneous records (e.g., multiple formats, types, and amounts of metadata); 7. Gather ideas on cooperatively organizing such ser- vices; and 8. Generally define new ideas in all interest areas for development of products and services. The survey, comprised of fifty-nine questions, was sent to thirty-five managers of leading digital libraries/librar- ies/information projects.9 There was roughly a 40 percent return from those targeted (fourteen out of thirty-five). Responding institutions and individuals were guaranteed anonymity of response. ■ Survey result summary There was considerable agreement on most answers. As such, this initial definitional survey has proven helpful in design and product definition. Though the survey sample set/number of respondents was limited and while results need to be seen as tentative, the views expressed are from well-regarded experts in the fields of digital library and library technology, development, and services. In addi- tion to helping define current Data Fountains services, the survey results also indicated the need for further explora- tion in the areas of services, tools, overall niche definition, and publicity. While conclusions remain tentative, barring future, larger surveys, some of the more relevant results are as follows: ■ There appear to be significant niches for an auto- mated/semi-automated collection-building/aug- mentation service given inadequacies in serving research-library users found in Google (and presum- ably other large commercial search engines) and commercial-library OPAC/catalog systems. Survey results indicate a need for services of the types char- acterized by Data Fountains. ■ Generally, academic libraries get a slightly above middle-value (neutral) grade in terms of meeting shifting researcher and student information needs over the last decade. This indicates that, above and beyond specific library and commercial-finding tools, 198 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 there are information needs not being met by librar- ies in regard to information discovery and retrieval that new services may be able to help provide. ■ There is support, above and beyond creating machine-assistance-based collection-building ser- vices, for developing and distributing the free, open- source software tools supporting these services. Tools that make possible machine assistance in resource description and collection development are seen as potentially providing very useful services. ■ Automated metadata creation and automated resource discovery/identification, specifically, are perceived as potentially important services of signifi- cant value to libraries/digital libraries. ■ There is support for the notion of automated iden- tification and extraction of rich, full-text data (e.g., abstracts, introductions) as an important service and augmentation to metadata in improving user retrieval. ■ The notion of hybrid databases/collections (such as INFOMINE) containing heterogeneous metadata records (referring to differing amounts, types, and origins of metadata) representing heterogeneous information objects/resources, of different types and levels of core importance, was supported in most regards. ■ Many notions that were foreign to library and even leading-edge digital library managers/leaders (the respondents) two to three years ago appear to be acknowledged research and service issues now. Included among these are: machine assistance in collection building; crawling, extraction, and clas- sification tools; more streamlined types of metadata; open-source software for libraries; limitations of Google for academic or research uses; limitations of commercial-library OPAC/catalog systems; and the value of rich full-text as a complement to metadata for improved retrieval. ■ There is strong support, given the resource savings and collection growth made possible, for the notion of machine-created metadata: both that which is cre- ated fully automatically and, with even more sup- port, that which is automatically created and then expert reviewed and refined. ■ Amounts, types, and formats of desired metadata (very streamlined DC metadata was supported for most uses and contexts) and means of data transfer (OAI-PMH was preferred) were specified by respondents. ■ Summary of Part I Data Fountains is a unique service and system for inex- pensively supporting aspects of collection building among IPDVLCs. Developing and utilizing advances in focused crawling and classification, this service automatically and semi-automatically identifies useful Internet resources (both open-Web as well as closed-collection resources including articles and reports, etc.) and generates metadata (and selected rich text) to accompany them. Data Fountains is a cooperative service, a free open-source software sys- tem, and a research-and-development project exploring machine assistance as well as machine-expert interfaces and synergies in collection building. Several useful service niches and roles for the work have been identified and have been or are being developed. ■ Part II: New directions in research This section discusses important new directions in research for machine assistance in collection building as they relate to upcoming and expanding research, development, and prototyping within Data Fountains and iVia. Among focus areas are promising means of: automated classification for applying library standard controlled subject vocabular- ies/schema, including hybrid and ensemble classification; smarter and more accurate named-entity extraction (e.g., capturing object/article metadata “facts” such as publisher and publishing date); improvements in rich-text identifi- cation and harvesting; article/report collection level co- citation and subject gisting functionality; and generally improved expert-guided and focused Web crawling. ■ New research in machine assistance for collection building The iVia and Data Fountains projects have recently received a fourth National Leadership Grant from the United States Institute of Museum and Library Services that supports three years of research and development in machine assistance in collection building. In addition, the National Science Digital Library is continuing funding. The areas that will be worked in are discussed below. These have been determined through experience gained over the last eight years of work in machine-assistance-oriented systems development and dialogue with computer scientists and collection coordinators. These areas of technology work and application, though complex and challenging, are very important. That is, assuming it is important that the learn- ing/library community not be dis-intermediated by such technologies but instead becomes more fully empowered by them. This can only occur through developing a much larger role in actively defining, guiding, and putting the technologies to best possible use. Looking into the future, it is clear that libraries cannot simply continue to wait for or rely on good companies like MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 199 Google, OCLC, or OPAC creators to deliver them, much like a cargo cult, as they have in the past. To the degree that this is done, there is the risk of becoming vendor vectors blinded by the limitations of these companies and their product lines. These products are often incorrectly assumed to be the known technical and organizational universe of what is possible or doable. The revolutions coming in computing power together with the low cost of this power—which will be almost ubiquitously distributed among users of library collections and services—promise much more change than libraries have seen in the last decade. Among the changes underway are those in machine learning and machine assistance in libraries. As the changes take place, organization size may not guarantee much as, over the last decade, librarians and researches have witnessed large academic and other research libraries, with some exception, demonstrate a profound organizational entropy in almost direct propor- tion to the magnitude of what are essentially paradigm shifts in scholarly communications, information provi- sion, and research. To some degree, these simply reflect larger blockages within the universities and institutions in which libraries are embedded. As these changes play out, it should be noted that history in information or library- related public or scholarly information provision/access probably will not end with Google or OCLC—wonderful and fairly open companies—just as history in automobile manufacture has not ended with GM, computer manufac- ture with IBM, or Web finding with Alta Vista. With this as background and in the vein of open planning (as well as open services and open software) and given the size of the work areas addressed and their challenges, much of the projects’ technical planning and direction are being presented in this paper. These areas of computer and information-science research and develop- ment, which will affect libraries in many ways, are evolving rapidly into practical application. The current major research areas are: ■ Named-entity identification and extraction, and unified models of information extraction and data mining Named-entity identification and extraction is concerned with finding and harvesting generally concise factual data—often common bibliographic metadata—present in the targeted resource such as publisher, title, and publish- ing date. This type of metadata usually is associated with particular collections containing information objects that are often homogeneous (e.g., scientific article collections) and in which author-intended placement of metadata (or data) elements follows an established pattern and location in the object (e.g., an abstract is typically present and indi- cated in a pattern following presentation of title/author). While making extraction easy is one of the functions of metatagged metadata in Internet resources, generally few authors or collection coordinators in academia, or else- where, use metatags or applicable naming schema in any significant or uniform way (often, in fact, it is used very sparingly or not at all). Extractors therefore must be able not only to identify and harvest metatag metadata, but must discern and then extract specific metadata elements interspersed in bodies of text, as made identifiable by detecting the patterns of occurrence unique to the type of element as it occurs in the object or collection. Among the many advances planned for Data Foundations is the usage of conditional random fields.10 Important as well are user interfaces or dash boards that allow configuration of extractors whereby, as patterns of placement for desired data for extraction change in differing collections and types of objects, the tool can be configured appropriately to match the context and task. Also under development consideration are more hybrid, unified approaches to and models for data extraction and mining (as applies to text classification), using each to inform and improve the other.11 That is, a family of models is being developed for improving data mining of informa- tion in largely unstructured text by using methods that “have such tight integration that the boundaries between them disappear, and they can be accurately described as a unified framework for extraction and mining.”12 Much of this work is concerned with generating metadata for article/report-level collections. ■ Document-scale learning and classification A strong emphasis in the new work will be on document- scale machine learning, classification, and named-entity extraction in regard to collections of research papers, reports, theses, and monographs. Internet-object boundary detection is another impor- tant concern. Detecting and properly defining compound documents (e.g., Web hyper-books on multiple pages or sites) is a goal, as is identifying compound-document points of author-intended entry and intended-user paths (i.e., author-intended main connective threads in distrib- uted or compound documents).13 Relatedly, improved internal-document structure identification for better docu- ment-level classification and extraction is critical. Involved are standard-document internal-structure identification (e.g., abstract, introduction, summary text, captions for tables/figures) including units of rich text and micro- information units of text organized via subtopic.14 Methods of document-level word-and-phrase graphing as per 200 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 TextRank and other means of identifying small-world and micro-information units are currently being pursued.15 A strong emphasis as well will be on examining and implementing new means of co-referencing among docu- ments in collections and new means of identifying latent topics in a well-defined collection. By way of explanation, another term for co-referencing is co-citation. An example of such co-referencing is referencing work, described in papers, that has been funded through the same agency and program or that shares principal investigators in addition to standard bibliographic citation. This will improve on work done in CiteSeer.IST (ResearchIndex) and similar projects through integrating and advancing the promising approaches of Rexa open-source collection-management software.16 The focus of this effort will be on integrating article-level named-entity extraction as well as co-citation and bibliometric-refined subject identification within col- lections of papers/reports. ■ Individual text-classification algorithm and training method improvement New research on individual text-classification algorithms will be examined and applied. The emphasis here will be on prototyping and measuring how applicable recent promising scholarly work might be to library-related meta- data-generation challenges. The major focus continues to be in the area of applying controlled, library standard sub- ject vocabularies (e.g., LCSH, LCC, and DDC). Many of the improvements relate to advances in individual text-clas- sification algorithms, classifier training and fine-tuning, training-corpora cleanup and normalization techniques, and creating the ability for the individual classifiers to hybridize with other classifiers. Of special interest are classifiers that perform well with very large numbers of classes, both small and large amounts of text, and that yield probabilistic estimates in class assignment (e.g., of a particular LCSH). The latter allows both provision of multiple class assignments for resources that have mul- tiple subjects as well as greater accuracy and knowledge of the confidence level of the assignments (thresholds of confidence level in accuracy can be set in applying, or not, a particular classification). More specifically, this work will examine, test, and— depending on test results—refine recently improved variants of the most promising of several classification algorithms.17 Among those are: ■ Support Vector Machines (SVMs)18 ■ Logistic Regression (LR)19 ■ Naïve Bayes (NB)20 ■ Hidden Markov Models (HMMs)21 ■ kNN/kNN Model22 A number of metrics to measure performance of these and other text classifiers in regard to controlled subject assignment, in both fully-automated and user-interactive (semi-automated) modes, will also be employed.23 ■ Hybrid classifiers An important effort will be to test and develop new hybrid classifiers that incorporate the best capabilities of two or more in one classifier. Much of the current research has involved developing and improving new hybrids that combine the best of discriminative (e.g., LR, SVMs, decision trees) and generative (e.g., NB and Expectation Maximization) techniques in classification. For example, NB is fast but lacking in accuracy, while SVMs are accurate but can be slow to train. Hybrid models can produce better accuracy/coverage than either their purely generative or purely discriminative counterparts.24 Various combina- tions, among others, of LR, HMM, and SVM are among the most promising.25 ■ Ensemble classification or classifier fusion This constitutes one of the main current directions in clas- sification research and has been applied to a wide range of real-world challenges. Classification ensembles are reputed to be more accurate than any individual classifier making them up.26 An important focus is on experimenting with new approaches to automated and semi-automated ensemble classification that involves creating frameworks that support metaclassifiers or classifier-recommender systems to apply multiple classifiers, as appropriate, to the classification task.27 Developing classifier ensembles, including the metaclassifiers to guide them, is a major element in making possible the self-service aspect of an open, automated metadata-generation service, given that the metaclassifier is intended to determine the nature of the collection and classification task and assign the appro- priate classifier(s) to the job.28 It is probable that expert interaction at suitable points in this process will improve performance. ■ Distributed classification Classifier ensembles are often used for distributed data mining in order to discover knowledge from inherently distributed and heterogeneous information sources and to scale-up learning to very large databases (often the context for library-related tasks). However, standard methods MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 201 of combining multiple classifiers, such as stacking, can have high performance costs. New classifier combination strategies and methods of distributed interaction will be examined to better handle very large classification needs.29 Distributed classification, by nature, would be focused on improving large-scale self-service classification. ■ Semi-automated, expert-interactive classification Means of enabling semi-automated, expert-interactive classification will be presented.30 There is much scope for building interactive classifiers that engage the tool user or collection coordinator in an active dialogue (e.g., multiple iterations of machine/expert actions and feedback loops) that leads to incorporation of expert knowledge about specific classification tasks, metadata, and collections into the classifier, thus improving performance. That is, an active learning model can be extended significantly for these processes to include both feature-selection and docu- ment-labeling conversations, exploiting rapidly increasing computing power to give the user immediate feedback on choices to improve the classification process.31 Several different models featuring domain expert- interactive classification and extraction will be evaluated. These vary from being extremely interactive, emphasizing frequent machine assists, to less interactive, where experts profile, launch, and only occasionally refine a primarily machine process. The initial focus will be on the latter models. Note that iVia and Data Fountains have included a metadata generator with semi-automated record builder for years. OLLIE and HIClass are examples of systems that are more intensively expert-interactive.32 Classification tasks and collection types will be characterized as to which lend themselves to frequent expert interactions, occasional interactions, or more fully-automated modes (i.e., little interaction or initial profiling/definition only). ■ Classifier training and evaluation techniques As important as direct work on the classifiers is work emphasizing assessment, cleaning, and testing of clas- sifier-training data and classifier-evaluation techniques. Involved are training data/corpora-normalization tech- niques, document-clustering techniques, and classifier bias/variance-reduction techniques. Also involved on the classifier side are tuning issues in regard to the data at hand, including improved feature-selection techniques and determining and using confidence estimates in apply- ing/not applying classifications. Different approaches to these will be examined, tested, and refined with a range of training corpora. Diverse training and test data from assorted collection “types” will include standardized corpora as well as data from participating library or educational community proj- ects. That is, the techniques will be assessed with regard to how they perform with: (1) open Web resources, (2) col- lections of research papers, reports, theses, or monographs (working with Rexa), (3) typical campus Web-site pages, and (4) mixes of the above.33 Each collection-type focus will require differing approaches, algorithms, training, and fine-tuning techniques and will be evaluated through a number of measures.34 ■ Improved rich-text identification and extraction for improved classification and user search/ browse Rich text is text that has the role of conveying through traditional or new document structures or conventions (e.g., introductions, tables of contents, FAQs, and captions for figures) the author-intended subject(s) and intent of the information object. Being able to accurately identify and extract this material greatly aids in classifier performance by improving significant keyphrase identification as well as in user retrieval by enabling full-text retrieval. The avail- ability of natural-language text for searching is one means of helping to resolve problems encountered in searching controlled, library standard subject vocabularies (which in turn counteract problems searchers have when only natural-language retrieval is available). Both approaches are inherently complementary. Improvements in rich-text identification and harvest through improved means of document-structure learning (e.g., identifying text windows around links or captions for figures and tables) will be sought. The lightweight semantic (e.g., use of terms that indicate “aboutness” such as “about”, FAQs, introduction, and abstract; rating the frequency and uniformity of application of these terms in a given collection; and proportioning source of harvest) and markup clues will be refined as well. Identifying aboutness text, which can be seen as micro-information units of text organized via topic and subtopic, is being pursued through work with Rexa and others.35 ■ Improved focused crawling Focused crawling is an appropriately scaled method of crawling for many library collections (see Part I). It is used to discover new Internet resources by defined topic termi- nology and topic Web-link neighborhood. Topic similarity 202 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 and semantic analysis are key measures of significance that are combined with linkage co-citation measures to indicate significance or relevance of a new resource. Topic similarity among resources will be increasingly modeled through a topic-linkage matrix (i.e., semantic similarity map).367 New means of evaluating, fine-tuning, and improving basic crawling will be examined.37 Rules reflecting the specific semantics of each major subject area are to be developed by participants for crawls/classification. ■ Combined mining and extraction that support improved focused crawling in regard to best link pursuit and expert interaction The development of hybrid, unified approaches to extrac- tion and mining can be applied to focused crawling. The processes of data mining, rich-text identification and extraction, and the newest forms of focused crawling are starting to overlap and depend upon one another in important ways (as discussed in the section on preferential focused crawling). Another focus for development efforts will therefore be work to more systematically refine best- link pursuit with an eye toward combining advances in mining, extraction, and rich-text identification in focused crawling. This work will be undertaken to improve the work on NiFC. Focused crawling will improve in many situations, as well, through use of user-interactive compo- nents and data-visualization interfaces (e.g., control boards that visualize an interactive graph to aid in expert “lifting” of the values of specific sites/subtopic neighborhoods to better reflect their significance to the expert). This in turn that will help users guide and tune the crawling, in semi- automated fashion, to better fit the goals and context of a particular crawl. ■ Modeling different approaches for a self-service, openly accessible metadata-generation service(s) The Data Fountains and iVia efforts have some experience with modeling metadata-collection related services, having provided collaborative, scholarly virtual-library service successfully for more than a decade. The Data Fountains project has improved upon earlier work and represents an automated and semi-automated resource discovery, metadata generation, and rich-text identification and harvest service for cooperating collections. The intent is that Data Fountains be a self-service operation. In related effort, with co-operators at the National Science Digital Library (NSDL) and Library of Congress (LC), the Data Fountains project has been striving to develop self-service dash boards that collection managers can use to configure, profile, and satisfy their needs. By complementing initial profiling with ongoing interactive dialogue, guidance, and refinement, more precise task definition and tool utilization can be achieved. The goal is to have a service that can, through advanced interfaces, engage users in dialogue to help them better determine their options, the tasks involved in achieving them, the capabilities and limitations of the tools available, and therefore, the best choice of tools and practices given their specific service needs and the nature of their collections. ■ Summary of Part II There are many fronts of research in machine learning as applied to text processing and new-resource discovery in regard to collection building of various types, relevant to libraries, which have opened over the last few years. The Data Fountains/iVia research described is looking into just a few of these. For libraries, the borders between computer science, information science, and library science are dis- solving rapidly. It would be hard to devise or project for- ward a five-year plan for a large working library without some understanding of current and oncoming machine- learning and machine-assistance work in each of these disciplines, the many inter-connected organizational/com- munity/technical issues, and without an understanding that goes beyond the domain of current or developing products and services from existing vendors. ■ Part III: Issues and reflections Part III is intended to define and address some of the many challenges and issues that are arising or may arise as a result of work on machine-assistance tools in the areas of automated and semi-automated resource discovery, metadata generation, and rich, full-text identification and harvest. Included here are reflections on and questions about some of the probable implications and impacts of, as well as roadblocks to, machine-learning technologies applied to collection building. Addressed are probable impacts leading to changing roles for libraries, librarian expertise, library standard vocabularies/schema, and the organizations that are the stewards of library standards. These include: ■ What might be the effect of these technologies on library operations, including changes in the areas and nature of expenditure of expertise required, shifts in amount of expertise required, and changes in divisions of labor (both human/human and human/ machine)? MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 203 ■ What are the effects on libraries and end users when the coverage of finding-tool content can be greatly and inexpensively broadened and deepened? ■ How do current or traditional approaches to library- based practices and standards help foster or hinder these technologies? ■ How will best practices develop in regard to machine- assisted activities? ■ How do these technologies amplify and enable or simply prematurely dislodge librarian expertise? ■ Who will own these technologies and tools? ■ How open to evolution are library metadata stan- dards and the organizations entrusted with their stewardship? ■ How will these technologies impact these standards? Unfortunately, most of these questions will remain as questions unanswered. The few answers offered here must remain as tentative, contradictory, and flawed as those of most who dabble in the cottage industry of imagining library futures. Still, in the effort to help map some of the new information landscape that is becoming apparent, these reflections, developed over the course of the last few years, may be small contributions toward defining and understanding what is coming. ■ Licensing for automatic agents of libraries It will become increasingly important for libraries to develop licenses with commercial-resource vendors/pub- lishers that allow crawlers/classifiers and other automated programs, to be seen as agents of and for these libraries. It is important that automated agents be allowed to work with (e.g., create or enrich metadata and therefore increase end-user success in finding) both free and fee-based materials in much the same way that an expert bibliogra- pher, cataloger, or public-services librarian would when selecting, creating original metadata for, and providing access to a new commercially vended book intended to become part of a library or other well-defined collection. Automated agents accessing and processing fee-based, Internet-delivered information objects do so with the goal of improving the finding tools of the institution paying the fee to provide access for users to these objects (i.e., “library users”). Thus, they are engaged in a bona fide, fair use of the material by and for the purchasing/subscribing institu- tion. The metadata and descriptive information these tools develop help make the materials they process more visible in collection/finding-tool contexts, a goal which should be desirable by all parties (i.e., end user, subscribing library, and owning author/publisher). ■ New medium, new organization, and an over-proliferation of electronic toll booths and borders Another challenge is that Internet access to library-col- lection contents and library catalog-described data, both free and fee-based, is becoming increasingly restricted as libraries, library service organizations, and publishers grope to create special aggregations, with exclusive access for their clienteles. Countering this in their adherence to open access, have been, among others, services devel- oped by, for example, arXiv, the Institute of Museum and Library Services open archive, CDL eScholarship, OAIster, CiteSeer, and NSDL.38 Differences in the two approaches may increasingly become an issue. On the one hand there is the broad, long- term community ethic favoring open access to an Internet with few walls or borders, and authors enabled to publish directly via the Internet through open eprint collections or dual commercial/personal-site publishing/copyrighting of their work. On the other hand there is the fairly nar- row definition of an Internet information niche in which electronic/virtual services and collection access remain mapped restrictively to the sponsoring physical librar- ies/collections/institutions/publishers. Libraries face a contradiction or tension between these two approaches. The latter mode is a natural effort to retain a tightly held clientele and access model that has characterized physi- cal libraries, reflecting narrowly conceived and decades- old organizational/budget/certification/user models of physical-library services and publisher controls. Much of this practice is necessitated by commercial publishers (for whom libraries often have no alternative but to act as vectors), together with the lack of vision for and outdated stereotypes held of libraries by the larger organizations in which they find themselves. At the same time, much of the problem is also due to the inability of libraries to develop new cooperative organizational modes, models, and services that map better to the new medium, map better to new author and user benefits enabled by this medium, and that are better able to exploit fully and fluidly the new medium’s capabilities. The types of compartmen- talization of collections, access, and services needed for physical libraries and print, or necessitated by publisher restrictions, are increasingly an obstacle when projected onto Internet access and service capabilities. Thorough rethinking is needed, just as the educational and scholarly missions of the university as a whole must be thoroughly rethought in the light of Internet-associated technologies and capabilities.39 While the information highway must be paid for, over-compartmentalization based on dated organizational and service models is yielding an over-multiplication of 204 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 toll booths and border crossings among aggregations and collections. An example has been the emphasis at many University of California campus libraries on the single campus OPAC rather than the pooling of resources across UC libraries for the strengthening and refinement of CDL’s Melvyl Union Catalog. It is likely that with systemwide, multicampus shared resources, Melvyl could improve in all respects vastly beyond the single campus OPAC. This is noted in the Final Report of the Bibliographic Services Task Force of the University of California Libraries.40 Overall, institutional parochialism can and has greatly lessened the value and fluidity of the Internet as a medium for information provision. The booths and borders of tightly held collections make material harder to find, less visible, and less useful than would be true of more open, expansive collections and archives. As Dempsey stated, libraries need to find “better ways to match supply and demand in the open network. . . . We need new services that operate at the network level, above the level of individual libraries.”41 For crawlers and classifiers, the booths and bor- ders that are proliferating in libraries can act disjunctively as barriers, reducing their performance. There are few answers to the challenges that over-pro- liferation of booths and borders represent. They are often practical solutions to immediate needs. Still, projects that are exploring new avenues in organization and open, shar- able collections (and the standards they are based upon) should be further encouraged and supported community- wide. These include the open archives already mentioned and systems such as those iVia/Data Fountains work upon that to provide services for such collections in an open, inclusive, cooperative, participatory manner. While the answer will probably remain a mix of open (reflecting capabilities of media) and closed (reflecting organizational and vendor restraints) collections, it would be progress to move the balance point more toward the middle and away from so many booths and borders. ■ Note on the related issue of meta-search Libraries often respond to some of these open/closed/ multiple-collection aggregator and “brand” challenges and issues with meta-search services. Meta-search can serve to mask the fundamental, growing problem of increasing booths and borders. Meta-search, unlike the Internet-borne conceptions of open service, collections, access, systems, software, and standards, does not really ask us to change our fundamental assumptions, organiza- tions, or data architectures to match the capabilities of the new information medium. It does not ask us to cooperate more fully and share at the level of collection and data; it also doesn’t encourage uniform-standards adoption and development. While meta-search is a fine answer to certain needs, sometimes it is used as a technical means to attempt to avoid these more fundamental issues. In addition, meta-search can be constraining for user search/access—i.e., it frequently disallows use of signifi- cant or unique search and metadata capabilities of each individual database to which it is applied. Meta-search in libraries is becoming increasingly central, though it has many current operational flaws. Among these flaws are: ■ simplification or dumbing-down of search in order to access lowest-common-denominator fields; ■ clumsy cross-walking among fields, or metadata ter- minologies that really are not equivalents; ■ difficulty in collating results/eliminating duplicates; and ■ difficulty of matching differing results ranking weightings/systems held by different bases. Libraries emphasizing this approach may be increas- ingly themselves perceived as dumbed down by academ- ics, grad students, or serious researchers, who must reach beyond Google, the OPAC, and meta-search search and display. Instead of, or in addition to, meta-search, it might be wise to pursue more fully the hybrid database approach of combining heterogeneous records for multiple collec- tions (and multiple retrieval languages as needed) in one database.42 As computing power increases geometrically and price decreases drastically every couple of years, the challenge that the hybrid-database approach poses in regard to searching and maintenance of very large hybrid databases may soon become less of a problem. This power also implies that meta-search become more useful. ■ Library standard controlled subject schema/vocabularies As the promise of automated and semi-automated meta- data generation and related tools becomes better known, it may be important for the community as a whole to urge our major subject vocabulary standards organizations, i.e., LC and OCLC, to open more fully their standards and input in standard making for wider participation on the part of new communities of researchers, developers, and end users. Both organizations maintain important library standard subject vocabularies/schema, LCSH/LCC, and DDC, and related large bibliographic databases and clas- sifier-training data embodying these standards. In this work, both organizations need to more actively seek out and encourage a wider variety of open innovation and development, both within and outside of the library community. This means involving more researchers, end users, and other perspectives in the effort of contribut- ing to the more rapid evolution of these standards in an attempt both to better meet end-user finding needs and MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 205 to facilitate application of the standards through machine assistance. While OCLC and LC have been generous in providing their data and standards for iVia research (others that have been generous with training data have been the Cornell University Library and CDL), most known work on these standards is funneled through their organizations, allies, and organizational filters. This is, of course, critical to a point for coordination; however, if overdone it may unnecessarily inhibit wider pollinations, new perspectives (e.g., a wider variety of linguists, computer scientists, and subject vocabulary/schema experts from other disciplines such as medicine and the sciences), decision making, and faster movement forward. Informing the perspective here is that, while there are major costs involved in maintaining and coordinating these vocabularies/schemas, such costs are being borne directly or indirectly by the community in fees paid, mon- ies applied (often public monies through the large par- ticipating public university/land-grant libraries, among others), or labor volunteered/provided. LC is a public agency and OCLC a corporate cooperative. In many ways then, libraries, through their metadata expert/cataloger community, should be seen as “owning,” as both co-author and funding agent, more of a share in these vocabularies (and other standards in library metadata) than their stew- arding organizations. A significant portion of the success of thousands of individual libraries is dependent on the successful evolution (replacement?) of these standards through the facilitation efforts and new roles adopted by these two organizations. Ultimately, it must be recognized that in many ways, OCLC and LC metadata schema and vocabularies (as well as conventions, styles, and customs in practical applica- tion) represent the codified wisdom, in the form of very large knowledge bases, of decades of resource descrip- tion practice on the part of information professionals in thousands of institutions. The library community is the co-author of these, and OCLC and LC are their stewards. When viewing the community as owner, and when taking into account that the community needs to evolve more rapidly with its users to survive, then periodic clarifica- tion and renewal of the origin, intent, and understanding of the stewarding organizations and the standards they coordinate might help encourage more rapid, far-sighted change. Libraries may or may not sink to the degree that this is realized. In this light, it should be noted that some communities, including path-breaking projects within NSDL, have made well-reasoned decisions not to use these library subject vocabulary standards (Carl Lagoze, pers. comm.). These are just recent examples, given that abstracting and indexing services/databases, for the jour- nal literature, have in most cases long ago chosen to use their own specialist vocabularies, often supplementing these by enabling key-word or natural-language searching of abstracts or complete full-text. Among other core practical concerns here are that the library community’s standards may not be seen as useful and as widely applicable as other information communi- ties may desire. That is, if an important goal is to evolve and expand standards long associated with and emanating from the library community into becoming the standards of new, larger communities outside of libraries, then a more- guarded-than-not approach, which is slow to respond to early adaptors or innovators and slows sensible change, may not be the best path. Here it should be said that there are significant ongo- ing efforts to overcome some of the challenges and better evolve LCSH/LCC. OCLC’s Faceted Application of Subject Terminology (FAST) may represent a step in the right direc- tion.43 Having an entry-level vocabulary to translate end- user terminology to appropriate library subject standard vocabulary terms would be of great importance to most types of end user.44 OCLC has also been working with the Resource Description Network (RDN) to streamline DDC application.45 There just need to be more of these efforts moving at a more rapid clip. As MacEwan concluded in 1998, “if LCSH does not change it will sooner or later be abandoned. . . .”46 The same might be said of library subject vocabulary/classification standards. However, in the worst-case scenario, assuming the existing subject standards cannot evolve more rapidly to meet new user needs in information access, collec- tion building, and metadata creation, now may even be an appropriate juncture for a large-scale rethinking and rebuilding, from the ground up.47 The architecture, intent, end-user audience, form, and substance of these standards would need to be rebuilt and expanded. A capability for organizationally responding more quickly to what has amounted over the last few years to far-reaching paradigm shifts would be enabled. Now may be the time because, in addition to the questions of the openness/innova- tion/evolutionary adaptability of these standards, they exhibit significant, long-noted, functional flaws in terms of a non-librarian end user finding success. Among others often noted are: ■ Misuse/lack of understanding on the part of end users (and, rarely, poor learning materials and guidance sup- plied by librarians) due to real or perceived complex- ity, often associated with the use of subheadings and arcane terms that are far from intuitive for users).48 ■ Typically sparse application that doesn’t fully repre- sent the number or depth of topics addressed by a work. Despite the time needed to create the MARC record manually, very few LCSHs are applied (often three or less in the University of California’s Melvyl Union Catalog). ■ The arcane and overly general nature of many terms that sometimes do not accord with terminology used by practitioners in the field.49 206 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 ■ The lack of currency of terms describing new or recent phenomenon (see discussion of entry vocabu- lary.50 ■ The lack of uniformity of subject granularity in their application across multiple cataloging institutions for the same/similar works. ■ The significant amounts of expensive expert labor involved in their application. ■ Their complexity often at least partially assumes some expert mediation (that may not be available, given that access is increasingly from outside the library) or long-term experience with the vocabu- lary. ■ Overdone detail/complexity, some of it either not extremely useful to researchers and nonlibrarian end users or already instantly verifiable by users. ■ Their arcane-ness and complexity, which limits capa- bilities for machine assistance in application and, thus thwarts a major, inexpensive means for future collection growth, increased coverage, and more use- ful collections. Fortunately, and this is crucial, it turns out that much of the tonic needed for improvement may reside in the areas of inexpensively augmenting, as opposed to changing, the LCSH/LCC/DDC schema/vocabularies. For example, it is probable that most significant objects, when not digitized themselves, will be accompanied increasingly by digitized, representative cores of searchable natural-language rich text, as LC is doing with its table of contents digitization.51 Automated and semi-automated tools for rich-text iden- tification, extraction, and end-user searching are showing applicability now (see part I). Similarly, keyphrase identifi- cation and application can be accomplished automatically with a good degree of reliability; these processes play a role similar to rich text in providing useful retrieval terms and in augmenting subject searching with/without these controlled vocabularies. Finally, reasonably good overall subject gisting is occurring in the creation of annotation- like constructs. All of these—rich text, keyphrases, and annotation-like constructs alike—are of great potential value in addressing controlled subject vocabulary/schema inadequacies and in complementing LCSH/LCC/DDC in end-user finding. It is also probable that use of machine means to aug- ment overarching standard subject vocabularies with complementary and much more granular/detailed spe- cialist vocabularies (both expert created and controlled as well as those that are automatically invoked) will shortly be practical and prove very useful. Streamlined LCSH/ LCC/DDC could be made perhaps to function as linguistic “switching yards” with specialist vocabularies oriented to them and acting as extensions via the spine provided by the generalist vocabularies (similar to work being explored by Vizine-Goetz). All of this could be hinged on the syn- onomy and other term/concept relationships supplied by WordNet or other whole natural-language corpora.52 In such a manner, reconceived LCSH/LCC/DDC can basi- cally work as multi-vocabulary integration and translation tools in cases where the granularity of the subject becomes very fine-grained or specialized.53 Such synonymy, lin- guistic linkages, and switching capabilities would make possible more meaningful and accurate interrelations and more fluid user movement among the vocabularies and concepts of multiple disciplines and multiple-controlled vocabularies/schema. This would also better enable the end user when employing terms actually used by practi- tioners/researchers/students in their disciplines.54 These and other efforts are crucial because, despite their problems, LCSH/LCC/DDC are comprehensive, overarch- ing vocabularies and schema that, though complex (as are the subject vocabularies of BIOSIS and Pubmed/Medline, which successfully represent very large subject universes of their own), have done a generally useful job of repre- senting and coherently organizing finding terminology for most known worldly (and unworldly) phenomena. This, on any basis, is no easy task. These library standard vocabularies might best be seen as both essential connective tissue and as spines that could coherently thread many disciplines and interests, and many of the more specific vocabularies, together. Without such a spine, interdisciplinarians, researchers/students new to an area, and generalists—whose focus requires wide knowledge often across among many disciplines (and therefore subject vocabularies)—may find themselves handicapped. Each sub- and then sub-sub-specialization might develop its own mutually exclusive and contra- dictory terminology in a manner that natural-language substitutions such as keyphrase and rich-text availability can only partially fix. Many end users and librarians noted the downsides of natural-language-text-only searching two decades ago while using newspaper and other full-text databases offered by Dialog or BRS. Finally, one cannot ignore that LCSH/LCC/DDC have huge established bases of practitioners and metadata records employing them. Therefore, their value is large. To summarzie, the solutions to the problems inherent in using library standard subject vocabulary/schema and other controlled metadata will involve the following: ■ openness to extensive hybridization of approaches to rethinking subject vocabularies/schema and other metadata; ■ awareness of, design for, guidance of, and incorpora- tion of new machine-assisted technologies to boost collection coverage and reduce costs of application; ■ embracing machine assistance, as appropriate, as a means of amplifying and extending expertise and application; ■ applying existent technologies for generation of key- MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 207 phrases, description-like constructs, and rich text in order to augment controlled subject vocabularies; ■ developing a better conception of end-user metadata expectations and needs against the backdrop and expectations generated by the Web, such as instant end-user access/verification; and ■ making use of specialist vocabularies that might be dovetailed well with and coordinated through stan- dard vocabularies. ■ Invoked subject vocabularies—hierarchical and otherwise It is important to track recent research into automated and semi-automated means for creating (often referred to in the computer-science literature as “inducing” or extracting) hierarchical and other subject vocabularies/ontologies from natural-language corpora (see part II). The intent of this work is to have the natural-language terms used by practitioners directly populate and structure the sub- ject-finding approach. Automated induction of subject vocabularies will be useful to augment and increase the capabilities, flexibility, and interactivity of standard subject vocabularies/schema.55 At the very least, and this is important, they could func- tion to automatically suggest synonyms or new terminology for ongoing vocabularies/schema. And these approaches could be put to use in building entry-level vocabularies that front the vocabularies of the standards.56 They could also be used to aid in the semi-automated or automated repopulation/reworking of the standards, if large-scale, from-the-ground-up reworking is deemed necessary at some point. This would be done on a disci- pline-by-discipline, subject-by-subject basis. ■ Resource discovery, search engines, and your library’s subject portal Library collections, virtual libraries, portals, and Internet- enabled catalogs of openly accessible, significant Internet resources all function as “hubs” (see part I). Along with other types of expert-created hubs, they have played a role in providing most large, sophisticated, commercial search engines with a significant means for modeling and determining high-quality resources and, when accurate, a considerable portion of their accuracy. Though Google and others do not detail how their search algorithms work, most advanced crawlers highly weight (give authority to) sites that contain large numbers of links to research and other significant resources, especially when expert created. Similarly, resources from specific domains such as .edu, .org, and .gov, and institutions such as libraries, universi- ties, and scholarly societies can be identified and more highly weighted. This is another case of the community’s expertise/authority functioning as a knowledge base that, when offered as a public good (as library-created hubs often are), helps better enable directional tools for these commercial and noncommercial crawlers. There is nothing wrong with this as long as the community is aware of its contribution and as long as its efforts are recognized by these businesses. Expert library-based subject portals often reciprocate usage by using commercial engines for resource discovery, though this usually represents a minor way of collecting because other expert sources are preferred. ■ Enumeration of catalysts for, impacts of, and issues in machine assistance in the library community Related to these research and technical developments, the library community needs to think through a great many interrelated and diverse issues and questions regarding (1) impacts of the machine assistance we have been dis- cussing; (2) the possible massive automation of metadata generation and resource discovery in libraries, (3) who will “own” these technologies and ideas, and (4) changes in expectations/roles of metadata practitioners and standards and their stewards, in the following areas: ■ When will machine learning/machine assistance yield reliable, inexpensive, and therefore massive application of metadata on an Internet scale, that meets librarian, and more importantly, end-user expectations in terms of usefulness? Machine assis- tance should begin to be factored into long-term planning. ■ What will be the effects of this machine amplification in changing the importance/roles/content of subject standards? That is, how and to what degree will a new means and scale of application change these standards generally, and how they’re perceived and used by end users and librarians and, therefore, be applied by the library community? How might these standards themselves change both in terms of changes in and approaches to vocabulary and schema? That is to say, how would massive, machine-assisted appli- cation in and of itself change the makeup of the vocabulary, schema, and the styles/conventions with which they are applied? ■ How might the roles of the stewards of these stan- dards change, given massive application as well as possible interest on the part of other communi- ties? Can library standards penetrate and be effec- tively used by other information communities? What changes in the standards would be required to achieve this? 208 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 ■ What are the trade-offs between highly manual or craftsman/guild approaches and highly automated or more industrial approaches to applying meta- data? Within which contexts, collections, resources, and budgets are these approaches to be best used, either singly or combined in various proportions, in building/expanding a collection? How does each approach best complement the other in library collec- tions? ■ To what degree will changing end-user information usage and access patterns change approaches in regard to collection design and access assumptions, the metadata standards the collections are based upon, and the stewarding organizations of the stan- dards? ■ To what degree may labor and resource savings, as well as the ability to provide for more comprehen- sive collections, as offered by this technology, dictate changes within the library community in regard to expectations for metadata quality and specificity? In which information-seeking contexts and collections and to what degree will the Google-type record or minimal, streamlined DC become, if not a necessity themselves, then a pole toward which library biblio- graphic metadata evolves? ■ A question self-evident to most but not to all is: to what degree will the nature of the Internet itself continue to change our approach to supplying meta- data? Again, researchers in academic departments no longer need walk across campus to the library by virtue of having many bibliographic details of an object present in a metadata record. Increasingly, they can go to the object on the Internet and instantly verify the detail for themselves. Should libraries de- emphasize data elements/fields that are dependably and quickly end-user verifiable in favor of expend- ing more expertise, time, and resources in gisting/ describing the subject, intent, and perhaps even esti- mated quality or significance of the work? ■ In which specific ways will labor be saved and machines be capable of assisting in resource discov- ery and metadata generation? That is, what level of automation/semi-automation is acceptable to the community and reliably deployable in production over horizons of one to five years? What level of qual- ity/depth will users accept in metadata designed to occupy the continuum existing between the MARC record and the Google “record” (this being a large and significant service area; see part I)? How will this technology change old and enable new roles, tasks, and production routines for library subject experts and other staff? How will libraries ramp up and tran- sition into this? ■ Will the substantial potential economic advantages of automated or semi-automated generation of library standard metadata such as LCSH/LCC/DDC vocab- ularies/schema drive a rethinking toward greater uniformity/simplicity/streamlining of these stan- dards and conventions in their application, explicitly with machine application in mind? For example, per- haps only a subset of a whole vocabulary will be used and those that are used will become less detailed and less rich for experts but also—for most end users— less complex and arcane, and more intuitive.57 ■ In some ways, the existence of DC is a recognition that this kind of rethinking and streamlining of library description standards, in the interest of repre- senting and providing access to a much larger scale of communities and resources, is already well under way. What are the obstacles to greater usage of DC? ■ What should the balance be in streamlining metadata for automated application, in relation to its cur- rent complexity/depth while augmenting with rich text? From another perspective, what is the balance when considering the oversimplification and loss of descriptive power when using machine methods as compared with that otherwise achievable through use of subject expertise? How will libraries deter- mine best balances of expert and machine in regard to different tasks? How will this be quantified and determined through examination of user retrieval success/satisfaction—with this, in turn, factored against the backdrop of metadata creation costs, full- text data harvesting and retrieval, and the need for collections with much greater reach? ■ As accurate means of metadata and rich-text gen- eration for/from text objects improve, machine assis- tance will allow a shifting of expertise to provide better collection coverage and expression of subject- domain expertise (e.g., in abstracts). How will this new capability for breadth and depth be defined and used in library collections? For example, will new visual, multimedia, and data objects—which the Web has made possible on a mass basis and which librar- ies generally do not cover well—become a major goal in repurposing expertise since these do not eas- ily lend themselves to machine processing (Karen Calhoun, pers. comm.)? ■ Might streamlining and the usage of multiple depths/types of metadata application first require the acceptance within the community of the concept of the multitiered collection/database that supports multiple levels and types of heterogeneous resources representing differing levels of importance to users?58 Or, can this need be met through more fully evolved meta-search approaches? ■ Helping to structure this metadata heterogeneity might be the sliding-scale application of varying levels of metadata-generation labor expenditures and amounts/type of metadata, with the lower- MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 209 and middle-value resources receiving application of streamlined standard vocabularies/schema and rich text, automatically or semi-automatically, at low cost. High-value resources would continue to receive expert-applied, expensively created, com- plex, and high-quality metadata as well as rich text. Libraries already make such distinctions in qual- ity/significance to some degree through purchasing (e.g., departmental collecting profiles/weightings by subject and object type and cost) and order-of- cataloging priority decisions, as well as by student/ faculty input on specific items. More specifically, we would need to discuss and develop criteria in deter- mining the core or peripheral value of a resource for its subjects and user communities and then, based on the judgments derived, appropriately apportion amount and type of metadata and expert labor or machine assistance, on a sliding scale. Again, while it should be noted that the library community has gen- erally avoided rendering judgments on the possible use/relevance of a resource to a subject community, libraries nevertheless do routinely make general calls that effectively function this way to some degree. In making this judgement, it would be critical to involve resource users. Reviewer-researcher, library user, and librarian evaluations for purchases as well as find- ing tool/collection-usage statistics for the specific subject or author and item all could be woven into the means by which the core weighting of a resource could be assigned and be refined over time via usage. Developing this value is important from a library standpoint. It is a key that may help unlock solu- tions for some of the community’s bigger challenges, including those revolving around the best marriage of machine assistance with librarian expertise. How do libraries go about making these sliding-scale evaluations with some uniformity, among different collection types and interests, with an eye toward tasking expert and machine? ■ Can some of the general end-user search deficiencies commonly acknowledged for LCSH/LCC/DDC be rectified to some extent by automatically/semi-auto- matically providing rich full-text accompaniment for each record/resource, either in the form of “selected” excerpts verbatim or as processed into significant key- phrases representing this text? How could the pres- ence of this rich text not so much change as augment these standards? For example, rich full-text might be relied upon to contain detail that obviated the need to use certain LCSH subdivisions or other types of MARC metadata. Could inadequacies/inaccuracies in expert-applied and machine-applied metadata be partially countered, for end-user retrieval purposes, through the presence of rich full-text? Rich text, as well as keyphrases/terms and descriptions that serve the same purpose in this context, can now be reliably generated in many cases automatically. What would be the right mix of subject-vocabulary standard meta- data and accompanying, selected natural-language text for best end-user success? How might rich-text extraction and searching improve upon searching of whole-object full-text? How much rich text is needed and how distilled should it be? Large, whole-object full-text searching can often be a searcher’s quag- mire, clouding results rankings and weightings. ■ Could a new scale of application and interest on the part of new communities be better catalyzed through the incentive offered by opening up the LCSH/LCC/ DDC subject vocabularies/schema on an open-stan- dards/open-source, free-software model? ■ If development of these technologies is constrained with regard to action/inaction on the part of the community and its stewards, will the standards be replaced—or become obsolete—for major existing or prospective sectors of users? If so, what does this mean for the library community? ■ By and for whom is such standard subject vocab- ulary/schema application technology developed within the community? Classifiers are actually trained through great amounts of what, in many cases, is really community-created knowledge in order to apply community-developed schema/vocabularies. Smart crawlers and extractors similarly use (have “learned”) collectively created information patterns, derived from open-knowledge bases of various sorts. Who should own these tools/models and how open/ closed should the programming code/ideas be, con- sidering they could not be built without using the collective wisdom embodied in these knowledge bases? These tools exploit decades of labor by thou- sands of institutions, whose assumption has gener- ally been that the knowledge base and, by extension, the tools that are built on and benefit from it, are and should remain directly or indirectly, public goods. ■ For whom is machine learning/assistance in collec- tion building patented? The ideas, training corpora, algorithms, and data models discussed need to be observed and protected for the public domain to encourage their widespread and inexpensive avail- ability, as well as their evolution. The U.S. Patent and Trademark Office is now more commonly supporting the patenting of whole, generic processes that have heretofore had one or both feet in the commons, as compared with solely granting patent rights in more discrete areas of original invention. It would be unfortunate to find one day that machine assistance in collection building had been patented. This is especially an issue, given that there is little machine learning of interest to libraries that does not mine, apply, and extend the stored wisdom and knowledge that the community has built for decades. 210 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 ■ Summary of Part III It is important to think through and anticipate a great number of issues and concerns—including those of open models and open development—regarding machine-assis- tance tools (e.g., classifiers, extractors, and related algo- rithms/models) that generate library standard metadata, and identify and extract useful natural-language data. It is important because these tools could become central activi- ties in libraries over the next one to five years. Reflection here is especially appropriate, given the degree that these tools are trained on exemplars from library collections and come to distill and embody models of library metadata, standards, and expertise that represent the knowledge created over decades through the effort of a whole com- munity. It is important to think through what machine- assistance technologies in collection building imply for the future role of the librarian’s expertise. Specifically, libraries need to reconceptualize machine-assistance software not as fully automated “AI” but rather, as enabling expert driven, strongly interactive, “servo-mechanisms” that semi-automate some work to increase the reach, quality, and user-finding success within library collections. While it will probably start out with ten or fifteen minutes of expert time saved per record by such tools, this is a lot of time saved when aggregated across the entire community and will only increase. And the community needs to think through what this implies for the evolution of library- standard metadata, given that machine assistance will increasingly allow for massive and economic application, if a convergence of machine capabilities and machine- friendly metadata standards is architected. This large-scale amplification of usage will quite likely involve changing the value/roles of these standards for the community, as well as for the larger communities that may come to use them at the cost of simplification, streamlin- ing, and a greater reliance on end users to verify some of their own metadata details (often interacting directly with the digital resource). The tools also imply a restructuring of expertise and its application in metadata creation in libraries to reflect a division of labor, with semi-automated machine description processes spent on the mass of useful but mid- to lower-value materials; with and expert time being spent on high-value resources; and with both types of records residing in the same multitiered, heterogeneous collection.58 Finally, needing examination will be the roles of the stewardship organizations in: ■ shepherding the community’s metadata standards during a period of great change; ■ openly evolving the application of metadata stan- dards within the context of machine assignment for the greatest possible good; ■ rapidly evolving the application of metadata stan- dards to retain guidance of and to keep pace with open and proprietary developments in these areas; ■ distilling the metadata knowledge base and wisdom created by the community as this is transformed into the programmatic knowledge (rule bases and mod- els) used by new tools. This knowledge base is a priceless asset for the library community in sustaining service roles in an age of the large-scale advent of commercial-information access, delivery, and ownership. ■ Conclusion This article discusses work over the last several years in machine-learning software and services relevant to collection building in libraries. A number of promising avenues for exploration and research are detailed. Deeper understanding of and more direct involvement in areas of machine learning are urged for libraries in order to reflect advances in the computer sciences and other disciplines as well as to meet changing end-user needs among infor- mation seekers. ■ Acknowledgements The author would like to thank the U.S. Institute of Museum and Library Services; the Library of the University of California at Riverside; the National Science Foundation’s National Science Digital Library; the Fund for the Improvement of Post-Secondary Education of the U.S Department of Education; the Librarians Association of the University of California; and the Computing and Communications Group of the University of California at Riverside for current or past funding support. The author would also like to thank the Library of Congress; Cornell University Library; OCLC; and the California Digital Library for providing training data and other assistance for the research. Thanks to Karen Calhoun (Cornell University Library) and two anonymous readers for some excellent comments and suggestions. Finally, the author would like to commend iVia lead programmer Johannes Ruscheinski, primary author of the Data Fountains and iVia code bases, for his excellent work over the years, as well as Gordon Paynter, Walt Howard, Jason Scheirer, Keith Humphries, Anthony Moralez, Paul Vander Griend, Artur Kedzierski, Margaret Mooney, John Saylor, Laura Bartolo, Carlos Rodriguez, Jan Herd, Carolyn Larson, Diane Hillmann, and Ruth Jackson for their invaluable contributions to the MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 211 projects. The views expressed here are solely those of the author and not intended to represent those of the Library of the University of California, Riverside, our funding agencies, or cooperators. ■ References and notes 1. S. Mitchell et al., “iVia: Open Source Virtual Library Software,” D-Lib Magazine (January 2003). http://www.dlib .org/dlib/january03/mitchell/01mitchell.html (accessed Oct. 20, 2006); G. Paynter, “Developing Practical Automatic Meta- data Assignment and Evaluation Tools for Internet Resources,” in Proceedings of the 5th ACM/IEEE Joint Conference on Digital Libraries (Denver: ACM Pr., 2005), 291–300 (Winner of the JCDL 2005 Vannevar Bush Best Paper Award), http://ivia.ucr.edu/ projects/publications/Paynter-2005-JCDL-Metadata-Assign- ment.pdf, (accessed Oct. 20, 2006); S. Mitchell, “Collaboration Enabling Internet Resource Collection-Building Software and Technologies,” Library Trends 53, no. 4 (May 2005): 604–19; J. Mason et al., “INFOMINE: Promising Directions in Virtual Library Development,” First Monday (2000), http://www.first monday.dk/issues/issue5_6/mason/ (accessed Oct. 20, 2006). 2. S. Mitchell, “INFOMINE: The First Three Years of a Vir- tual Library for the Biological, Agricultural, and Medical Sci- ences,” in Proceedings of the Contributed Papers Session, Biological Sciences Division, Special Libraries Association Annual Conference (Seattle: Special Libraries Assocation, 1997). 3. Mitchell, “Collaboration Enabling Internet Resource Col- lection-Building Software and Technologies.” 4. J. Phipps et al., “Orchestrating Metadata Enhancement Services: Introducing Lenny,” in Proceedings of DC-2005: Inter- national Conference on Dublin Core and Metadata Applications (Madrid, Spain: Universidad Carlos III de Madrid, 2005), http://arxiv.org/pdf/cs.DL/0501083, (accessed Oct. 20, 2006). 5. Mason et al., “INFOMINE: Promising Directions in Vir- tual Library Development.” 6. Ibid. 7. S. Chakrabarti, Mining the Web: Discovering Knowledge from Hypertext (San Francisco: Morgan Kauffman, 2003); S. Chakrabarti et al., Accelerated Focused Crawling through Online Relevance Feed- back, http://www2002.org/CDROM/ refereed/336/ (accessed Oct. 20, 2006); S. Chakrabarti, The Structure of Broad Topics on the Web, http://www2002.org/CDROM/refereed/338/index.html (accessed Oct. 20, 2006); S. Chakrabarti, Integrating the Document Object Model with Hyperlinks for Enhanced Topic Distillation and Information Extraction, http://www10.org/cdrom/papers/489 (accessed Oct. 20, 2006). 8. Chakrabarti et al., Accelerated Focused Crawling; F. Menc- zer, “Mapping the Semantics of Web Text and Links” IEE Internet Computing, 9, no. 3 (May/June 2005): 27–36; F. Menczer, G. Pant, and P. Srinivasan, “Topical Web Crawlers: Evaluating Adaptive Algorithms” Transactions on Internet Technology, 4, no 4 (2004): 378–; F. Menczer, “Correlated Topologies in Citation Networks and the Web” European Physical Journal B, 38 no. 2 (March 2004): 211–21. 9. S. Mitchell, “Data Fountains Survey,” 2005, http:// datafountains.ucr.edu/ datafountainssurvey.doc, (accessed Oct. 20, 2006). 10. A. Culotta and A. McCallum, “Confidence Estimation for Information Extraction,” in Proceedings of Human Language Technology Conference and North American Chapter of the Asso- ciation for Computational Linguistics (Boston: Association for Computational Linguistics, 2004), http://www.cs.umass.edu/ ~mccallum/papers/crfcp-hlt04.pdf, (accessed Oct. 20, 2006); F. Peng and A. McCallum, “Accurate Information Extraction from Research Papers Using Conditional Random Fields,” in Pro- ceedings of the Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (2004). http://ciir.cs.umass.edu/pubfiles/ir-329.pdf, (accessed Oct. 20, 2006); C. Sutton and A. McCallum, “An Introduction to Conditional Random Fields for Relational Learning,” in Introduction to Statistical Relational Learning, Lise Getoor and Ben Taskar, eds. (Cambridge, Mass.: MIT Pr., 2006). http://www .cs.umass.edu/~mccallum/papers/crf-tutorial.pdf, (accessed Oct. 20, 2006). 11. A. McCallum and D. Jensen, “A Note on the Unification of Information Extraction and Data Mining Using Conditional- Probability, Relational Models,” in Proceedings of the IJCAI 2003 Workshop on Learning Statistical Models from Relational Data, Aca- pulco, Mexico: IJCAI, http://www.cs.umass.edu/~mccallum/ papers/iedatamining-ijcaiws03.pdf, (accessed Oct. 20, 2006); U. Nahm and R. Mooney, “A Mutually Beneficial Integration of Data Mining and Information Extraction,” in Proceedings of the American Association for Artificial Intelligence/Innovative Applica- tions of Artificial Intelligence (Austin, Texas: American Asso- ciation for Artificial Intelligence, 2000). http://www.cs.utexas .edu/users/ ml/papers/discotex-aaai-00.pdf, (accessed Oct. 20, 2006); R. Raina et al., “Classification with Hybrid Generative/ Discriminative Models,” in Proceedings of Neural Information Pro- cessing Systems (2003). http://www.cs.umass.edu/~mccallum/ papers/hybrid-nips03.pdf, (accessed Oct. 20, 2006); G. Bouchard and B. Triggs, “The Trade-Off Between Generative and Discrimi- native Classifiers,” COMPSTAT 2004. (Prague: Springer, 2004) http://lear.inrialpes.fr/pubs/2004/BT04/Bouchard-comp stat04.pdf, (accessed Oct. 20, 2006). 12. McCallum and Jensen, “A Note on the Unification of Information Extraction.” 13. N. Eiron and K. McCurley, “Untangling Compound Docu- ments on the Web,” in Conference on Hypertext (Nottingham, UK: ACM Conference on Hypertext and Hypermedia, 2003), http://citeseer .ist.psu.edu/eiron03untangling.html, (accessed Oct. 20, 2006). http://www.almaden.ibm.com/cs/people/mccurley/pdfs/ pdf.pdf, (accessed Oct. 20, 2006); P. Dimitriev et al., “As We May Perceive: Inferring Logical Documents from Hypertext,” presented at HT 2005, 16th ACM Conference on Hypertext and Hypermedia (Salzburg: ACM, 2005); K. Tajima, “Finding Context Paths for Web Pages,” in Proceedings of ACM Hypertext (Darm- stad, Germany: ACM, 1999), http://www.jaist.ac.jp/~tajima/ 212 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 papers/ht99www.pdf, (accessed Oct. 20, 2006); K. Tajima et al., “Discovery and Retrieval of Logical Information Units in Web,” in Proceedings of the Workshop of Organizing Web Space (in conjunc- tion with ACM Conference on Digital Libraries) (Berkeley, Calif.: ACM, 1999), 13–23, http://www.jaist.ac.jp/~tajima/ papers/ wows99www.pdf, (accessed Oct. 20, 2006); E. de Lara et al., “A Characterization of Compound Documents on the Web,” TR99-351, University of Toronto (1999), http://www.cs.toronto .edu/~delara/papers/compdoc.pdf, (accessed Oct. 20, 2006), http://www.cs.toronto.edu/~delara/ papers/compdoc_html/, (accessed Oct. 20, 2006); L. Xiaoli et al., “Web Search Based on Micro Information Units,” (Honolulu, Hawaii: Eleventh Inter- national World Wide Web Conference, 2002), http://www2002 .org/CDROM/poster/78.pdf, (accessed Oct. 20, 2006); W. Lee et al., Retrieval and Organizing Web Pages by Information Unit, http://www10.org/cdrom/papers/466/, (accessed Oct. 20, 2006). 14. Tajima et al., “Discovery and Retrieval of Logical Informa- tion Units in Web”; Xiaoli et al., “Web Search Based on Micro Information Units”; Lee et al., Retrieval and Organizing Web Pages. 15. R. Mihalcea, “Graph-Based Ranking Algorithms for Sen- tence Extraction, Applied to Text Summarization,” in Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, companion volume (Barcelona, Spain: Associa- tion for Computational Linguistics, 2004), http://www.cs.unt .edu/~rada/papers/mihalcea.acl2004.pdf, (accessed Oct. 20, 2006); R. Mihalcea and P. Tarau, “TextRank: Bringing Order into Texts,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (Barcelona, Spain: Empirical Meth- ods in Natural Language Processing, 2004), http://www.cs.unt .edu/~rada/papers/mihalcea.emnlp04.pdf, (accessed Oct. 20, 2006); R. Mihalcea, P. Tarau, and E. Figa, “PageRank on Semantic Networks, with Application to Word Sense Disambiguation,” in Proceedings of the 20th International Conference on Computational Linguistics (Geneva, Switzerland: COLING 2004). http://www .cs.unt.edu/~rada/papers/ mihalcea.coling04.pdf, (accessed Oct. 20, 2006); Y. Matsuo et al., “KeyWorld: Extracting Keywords in a Document as a Small World,” in Proceedings of Discovery Sci- ence (Berlin, New York: Springer, 2001), 271–81 (Lecture Notes in Computer Science, v. 2226), http://www.miv.t.u-tokyo.ac.jp/ papers/matsuoDS01.pdf, (accessed Oct. 20, 2006); Y. Matsuo and M. Ishizuka, “Keyword Extraction from a Single Document Using Word Co-Occurrence Statistical Information,” Interna- tional Journal on Artificial Intelligence Tools 13, no.1 (2004): 157–69, http://www.miv.t.u-tokyo.ac.jp/papers/matsuoIJAIT04.pdf, (accessed Oct. 20, 2006); Xiaoli et al., “Web Search Based on Micro Information Units”; Lee et al., Retrieval and Organizing Web Pages; G. Forman and Ira Cohen, “Learning from Little: Com- parison of Classifiers Given Little Training,” Tech Report: HPL- 2004-19R1 20040719 (Palo Alto, Calif.: Hewlett-Packard Research Labs., 2004), http://www.hpl.hp.com/techreports/2004/HPL -2004-19R1.pdf, (accessed Oct. 20, 2006). 16. G. Mann et al., “Bibliometric Impact Measures Leveraging Topic Analysis,” (in press), in Proceedings of the Joint Conference on Digital Libraries (2006). http://www.cs.umass.edu/~mccallum/ papers/impact-jcdl06s.pdf, (accessed Oct. 20, 2006). 17. R. Bouckaert and E. Frank, “Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms,” in Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. (Berlin, New York: Springer-Verlag, 2004), 3–12 (Lecture Notes in Computer Science, v. 3056), http://www .cs.waikato.ac.nz/~ml/publications/2004/bouckaert-frank.pdf, (accessed Oct. 20, 2006); R. Bouckaert, “Estimating Replicabil- ity of Classifier Learning Experiments,” in Proceedings of the International Conference on Machine Learning (2004), http://www. cs.waikato.ac.nz/~ml/publications/2004/bouckaert-estimat- ing.pdf, (accessed Oct. 20, 2006); R. Caruana and A. Niculescu- Mizil, “Data Mining in Metric Space: An Empirical Analysis of Supervised Learning Performance Criteria,” in KDD-2004: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York: ACM Press, 2004), http://perfs.rocai04.revised.rev1.ps, (accessed Oct. 20, 2006). 18. J. Zhang et al., “Modified Logistic Regression: An Approx- imation to SVM and Its Application in Large-Scale Text Cat- egorization,” in Proceedings: Twentieth International Conference on Machine Learning (Menlo Park Calif.: AAAI Press, 2003), 888–97, http://www.informedia.cs.cmu.edu/documents/icml03zhang .pdf, (accessed Oct. 20, 2006); Y-C. Chang, “Boosting SVM Classifiers with Logistic Regression,” Technical Report. (Tai- pei: Institute of Statistical Science, Academia Sinica, 2003), http://www.stat.sinica.edu.tw/library/c_tec_rep/2003-03.pdf, (accessed Oct. 20, 2006); T. Zhang and F. Oles, “Text Categori- zation Based on Regularized Linear Classification Methods,” Information Retrieval 4, no. 1 (2001): 5–31, http://www.research .ibm.com/people/t/tzhang/pubs.html, (accessed Oct. 20, 2006); T. Joachims, “SVMlight,” (including SVMmulticlass, SVMstruct, SVMHMM) (software, 2005), http://svmlight.joachims.org/, (accessed Oct. 20, 2006); C. Chang and C-J. Lin, “LIBSVM,” (software, 2005), http://www.csie.ntu.edu.tw/~cjlin/libsvm/, (accessed Oct. 20, 2006); C-W Hsu and C-J Lin, “BSVM,” (soft- ware, 2003), http://www.csie.ntu.edu.tw/~cjlin/bsvm/index .html, (accessed Oct. 20, 2006); T. Finley and T. Joachims, “Supervised Clustering with Support Vector Machines,” in Proceedings of the International Conference on Machine Learning (New York: ACM Press, 2005), http://www.cs.cornell.edu/ People/tj/publications/finley_joachims_05a.pdf, (accessed Oct. 20, 2006); I. Tsochantaridis et al., “Support Vector Machine Learning for Interdependent and Structured Output Spaces,” in Proceedings of the International Conference on Machine Learning (New York: ACM Press, 2004), http://www.cs.cornell.edu/ People/tj/publications/tsochantaridis_etal_04a.pdf, (accessed Oct. 20, 2006) ; S. Godbole and S. Sarawagi, “Discriminative Methods for Multi-Labeled Classification,” in Proceedings of the Pacific-Asia Conferences on Knowledge Discovery and Data Min- ing (2004), http://www.it.iitb.ac.in/~shantanu/work/pakdd04 .pdf, (accessed Oct. 20, 2006); L. Cai and T. Hofmann, “Hierarchi- cal Document Categorization with Support Vector Machines,” in Proceedings of the ACM 13th Conference on Information and Knowl- MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 213 edge Management (2004), http://www.cs.brown.edu/people/ th/publications.html, (accessed Oct. 20, 2006); T. Hofmann et al., “Learning with Taxonomies: Classifying Documents and Words,” in Proceedings of the Workshop on Syntax, Seman- tics, and Statistics, Neural Information Processing (2003), http:// www.cs.brown.edu/people/ th/publications.html, (accessed Oct. 20, 2006); A. Tveit, “Empirical Comparison of Accuracy and Performance for the MIPSVM Classifier with Existing Classifiers,” Technical Report, Division of Intelligent Systems, Department of Computer and Information Science, Norwegian University of Science and Technology. (Trondheim, Norway, 2003), http://www.idi.ntnu.no/~amundt/publications/2003/ MIPSVMClassificationComparison.pdf, (accessed Oct. 20, 2006); C-W Hsu and C-J Lin, “A Comparison of Methods for Multi- Class Support Vector Machines,” IEEE Transactions on Neu- ral Networks 13, no. 2 (2002): 415–25, http://www.csie.ntu .edu.tw/~cjlin/papers/multisvm.pdf, (accessed Oct. 20, 2006). 19. P. Komarek, “Logistic Regression for Data Mining and High-Dimensional Classification” (Ph.D. thesis, Carnegie Mellon University, 2004), 138; P. Komarek and A. Moore, “Fast Robust Logistic Regression for Large Sparse Data Sets with Binary Outputs,” Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics. January 3–6, 2003, Hyatt Hotel, Key West, Florida, ed. By Christopher M. Bishop and Brendan J. Frey. http://research.Microsoft.com/ conferences/AIStats2003/proceedings/174.pdf (accessed Nov. 23, 2006); A. Popescul et al., “Towards Structural Logistic Regression: Combining Relational and Statistical Learning,” in MRDM 2002: Workshop on Multi-Relational Data Mining, http://www-ai.ijs.si/sasodzeroski/MRDM2002/proceed ings/popesul.pdf (accessed Nov. 23, 2006); J. Zhang and Y. Yang, “Probabilistic Score Estimation with Piecewise Logistic Regression,” in Proceedings: Twenty-first International Conference on Machine Learning (Menlo Park, Calif.: AAAI Press, 2004), http://www-2.cs.cmu.edu/~jianzhan/papers/icml04zhang .pdf, (accessed Oct. 20, 2006); Zhang et al., “Modified Logistic Regression”; Zhang and Oles, “Text Categorization”; Multi-class LR is discussed in Zhang et al., 2003, and Chang, 2003 (reference 18). 20. Some recent work on NB can be seen in J. Rennie, “Tack- ling the Poor Assumptions of Naive Bayes Text Classifiers,” in T. Fawcett and N. Mishra, eds., Proceedings of the 20th Interna- tional Conference on Machine Learning (Washington, D.C.: AAAI Pr., 2003), 616–23, http://haystack.lcs.mit.edu/papers/rennie .icml03.pdf, (accessed Oct. 20, 2006); K. Schneider, “Tech- niques for Improving the Performance of Naive Bayes for Text Classification,” in Computational Linguistics and Intelli- gent text processing: Sixth International Conference, CICLing2005, Mexico City, Mexico, February 13–19, 2005: Proceedings (New York: Springer, 2005). (Lecture Notes in Computer Science, 3406). 682–93, http://www.phil.uni-passau.de/linguistik/ schneider/pub/cicling2005.html, (accessed Oct. 20, 2006); E. Frank et al., “Locally Weighted Naive Bayes,” in Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence (Acapulco: Morgan Kaufmann, 2003), 249–56, http://www .cs.waikato.ac.nz/~eibe/pubs/UAI_200.ps.gz, (accessed Oct. 20, 2006); G. Webb et al., “Not so Naive Bayes: Aggregating One-Dependence Estimators,” Machine Learning 58, no. 1 (Jan. 2005): 5–24, http://www.csse.monash.edu.au/~webb/Files/ WebbBoughtonWang05.pdf, (accessed Oct. 20, 2006); E. Keogh and M. Pazzani, “Learning the Structure of Augmented Bayes- ian Classifiers,” International Journal on Artificial Intelligence Tools 11, no. 4 (2002): 587–601, http://www.ics.uci.edu/~pazzani/ Publications/tools.pdf (accessed Oct. 20, 2006). 21. McCallum and Jensen, “A Note on the Unification of Information Extraction and Data Mining”; Joachims, “SVM- light”; Y. Altun et al., “Hidden Markov Support Vector Machines,” in Proceedings of the 20th International Conference on Machine Learning (Menlo Park, Calif.: AAAI Press, 2003), http://www.cs.brown.edu/people/th/publications.html (accessed Oct. 20, 2006); A. Ganapathiraju et al., “Hybrid SVM/HMM Architectures for Speech Recognition,” in Advances in Neural Information Processing Systems 13: Proceed- ings of the 2000 Conference (Cambridge, Mass.: MIT Press, 2001), http://www.nist.gov/speech/publications/tw00/pdf/cp210 .pdf (accessed Oct. 20, 2006); D. Freitag and A. McCallum, “Information Extraction with HMM Structures Learned by Stochastic Optimization,” in Proceedings of the 18th Conference on Artificial Intelligence (Austin, TX.: AAAI Press, 2000) http:// www.cs.umass.edu/~mccallum/papers/iehill-aaai2000s .ps (accessed Oct. 20, 2006); S. Basu et al., “A Probabilistic Framework for Semi-Supervised Clustering,” in Proceedings of the 10th ACM SIGKDD International Conference on Knowl- edge Discovery and Data Mining (Seattle, Wash.: 2004), 59– 68, http://www.cs.utexas.edu/users/ml/papers/semi-kdd -04.pdf, (accessed Oct. 20, 2006). 22. T. Liu et al., “Efficient Exact kNN and Nonparametric Classification in High Dimensions,” in Advances in Neural Information Processing Systems 15: Proceedings of the 2002 Con- ference (Cambridge, Mass.: MIT Press, 2001). http://www .autonlab.org/autonweb/showPaper.jsp?ID=Liu-knn, (accessed Oct. 20, 2006); G. Guo et al., “KNN Model-Based Approach in Classification,” in Lecture Notes in Computer Science, vol. 2888 (Heidelberg: Springer Berlin, 2003), 986–96, http://www .icons.rodan.pl/publications/%5BGuo2003%5D.pdf (accessed Oct. 20, 2006) 23. Bouckaert and Frank, “Evaluating the Replicability of Significance Tests”; Bouckaert, “Estimating Replicability of Clas- sifier Learning Experiments”; Caruana and Niculescu-Mizil, “Data Mining in Metric Space”; R. Caruana and T. Joachims, “PERF (Data Mining Evaluation Software),” in Proceedings of the Conference on Knowledge Discovery and Data Mining (2004). http://kodiak.cs.cornell.edu/kddcup/software.html (accessed Oct. 20, 2006); Paynter, “Developing Practical Automatic Meta- data.” 24. Raina et al., “Classification with Hybrid Generative/ Discriminative Models”; Bouchard and Triggs, “The Trade-Off Between Generative and Discriminative Classifiers.” 25. Ibid; Zhang et al., “Modified Logistic Regression”; Chang, “Boosting SVM Classifiers with Logistic Regression”; Joachims, 214 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 “SVMlight”; L. Shih et al., “Not Too Hot, Not Too Cold: The Bundled SVM Is Just Right!” in Proceedings of the ICML-2002 Workshop on Text Learning (2002). http://people.csail.mit.edu/ u/j/jrennie/public_html/papers/icml02-bundled.pdf (accessed Oct. 20, 2006); F. Fukumoto and Y. Suzuki, “Manipulating Large Corpora for Text Classification,” in Proceedings of the Conference on Empirical Methods in Natural-Language Processing (Philadel- phia: Association for Computational Linguistics, 2002), 196–203, http://acl.ldc.upenn.edu/W/W02/W02-1026.pdf (accessed Oct. 20, 2006); Altun et al., “Hidden Markov Support Vector Machines; Ganapathiraju et al., “Hybrid SVM/HMM Architec- tures”; Liu et al., “Efficient Exact k-NN”; A. Ng and M. Jordan, “On Discriminative versus Generative Classifiers: A Com- parison of Logistic Regression and Naive Bayes,” in Advances in Neural Information Processing Systems 14: Proceedings of the 2001 Conference (Cambridge, Mass.: MIT Press, 2002), http:// www.robotics.stanford.edu/~ang/ papers/nips01-discriminati- vegenerative.ps (accessed Oct. 20, 2006); K. Nigam et al., “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning 39, nos. 2/3 (2000): 103–34, http://www .kamalnigam.com/papers/emcat-mlj99.pdf (accessed Oct. 20, 2006). 26. G. Valentini and F. Masulli, “Ensembles of Learning Machines,” in Neural Nets WIRN Vietri-02, Series Lecture Notes in Computer Sciences, M. Marinaro and R. Tagliaferri, eds. (Heidelberg: Springer-Verlag, 2002), http://www.disi.unige.it/ person/MasulliF/papers/masulli-wirn02.pdf (accessed Oct. 20, 2006). 27. Ibid.; R. Caruana et al., “Ensemble Selection from Librar- ies of Models” in Proceedings: Twenty-first International Conference on Machine Learning (Menlo Park, Calif.: AAAI Press, 2004). http://www.cs.cornell.edu/~alexn/shotgun.icml04.revised. rev2.pdf (accessed Oct. 20, 2006); G. Tsoumakas, “Effective Vot- ing of Heterogeneous Classifiers,” in Machine Learning ECML 2004: 15th European Conference on Machine Learning, Pisa, Italy, September 20–24, 2004: Proceedings. (Berlin, New York: Springer, 2004), http://users.auth.gr/~greg/Publications/tsoumakas -ecml2004.pdf (accessed Oct. 20, 2006); J. Fürnkranz, “On the Use of Fast Sub-Sampling Estimates for Algorithm Recommen- dation,” Technical Report TR-2002-36 (Wien: Österreichisches Forschungsinstitut für Artificial Intelligence, 2002), http://www .ofai.at/cgi-bin/get-tr?paper=oefai-tr-2002-36.pdf (accessed Oct. 20, 2006); A. Seewald, 2002. “Meta-Learning for Stacked Classifi- cation,” (extended version) in Proceedings of the 2nd International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support, and Meta-Learning (University of Helsinki, Department of Computer Science, Report B-2002-3, 2002), http:// www.ofai.at/cgi-bin/get-tr?download=1&paper=oefai-tr-2002 -05.pdf (accessed Oct. 20, 2006); A. Seewald and J. Fürnkranz, “An Evaluation of Grading Classifiers,” in Advances in Intelli- gent Data Analysis: Proceedings of the 4th International Symposium (Lisbon, Portugal: Springer-Verlag, 2001), http://www.ofai.at/ cgi-bin/get-tr?paper=oefai-tr-2001-01.pdf (accessed Oct. 20, 2006); P. Bennett et al., “The Combination of Text Classifiers Using Reliability Indicators,” Technical Report. Microsoft and Information Retrieval 8, no. 1 (2005): 67–100, http://research .microsoft.com/~horvitz/tclass_combine.pdf (accessed Oct. 20, 2006); Y. Kim et al., “Optimal Ensemble Construction via Meta- Evolutionary Ensembles,” Expert Systems With Applications 30, no. 4 (in press 2006), http://www.informatics.indiana.edu/fil/ Papers/mee-eswa.pdf (accessed Oct. 20, 2006). 28. S. Godbole, “Document Classification as an Internet Ser- vice: Choosing the Best Classifier” (masters thesis, IIT Bombay, 2001). http://www.it.iitb.ac.in/~shantanu/work/mtpsg.pdf (accessed Oct. 20, 2006). 29. K. Liu and H. Kargupta, “Distributed Data Mining Bibli- ography: Release 1.7,” (Baltimore: University of Maryland, Com- puter Science Department, 2006), http://www.csee.umbc.edu/ ~hillol/DDMBIB/ (accessed Oct. 20, 2006); A. Prodromidis and P. Chan, “Meta-Learning in Distributed Data Mining Systems: Issues and Approaches,” in Advances of Distributed Data Mining, Hillol Kargupta and Philip Chan, eds. (Menlo Park, Calif. : AAAI/ MIT Press, 2000). http://www1.cs.columbia.edu/~andreas/ publications/DDMBOOK.ps.gz (accessed Oct. 20, 2006); G. Tsoumakas and I. Vlahavas, “Distributed Data Mining of Large Classifier Ensembles,” in Methods and Applications of Artificial Intelligence: Second Hellenic Conference on AI, SETN 2002, Thes- saloniki, Greece, April 11–12, 2002: Proceedings, (Berlin, New York: Springer, 2002), 249–56, http://users.auth.gr/~greg/Pub- lications/ddmlce.pdf (accessed Oct. 20, 2006); R. Khoussainov et al., “Grid-Enabled Weka: A Toolkit for Machine Learn- ing on the Grid,” ERCIM News no. 59, (Oct. 2004), http:// www.ercim.org/publication/Ercim_News/enw59/khussainov .html (accessed Oct. 20, 2006). 30. S. Godbole et al., “Document Classification through Inter- active Supervision of Document and Term Labels,” in Knowledge Discovery in Databases: PKDD 2004: 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Pisa, Italy, September 20–24, 2004: Proceedings (Berlin; New York: Springer, 2004), http://www.it.iitb.ac.in/~shantanu/work/ pkdd04.pdf (accessed Oct. 20, 2006).; H. Yu et al., “PEBL: Posi- tive Example Based Learning for Web Page Classification Using SVM,” in KDD-2002: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (New York: ACM Pr., 2002), 239–48, http://eagle.cs.uiuc.edu/ pubs/2002/pebl-kdd02.pdf (accessed Oct. 20, 2006); T. Krist- jannson et al., “Interactive Information Extraction with Con- strained Conditional Random Fields,” in Proceedings: Nineteenth National Conference on Artificial Intelligence (AAI-04) (Menlo Park, Calif.: AAAI Press; Cambridge, Mass.: MIT Press, 2004), http://www.cs.umass.edu/~mccallum/papers/addrie-aaai04. pdf (accessed Oct. 20, 2006); V. Tablan et al., “OLLIE: On-Line Learning for Information Extraction,” in Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems: Edmonton, Canada: 2003. (New York: ACM, 2003), http://gate.ac.uk/sale/hlt03/ollie-sealts.pdf (accessed Oct. 20, 2006). 31. Godbole et al., “Document Classification.” 32. Ibid.; Tablan et al., “OLLIE: On-Line Learning for Infor- mation Extraction.” MACHINE ASSISTANCE IN COLLECTION BUILDING | MITCHELL 215 33. G. Mann et al., “Bibliometric Impact Measures,” (in press). 34. Bouckaert and Frank, “Evaluating the Replicability of Significance Tests”; Caruana and Niculescu-Mizil, “Data Mining in Metric Space”; Caruana and Joachims, “PERF (Data Mining Evaluation Software).” 35. Mann et al., “Bibliometric Impact Measures”; Matsuo et al., “KeyWorld”; Matsuo and Ishizuka, “Keyword Extraction from a Single Document”; Lee et al., Retrieval and Organizing Web Pages; Tajima et al., “Discovery and Retrieval of Logical Information.” (See also the sections on Hybrid, Unified Models, and Document Scale Learning and Classification, above.) 36. Menczer, “Mapping the Semantics of Web Text and Links.” 37. P. Srinivasan et al., “A General Evaluation Framework for Topical Crawlers,” Information Retrieval 8, no. 3 (2005): 417–47, http://www.informatics.indiana.edu/fil/Papers/ crawl_framework.pdf (accessed Oct. 20, 2006); A. Maguit- man et al., “Algorithmic Computation and Approximation of Semantic Similarity,” (in press, 2006). To appear in World Wide Web Journal. http://www.informatics.indiana.edu/fil/Papers/ semsim_extended.pdf (accessed Oct. 20, 2006). 38. ArXiv. Cornell University Library, http://arxiv.org/ (accessed Oct. 20, 2006); CiteSeer.IST (formerly ResearchIndex), http://citeseer.ist.psu.edu/ (accessed Oct. 20, 2006); eScholarship Repository, California Digital Library, http://repositories.cdlib .org/escholarship/, (accessed Oct. 20, 2006); National Science Foundation, National Science Digital Library, http://nsdl.org/ (accessed Oct. 20, 2006); OAIster. Digital library production ser- vice (University of Michigan), http://oaister.umdl.umich.edu/ o/oaister/ (accessed Oct. 20, 2006); U.S. Institute of Museum and Library Services. Digital collections and content, http://imlsdcc .grainger.uiuc.edu/ (accessed Oct. 20, 2006). 39. K. Calhoun, “The Changing Nature of the Catalog and Its Integration into Other Discovery Tools,” (report to the Library of Congress, Mar. 17, 2006), http://www.loc.gov/catdir/calhoun -report-final.pdf (accessed Oct. 20, 2006); Mitchell, “Collabora- tion Enabling Internet Resource Collection-Building Software and Technologies”; W. Wulf, “Higher Education Alert: The Railroad is Coming,” in EDUCAUSE, Publications from the Forum for the Future of Higher Education (2002), http://www.educause. edu/ir/library/pdf/FFPIU022.pdf (accessed Oct. 20, 2006). 40. University of California Libraries, “Rethinking How We Provide Bibliographic Services at the University of Cali- fornia,” final report of the Bibliographic Services Task Force of the University of California Libraries, 2005, http://libraries .universityofcalifor nia.edu/sopag/BSTF/Final.pdf (accessed Oct. 20, 2006). 41. L. Dempsey, “Libraries and the Long Tail: Some Thoughts About Libraries in a Network Age,” D-Lib Magazine 12, no. 4 (2006), http://www.dlib.org/dlib/april06/dempsey/ 04dempsey.html (accessed Oct. 20, 2006). 42. Mason, J. et al., “INFOMINE: Promising Directions in Virtual Library Development,” First Monday 5, no. 6 (June 5, 2000), http://www.firstmonday.dk/issues/issue5_6/mason/ (accessed Oct. 20, 2006). 43. E. O’Neill and L. M. Chan, “FAST: Faceted Application of Subject Terminology,” in Proceedings of the World Information Con- gress, IFLA General Conference and Council (Berlin: IFLA, 2003). http://www.ifla.org/IV/ifla69/papers/010e-ONeill_Mai- Chan.pdf (accessed Oct. 20, 2006); See also: OCLC 2003–2006, “FAST: Faceted Application of Subject Terminology,” http:// www.oclc.org/research/projects/fast/default.htm) (accessed Oct. 20, 2006). 44. M. Bates, 2003, “Improving User Access to Library Cata- log and Portal Information,” Task Force Recommendation 2.3, Final Report (Washington, D.C.:Library of Congress, 2003), 30, http:// www.loc.gov/catdir/bibcontrol/2.3BatesReport6-03.doc.pdf (accessed Oct. 20, 2006). 45. RDN (Resource Description Network), http://www.rdn .ac.uk/projects/eprints-uk/, (accessed Oct. 20, 2006); OCLC “ePrints-UK” (2005), http://www.oclc.org/research/projects/ mswitch/epuk.htm, (accessed Oct. 20, 2006). 46. A. MacEwan, “Working with LCSH: The Cost of Coop- eration and the Achievement of Access: A Perspective from the British Library,” presented at the IFLA General Conference, 1998, http://www.ifla.org/IV/ifla64/033-99e.htm (accessed Oct. 20, 2006). 47. Ibid.; R. Larson, “The Decline of Subject Searching: Long- Term Trends and Patterns of Index Use in an Online Catalog,” Journal of the American Society for Information Science 42, no. 3 (1991): 197–215. 48. K. Drabenstott et al., “End-User Understanding of Subject Headings in Library Catalogs,” Library Resources & Technical Services 43, no. 3 (Jul. 1999): 140–60; Bates, “Improving User Access.” 49. Bates, “Improving User Access,” (see discussion of entry vocabulary). 50. Ibid. 51. BEAT (Bibliographic Enrichment Advisory Team, Library of Congress), “Digital Tables of Contents,” (2005), http://www .loc.gov/catdir/beat/digitoc.html (accessed Oct. 20, 2006). 52. D. Vizine-Goetz, “Terminology Services, OCLC,” (2004), http://www.oclc.org/research/projects/termservices/default .htm (accessed Oct. 20, 2006). 53. C. Fellbaum, Wordnet: An Electronic Lexical Database (Cam- bridge, Mass.: MIT Pr., 1998), http://wordnet.princeton.edu/ (accessed Oct. 20, 2006); A. Csomai, “Wordnet Bibliography,” (2006). http://lit.csci.unt.edu/~wordnet/ (accessed Oct. 20, 2006). 54. Bates, “Improving User Access.” 55. A. Maedche and R. Volz, “The Ontology Extraction and Maintenance Framework: Text-to-Onto,” in Proceedings of the ICDM 2001 Workshop (San Jose, Calif.: IEEE Computer Society (2001), http://cui.unige.ch/~hilario/icdm-01/DM-KM-Final/Volz .pdf (accessed Oct. 20, 2006); V. Parekh, J. Gwo, and T. Finin, “Mining Domain Specific Texts and Glossaries to Evaluate and Enrich Domain Ontologies,” in Proceedings of the 2004 International Conference on Information and Knowledge Engineer- ing: IKE ‘04 (Las Vegas: CSREA Press, 2004), http://ebiquity. umbc.edu/v2.1/paper/html/id/171/ (accessed Oct. 20, 2006); 216 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2006 D. Sleeman et al., “Enabling Services for Distributed Environ- ments: Ontology Extraction and Knowledge Base Characteriza- tion,” in Proceedings of Workshop on Knowledge Transformation for the Semantic Web/Fifteenth European Conference on Artificial Intelligence (Lyon, France: ECAI, 2002), http://www.csd.abdn .ac.uk/~sleeman/published-papers/p129-final-ontomine.pdf (accessed Oct. 20, 2006). ; B. Omelayenko, “Learning of Ontol- ogies for the Web: The Analysis of Existent Approaches,” in Proceedings of the International Workshop on Web Dynam- ics (London: WebDyn, 2001), http://dcs.bbk.ac.uk/webdyn/ webDynPapers/omelayenko.pdf (accessed Oct. 20, 2006); R. Dhamankar et al., “Imap: Discovering Complex Seman- tic Matches Between Database Schemas,” in SIGMOD 2004: Proceedings of the ACM SIGMOD International Conference on Management of Data, June 13–18, 2004, Paris, France (New York: Association for Computing Machinery, 2004), http:// www.cs.washington.edu/homes/pedrod/papers/sigmod04 .pdf (accessed Oct. 20, 2006); P. Cassin et al., “Ontology Extraction for Educational Knowledge Bases,” Lecture Notes in Computer Science, vol. 2926 (Heidelberg: Springer-Verlag, 2004), 297–309; Revised and Invited Papers from Agent-Medi- ated Knowledge Management: International Symposium (Stanford, Calif., Mar. 24–26, 2003), ftp://mas.cs.umass.edu/pub/Cassin _Ontology-AMKM03.pdf (accessed Oct. 20, 2006); T. Wang et al., “Extracting a Domain Ontology from Linguistic Resource Based on Relatedness Measurements,” in The 2005 IEEE/WIC/ ACM International Conference on Web Intelligence: Proceedings: September 19–22, Compiègne University of Technology, France (Los Alamitos, Calif.: IEEE Computer Society, 2005), 345–51, http:// csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/ dl/proceedings/&toc=comp/proceedings/wi/2005/2415/00/ 2415toc.xml&DOI=10.1109/WI.2005.63 (accessed Oct. 20, 2006). 56. Bates, “Improving User Access to Library Catalog and Portal Information.” 57. O’Neill and Chan, “FAST: Faceted Application of Subject Terminology.” 58. Mason, et al., “INFOMINE: Promising Directions in Vir- tual Library Development.”